Page 1
International Journal of Research Available at https://edupediapublications.org/journals
p-ISSN: 2348-6848 e-ISSN: 2348-795X Volume 03 Issue 12
August 2016
Available online: http://internationaljournalofresearch.org/ P a g e | 203
Design of High Performance Baugh Wooley Multiplier Using Compressors
*T.MOUNIKA **MR.T.SAMMAIAH ***MR.G.BABU *Pg Scholor Dept of Ece Vaagdevi College of Engineering Warangal
**Asso Professor Dept of Ece Vaagdevi College of Engineering
*** Asso Professor Dept of Ece Vaagdevi College of Engineering Abstract A multiplier is one of the key hardware
blocks in most digital and high performance
systems such as FIR filters, digital signal
processors and microprocessors etc. With
advances in technology, many researchers
have tried and are trying to design
multipliers which offer either of the
following- high speed, low power
consumption, regularity of layout and hence
less area or even combination of them in
multiplier. Thus making them suitable for
various high speed, low power, and compact
VLSI implementations. However area and
speed are two conflicting constraints. So
improving speed results always in larger
areas. So here we try to find out the best
trade off solution among the both of
them.To achieve speed improvements
Baugh Wooley Multiplication technique
used for signed multiplication Generally as
we know multiplication goes in three basic
steps. Partial product generation, reduction
and final stage is addition. Hence in this
paper we have first tried to design different
adders and compare their speed and
complexity of circuit i.e. the area occupied.
And then we have designed Wallace tree
multiplier then followed by Conventional,
proposed Wallace multipliers and have
compared the speed and Power consumption
in both of them.
Page 2
International Journal of Research Available at https://edupediapublications.org/journals
p-ISSN: 2348-6848 e-ISSN: 2348-795X Volume 03 Issue 12
August 2016
Available online: http://internationaljournalofresearch.org/ P a g e | 204
I.INTRODUCTION
Multiplication is a heavily used arithmetic
operation that figures distinguished in signal
processing and scientific applications.
Typical DSP applications where a multiplier
plays an important role include digital
filtering, digital communications and
spectral analysis. Many current DSP
applications are aimed at portable, battery-
operated systems, so that power dissipation
circuits and must typically operate at a high
system clock rate, dropping the delay of a
multiplier is a vital part of satisfying the
overall design. The ALU is the core in DSP
and ASIC where it is used in comparison,
convolution, correlation, and digital filters.
An ALU combines a variety of arithmetic
and logic operations into a single unit. The
speed of ALU greatly depends on its
multiplier circuit. This in turn increase
demand for high speed multipliers, at the
same time keeping in mind low area and
moderate power consumption. Generation of
partial product and their accumulation are
the two basic operation of multiplication. A
binary multiplier is an electronic circuit used
in digital electronics, like a computer to
multiply two binary numbers .The aim at
higher efficiency and less power
consumption even while occupying less
silicon area (or) visa versa.There are various
multipliers in that Baugh-Wooley
multipliers is famous for multiplication of
signed multiplicands in 2s complement data
representation. The unsigned multiplier is
the Wallace tree in which the overall delay
and the number of stages reduces. Signed
Baugh-Wooley with Wallace multiplier
which increases the performances and
reduce the power consumption.
Compressors are basic components used for summing operation in which only
multiplexer and basic gates are used . II. BAUGH WOOLEY MULTIPLIER
The array multiplier Baugh-Wooley
is an efficient way for multiplying both
signed and unsigned numbers. Baugh
Wooley algorithm is used in High
Performance Multiplier (HPM) tree, which
inherits regular and repeating structure of
the array multiplier. Baugh Wooley
multiplier exhibits less delay, low power
dissipation and the area occupied is also
Page 3
International Journal of Research Available at https://edupediapublications.org/journals
p-ISSN: 2348-6848 e-ISSN: 2348-795X Volume 03 Issue 12
August 2016
Available online: http://internationaljournalofresearch.org/ P a g e | 205
small compared to other array multipliers.
The architecture of Baugh Wooley
multiplier is based on carry save algorithm.
WORKING
The multiplication algorithm can be represented as shown below. Here, two 4 bit numbers are
multiplied using Baugh Wooley algorithm, and the partial products are given by Pp0 to Pp6 the
MSB bits are signed bits and are represented using “bar”.
III BACKGROUND ON COMPRESSORS
Page 4
International Journal of Research Available at https://edupediapublications.org/journals
p-ISSN: 2348-6848 e-ISSN: 2348-795X Volume 03 Issue 12
August 2016
Available online: http://internationaljournalofresearch.org/ P a g e | 206
A. 3:2 Compressor(CM3EA1)
In the existing 3:2 compressor [5] and full adder works similarly with the transitions
000,001,010,100,101,110,111. The equations for sum and carry are given below. By using
the Exor and multiplexer combination the performance and thearea reduction will be take
place because Exor increases the performance and the multiplexer is used for carry
operation.
B. 4:2 Compressor
Previous work on 4:2 compressor shows a significant improvement in delay by replacing some
Exor gates with multiplexer can be found. The equations governing the existing 4-2 compressor
[5] outputs are shown below.
Page 5
International Journal of Research Available at https://edupediapublications.org/journals
p-ISSN: 2348-6848 e-ISSN: 2348-795X Volume 03 Issue 12
August 2016
Available online: http://internationaljournalofresearch.org/ P a g e | 207
Figure 2.4-2 compressor
C.5-2 Compressor
The 5-2 Compressor block has 5 inputsX1,X2,X3,X4,X5 and 2 outputs, Sum and Carry, along
with 2 input carry bits (Cin1, Cin2) and 2 output carry bits (Cout1,Cout2) as shown in Fig3. The
input carry bits are the outputs from the previous lesser significant compressor block and the
output carry are passed on to the next higher significant.
In the proposed architecture these outputs are utilized efficiently by using multiplexers at select
stages in the circuit. Also additional inverter stages are eliminated. This in turn contributes to the
reduction of delay, power consumption and transistor count (area). The equations governing the
outputs are shown below.
Page 6
International Journal of Research Available at https://edupediapublications.org/journals
p-ISSN: 2348-6848 e-ISSN: 2348-795X Volume 03 Issue 12
August 2016
Available online: http://internationaljournalofresearch.org/ P a g e | 208
Figure 3. 5-2 compressor
IV WALLACE TREE MULTIPLIER
Wallace tree reduces the number of partial products to be added into 2 final intermediate results.
The Wallace tree basically multiplies two unsigned integers. Wallace tree is an efficient
hardware implementation of a digital circuit that multiplies two integers, devised by an
Australian Computer Scientist Chris in 1964. The Wallace tree has three steps:
1. Partial Product Generation Stage
2. Partial Product Reduction Stage
3. Partial Product Addition Stage
Partial Product Generation Stage :
Page 7
International Journal of Research Available at https://edupediapublications.org/journals
p-ISSN: 2348-6848 e-ISSN: 2348-795X Volume 03 Issue 12
August 2016
Available online: http://internationaljournalofresearch.org/ P a g e | 209
Partial product generation is the very first step in binary multiplier. These are the intermediate
terms which are generated based on the value of multiplier. If the multiplier bit is ‘0’, then partial
product row is also zero, and if it is ‘1’, then the multiplicand is copied as it is. From the 2nd bit
multiplication onwards, each partial product row is shifted one unit to the left as shown in the
above mentioned example. In signed multiplication, the sign bit is also extended to the left.
Partial product generators for a conventional multiplier consist of a series of logic AND gates as
shown in Figure3.1.
The main operation in the process of multiplication of two numbers is addition of the partial
products. Therefore, the performance and speed of the multiplier depends on the performance of
the adder that forms the core of the multiplier. To achieve higher performance, the multiplier
must be pipelined.
Partial Product Reduction Stage:
The design analysis starts with the analysis
of the elementary algorithm for
multiplication by Wallace Tree multiplier.
Figure 3.1 shows the algorithm for 8-bitsx 8-
bits multiplication performs by Wallace
Tree multiplier. There are five stages to go
through, to complete the multiplication
process. Each stage used half adders and full
adders that are denoted by the red circle for
the 1 bit half adder and the blue circle for
the 1-bit full adder. Firstly, we have to
reduce the partial products using half adders
and full adders that are combined to build a
carry-save adder (CSA) until there were just
two rows of partial products left. Next, we
add the remaining two rows by using a fast
carry-propagate adder. For this project,
ripple-carry adder (RCA) is used, to get the
final product of the two operands
multiplication. Secondly, the schematic of
the conventional 8-bits x 8-bits high speed
Page 8
International Journal of Research Available at https://edupediapublications.org/journals
p-ISSN: 2348-6848 e-ISSN: 2348-795X Volume 03 Issue 12
August 2016
Available online: http://internationaljournalofresearch.org/ P a g e | 210
Wallace Tree multiplier is design by
referring to the algorithm.
Compressors for Partial Product
Reduction:
In the proposed architecture, partial product
reduction is accomplished by the use of 4:2,
5:2 compressor structures and the final stage
of addition is performed by a Sklansky
adder. This multiplier architecture comprises
of a partial product generation stage, partial
product reduction stage and the final
addition stage. The latency in the Wallace
tree multiplier can be reduced by decreasing
the number of adders in the partial products
reduction stage. In the proposed
architecture, multi bit compressors are used
for realizing the reduction in the number of
partial product addition stages. The
combined factors of low power, low
transistor count and minimum delay makes
the 5:2 and 4:2compressors, the appropriate
choice. In these compressors, the outputs
generated at each stage are efficiently used
by replacing the XOR blocks with
multiplexer blocks .
V.ARCHITECTURE OF
PROPOSED WALLACE TREE
MULTIPLIER USING
COMPRESSORS Our proposed architecture aims to reduce the
overall latency. This leads to increased
speed and reduced power consumption. The
design makes use of compressors in place of
full adders, and the final carry propagate
stage is replaced by 4-2 compressors and 5-2
compressors
Page 9
International Journal of Research Available at https://edupediapublications.org/journals
p-ISSN: 2348-6848 e-ISSN: 2348-795X Volume 03 Issue 12
August 2016
Available online: http://internationaljournalofresearch.org/ P a g e | 211
The first stage consisting of a full adder. In
the second stage, two full adders have been
grouped and implemented using a 4:2
compressor. Similarly, the third stage
consists of a 5:2 compressor, which is a
combination of 3 full adders and so on. In
this manner the individual full adder blocks
in the original structure are grouped and
implemented using compressors. The
number of interconnections is taken care of,
since they play a vital role in the flow of
carry from one stage to thenext in the tree.
From Fig. 12, we can see that the longest
delay path of our design is the one
consisting of two 5:2 compressors, which
produces a reduced latency of 8(four per
compressor) only. The use of the Sklansky
adder in the structure further results in a
reduced latency of 6 with a latency of 1 for
Page 10
International Journal of Research Available at https://edupediapublications.org/journals
p-ISSN: 2348-6848 e-ISSN: 2348-795X Volume 03 Issue 12
August 2016
Available online: http://internationaljournalofresearch.org/ P a g e | 212
the AND array. Hence, this novel structure brings down the overall latency count to 15.
VI. RESULT
A. Simulation Results
The result of proposed 5:2 compressor
architecture when used in Wallace tree
multiplier the result shows 6.90 percentage
reduction in cell area and is 5.71 percentage
faster than existing. And when the proposed
5:2 compressor is used in Signed Baugh-
Wooley with Wallace tree multiplier, the
result shows 21.75 percentage reduction in
cell area and 15.34 percentage faster than
the existing full adder
.
RTL Schematic:-
Page 11
International Journal of Research Available at https://edupediapublications.org/journals
p-ISSN: 2348-6848 e-ISSN: 2348-795X Volume 03 Issue 12
August 2016
Available online: http://internationaljournalofresearch.org/ P a g e | 213
Fig. 4 shows the resulted RTL schematic for Baugh-Wooley multiplier. The fig. 5 shows the
technical schematic for Baugh-Wooley multiplier. Fig. 6 shows simulation result for Baugh-
Wooley multiplier.
Page 12
International Journal of Research Available at https://edupediapublications.org/journals
p-ISSN: 2348-6848 e-ISSN: 2348-795X Volume 03 Issue 12
August 2016
Available online: http://internationaljournalofresearch.org/ P a g e | 214
Device utilization summary:-
VII.
CONCLUSION
In this paper, a new approach in using
compressors for Signed Baugh-Wooley
Multiplier with Wallace tree is shown. The
performance in terms of speed and Area
verified in the Xilinx , synthesized in RTL
complier and compared using 180nm
standard cell technology. The results prove
that the proposed architecture is more
efficient than the conventional one in terms
of power consumption and speed.
Page 13
International Journal of Research Available at https://edupediapublications.org/journals
p-ISSN: 2348-6848 e-ISSN: 2348-795X Volume 03 Issue 12
August 2016
Available online: http://internationaljournalofresearch.org/ P a g e | 215
References
[1] List I. Abdellatif, E. Mohamed, “Low-
Power Digital VLSI Design, Circuits and
Systems,”
Kluwer Academic Publishers, 1995.
[2] H. Neil. Weste and Kamran Eshraghian,
“Principles of CMOS VLSIdesign-A
Systems
Perspective,” Pearson Edition Pvt Ltd. 3rd
edition, 2005.
[3] Sreehari Veeramachaneni, Kirthi M,
Krishna Lingamneni Avinash Sreekanth
Reddy Puppala
M.B. Srinivas, “Novel Architectures for
High- Speed and Low-Power 3-2, 4-2 and 5-
2
Compressors,” 20th International
Conference on VLSI Design, Jan 2007, Pp.
324-329.
[4] K. Prasad and K. K. Parhi, “Low-power
4-2 and 5-2 compressors,” inProc. of the
35th Asilomar Conf. on Signals, Systems
and Computers, 2001, Vol. 1, pp. 129–133.
AUTHOR1:-
*T.MOUNIKA completed her B.Tech VAAGDEVI COLLEGE OF ENGINEERING in 2013 and M.Tech completed in VAAGDEVI COLLEGE OF ENGINEERING
AUTHOR2:-
**Mr.T.SAMMAIAH working as Assoc.
prof in Dept of ECE, VAAGDEVI
COLLEGE OF ENGINEERING
AUTHOR3:-
***Mr.G.Babu working as Assoc. prof in
Dept of ECE, VAAGDEVI COLLEGE OF
ENGINEERING