Page 1
EE1411
© Digital Integrated Circuits2nd Arithmetic Circuits
Digital Integrated Digital Integrated CircuitsCircuitsA Design PerspectiveA Design Perspective
Arithmetic CircuitsArithmetic Circuits
Jan M. RabaeyAnantha ChandrakasanBorivoje Nikolic
January, 2003
Page 2
EE1412
© Digital Integrated Circuits2nd Arithmetic Circuits
A Generic Digital ProcessorA Generic Digital Processor
MEM ORY
DATAPATH
CONTROL
INPU
T-O
UT
PUT
Page 3
EE1413
© Digital Integrated Circuits2nd Arithmetic Circuits
Building Blocks for Digital ArchitecturesBuilding Blocks for Digital Architectures
Arithmetic unit- Bit-sliced datapath (adder, multiplier, shifter, comparator, etc.)
Memory- RAM, ROM, Buffers, Shift registers
Control- Finite state machine (PLA, random logic.)- Counters
Interconnect- Switches- Arbiters- Bus
Page 4
EE1414
© Digital Integrated Circuits2nd Arithmetic Circuits
An Intel MicroprocessorAn Intel Microprocessor
9-1
Mux
9-1
Mux
5-1
Mux
2-1
Mux
ck1
CARRYGEN
SUMGEN+ LU
1000um
b
s0
s1
g64
sum sumb
LU : LogicalUnit
SUM
SEL
a
to Cachenode1
REG
Itanium has 6 integer execution units like this
Page 5
EE1415
© Digital Integrated Circuits2nd Arithmetic Circuits
BitBit--Sliced DesignSliced Design
Bit 3
Bit 2
Bit 1
Bit 0
Reg
ister
Add
er
Shift
er
Mul
tiple
xer
ControlD
ata-
In
Dat
a-O
ut
Tile identical processing elements
Page 6
EE1416
© Digital Integrated Circuits2nd Arithmetic Circuits
BitBit--Sliced Sliced DatapathDatapath
Adder stage 1
Wiring
Adder stage 2
Wiring
Adder stage 3
Bit slice 0
Bit slice 2
Bit slice 1
Bit slice 63
Sum Select
Shifter
Multiplexers
Loopback Bus
From register files / Cache / Bypass
To register files / Cache
Loopback Bus
Loopback Bus
Page 7
EE1417
© Digital Integrated Circuits2nd Arithmetic Circuits
Itanium Integer Itanium Integer DatapathDatapath
Fetzer, Orton, ISSCC’02
Page 8
EE1418
© Digital Integrated Circuits2nd Arithmetic Circuits
AddersAdders
Page 9
EE1419
© Digital Integrated Circuits2nd Arithmetic Circuits
FullFull--AdderAdderA B
Cout
Sum
Cin Fulladder
Page 10
EE14110
© Digital Integrated Circuits2nd Arithmetic Circuits
The Binary AdderThe Binary Adder
S A B Ci⊕ ⊕=
A= BCi ABCi ABCi ABCi+ + +
Co AB BCi ACi+ +=
A B
Cout
Sum
Cin Fulladder
Page 11
EE14111
© Digital Integrated Circuits2nd Arithmetic Circuits
Express Sum and Carry as a function of P, G, DExpress Sum and Carry as a function of P, G, D
Define 3 new variable which ONLY depend on A, BGenerate (G) = ABPropagate (P) = A ⊕ BDelete = A B
Can also derive expressions for S and Co based on D and P
Propagate (P) = A + BNote that we will be sometimes using an alternate definition for
Page 12
EE14112
© Digital Integrated Circuits2nd Arithmetic Circuits
Complimentary Static CMOS Full AdderComplimentary Static CMOS Full Adder
28 Transistors
A B
B
A
Ci
Ci A
X
VDD
VDD
A B
Ci BA
B VDD
A
B
Ci
Ci
A
B
A CiB
Co
VDD
S
Page 13
EE14113
© Digital Integrated Circuits2nd Arithmetic Circuits
A Better Structure: The Mirror AdderA Better Structure: The Mirror Adder
VDD
Ci
A
BBA
B
A
A BKill
Generate"1"-Propagate
"0"-Propagate
VDD
Ci
A B Ci
Ci
B
A
Ci
A
BBA
VDD
SCo
24 transistors
Page 14
EE14114
© Digital Integrated Circuits2nd Arithmetic Circuits
Mirror AdderMirror AdderStick Diagram
CiA B
VDD
GND
B
Co
A Ci Co Ci A B
S
Page 15
EE14115
© Digital Integrated Circuits2nd Arithmetic Circuits
The Mirror AdderThe Mirror Adder•The NMOS and PMOS chains are completely symmetrical. A maximum of two series transistors can be observed in the carry-generation circuitry.
•When laying out the cell, the most critical issue is the minimization of the capacitance at node Co. The reduction of the diffusion capacitances is particularly important.
•The capacitance at node Co is composed of four diffusion capacitances, two internal gate capacitances, and six gate capacitances in the connecting adder cell .
•The transistors connected to Ci are placed closest to the output.•Only the transistors in the carry stage have to be optimized foroptimal speed. All transistors in the sum stage can be minimal size.
Page 16
EE14116
© Digital Integrated Circuits2nd Arithmetic Circuits
Transmission Gate Full AdderTransmission Gate Full Adder
A
B
P
Ci
VDDA
A A
VDD
Ci
A
P
AB
VDD
VDD
Ci
Ci
Co
S
Ci
P
P
P
P
P
Sum Generation
Carry Generation
Setup
Page 17
EE14117
© Digital Integrated Circuits2nd Arithmetic Circuits
OneOne--phasephase dynamicdynamic CMOS CMOS adderadder
Page 18
EE14118
© Digital Integrated Circuits2nd Arithmetic Circuits
OneOne--phasephase dynamicdynamic CMOS CMOS adderadder
Page 19
EE14119
© Digital Integrated Circuits2nd Arithmetic Circuits
OneOne--phasephase dynamicdynamic CMOS CMOS adderadder
Page 20
EE14120
© Digital Integrated Circuits2nd Arithmetic Circuits
The RippleThe Ripple--Carry AdderCarry Adder
Worst case delay linear with the number of bits
Goal: Make the fastest possible carry path circuit
FA FA FA FA
A0 B0
S0
A1 B1
S1
A2 B2
S2
A3 B3
S3
Ci,0 Co,0
(= Ci,1)
Co,1 Co,2 Co,3
td = O(N)
tadder = (N-1)tcarry + tsum
Page 21
EE14121
© Digital Integrated Circuits2nd Arithmetic Circuits
Inversion PropertyInversion Property
A B
S
CoCi FA
A B
S
CoCi FA
Page 22
EE14122
© Digital Integrated Circuits2nd Arithmetic Circuits
Minimize Critical Path by Reducing Inverting StagesMinimize Critical Path by Reducing Inverting Stages
Exploit Inversion Property
A3
FA FA FA
Even cell Odd cell
FA
A0 B0
S0
A1 B1
S1
A2 B2
S2
B3
S3
Ci,0 Co,0 Co,1 Co,3Co,2
Page 23
EE14123
© Digital Integrated Circuits2nd Arithmetic Circuits
CarryCarry LookLook--AheadAhead AddersAdders
Page 24
EE14124
© Digital Integrated Circuits2nd Arithmetic Circuits
CarryCarry--LookaheadLookahead AddersAdders
Page 25
EE14125
© Digital Integrated Circuits2nd Arithmetic Circuits
CarryCarry--LookaheadLookahead AddersAdders
Page 26
EE14126
© Digital Integrated Circuits2nd Arithmetic Circuits
LookLook--Ahead: TopologyAhead: Topology
Co k, Gk Pk Gk 1– Pk 1– Co k 2–,+( )+=
Co k, Gk Pk Gk 1– Pk 1– … P1 G0 P0 Ci 0,+( )+( )+( )+=
Expanding Lookahead equations:
All the way:
Co,3
Ci,0
VDD
P0
P1
P2
P3
G0
G1
G2
G3
Page 27
EE14127
© Digital Integrated Circuits2nd Arithmetic Circuits
Manchester Carry ChainManchester Carry Chain
CoCi
Gi
Di
Pi
Pi
VDD
CoCi
Gi
Pi
VDD
φ
φ
Page 28
EE14128
© Digital Integrated Circuits2nd Arithmetic Circuits
Manchester Carry ChainManchester Carry Chain
G2
φ
C3
G3Ci,0
P0
G1
VDD
φ
G0
P1 P2 P3
C3C2C1C0
Page 29
EE14129
© Digital Integrated Circuits2nd Arithmetic Circuits
Manchester Carry ChainManchester Carry Chain
Pi + 1 Gi + 1 φ
Ci
Inverter/Sum Row
Propagate/Generate Row
Pi Gi φ
Ci - 1Ci + 1
VDD
GND
Stick Diagram
Page 30
EE14130
© Digital Integrated Circuits2nd Arithmetic Circuits
CarryCarry--Bypass AdderBypass Adder
FA FA FA FA
P0 G1 P0 G1 P2 G2 P3 G3
Co,3Co,2Co,1Co,0Ci,0
FA FA FA FA
P0 G1 P0 G1 P2 G2 P3 G3
Co,2Co,1Co,0Ci,0
Co,3
Mul
tiple
xer
BP=PoP1P2P3
Idea: If (P0 and P1 and P2 and P3 = 1)then Co3 = C0, else “kill” or “generate”.
Also called Carry-Skip
Page 31
EE14131
© Digital Integrated Circuits2nd Arithmetic Circuits
CarryCarry--Bypass Adder (cont.)Bypass Adder (cont.)
Carrypropagation
SetupBit 0–3
Sum
M bits
tsetup
tsum
Carrypropagation
SetupBit 4–7
Sum
tbypass
Carrypropagation
SetupBit 8–11
Sum
Carrypropagation
SetupBit 12–15
Sum
tadder = tsetup + Mtcarry + (N/M –1)tbypass + (M – 1)tcarry + tsum
Page 32
EE14132
© Digital Integrated Circuits2nd Arithmetic Circuits
Carry Ripple versus Carry BypassCarry Ripple versus Carry Bypass
N
tp
ripple adder
bypass adder
4..8
Page 33
EE14133
© Digital Integrated Circuits2nd Arithmetic Circuits
CarryCarry--Select AdderSelect AdderSetup
"0" Carry Propagation
"1" Carry Propagation
Multiplexer
Sum Generation
Co,k-1 Co,k+3
"0"
"1"
P,G
Carry Vector
Page 34
EE14134
© Digital Integrated Circuits2nd Arithmetic Circuits
Carry Select Adder: Critical Path Carry Select Adder: Critical Path
0
1
Sum Generation
Multiplexer
1-Carry
0-Carry
Setup
Ci,0 Co,3 Co,7 Co,11 Co,15
S0–3
Bit 0–3 Bit 4–7 Bit 8–11 Bit 12–15
0
1
Sum Generation
Multiplexer
1-Carry
0-Carry
Setup
S4–7
0
1
Sum Generation
Multiplexer
1-Carry
0-Carry 0-Carry
Setup
S8–11
0
1
Sum Generation
Multiplexer
1-Carry
Setup
S12–15
Page 35
EE14135
© Digital Integrated Circuits2nd Arithmetic Circuits
Linear Carry Select Linear Carry Select
Setup
"0" Carry
"1" Carry
Multiplexer
Sum Generation
"0"
"1"
Setup
"0" Carry
"1" Carry
Multiplexer
Sum Generation
"0"
"1"
Setup
"0" Carry
"1" Carry
Multiplexer
Sum Generation
"0"
"1"
Setup
"0" Carry
"1" Carry
Multiplexer
Sum Generation
"0"
"1"
Bit 0-3 Bit 4-7 Bit 8-11 Bit 12-15
S0-3 S4-7 S8-11 S12-15
Ci,0
(1)
(1)
(5)(6) (7) (8)
(9)
(10)
(5) (5) (5)(5)
Page 36
EE14136
© Digital Integrated Circuits2nd Arithmetic Circuits
Square Root Carry Select Square Root Carry Select Setup
"0" Carry
"1" Carry
Multiplexer
Sum Generation
"0"
"1"
Setup
"0" Carry
"1" Carry
Multiplexer
Sum Generation
"0"
"1"
Setup
"0" Carry
"1" Carry
Multiplexer
Sum Generation
"0"
"1"
Setup
"0" Carry
"1" Carry
Multiplexer
Sum Generation
"0"
"1"
Bit 0-1 Bit 2-4 Bit 5-8 Bit 9-13
S0-1 S2-4 S5-8 S9-13
Ci,0
(4) (5) (6) (7)
(1)
(1)
(3) (4) (5) (6)
Mux
Sum
S14-19
(7)
(8)
Bit 14-19
(9)
(3)
Page 37
EE14137
© Digital Integrated Circuits2nd Arithmetic Circuits
Adder Delays Adder Delays -- Comparison Comparison
Square root select
Linear select
Ripple adder
20 40N
t p(in
uni
t del
ays)
600
10
0
20
30
40
50
Page 38
EE14138
© Digital Integrated Circuits2nd Arithmetic Circuits
““O” OperatorO” Operator
Definizione
Page 39
EE14139
© Digital Integrated Circuits2nd Arithmetic Circuits
PropertiesProperties of the “O” operatorof the “O” operator
Page 40
EE14140
© Digital Integrated Circuits2nd Arithmetic Circuits
PropertiesProperties of the “O” operatorof the “O” operator
Page 41
EE14141
© Digital Integrated Circuits2nd Arithmetic Circuits
GroupGroup Generate and PropagateGenerate and Propagate
Page 42
EE14142
© Digital Integrated Circuits2nd Arithmetic Circuits
GroupGroup Generate and PropagateGenerate and Propagate
Page 43
EE14143
© Digital Integrated Circuits2nd Arithmetic Circuits
GroupGroup Generate and PropagateGenerate and Propagate
Page 44
EE14144
© Digital Integrated Circuits2nd Arithmetic Circuits
LookLook--Ahead Ahead -- Basic IdeaBasic Idea
Co k, f A k Bk Co k, 1–, ,( ) Gk P kCo k 1–,+= =
AN-1, BN-1A1, B1
P1
S1
• • •
• • • SN-1
PN-1Ci, N-1
S0
P0Ci,0 Ci,1
A0, B0
Page 45
EE14145
© Digital Integrated Circuits2nd Arithmetic Circuits
Logarithmic LookLogarithmic Look--Ahead AdderAhead Adder
A7
F
A6A5A4A3A2A1
A0
A0A1
A2A3
A4A5
A6
A7
F
tp∼ log2(N)
tp∼ N
Page 46
EE14146
© Digital Integrated Circuits2nd Arithmetic Circuits
BrentBrent--KungKung BLC BLC adderadder
Page 47
EE14147
© Digital Integrated Circuits2nd Arithmetic Circuits
FoldingFolding of the inverse of the inverse treetree
Page 48
EE14148
© Digital Integrated Circuits2nd Arithmetic Circuits
FoldingFolding the inverse the inverse treetree
Page 49
EE14149
© Digital Integrated Circuits2nd Arithmetic Circuits
Dense Dense treetree withwith minimum fanminimum fan--outout
Page 50
EE14150
© Digital Integrated Circuits2nd Arithmetic Circuits
Dense Dense treetree withwith simplesimple connectionsconnections
Page 51
EE14151
© Digital Integrated Circuits2nd Arithmetic Circuits
Carry Carry Lookahead Lookahead TreesTrees
Co 0, G0 P0Ci 0,+=
Co 1, G1 P1 G0 P1P0 Ci 0,+ +=
Co 2, G2 P2G1 P2 P1G0 P+ 2 P1P0Ci 0,+ +=
G2 P2G1+( )= P2P1( ) G0 P0Ci 0,+( )+ G 2:1 P2:1Co 0,+=
Can continue building the tree hierarchically.
Page 52
EE14152
© Digital Integrated Circuits2nd Arithmetic Circuits
Tree AddersTree Adders
16-bit radix-2 Kogge-Stone tree
(A0,
B0)
(A1,
B1)
(A2,
B2)
(A3,
B3)
(A4,
B4)
(A5,
B5)
(A6,
B6)
(A7,
B7)
(A8,
B8)
(A9,
B9)
(A10
, B10
)
(A11
, B11
)
(A12
, B12
)
(A13
, B13
)
(A14
, B14
)
(A15
, B15
)
S 0 S 1 S 2 S 3 S 4 S 5 S 6 S 7 S 8 S 9 S 10 S 11 S 12 S 13 S 14 S 15
Page 53
EE14153
© Digital Integrated Circuits2nd Arithmetic Circuits
Tree AddersTree Adders(a
0, b
0)
(a1,
b1)
(a2,
b2)
(a3,
b3)
(a4,
b4)
(a5,
b5)
(a6,
b6)
(a7,
b7)
(a8,
b8)
(a9,
b9)
(a10
, b10
)
(a11
, b11
)
(a12
, b12
)
(a13
, b13
)
(a14
, b14
)
(a15
, b15
)
S0
S1
S2
S3
S4
S5
S6
S7
S8
S9
S10
S11
S12
S13
S14
S15
16-bit radix-4 Kogge-Stone Tree
Page 54
EE14154
© Digital Integrated Circuits2nd Arithmetic Circuits
Sparse TreesSparse Trees(a
0, b 0)
(a1,
b 1)
(a2,
b 2)
(a3,
b 3)
(a4,
b 4)
(a5,
b 5)
(a6,
b 6)
(a7,
b 7)
(a8,
b 8)
(a9,
b 9)
(a10
, b10
)
(a11
, b11
)
(a12
, b12
)
(a13
, b13
)
(a14
, b14
)
(a15
, b15
)
S1
S3
S5
S7
S9
S11
S13
S15
S0
S2
S4
S6
S8
S10
S12
S14
16-bit radix-2 sparse tree with sparseness of 2
Page 55
EE14155
© Digital Integrated Circuits2nd Arithmetic Circuits
Tree AddersTree Adders(A
0, B
0)
(A1,
B1)
(A2,
B2)
(A3,
B3)
(A4,
B4)
(A5,
B5)
(A6,
B6)
(A7,
B7)
(A8,
B8)
(A9,
B9)
(A10
, B10
)
(A11
, B11
)
(A12
, B12
)
(A13
, B13
)
(A14
, B14
)
(A15
, B15
)
S 0 S 1 S 2 S 3 S 4 S 5 S 6 S 7 S 8 S 9 S 10 S 11 S 12 S 13 S 14 S 15
Brent-Kung Tree
Page 56
EE14156
© Digital Integrated Circuits2nd Arithmetic Circuits
Example: Domino AdderExample: Domino Adder
VDD
Clk Pi= ai + bi
Clk
ai bi
VDD
Clk Gi = aibi
Clk
ai
bi
Propagate Generate
Page 57
EE14157
© Digital Integrated Circuits2nd Arithmetic Circuits
Example: Domino AdderExample: Domino Adder
VDD
Clkk
Pi:i-k+1
Pi-k:i-2k+1
Pi:i-2k+1
VDD
Clkk
Gi:i-k+1
Pi:i-k+1
Gi-k:i-2k+1
Gi:i-2k+1
Propagate Generate
Page 58
EE14158
© Digital Integrated Circuits2nd Arithmetic Circuits
Example: Domino SumExample: Domino SumVDD
Clk
Gi:0
Clk
Sum
VDD
Clkd
Clk
Gi:0
Clk
Si1
Clkd
Si0
Keeper
Page 59
EE14159
© Digital Integrated Circuits2nd Arithmetic Circuits
AddersAdders –– SummarySummary
Page 60
EE14160
© Digital Integrated Circuits2nd Arithmetic Circuits
MultipliersMultipliers
Page 61
EE14161
© Digital Integrated Circuits2nd Arithmetic Circuits
The Binary MultiplicationThe Binary Multiplication
Page 62
EE14162
© Digital Integrated Circuits2nd Arithmetic Circuits
The Binary MultiplicationThe Binary Multiplication
Page 63
EE14163
© Digital Integrated Circuits2nd Arithmetic Circuits
The Binary MultiplicationThe Binary Multiplication
x
+
Partial products
Multiplicand
Multiplier
Result
1 0 1 0 1 0
1 0 1 0 1 0
1 0 1 0 1 0
1 1 1 0 0 1 1 1 0
0 0 0 0 0 0
1 0 1 0 1 0
1 0 1 1
Page 64
EE14164
© Digital Integrated Circuits2nd Arithmetic Circuits
The Array MultiplierThe Array MultiplierY0
Y1
X3 X2 X1 X0
X3
HA
X2
FA
X1
FA
X0
HA
Y2X3
FA
X2
FA
X1
FA
X0
HA
Z1
Z3Z6Z7 Z5 Z4
Y3X3
FA
X2
FA
X1
FA
X0
HA
Z2
Z0
Page 65
EE14165
© Digital Integrated Circuits2nd Arithmetic Circuits
The The MxN MxN Array MultiplierArray Multiplier—— Critical PathCritical Path
HA FA FA HA
HAFAFAFA
FAFA FA HA
Critical Path 1
Critical Path 2
Critical Path 1 & 2
Page 66
EE14166
© Digital Integrated Circuits2nd Arithmetic Circuits
CarryCarry--Save MultiplierSave MultiplierHA HA HA HA
FAFAFAHA
FAHA FA FA
FAHA FA HA
Vector Merging Adder
Page 67
EE14167
© Digital Integrated Circuits2nd Arithmetic Circuits
Multiplier Multiplier FloorplanFloorplan
SCSCSCSC
SCSCSCSC
SCSCSCSC
SC
SC
SC
SC
Z0
Z1
Z2
Z3Z4Z5Z6Z7
X0X1X2X3
Y1
Y2
Y3
Y0
Vector Merging Cell
HA Multiplier Cell
FA Multiplier Cell
X and Y signals are broadcastedthrough the complete array.( )
Page 68
EE14168
© Digital Integrated Circuits2nd Arithmetic Circuits
WallaceWallace--Tree MultiplierTree Multiplier
6 5 4 3 2 1 0 6 5 4 3 2 1 0
Partial products First stage
Bit position
6 5 4 3 2 1 0 6 5 4 3 2 1 0Second stage Final adder
FA HA
(a) (b)
(c) (d)
Page 69
EE14169
© Digital Integrated Circuits2nd Arithmetic Circuits
WallaceWallace--Tree MultiplierTree Multiplier
Partial products
First stage
Second stage
Final adder
FA FA FA
HA HA
FA
x3y3
z7 z6 z5 z4 z3 z2 z1 z0
x3y2x2y3
x1y1x3y0 x2y0 x0y1x0y2
x2y2x1y3
x1y2x3y1x0y3 x1y0 x0y0x2y1
Page 70
EE14170
© Digital Integrated Circuits2nd Arithmetic Circuits
WallaceWallace--Tree MultiplierTree Multiplier
FA
FA
FA
FA
y0 y1 y2
y3
y4
y5
S
Ci-1
Ci-1
Ci-1
Ci
Ci
Ci
FA
y0 y1 y2
FA
y3 y4 y5
FA
FA
CC S
Ci-1
Ci-1
Ci-1
Ci
Ci
Ci
Page 71
EE14171
© Digital Integrated Circuits2nd Arithmetic Circuits
WallaceWallace TreeTree MultMult. Performance. Performance
Page 72
EE14172
© Digital Integrated Circuits2nd Arithmetic Circuits
WallaceWallace TreeTree MultiplierMultiplier ComplexityComplexity
Page 73
EE14173
© Digital Integrated Circuits2nd Arithmetic Circuits
4:2 4:2 AdderAdder
Page 74
EE14174
© Digital Integrated Circuits2nd Arithmetic Circuits
EightEight--inputinput TreeTree
Page 75
EE14175
© Digital Integrated Circuits2nd Arithmetic Circuits
ArchitecturalArchitectural comparisoncomparison of of multipliermultiplier solutionssolutions
Page 76
EE14176
© Digital Integrated Circuits2nd Arithmetic Circuits
SPIM SPIM ArchitectureArchitecture
Page 77
EE14177
© Digital Integrated Circuits2nd Arithmetic Circuits
SPIM Pipe TimingSPIM Pipe Timing
Page 78
EE14178
© Digital Integrated Circuits2nd Arithmetic Circuits
SPIM SPIM MicrophotographMicrophotograph
Page 79
EE14179
© Digital Integrated Circuits2nd Arithmetic Circuits
SPIM clock SPIM clock generatorgenerator circuitcircuit
Page 80
EE14180
© Digital Integrated Circuits2nd Arithmetic Circuits
BinaryBinary TreeTree MultiplierMultiplier PerformancePerformance
Page 81
EE14181
© Digital Integrated Circuits2nd Arithmetic Circuits
BinaryBinary TreeTree MultiplierMultiplier ComplexityComplexity
Page 82
EE14182
© Digital Integrated Circuits2nd Arithmetic Circuits
Multipliers Multipliers –– SummarySummary
• Optimization Goals Different Vs Binary Adder
• Once Again: Identify Critical Path
• Other possible techniques
- Data encoding (Booth)- Pipelining
FIRST GLIMPSE AT SYSTEM LEVEL OPTIMIZATION
- Logarithmic versus Linear (Wallace Tree Mult)
Page 83
EE14183
© Digital Integrated Circuits2nd Arithmetic Circuits
MultipliersMultipliers –– SummarySummary
Page 84
EE14184
© Digital Integrated Circuits2nd Arithmetic Circuits
BoothBooth encodingencoding
Page 85
EE14185
© Digital Integrated Circuits2nd Arithmetic Circuits
BoothBooth encodingencoding
Page 86
EE14186
© Digital Integrated Circuits2nd Arithmetic Circuits
TreeTree MultiplierMultiplier withwith BoothBooth EncodingEncoding
Page 87
EE14187
© Digital Integrated Circuits2nd Arithmetic Circuits
The f.p. The f.p. additionaddition algorithmalgorithm
Exponent comparison and swap (ifneeded)Mantissas’ aligmentAdditionNormalization of resultRounding of result
Page 88
EE14188
© Digital Integrated Circuits2nd Arithmetic Circuits
The f.p. The f.p. multiplicationmultiplication algorithmalgorithm
Mantissas’ multiplicationExponent additionMantissa normalization and exponentadjusting (if needed)Rounding of result
Page 89
EE14189
© Digital Integrated Circuits2nd Arithmetic Circuits
DividersDividers
Page 90
EE14190
© Digital Integrated Circuits2nd Arithmetic Circuits
Iterative Iterative DivisionDivision ((NewtonNewton--RaphsonRaphson))
Page 91
EE14191
© Digital Integrated Circuits2nd Arithmetic Circuits
Iterative Iterative DivisionDivision ((NewtonNewton--RaphsonRaphson))
Page 92
EE14192
© Digital Integrated Circuits2nd Arithmetic Circuits
QuadraticQuadratic ConvergenceConvergence of the Newton of the Newton MethodMethod
Page 93
EE14193
© Digital Integrated Circuits2nd Arithmetic Circuits
PropertiesProperties of the Newton of the Newton MethodMethod
Asymptotically quadratic convergenceCorrection of round-off errorsFinal multiplication by a generates a round-off problem incompatible withthe Standard IEEE-754
Page 94
EE14194
© Digital Integrated Circuits2nd Arithmetic Circuits
Iterative Iterative DivisionDivision ((GoldschmidtGoldschmidt))
Page 95
EE14195
© Digital Integrated Circuits2nd Arithmetic Circuits
Iterative Iterative divisiondivision ((GoldschmidtGoldschmidt))
Page 96
EE14196
© Digital Integrated Circuits2nd Arithmetic Circuits
Iterative Iterative DivisionDivision ((GoldschmidtGoldschmidt))
The sequence of xn tends to a/bThe convergence is quadraticIn its present form, this method isaffected by roud-off errors
Page 97
EE14197
© Digital Integrated Circuits2nd Arithmetic Circuits
ModifiedModified GoldschmidtGoldschmidt AlgorithmAlgorithm((correctioncorrection of roundof round--off off errorserrors))
Page 98
EE14198
© Digital Integrated Circuits2nd Arithmetic Circuits
The The LinkLink betweenbetween the the NewtonNewton--RaphsonRaphsonand and GoldschmidtGoldschmidt MethodsMethods
Page 99
EE14199
© Digital Integrated Circuits2nd Arithmetic Circuits
The The LinkLink betweenbetween the the NewtonNewton--RaphsonRaphsonand and GoldschmidtGoldschmidt MethodsMethods
Page 100
EE141100
© Digital Integrated Circuits2nd Arithmetic Circuits
ComparisonComparison betweenbetween Newton and Newton and GoldschmidtGoldschmidt methodsmethods
The Newton and Goldschmidt methodsare essentially equivalent;Both methods exhibit an asymptoticallyquadratic convergence;Both methods are able to correct round-off errors;The Goldschmidt methods directlycomputes the a/b ratio.
Page 101
EE141101
© Digital Integrated Circuits2nd Arithmetic Circuits
ShiftersShifters
Page 102
EE141102
© Digital Integrated Circuits2nd Arithmetic Circuits
The Binary ShifterThe Binary Shifter
Ai
Ai-1
Bi
Bi-1
Right Leftnop
Bit-Slice i
...
Page 103
EE141103
© Digital Integrated Circuits2nd Arithmetic Circuits
The Barrel ShifterThe Barrel Shifter
Sh3Sh2Sh1Sh0
Sh3
Sh2
Sh1
A3
A2
A1
A0
B3
B2
B1
B0
: Control Wire
: Data Wire
Area Dominated by Wiring
Page 104
EE141104
© Digital Integrated Circuits2nd Arithmetic Circuits
4x4 barrel shifter4x4 barrel shifter
BufferSh3S h2Sh 1Sh0
A3
A2
A 1
A 0
Widthbarrel ~ 2 pm M
Page 105
EE141105
© Digital Integrated Circuits2nd Arithmetic Circuits
Logarithmic ShifterLogarithmic ShifterSh1 Sh1 Sh2 Sh2 Sh4 Sh4
A3
A2
A1
A0
B1
B0
B2
B3
Page 106
EE141106
© Digital Integrated Circuits2nd Arithmetic Circuits
A3
A 2
A1
A0
Out3
Out2
Out1
Out0
00--7 bit Logarithmic Shifter7 bit Logarithmic Shifter
Page 107
EE141107
© Digital Integrated Circuits2nd Arithmetic Circuits
ALUsALUs
Page 108
EE141108
© Digital Integrated Circuits2nd Arithmetic Circuits
TwoTwo--bitbit MUXMUX
Page 109
EE141109
© Digital Integrated Circuits2nd Arithmetic Circuits
TwoTwo--bitbit MUX MUX TruthTruth TableTable
Page 110
EE141110
© Digital Integrated Circuits2nd Arithmetic Circuits
TwoTwo--bitbit SelectorSelector TruthTruth TableTable
Page 111
EE141111
© Digital Integrated Circuits2nd Arithmetic Circuits
CarryCarry--chainchain TruthTruth TableTable
Page 112
EE141112
© Digital Integrated Circuits2nd Arithmetic Circuits
ALU block ALU block diagramdiagram ((MeadMead--ConwayConway))
Page 113
EE141113
© Digital Integrated Circuits2nd Arithmetic Circuits
ALU ALU OperationsOperations
Page 114
EE141114
© Digital Integrated Circuits2nd Arithmetic Circuits
ALU ALU OperationsOperations
Page 115
EE141115
© Digital Integrated Circuits2nd Arithmetic Circuits
ALU ALU OperationsOperations
Page 116
EE141116
© Digital Integrated Circuits2nd Arithmetic Circuits
ALU ALU OperationsOperations
Page 117
EE141117
© Digital Integrated Circuits2nd Arithmetic Circuits
ALU ALU OperationsOperations
Page 118
EE141118
© Digital Integrated Circuits2nd Arithmetic Circuits
MIPSMIPS--X X InstructionInstruction FormatFormat
Page 119
EE141119
© Digital Integrated Circuits2nd Arithmetic Circuits
Pipeline Pipeline dependenciesdependencies in MIPSin MIPS--XX
Page 120
EE141120
© Digital Integrated Circuits2nd Arithmetic Circuits
DieDie Photo of MIPSPhoto of MIPS--XX
Page 121
EE141121
© Digital Integrated Circuits2nd Arithmetic Circuits
MIPSMIPS--X X ArchitectureArchitecture
Page 122
EE141122
© Digital Integrated Circuits2nd Arithmetic Circuits
MIPSMIPS--X X InstructionInstruction CacheCache--miss timingmiss timing
Page 123
EE141123
© Digital Integrated Circuits2nd Arithmetic Circuits
MIPSMIPS--X X TagTag MemoryMemory
Page 124
EE141124
© Digital Integrated Circuits2nd Arithmetic Circuits
MIPSMIPS--X X ValidValid StoreStore ArrayArray
Page 125
EE141125
© Digital Integrated Circuits2nd Arithmetic Circuits
RAM RAM SenseSense AmplifierAmplifier
Page 126
EE141126
© Digital Integrated Circuits2nd Arithmetic Circuits
CMOS CMOS DualDual--portport RegisterRegister CellCell
Page 127
EE141127
© Digital Integrated Circuits2nd Arithmetic Circuits
SelfSelf--timedtimed bitbit--line line writewrite circuitcircuit
Page 128
EE141128
© Digital Integrated Circuits2nd Arithmetic Circuits
RegisterRegister bypass bypass logiclogic
Page 129
EE141129
© Digital Integrated Circuits2nd Arithmetic Circuits
SchematicSchematic of of comparatorcomparator circuitcircuit
Page 130
EE141130
© Digital Integrated Circuits2nd Arithmetic Circuits
SquashSquash FSMFSM
Page 131
EE141131
© Digital Integrated Circuits2nd Arithmetic Circuits
CacheCache--miss FSMmiss FSM