Digital Integrated Digital Integrated Circuits Circuits Arithmetic Circuits Arithmetic Circuits EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1
Digital Integrated Digital Integrated g gg gCircuitsCircuits
Arithmetic CircuitsArithmetic Circuits
EE141© Digital Integrated Circuits2nd Arithmetic Circuits1
A Generic Digital ProcessorA Generic Digital ProcessorA Generic Digital ProcessorA Generic Digital Processor
MEMORYU
T
CONTROL
UT
-OU
TPU
DATAPATH
INPU
EE141© Digital Integrated Circuits2nd Arithmetic Circuits2
Building Blocks for Digital ArchitecturesBuilding Blocks for Digital ArchitecturesBuilding Blocks for Digital ArchitecturesBuilding Blocks for Digital Architectures
Arithmetic unit- Bit-sliced datapath (adder, multiplier, shifter, comparator, etc.)
Memory- RAM, ROM, Buffers, Shift registers
Control- Finite state machine (PLA, random logic.)
Counters- Counters
Interconnect- SwitchesSwitches- Arbiters- Bus
EE141© Digital Integrated Circuits2nd Arithmetic Circuits3
An Intel MicroprocessorAn Intel MicroprocessorAn Intel MicroprocessorAn Intel Microprocessor
-1 M
ux
-1 M
ux
CARRYGENg64a
9- 5-
ck1sum sumb
SUM
SEL
to Cachenode1
REG
9-1
Mux
2-1
Mux SUMGEN
+ LUb
s0
s1
LU : Logical
1000um
gUnit
Itanium has 6 integer execution units like this
EE141© Digital Integrated Circuits2nd Arithmetic Circuits4
BitBit Sliced DesignSliced DesignBitBit--Sliced DesignSliced DesignControlControl
Bit 3
Bit 2ern utBit 2
Bit 1R
egist
er
Add
er
Shift
er
Mul
tiple
xe
Dat
a-In
Dat
a-O
u
Bit 0M
Tile identical processing elements
EE141© Digital Integrated Circuits2nd Arithmetic Circuits5
BitBit--Sliced DatapathSliced DatapathBitBit--Sliced DatapathSliced DatapathFrom register files / Cache / Bypass
Shifter
Multiplexers
Adder stage 1
WiringLo Lo Lo
Adder stage 2
Wiring
oopback Bus
opback Bus
oopback Bus
Adder stage 3
Bit slice
Bit slice
Bit slice
Bit slice e 0
e 2e 163
Sum Select
EE141© Digital Integrated Circuits2nd Arithmetic Circuits6
To register files / Cache
Itanium Integer DatapathItanium Integer DatapathItanium Integer DatapathItanium Integer Datapath
EE141© Digital Integrated Circuits2nd Arithmetic Circuits7Fetzer, Orton, ISSCC’02
AddersAddersAddersAdders
EE141© Digital Integrated Circuits2nd Arithmetic Circuits8
FullFull--AdderAdderFullFull AdderAdderA B
CoutCin Fulladder
Sum
EE141© Digital Integrated Circuits2nd Arithmetic Circuits9
The Binary AdderThe Binary AdderThe Binary AdderThe Binary AdderA B
CoutCin Fulladder
Sum
adder
S A B C⊕ ⊕S A B Ci⊕ ⊕=
A= BCi ABCi ABCi ABCi+ + +i i i iCo AB BCi ACi+ +=
EE141© Digital Integrated Circuits2nd Arithmetic Circuits10
Express Sum and Carry as a function of P G DExpress Sum and Carry as a function of P G DExpress Sum and Carry as a function of P, G, DExpress Sum and Carry as a function of P, G, D
Define 3 new variable which ONLY depend on A, BGenerate (G) = ABPropagate (P) = A ⊕ BDelete = A B
Can also derive expressions for S and Co based on D and PNote that we will be sometimes using an alternate definition for
EE141© Digital Integrated Circuits2nd Arithmetic Circuits11
Propagate (P) = A + B
The RippleThe Ripple--Carry AdderCarry AdderThe RippleThe Ripple--Carry AdderCarry Adder
EE141© Digital Integrated Circuits2nd Arithmetic Circuits12
Complimentary Static CMOS Full AdderComplimentary Static CMOS Full AdderComplimentary Static CMOS Full AdderComplimentary Static CMOS Full Adder
EE141© Digital Integrated Circuits2nd Arithmetic Circuits13
Inversion PropertyInversion PropertyInversion PropertyInversion Property
A B A B
CoCi FA CoCi FA
S S
EE141© Digital Integrated Circuits2nd Arithmetic Circuits14
Minimize Critical Path by Reducing Inverting StagesMinimize Critical Path by Reducing Inverting StagesMinimize Critical Path by Reducing Inverting StagesMinimize Critical Path by Reducing Inverting Stages
EE141© Digital Integrated Circuits2nd Arithmetic Circuits15
A Better Structure: The Mirror AdderA Better Structure: The Mirror AdderA Better Structure: The Mirror AdderA Better Structure: The Mirror Adder
V
VDD VDD A
VDD
B
A
A BKill
"0"-Propagate
Ci
Ci
BBA
Ci
AGenerate"1"-Propagate
Ci
CiSCo
BBA A B Ci
B
A
24 transistors
EE141© Digital Integrated Circuits2nd Arithmetic Circuits16
Mirror AdderMirror AdderMirror AdderMirror AdderStick Diagram
VDD
CA BB A C C C A BCiA BB A Ci Co Ci A B
Co
GND
S
EE141© Digital Integrated Circuits2nd Arithmetic Circuits17
GND
The Mirror AdderThe Mirror AdderThe Mirror AdderThe Mirror Adder•The NMOS and PMOS chains are completely symmetrical. p y yA maximum of two series transistors can be observed in the carry-generation circuitry.Wh l i t th ll th t iti l i i th i i i ti•When laying out the cell, the most critical issue is the minimization of the capacitance at node Co. The reduction of the diffusion capacitances is particularly important.
•The capacitance at node Co is composed of four diffusion capacitances, two internal gate capacitances, and six gate
it i th ti dd llcapacitances in the connecting adder cell .•The transistors connected to Ci are placed closest to the output.O l th t i t i th t h t b ti i d f•Only the transistors in the carry stage have to be optimized for optimal speed. All transistors in the sum stage can be minimal size.
EE141© Digital Integrated Circuits2nd Arithmetic Circuits18
Transmission Gate Full AdderTransmission Gate Full AdderTransmission Gate Full AdderTransmission Gate Full Adder
P
AVDD
VDDCi
S
P
PPA A Ci
S
P
P Sum Generation
BVDD
AP
AB VDD
C
P
P Carry GenerationCi
ACi
CoCi
P
P Carry Generation
S t PSetup
EE141© Digital Integrated Circuits2nd Arithmetic Circuits19
Manchester Carry ChainManchester Carry ChainManchester Carry ChainManchester Carry ChainV
PVDD
Pi
VDD
φ
CGi
Pi CoCi
φ
CoCi
Di
Gi
i
Pi φ
EE141© Digital Integrated Circuits2nd Arithmetic Circuits20
Manchester Carry ChainManchester Carry ChainManchester Carry ChainManchester Carry ChainVDD
φ
C
P0
DD
P1 P2 P3
G2
C3
G3Ci,0 G1G0
φ
C3C2C1C0
EE141© Digital Integrated Circuits2nd Arithmetic Circuits21
Manchester Carry ChainManchester Carry ChainManchester Carry ChainManchester Carry ChainStick Diagram
Propagate/Generate Row
g
Pi + 1 Gi + 1 φPi Gi φ
VDD
CiCi - 1Ci + 1
GND
Inverter/Sum Row
EE141© Digital Integrated Circuits2nd Arithmetic Circuits22
CarryCarry--Bypass AdderBypass AdderCarryCarry Bypass AdderBypass AdderP0 G1 P0 G1 P2 G2 P3 G3 Also called
Carry-SkipFA FA FA FA
Co,3Co,2Co,1Co,0Ci,0Carry-Skip
P0 G1 P0 G1 P2 G2 P3 G3BP=PoP1P2P3
FA FA FA FACo,2Co,1Co,0Ci,0
Co,3
ultip
lexe
rM
u
Idea: If (P0 and P1 and P2 and P3 = 1)Idea: If (P0 and P1 and P2 and P3 = 1)then Co3 = C0, else “kill” or “generate”.
EE141© Digital Integrated Circuits2nd Arithmetic Circuits23
CarryCarry--Bypass Adder (cont )Bypass Adder (cont )CarryCarry--Bypass Adder (cont.)Bypass Adder (cont.)
EE141© Digital Integrated Circuits2nd Arithmetic Circuits24
Carry Ripple versus Carry BypassCarry Ripple versus Carry BypassCarry Ripple versus Carry BypassCarry Ripple versus Carry Bypass
tp
ripple adder
bypass adder
N4..8
EE141© Digital Integrated Circuits2nd Arithmetic Circuits25
N
CarryCarry--Select AdderSelect AdderCarryCarry Select AdderSelect AdderSetup
"0" Carry Propagation"0"
P,G
y p g
"1" Carry Propagation
0
"1" 1 Carry Propagation
MultiplexerC k 1 C
1
Multiplexer
S G i
Co,k-1 Co,k+3
Carry Vector
Sum Generation
EE141© Digital Integrated Circuits2nd Arithmetic Circuits26
Carry Select Adder: Critical Path Carry Select Adder: Critical Path Carry Select Adder: Critical Path Carry Select Adder: Critical Path
EE141© Digital Integrated Circuits2nd Arithmetic Circuits27
Linear Carry Select Linear Carry Select Linear Carry Select Linear Carry Select Bit 0-3 Bit 4-7 Bit 8-11 Bit 12-15
Setup
"0" Carry
Setup
"0" Carry
Setup
"0" Carry
Setup
"0" Carry
(1)
y
"1" Carry
"0"
"1"
y
"1" Carry
"0"
"1"
y
"1" Carry
"0"
"1"
y
"1" Carry
"0"
"1"
(1)
(5) (5) (5) (5)(5)
Multiplexer Multiplexer Multiplexer MultiplexerCi,0
(5)(6) (7) (8)
(9)
(5) (5) (5)(5)
Sum Generation Sum Generation Sum Generation Sum Generation
S0-3 S4-7 S8-11 S12-15 (10)
EE141© Digital Integrated Circuits2nd Arithmetic Circuits28
Square Root Carry Select Square Root Carry Select Square Root Carry Select Square Root Carry Select S
Bit 0-1 Bit 2-4 Bit 5-8 Bit 9-13 Bit 14-19
Setup
"0" Carry "0"
Setup
"0" Carry "0"
Setup
"0" Carry "0"
Setup
"0" Carry "0"
(1)
"1" Carry "1"
"1" Carry "1"
"1" Carry "1"
"1" Carry "1"
(1)
(3) (4) (5) (6) (7)(3)
Multiplexer
Sum Generation
Multiplexer
Sum Generation
Multiplexer
Sum Generation
Multiplexer
Sum Generation
Ci,0
(4) (5) (6) (7)Mux
S
(8)
Sum Generation Sum Generation Sum Generation Sum Generation
S0-1 S2-4 S5-8 S9-13
Sum
S14-19 (9)
EE141© Digital Integrated Circuits2nd Arithmetic Circuits29
Adder Delays Adder Delays -- Comparison Comparison Adder Delays Adder Delays -- Comparison Comparison
EE141© Digital Integrated Circuits2nd Arithmetic Circuits30
LookAhead LookAhead -- Basic IdeaBasic IdeaLookAhead LookAhead -- Basic IdeaBasic Idea
EE141© Digital Integrated Circuits2nd Arithmetic Circuits31
LookLook--Ahead: TopologyAhead: TopologyLookLook--Ahead: TopologyAhead: Topology
EE141© Digital Integrated Circuits2nd Arithmetic Circuits32
Logarithmic LookLogarithmic Look--Ahead AdderAhead AdderLogarithmic LookLogarithmic Look Ahead AdderAhead AdderFA0
A7A6A5A4A3A2A1
0
A0A1
tp∼ N
A2A3
A4
FA4A5
A6
Atp∼ log2(N)
A7
EE141© Digital Integrated Circuits2nd Arithmetic Circuits33
Carry Lookahead TreesCarry Lookahead TreesCarry Lookahead TreesCarry Lookahead Trees
C G CCo 0, G0 P0Ci 0,+=
Co 1, G1 P1 G0 P1P0 Ci 0,+ +=
Co 2, G2 P2G1 P2 P1G0 P+ 2 P1P0Ci 0,+ +=
G2 P2G1+( )= P2P1( ) G0 P0Ci 0+( )+ G 2:1 P2:1Co 0+=G2 2G1( ) 2 1( ) G0 0Ci 0,( ) G 2:1 2:1Co 0,
Can continue building the tree hierarchically.
EE141© Digital Integrated Circuits2nd Arithmetic Circuits34
Tree AddersTree AddersTree AddersTree AddersS 0 S 1 S 2 S 3 S 4 S 5 S 6 S 7 S 8 S 9 S 10 S 11 S 12 S 13 S 14 S 15S S S S S S S S S S S S S S S S
(A0,
B0)
(A1,
B1)
(A2,
B2)
(A3,
B3)
(A4,
B4)
(A5,
B5)
(A6,
B6)
(A7,
B7)
(A8,
B8)
(A9,
B9)
10, B
10)
11, B
11)
12, B
12)
13, B
13)
14, B
14)
15, B
15)
16-bit radix-2 Kogge-Stone tree
( ( ( ( ( ( ( ( ( (
(A1
(A1
(A1
(A1
(A1
(A1
EE141© Digital Integrated Circuits2nd Arithmetic Circuits35
Tree AddersTree AddersTree AddersTree AddersS
0
S1
S2
S3
S4
S5
S6
S7
S8
S9
S10
S11
S12
S13
S14
S15
(a0,
b0)
(a1,
b1)
(a2,
b2)
(a3,
b3)
(a4,
b4)
(a5,
b5)
(a6,
b6)
(a7,
b7)
(a8,
b8)
(a9,
b9)
(a10
, b10
)
(a11
, b11
)
(a12
, b12
)
(a13
, b13
)
(a14
, b14
)
(a15
, b15
)
16-bit radix-4 Kogge-Stone Tree
EE141© Digital Integrated Circuits2nd Arithmetic Circuits36
Sparse TreesSparse TreesSparse TreesSparse Trees
S1
S3
S5
S7
S9
S11
S13
S15
S0
S2
S4
S6
S8
S10
S12
S14
a 0, b 0)
a 1, b 1)
a 2, b 2)
a 3, b 3)
a 4, b 4)
a 5, b 5)
a 6, b 6)
a 7, b 7)
a 8, b 8)
a 9, b 9)
0, b 10
)
1, b 11
)
2, b 12
)
3, b 13
)
4, b 14
)
5, b 15
)
(a (a (a (a (a (a (a (a (a (a
(a10
(a11
(a12
(a13
(a14
(a15
16 bit radix 2 sparse tree with sparseness of 2
EE141© Digital Integrated Circuits2nd Arithmetic Circuits37
16-bit radix-2 sparse tree with sparseness of 2
Tree AddersTree AddersTree AddersTree Adders
S 0 S 1 S 2 S 3 S 4 S 5 S 6 S 7 S 8 S 9 S 10 S 11 S 12 S 13 S 14 S 15
B0)
B1)
B2)
B3)
B4)
B5)
B6)
B7)
B8)
B9)
B10
)
B11
)
B12
)
B13
)
B14
)
B15
)
(A0,
(A1,
(A2,
(A3,
(A4,
(A5,
(A6,
(A7,
(A8,
(A9,
(A10
, B
(A11
, B
(A12
, B
(A13
, B
(A14
, B
(A15
, B
B t K T
EE141© Digital Integrated Circuits2nd Arithmetic Circuits38
Brent-Kung Tree
Example: Domino AdderExample: Domino AdderExample: Domino AdderExample: Domino AdderVDD
VDD Clk Gi = aibi
Clk Pi= ai + bi
a b
ai
Clk
ai bi
Clk
bi
Clk
Propagate Generate
EE141© Digital Integrated Circuits2nd Arithmetic Circuits39
Example: Domino AdderExample: Domino AdderExample: Domino AdderExample: Domino Adder
VDDVDD
ClkClkk
P
Pi:i-2k+1
Clkk
P
Gi:i-2k+1
Pi:i-k+1
Pi-k:i-2k+1
Gi:i-k+1
Pi:i-k+1
Gi-k:i-2k+1
P t G tPropagate Generate
EE141© Digital Integrated Circuits2nd Arithmetic Circuits40
Example: Domino SumExample: Domino SumExample: Domino SumExample: Domino SumVDD
Clk
VDD
Clkd
Keeper
Clk
Gi:0
Sum
Clkd
Clk Si0
Clk
Clkd
Gi:0
S 1
Clk
Si1
EE141© Digital Integrated Circuits2nd Arithmetic Circuits41
MultipliersMultipliersMultipliersMultipliers
EE141© Digital Integrated Circuits2nd Arithmetic Circuits42
The Binary MultiplicationThe Binary MultiplicationThe Binary MultiplicationThe Binary MultiplicationM N 1
Z X·· Y× Zk2k
k 0=
M N 1–+
∑= =
M 1⎛ ⎞ N 1⎛ ⎞
Xi2i
i 0=
M 1–
∑⎝ ⎠⎜ ⎟⎜ ⎟⎜ ⎟⎛ ⎞
Yj2j
j 0=
N 1–
∑⎝ ⎠⎜ ⎟⎜ ⎟⎜ ⎟⎛ ⎞
=
N 1⎛ ⎞M 1XiYj2
i j+
j 0=
N 1–
∑⎝ ⎠⎜ ⎟⎜ ⎟⎜ ⎟⎛ ⎞
i 0=
M 1–
∑=
iM 1–
with
X Xi2i
i 0=∑=
Y Y 2j
N 1–
EE141© Digital Integrated Circuits2nd Arithmetic Circuits43
Y Yj2j
j 0=∑=
The Binary MultiplicationThe Binary MultiplicationThe Binary MultiplicationThe Binary Multiplication
EE141© Digital Integrated Circuits2nd Arithmetic Circuits44
The Array MultiplierThe Array MultiplierThe Array MultiplierThe Array Multiplier
EE141© Digital Integrated Circuits2nd Arithmetic Circuits45
The MxN Array MultiplierThe MxN Array MultiplierC iti l P thC iti l P th—— Critical PathCritical Path
HA FA FA HA
HAFAFAFA Critical Path 1
Critical Path 2
C i i l P h 1 & 2FAFA FA HA Critical Path 1 & 2
EE141© Digital Integrated Circuits2nd Arithmetic Circuits46
CarryCarry--Save MultiplierSave MultiplierCarryCarry Save MultiplierSave MultiplierHA HA HA HA
FAFAFAHA
FAHA FA FA
FAHA FA HA
Vector Merging Adder
EE141© Digital Integrated Circuits2nd Arithmetic Circuits47
Multiplier FloorplanMultiplier FloorplanMultiplier FloorplanMultiplier FloorplanX0X1X2X3
SCSCSCSC Z0
Y1
Y0
HA Multiplier Cell
SCSCSCSC
0
Y2
i C
FA Multiplier Cell
SCSCSCSC
Z1
Y3
Vector Merging Cell
X and Y signals are broadcastedSCSCSCSC
CCCC
Z2
X and Y signals are broadcastedthrough the complete array.( )
SSSS
Z3Z4Z5Z6Z7
EE141© Digital Integrated Circuits2nd Arithmetic Circuits48
WallaceWallace--Tree MultiplierTree MultiplierWallaceWallace--Tree MultiplierTree Multiplier
EE141© Digital Integrated Circuits2nd Arithmetic Circuits49
WallaceWallace--Tree MultiplierTree MultiplierWallaceWallace--Tree MultiplierTree Multiplier
EE141© Digital Integrated Circuits2nd Arithmetic Circuits50
WallaceWallace--Tree MultiplierTree MultiplierWallaceWallace--Tree MultiplierTree Multipliery0 y1 y2
FACi-1
y0 y1 y2 y3 y4 y5
FA
y3
Ci-1Ci
FA FACi-1
Ci-1
Ci
Ci
y4
Ci-1Ci
FACi-1Ci
FA
y5
C FA
i 1i
FA
S
Ci FA
CC S
EE141© Digital Integrated Circuits2nd Arithmetic Circuits51
SC
Multipliers Multipliers ——SummarySummaryMultipliers Multipliers ——SummarySummary
• Optimization Goals Different Vs Binary Adder
• Once Again: Identify Critical Path
Oth ibl t h i• Other possible techniques- Logarithmic versus Linear (Wallace Tree Mult)- Data encoding (Booth)- Pipelining
FIRST GLIMPSE AT SYSTEM LEVEL OPTIMIZATION
EE141© Digital Integrated Circuits2nd Arithmetic Circuits52
ShiftersShiftersShiftersShifters
EE141© Digital Integrated Circuits2nd Arithmetic Circuits53
The Binary ShifterThe Binary ShifterThe Binary ShifterThe Binary ShifterRight Leftnopg p
Ai Bi
Ai-1 Bi-1
Bit-Slice i
...
EE141© Digital Integrated Circuits2nd Arithmetic Circuits54
The Barrel ShifterThe Barrel ShifterThe Barrel ShifterThe Barrel ShifterA3
B3
Sh1A2
B3
B
Sh2A1
B2
: Data Wire
Sh3
1
A
B1: Control Wire
A0
B0
Sh3Sh2Sh1Sh0
Area Dominated by Wiring
EE141© Digital Integrated Circuits2nd Arithmetic Circuits55
y g
4x4 barrel shifter4x4 barrel shifter4x4 barrel shifter4x4 barrel shifterA3
A2A2
A 1
A 0
B ffSh3S h2Sh 1Sh0
BufferWidthbarrel ~ 2 pm M
EE141© Digital Integrated Circuits2nd Arithmetic Circuits56
Logarithmic ShifterLogarithmic ShifterLogarithmic ShifterLogarithmic ShifterSh1 Sh1 Sh2 Sh2 Sh4 Sh4
A3 B3
A2 B22
A1 B1A1
A
B1
A0 B0
EE141© Digital Integrated Circuits2nd Arithmetic Circuits57
00--7 bit Logarithmic Shifter7 bit Logarithmic Shifter00--7 bit Logarithmic Shifter7 bit Logarithmic Shifter
A3Out3
A 2
Out3
2
A1
Out2
1
A
Out1
A0 Out0
EE141© Digital Integrated Circuits2nd Arithmetic Circuits58