Top Banner
ECE553 ECE553 1 Reading: CH 11.1 – 11.3.1 Lecture 7: Adders
39

Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

Sep 01, 2018

Download

Documents

hacong
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553 1

Reading: CH 11.1 – 11.3.1

Lecture 7: Adders

Page 2: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

The Latest News in EE world

2

20nm NAND, Hynix, 2010/2/10 R&D, Third quarter mass production

20nm NAND, Samsung, 2010/4/19 50% more density than 30nm 1st product level Write speed: 10MB/s

$~180

Page 3: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

A Generic Digital Processor

3

MEMORY

DATAPATH

CONTROL

INPUT-OUTPUT

<Building Blocks> Arithmetic unit -  bit-sliced datapath (adder,

multiplier, shifter, comparator, etc)

Memory -  RAM, ROM, Buffers, Shift

registers Control -  Finite state machine (PLA,

random logic) -  Counters

Interconnect -  Switches -  Arbiters -  Bus

Page 4: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

An Intel Microprocessor

4

9-1

Mux

9-1

Mux

5-1

Mux

2-1

Mux

ck1

CARRYGEN

SUMGEN+ LU

1000um

b

s0

s1

g64

sum sumb

LU : LogicalUnit

SUM

SEL

a

to Cachenode1

REG

Itanium has 6 integer execution units like this

Page 5: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Bit-Sliced Design

5

Bit 3

Bit 2

Bit 1

Bit 0

Reg

ister

Add

er

Shift

er

Mul

tiple

xer

ControlD

ata-

In

Dat

a-O

ut

Tile identical processing elementsDatapath: all computations are performed word-based

Tile identical processing elements

Page 6: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Bit-Sliced Datapath

6

Adder stage 1

Wiring

Adder stage 2

Wiring

Adder stage 3

Bit slice 0

Bit slice 2

Bit slice 1

Bit slice 63

Sum Select

Shifter

Multiplexers

Loopback Bus

From register files / Cache / Bypass

To register files / CacheLoopback B

us

Loopback Bus

Page 7: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Itanium Integer Datapath

7

Fetzer, Orton, ISSCC’02

Page 8: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Full-Adder

8

A B

Cout

Sum

Cin Fulladder

Half  adder

P: Cout = Cin G: Cout = 1 D: Cout = 0

Page 9: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

The Binary Adder

9

S A B Ci⊕ ⊕=

A= BCi ABCi ABCi ABCi+ + +

Co AB BCi ACi+ +=

A B

Cout

Sum

Cin Fulladder

( ) ( ( ))out inC A B C A B= ⋅ + ⋅ ⊕

Page 10: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Express Sum and Carry as a function of P, G, D

10

Define 3 new variable which ONLY depend on A, B, not Ci: why?

Generate (G) = AB

Propagate (P) = A + B

Delete = A B

Can also derive expressions for S and C o based on D and P

Propagate (P) = A + B Note that we will be sometimes using an alternate definition for

( ) ( ( ))i

o i

S A B CC A B C A B= ⊕ ⊕= ⋅ + ⋅ ⊕

CO (G, P) =G + PCi

S(G, P) = P⊕Ci

Page 11: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

The Ripple-Carry Adder

11

Worst case delay linear with the number of bits(N)

Goal: Make the fastest possible carry path circuit

FA FA FA FA

A0 B0

S0

A1 B1

S1

A2 B2

S2

A3 B3

S3

Ci,0 Co,0

(= Ci,1)

Co,1 Co,2 Co,3

td = O(N) tadder = (N-1)tcarry + tsum

Simple but slow !!

Critical path: the worst case delay over all possible input patterns

Page 12: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Static Adder Circuit

12

CO = AB + BCi + ACi = AB +Ci( A+ B)

S = ABCi +CO ( A+ B +Ci )

S = A⊕ B⊕Ci

= ABCi + ABCi + ABCi + ABCi

Co = AB + BCi + ACi

Page 13: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Complimentary Static CMOS Full Adder

13

28 Transistors

A B

B

A

Ci

Ci A

X

VDD

VDD

A B

Ci BA

B VDD

A

B

Ci

Ci

A

B

A CiB

Co

VDD

S

Implementation is ugly!!

Sizing?

O i iC AB BC AC= + +( )i O iS ABC C A B C= + + +

Page 14: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Inversion Property

14

A B

S

CoCi FA

A B

S

CoCi FA

S A B Ci, ,( ) S A B Ci, ,( )=

Co A B Ci, ,( ) Co A B Ci, ,( )=

Inverted inputs inverted outputs

Page 15: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Minimize Critical Path by Reducing Inverting Stages

15

Exploit Inversion Property

A3

FA FA FA

Even cell Odd cell

FA

A0 B0

S0

A1 B1

S1

A2 B2

S2

B3

S3

Ci,0 Co,0 Co,1 Co,3Co,2

Allows us to remove inverter in carry chain at what cost?

Page 16: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

A Better Structure: The Mirror Adder

16

VDD

CiA

BBA

B

A

A BKill

Generate"1"-Propagate

"0"-Propagate

VDD

Ci

A B Ci

Ci

B

A

Ci

A

BBA

VDD

SCo

24 transistors

Page 17: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

The Mirror Adder Details u  The NMOS and PMOS chains are completely symmetrical.

A maximum of 2 series transistors can be observed in the carry-generation circuitry.

u  In layout critical to minimize the capacitance at node Co. The reduction of the diffusion (junction) capacitances is particularly important.

u  The capacitance at node Co is composed of 4 diffusion (junction) capacitances, 2 internal gate capacitances, and 6 gate capacitances in the connecting adder cell .

u  The transistors connected to Ci are placed closest to the output.

u  Only the transistors in the carry stage have to be optimized for optimal speed. All transistors in the sum stage can be minimal size.

17

Page 18: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Mirror Adder

18

Stick Diagram

CiA B

VDD

GND

B

Co

A Ci Co Ci A B

S

Page 19: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Layout of Mirror Adder

19

l  2 possible layouts of mirror adder -  (a) Corresponds

roughly to last slide’s stick diagram

-  (b) Layout is datapath-oriented (ex. M2 can easily run horizontally across cell)

(a)

(b)

Page 20: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Sizing Mirror Adder

20

Page 21: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Transmission Gate Full Adder

21

A

B

P

Ci

VDD A

A A

VDD

Ci

A

P

AB

VDD

VDD

Ci

Ci

Co

S

Ci

P

P

P

P

P

Sum Generation

Carry Generation

Setup

24 transistors Similar delay for Sum and Carry

CO (G, P) =G + PCi

S(G, P) = P⊕Ci

Page 22: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Manchester Carry Chain

22

CoCi

Gi

DiPi

PiVDD CoCi

Gi

PiVDD

φ

φ

Static Dynamic

Page 23: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Manchester Carry Chain

23

G2

φ

C3

G3Ci,0

P0

G1

VDD

φ

G0

P1 P2 P3

C3C2C1C0

•  Implement P with pass-transistors (1 0 transition), not TG •  Implement G with pull-down OR delete with pull-down (note inversion) •  Use dynamic logic to reduce complexity and speed up

Worst delay path?

Page 24: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Manchester Carry Chain

24

Pi + 1 Gi + 1 φ

Ci

Inverter/Sum Row

Propagate/Generate Row

Pi Gi φ

Ci - 1Ci + 1

VDD

GND

Stick Diagram

Page 25: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Carry-Bypass Adder

25

FA FA FA FA

P0 G1 P0 G1 P2 G2 P3 G3

Co,3Co,2Co,1Co,0Ci,0

FA FA FA FA

P0 G1 P0 G1 P2 G2 P3 G3

Co,2Co,1Co,0Ci,0

Co,3

Multiplexer

BP=PoP1P2P3

Idea: If (P0 and P1 and P2 and P3 = 1)then Co3 = C0, else “kill” or “generate”.

Also called Carry-Skip

Small area overhead (10~20%) But, not regular bit-slice structure

Page 26: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Carry-Bypass Adder (cont.)

26

Carrypropagation

SetupBit 0–3

Sum

M bits

tsetup

tsum

Carrypropagation

SetupBit 4–7

Sum

tbypass

Carrypropagation

SetupBit 8–11

Sum

Carrypropagation

SetupBit 12–15

Sum

tadder = tsetup + Mtcarry + (N/M-1)tbypass + (M-1)tcarry + tsum

•  Inner blocks do not contribute to worst-case delay since they have time to compute while bits 0-3 are propagating (assuming they have a generate or delete)

•  Block sizes can be made non-uniform (HOW?)

Total N-bit adder

Page 27: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Carry Ripple versus Carry Bypass

27

N

tp

ripple adder

bypass adder

4..8

Why?

Page 28: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Carry-Select Adder

28

Setup

"0" Carry Propagation

"1" Carry Propagation

Multiplexer

Sum Generation

Co,k-1 Co,k+3

"0"

"1"

P,G

Carry Vector

Basic idea: Pre-compute both possibilities (0 or 1) à choose when Cin available

0 1

Area overhead: additional carry path + mux ~ 30%

Page 29: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Carry Select Adder: Critical Path

29

0

1

Sum Generation

Multiplexer

1-Carry

0-Carry

Setup

Ci,0 Co,3 Co,7 Co,11 Co,15

S0–3

Bit 0–3 Bit 4–7 Bit 8–11 Bit 12–15

0

1

Sum Generation

Multiplexer

1-Carry

0-Carry

Setup

S4–7

0

1

Sum Generation

Multiplexer

1-Carry

0-Carry 0-Carry

Setup

S8–11

0

1

Sum Generation

Multiplexer

1-Carry

Setup

S12–15

( )add setup carry mux sumNt t Mt t tM

= + + + Still O(N)

Page 30: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Linear Carry Select

30

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Bit 0-3 Bit 4-7 Bit 8-11 Bit 12-15

S0-3 S4-7 S8-11 S12-15

Ci,0

(1)

(1)

(5)(6) (7) (8)

(9)

(10)

(5) (5) (5)(5)

Still ~ O(N)

( )add setup carry mux sumNt t Mt t tM

= + + +

Page 31: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Square Root Carry Select

31

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Bit 0-1 Bit 2-4 Bit 5-8 Bit 9-13

S0-1 S2-4 S5-8 S9-13

Ci,0

(4) (5) (6) (7)

(1)

(1)

(3) (4) (5) (6)

Mux

Sum

S14-19

(7)

(8)

Bit 14-19

(9)

(3)

~O(sqrt(N))

( 2 )add setup carry mux sumt t P t N t t= + ⋅ + +

Page 32: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Adder Delays - Comparison

32

Square root select

Linear select

Ripple adder

20 40N

t p(in

uni

t del

ays)

600

10

0

20

30

40

50

Page 33: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Carry Lookahead Adder - Basic Idea

33

AN-1, BN-1A1, B1

P1

S1

• • •

• • • SN-1

PN-1Ci, N-1

S0

P0Ci,0 Ci,1

A0, B0

To remove the rippling effect of the carry

, 1

1

, , 1

, ,01 1 0 0

( , , )( ( ( )))

o k o k

o k

k k o k k k

k k k k i

f A B CC CG PG P G P P GC CP

− −

− −

= = + ⋅+ + + + ⋅= L

Page 34: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Lookahead Adder

34

Carry and sum are independent of the previous bits Addition time is independent of the #of bits

Page 35: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Lookahead Adder

35

u  Unrolling of carry recurrence can be continued u  If unrolled to level k, resulting in two-level AND str

ucture u  AND Fan-in = k + 1, OR Fan-in = k + 1 u  k + 1 transistors in the MOS stack u  Limits k to 3-4

Page 36: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Look-Ahead: Topology – 4 bits

36

Expanding Lookahead equations:

All the way:

Co,3Ci,0

VDD

P0

P1

P2

P3

G0

G1

G2

G3

Mirror Implementation

Co,k =Gk + Pk (Gk−1 + Pk−1(+ P1(G0 + P0 ⋅Ci ,0 )))

, , 1

1 1 , 2( )o k k k o k

k k k k o k

C G PCG P G P C

− − −

= += + +

Co,3 =G3 + P3G2 + P3P2G1 + P3P2P1G0 + P3P2P1P0Ci ,0= (G3 + P3G2 ) + (P3P2 )Co,1=G3:2 + P3:2Co,1

k=3

Page 37: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Block (Logarithmic) Lookahead

37

Page 38: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Block lookahead

38

Can create groups of groups, or ‘super-groups’

Page 39: Lecture 7: Adders - ecevn.files.wordpress.com · Adder stage 2 Wiring Adder stage 3 Bit slice 0 Bit slice 2 Bit slice 1 Bit slice 63 Sum Select Shifter Multiplexers Loopback Bus From

ECE553 ECE553

Summary

39

 Many topologies for adders l Extensions of carry lookahead are dominant today (why?)

 For 16-bit addition, complex techniques such as carry lookahead do not offer much benefit l Carry lookahead based techniques are much faster for N>16, lots more

area/complexity l Carry select and carry bypass yield good performance in this case l Can string together 4-bit lookahead structures in a ripple fashion to

achieve decent performance

 Adder cells may use mirror structure or transmission gates