7/27/2019 Cse477 19 Timing
1/20
CSE477 L19 Timing Issues; Datapaths.1 Irwin&Vijay, PSU, 2002
CSE477VLSI Digital Circuits
Fall 2002
Lecture 19: Timing Issues;Introduction to Datapath Design
Mary Jane Irwin ( www.cse.psu.edu/~mji )www.cse.psu.edu/~cg477
[Adapted from Rabaeys Digital Integrated Circuits, 2002, J. Rabaey et al.]
http://www.cse.psu.edu/~mjihttp://www.cse.psu.edu/~cg477http://www.cse.psu.edu/~cg477http://www.cse.psu.edu/~mji7/27/2019 Cse477 19 Timing
2/20
CSE477 L19 Timing Issues; Datapaths.2 Irwin&Vijay, PSU, 2002
Review: Sequential Definitions
Use two level sensitive latches of opposite type to build onemaster-slave flipflop that changes state on a clock edge(when the slave is transparent)
Static storage
static uses a bistable element with feedback to store its state andthus preserves state as long as the power is on
- Loading new data into the element: 1) cutting the feedback path (muxbased); 2) overpowering the feedback path (SRAM based)
Dynamic storage
dynamic stores state on parasitic capacitors so the state held for
only a period of time (milliseconds); requires periodic refresh dynamic is usually simpler (fewer transistors), higher speed, lower
power but due to noise immunity issues always modify the circuitso that it is pseudostatic
7/27/2019 Cse477 19 Timing
3/20
CSE477 L19 Timing Issues; Datapaths.3 Irwin&Vijay, PSU, 2002
Timing Classifications
Synchronous systems
All memory elements in the system are simultaneously updatedusing a globally distributed periodic synchronization signal (i.e.,a global clock signal)
Functionality is ensure by strict constraints on the clock signalgeneration and distribution to minimize
- Clock skew (spatial variations in clock edges)
- Clock jitter (temporal variations in clock edges)
Asynchronous systems
Self-timed (controlled) systems
No need for a globally distributed clock, but have asynchronous
circuit overheads (handshaking logic, etc.)
Hybrid systems
Synchronization between different clock domains
Interfacing between asynchronous and synchronous domains
7/27/2019 Cse477 19 Timing
4/20
CSE477 L19 Timing Issues; Datapaths.4 Irwin&Vijay, PSU, 2002
Review: Synchronous Timing Basics
Under ideal conditions (i.e., when tclk1 = tclk2)
T tc-q + tplogic + tsu
thold tcdlogic + tcdreg
Under real conditions, the clock signal can have bothspatial (clock skew) and temporal (clock jitter) variations
skew is constant from cycle to cycle (by definition); skew can bepositive (clock and data flowing in the same direction) or negative(clock and data flowing in opposite directions)
jitter causes T to change on a cycle-by-cycle basis
D Q
R1Combinational
logic D Q
R2
clk
In
tclk1 tclk2
tc-q, tsu,thold, tcdreg
tplogic, tcdlogic
7/27/2019 Cse477 19 Timing
5/20
CSE477 L19 Timing Issues; Datapaths.5 Irwin&Vijay, PSU, 2002
Sources of Clock Skew and Jitter in Clock Network
PLL
1
2
4
3
5
6
7
clockgeneration
clock drivers
power supply
interconnectcapacitive load
capacitivecoupling
temperature
Skew
manufacturing device
variations in clock drivers interconnect variations
environmental variations(power supply andtemperature)
Jitter
clock generation
capacitive loading andcoupling
environmental variations(power supply andtemperature)
7/27/2019 Cse477 19 Timing
6/20
CSE477 L19 Timing Issues; Datapaths.7 Irwin&Vijay, PSU, 2002
Positive Clock Skew
D Q
R1Combinational
logic D Q
R2
clk
In
tclk1 tclk2
delay
> 0: Improves performance, but makes thold harder tomeet. If thold is not met (race conditions), the circuit
malfunctions independent of the clock period!
T
T + > 0
+ thold
T + tc-q
+ tplogic
+ tsu
so T tc-q
+ tplogic
+ tsu
-
thold + tcdlogic + tcdreg so thold tcdlogic + tcdreg -
1
2
3
4
Clock anddata flow in
the samedirection
T :
thold :
7/27/2019 Cse477 19 Timing
7/20CSE477 L19 Timing Issues; Datapaths.9 Irwin&Vijay, PSU, 2002
Negative Clock Skew
D Q
R1Combinational
logic D Q
R2
clk
In
tclk1 tclk2
delay
Clock anddata flow in
oppositedirections
T
T +
< 0
T + tc-q
+ tplogic
+ tsu
so T tc-q
+ tplogic
+ tsu
-
thold + tcdlogic + tcdreg so thold tcdlogic + tcdreg -
1
2
3
4
< 0: Degrades performance, but thold is easier to meet(eliminating race conditions)
T :
thold :
7/27/2019 Cse477 19 Timing
8/20CSE477 L19 Timing Issues; Datapaths.11 Irwin&Vijay, PSU, 2002
Clock Jitter Jitter causes T to
vary on a cycle-by-
cycle basisR1
Combinationallogic
clk
In
tclk
T
-tjitter +tjitter
T - 2tjitter tc-q + tplogic + tsu so T tc-q + tplogic + tsu + 2tjitter
Jitter directly reduces the performance of a sequentialcircuit
T :
7/27/2019 Cse477 19 Timing
9/20CSE477 L19 Timing Issues; Datapaths.12 Irwin&Vijay, PSU, 2002
Combined Impact of Skew and Jitter
D Q
R1Combinational
logic D Q
R2
In
tclk1 tclk2
Constraintson the
minimumclock period( > 0)
> 0 with jitter: Degrades performance, and makes tholdeven harderto meet. (The acceptable skew is reducedby jitter.)
T
T + > 0
1
6 12-tjitter
T tc-q + tplogic + tsu - + 2tjitter thold tcdlogic + tcdreg 2tjitter
7/27/2019 Cse477 19 Timing
10/20CSE477 L19 Timing Issues; Datapaths.13 Irwin&Vijay, PSU, 2002
Clock Distribution Networks
Clock skew and jitter can ultimately limit the performanceof a digital system, so designing a clock network thatminimizes both is important
In many high-speed processors, a majority of the dynamic poweris dissipated in the clock network.
To reduce dynamic power, the clock network must support clockgating (shutting down (disabling the clock) units)
Clock distribution techniques
Balanced paths (H-tree network, matched RC trees)
- In the ideal case, can eliminate skew
- Could take multiple cycles for the clock signal to propagate to theleaves of the tree
Clock grids
- Typically used in the final stage of the clock distribution network
- Minimizes absolute delay (not relative delay)
7/27/2019 Cse477 19 Timing
11/20CSE477 L19 Timing Issues; Datapaths.14 Irwin&Vijay, PSU, 2002
H-Tree Clock Network
Clock
Clock
IdleconditionGatedclock
Can insert clock gating atmultiple levels in clock tree
Can shut off entire subtreeif all gating conditions aresatisfied
If the paths are perfectly balanced, clock skew is zero
7/27/2019 Cse477 19 Timing
12/20CSE477 L19 Timing Issues; Datapaths.15 Irwin&Vijay, PSU, 2002
DEC Alpha 21164 (EV5)
300 MHz clock (9.3 million transistors on a 16.5x18.1mm die in 0.5 micron CMOS technology)
single phase clock
3.75 nF total clock load
Extensive use of dynamic logic
20 W (out of 50) in clock distribution network
Two level clock distribution
Single 6 stage driver at the center of the chip
Secondary buffers drive the left and right sides of the clockgrid in m3 and m4
Total equivalent driver size of 58 cm !!
7/27/2019 Cse477 19 Timing
13/20CSE477 L19 Timing Issues; Datapaths.16 Irwin&Vijay, PSU, 2002
Clock Drivers
7/27/2019 Cse477 19 Timing
14/20
CSE477 L19 Timing Issues; Datapaths.17 Irwin&Vijay, PSU, 2002
Clock Skew in Alpha ProcessorAbsolute skew smaller than 90 ps
The criticalinstruction andexecution units allsee the clock within65 ps
7/27/2019 Cse477 19 Timing
15/20
CSE477 L19 Timing Issues; Datapaths.18 Irwin&Vijay, PSU, 2002
Dealing with Clock Skew and Jitter To minimize skew, balance clock paths using H-tree or
matched-tree clock distribution structures.
If possible, route data and clock in opposite directions;eliminates races at the cost of performance.
The use of gated clocks to help with dynamic powerconsumption make jitter worse.
Shield clock wires (route power lines VDD or GND next toclock lines) to minimize/eliminate coupling with neighboringsignal nets.
Use dummy fills to reduce skew by reducing variations in
interconnect capacitances due to interlayer dielectricthickness variations.
Beware of temperature and supply rail variations and theireffects on skew and jitter. Power supply noise fundamentallylimits the performance of clock networks.
7/27/2019 Cse477 19 Timing
16/20
CSE477 L19 Timing Issues; Datapaths.19 Irwin&Vijay, PSU, 2002
Major Components of a Computer
Processor
Control
Datapath
Memory
Devices
Input
Output
Modern processor architecture styles (CSE 431)
Pipelined, single issue (e.g., ARM) Pipelined, hardware controlled multiple issue superscalar
Pipelined, software controlled multiple issue VLIW
Pipelined, multiple issue from multiple process threads -multithreaded
7/27/2019 Cse477 19 Timing
17/20
CSE477 L19 Timing Issues; Datapaths.20 Irwin&Vijay, PSU, 2002
Basic Building Blocks
Datapath
Execution units- Adder, multiplier, divider, shifter, etc.
Register file and pipeline registers
Multiplexers, decoders
Control Finite state machines (PLA, ROM, random logic)
Interconnect
Switches, arbiters, buses
Memory
Caches, TLBs, DRAM, buffers
7/27/2019 Cse477 19 Timing
18/20
CSE477 L19 Timing Issues; Datapaths.21 Irwin&Vijay, PSU, 2002
MIPS 5-Stage Pipelined (Single Issue) Datapath
ReadAddress
I$
Add
PC
4
0
1
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
ReadData 1
ReadData 2
SignExtend16 32
ALU
1
0
Shift
left 2
Add
D$Address
Write Data
ReadData
1
0
IF/Dec
Dec/Exec
Exec/Mem
Mem/WB
pipelinestage
isolationregister
Fetch Decode Execute Memory WriteBack
clk
Icacheprecharge
Dcacheprecharge
RegWrite
7/27/2019 Cse477 19 Timing
19/20
CSE477 L19 Timing Issues; Datapaths.22 Irwin&Vijay, PSU, 2002
Datapath Bit-Sliced Organization
Control Flow
Bit 0
Bit 1
Bit 2
Bit 3
Tile identical bit-slice elements
Regis
terFile
Pipeline
Register
Ad
der
Sh
ifter
Pipeline
Register
Multiplexer
Mult
iplexer
Data Flow
Pipeline
Register
From
I$
Pipeline
Register
To/From D$
7/27/2019 Cse477 19 Timing
20/20
CSE477 L19 Timing Issues; Datapaths.23 Irwin&Vijay, PSU, 2002
Next Lecture and Reminders Next lecture
Adder design
- Reading assignment Rabaey, et al, 11.3
Reminders
Pick up second half of the new edition of the book from Sue in202 Pond Lab
Project final reports due December 5th HW4 due today
HW5 due November 19th
Final grading negotiations/correction (except for the finalexam) must be concluded by December 10th
Final exam scheduled
- Monday, December 16th from 10:10 to noon in 118 and 121Thomas