S.Veneziano, 2010, Roma ASIC design Tecnologie disponibili, flusso di progettazione. Panoramica dei linguaggi utilizzati e librerie. Specifiche, partizionamento del progetto. Simulazione e “testbench”. Sintesi logica.Vincoli di progetto. Analisi statica dei tempi, “floorplan” e vincoli alla sintesi fisica. Piazzamento. Routing. Estrazione di resistenze e capacita’ parassite. 1 Friday, January 22, 2010
96
Embed
ASIC design - Istituto Nazionale di Fisica Nucleare · all’interno di uno stesso dominio sono di vari tipi, e fanno uso di diversi strumenti di: ... • Full-Custom ASIC Design
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
S.Veneziano, 2010, Roma
ASIC designTecnologie disponibili, flusso di progettazione.
Panoramica dei linguaggi utilizzati e librerie. Specifiche, partizionamento del progetto. Simulazione e
“testbench”. Sintesi logica. Vincoli di progetto. Analisi statica dei tempi, “floorplan” e vincoli alla sintesi fisica.
Piazzamento. Routing. Estrazione di resistenze e capacita’ parassite.
1
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Design domains
• una sola gerarchia non e’ sufficiente a descrivere il problema. Si distinguono tre domini:
• dominio comportamentale
• dominio strutturale
• dominio fisico
BEHAVIORAL DOMAIN
PHYSICAL DOMAIN
Physical partitions
Floorplans
Module layout
Cell layout
Transistor layout
Systems
Algorithms
Register transfers
Logic
Transfer functions
Processors
ALU’s, RAM, etc.
Gates, flip-flops, etc.
Transistors
STRUCTURAL DOMAIN
2
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Design actions• e’ possibile descrivere una
metodologia di progetto descrivendo le transizioni tra i tre domini
• le transizioni tra i domini o all’interno di uno stesso dominio sono di vari tipi, e fanno uso di diversi strumenti di:• sintesi (correct by
construction o verificati)• verifica• ottimizzazione• design management
BEHAVIORAL DOMAIN
PHYSICAL DOMAIN
Physical partitions
Floorplans
Module layout
Cell layout
Transistor layout
Systems
Algorithms
Register transfers
Logic
Transfer functions
Processors
ALU’s, RAM, etc.
Gates, flip-flops, etc.
Transistors
STRUCTURAL DOMAIN
3
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Design automation tools
• algorithmic and system design
• structural and logic design• transistor-level design• layout design• verification• design management• le soluzioni CMOS
disponibili oggi si differenziano per l’uso di un diverso set di tools di design automation
BEHAVIORAL DOMAIN
PHYSICAL DOMAIN
STRUCTURAL DOMAIN
Algorithmic and system design
Structural and logic design
Transistor-level design
Layout design
4
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
verification methods
• prototyping
• simulation
• formal verification (specification vs implementation)
5
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Technologies and design approach
• soluzioni CMOS disponibili oggi:
• Full custom• Standard cell• Structured ASIC• Embedded array• Gate Array• Field Programmable Gate Array
• Hanno in comune la tecnologia della singola cella ma cambia il numero di maschere dedicate e il flusso di progettazione
6
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Full custom designs• Wikipedia:
• Full-Custom ASIC Design defines all the photo lithographic layers of the device. Full Custom Design is used for both ASIC design and for Standard Product design.
•The benefits of Full custom Design usually include reduced area (and therefore recurring component cost), performance improvements and also the ability to integrate (include) analog components and other pre-designed (and thus fully verified) components such as microprocesser cores etc that form a System-On-Chip.
•The disadvantages of Full-Custom can include increased manufacturing and design time, increased non-recurring engineering costs, more complexity in the Computer Aided Design (CAD) system and a much higher skill requirement on the part of the design team.
• La progettazione full custom e’ ancora utilizzata e necessaria per la realizzazione di celle di librerie standard, memorie, parti analogiche, celle IO, PLL.
7
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Progetti full-custom
• in un progetto full-custom si fa uso di:
• layout editors (symbolic editor+compactors)
• design rule checkers
• circuit extractors (parasitics)
36 IC LAYOUT Chapter 3
Design-Rule Checking
Design rules were introduced in Chapter 2 as a set of layout restrictions that ensure the
manufactured design will operate as desired with no short or open circuits. A prime
requirement of the physical layout of a design is that it adhere to these rules. This can be
verified with the aid of a design-rule checker (DRC), which uses as inputs the physical
layout of a design and a description of the design rules presented in the form of a technol-
ogy file. Since a complex circuit can contain millions of polygons that must be checked
against each other, efficiency is the most important property of a good DRC tool. The ver-
ification of a large chip can take hours or days of computation time. One way of expedit-
ing the process is to preserve the design hierarchy at the physical level. For instance, if a
cell is used multiple times in a design, it should be checked only once. Besides speeding
up the process, the use of hierarchy can make error messages more informative by retain-
ing knowledge of the circuit structure.
DRC tools come in two formats: (1) The on-line DRC runs concurrent with the lay-
out editor and flags design violations during the cell layout. For instance, max has a built-
in design-rule checking facility. An example of on-line DRC is shown in Figure 3.4. (2)
Batch DRC is used as a post-design verifier, and is run on a complete chip prior to ship-
ping the mask descriptions to the manufacturer.
Circuit Extraction
Another important tool in the custom-design methodology is the circuit extractor, which
derives a circuit schematic from a physical layout. By scanning the various layers and
their interactions, the extractor reconstructs the transistor network, including the sizes of
the devices and the interconnections. The schematic produced can be used to verify that
the artwork implements the intended function. Furthermore, the resulting circuit diagram
contains precise information on the parasitics, such as the diffusion and wiring capaci-
tances and resistances. This allows for a more accurate simulation and analysis. The com-
plexity of the extraction depends greatly upon the desired information. Most extractors
extract the transistor network and the capacitances of the interconnect with respect to
GND or other network nodes. Extraction of the wiring resistances already comes at a
Figure 3.4 On-line design rule checking. The white dots
indicate a design rule violation. The violated rule can be
obtained with a simple mouse click.
poly_not_fet to all_diff minimum spacing = 0.14 um.
DMIA.fm Page 36 Monday, September 4, 2000 11:19 AM
Section 35
wires). The absolute coordinates of these elements are determined automatically by the
editor using a compactor [Hsueh79, Weste93]. The compactor translates the design rules
into a set of constraints on the component positions, and solves a constrained optimization
problem that attempts to minimize the area or another cost function.
An example of a symbolic notation for a circuit topology, called a sticks diagram, is
shown in Figure 3.2. The different layout entities are dimensionless, since only position-
ing is important. The advantage of this approach is that the designer does not have to
worry about design rules, because the compactor ensures that the final layout is physically
correct. Thus, she can avoid cumbersome polygon manipulations. Another plus of the
symbolic approach is that cells can adjust themselves automatically to the environment.
For example, automatic pitch-matching of cells is an attractive feature in module genera-
tors. Consider the case of Figure 3.3 (from [Croes88]), in which the original cells have dif-
ferent heights, and the terminal positions do not match. Connecting the cells would require
extra wiring. The symbolic approach allows the cells to adjust themselves and connect
without any overhead.
The disadvantage of the symbolic approach is that the outcome of the compaction
phase is often unpredictable. The resulting layout can be less dense than what is obtained
with the manual approach. Notwithstanding, symbolic layout tools have improved consid-
erably over the years and are currently a part of the mainstream design process.
Figure 3.2 Sticks representation of CMOS inverter. The numbers
represent the (Width/Length)-ratios of the transistors.
1
3
In Out
VDD
GND
Figure 3.3 Automatic pitch matching of datapath cells based on symbolic layout.
BEFORE
AFTER
DMIA.fm Page 35 Monday, September 4, 2000 11:19 AM
INTRODUCTION: This report contains the lot average results obtained by MOSIS from measurements of MOSIS test structures on each wafer of this fabrication lot. SPICE parameters obtained from similar measurements on a selected wafer are also attached.
type std_ulogic is ( 'U', -- Uninitialized 'X', -- Forcing Unknown '0', -- Forcing 0 '1', -- Forcing 1 'Z', -- High Impedance 'W', -- Weak Unknown 'L', -- Weak 0 'H', -- Weak 1 '-' -- Don't care );
03/02/2006 11:02 AMVerilog HDL On-line Quick Reference body
Page 3 of 31http://www.sutherland-hdl.com/on-line_ref_guide/vlog_ref_body.html
x or X unknown or uninitialized
4.6 Logic Strengths
The Verilog HDL has 8 logic strengths: 4 driving, 3 capacitive, and high impedance (no strength).
Strength
LevelStrength Name
Specification
Keyword
Display
Mnemonic
7 Supply Drive supply0 supply1 Su0 Su1
6 Strong Drive strong0 strong1 St0 St1
5 Pull Drive pull0 pull1 Pu0 Pu1
4 Large Capacitive large La0 La1
3 Weak Drive weak0 weak1 We0 We1
2 Med. Capacitive medium Me0 Me1
1 Small Capacitive small Sm0 Sm1
0 High Impedance highz0 highz1 HiZ0 HiZ1
4.7 Literal Integer Numbers
Syntax
size'base valueSized integer in a specific radix
(base)
size (optional) is the number of bits in the number. Unsized integers default to at least 32-
bits.
'base (optional) represents the radix. The default base is decimal.
Base Symbol Legal Values
binary b or B 0, 1, x, X, z, Z, ?, _
octal o or O 0-7, x, X, z, Z, ?, _
decimal d or D 0-9, _
hexadecimal h or H 0-9, a-f, A-F, x, X, z, Z, ?, _
The ? is another way of representing the Z logic value.
An _ (underscore) is ignored (used to enhance readability).
Values are expanded from right to left (lsb to msb).
When size is less than value, the upper bits are truncated.
When size is larger than value, and the left-most bit of value is 0 or 1, zeros are left-
extended to fill the size.
When size is larger than value, and the left-most bit of value is Z or X, the Z or X is left-
26
Verilog
VHDL
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Design libraries
• simulation
• synthesis
• place and routing
• testvector generation
27
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Introduzione ai linguaggi
• definizione, simulazione, sintesi, test:
• Verilog, VHDL ...
• scripting languages:
• tcl, perl, awk ...
• descrizione del circuito
• EDIF, SDF ...
28
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Verilogmodule counter; reg clock; // Declare a reg data type for the clock. integer count; // Declare an integer data type for the count.initial // Initialize things; this executes once at t=0. begin clock = 0; count = 0; // Initialize signals. #340 ; // Finish after 340 time ticks. end /* An always statement to generate the clock; only one statement follows the always so we don't need a begin and an end. */always #10 clock = ~ clock; // Delay (10ns) is set to half the clock cycle./* An always statement to do the counting; this executes at the same time (concurrently) as the preceding always statement. */always begin // Wait here until the clock goes from 1 to 0. @ (negedge clock); // Now handle the counting. if (count == 7) count = 0; else count = count + 1; $monitor("time = ",," count = ", count); end endmodule
29
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Esempi di celle di libreria// // Copyright (C) 2001 Virtual Silicon Technology Inc.. All Rights Reserved. // "eSilicon", "eSi", "eSi-Route", "eSi-Pad", "eSi-RAM", "eSi-ROM", "eSi-PLL", // "The Heart of Great Silicon(R)", "Silicon Ready(R)", "IP Ambassador", // "Design Service Ambassador", and "Virtual Silicon" are trademarks // of Virtual Silicon Technology Inc. // // Virtual Silicon Technology Inc. // 1200 Crossman Ave Suite 200 // Sunnyvale, CA 94089-1116 // Phone : 408-548-2700 // Fax : 408-548-2750 // Web Site : www.virtual-silicon.com // // File Name: DFFPB1.v // Library Name: umcl18u250t2 // Library Release: 2.1 // Process: eSi/Route-11 UMC L180 // verigen patch-level 1.132-m 02/27/2000 13:49:14`celldefine// Positive Edge, D Flip-Flop; Q, QB Outputs// Q = rising(CK) ? D : 'p';QB = !Qmodule DFFPB1 (Q, QB, CK, D); output Q; output QB; input CK; input D; reg notifier; p_ff _i0 (Q, D, CK, notifier); not _i1 (QB,Q); specify (CK => Q) = (1,1); (CK => QB) = (1,1);`ifdef no_tchk`else $hold(posedge CK,negedge D,0,notifier); $hold(posedge CK,posedge D,0,notifier); $setup(negedge D,posedge CK,0,notifier); $setup(posedge D,posedge CK,0,notifier); $width(negedge CK,1,0,notifier); $width(posedge CK,1,0,notifier);`endif endspecifyendmodule`endcelldefine
Memory – FIFO OverviewThe FIFOs in this category address a broad array of design requirements. FIFOs, which include dual-port RAM memory arrays, are offered for both synchronous and asynchronous interfaces. The memory arrays are offered in two configurations: latch-based to minimize area, and D flip-flop-based to maximize testability. These two configurations also offer flexibility when working under design constraints, such as a requirement that no latches be employed. Flip-flop-based designs employ no clock gating to minimize skew and maximize performance. All FIFOs employ a FIFO RAM controller architecture in which there is no extended “fall-through” time required before reading contents just written.
Also offered are FIFO Controllers without the RAM array. They consist of control and flag logic and an interface to common ASIC dual port RAMs. Choosing between the two is typically based on the required size of the FIFO. For shallow FIFOs (less than 256 bits), synchronous or asynchronous FIFOs are available which include both memory and control in a single macro. These macros can be programmed via word width, depth, and level (almost-full flag) parameters.
For larger applications (greater than 256 bits), you can use the asynchronous FIFO Controller with a diffused or metal programmable RAM. See Figure 1.
Figure 1: Memory: FIFOs and FIFO Controllers
All FIFOs and Controllers support full, empty, and programmable flag logic. Programmable flag logic may be statically or dynamically programmed. When statically programmed, the threshold comparison value is hardwired at synthesis compile time. When dynamically programmed, it may be changed during FIFO operation.
DiffusedorMetal ProgrammableRAM(on-chip or off-chip)
Synthetic DesignsFIFO RAM Controller
Synthetic Designs FIFO(includes control and memory)
Controller
LatchorFlip-FlopBased RAM
FIFO Controller to be used with a technology-specific vendor supplied RAM
Technology-independent FIFOthat includes control and memory
•For large FIFOs (> 256 bits)•Interfaces to dual port static RAMs
entity cma_pll is port ( clk40 : in STD_LOGIC; -- Input clock bypass : in STD_LOGIC; -- pass thru enable clr : in STD_LOGIC; -- Reset not lock : out STD_LOGIC; -- PLL is in lock clk : out STD_LOGIC -- PLL clock out );end cma_pll;
architecture rtl of cma_pll is component PLL40x320x6LM port ( REF : in STD_LOGIC; -- Input clock FB : in STD_LOGIC; -- Feedback from a delayed place on the chip BYPASS : in STD_LOGIC; -- pass thru mode enable RESET : in STD_LOGIC; -- Reset not LOCK : out STD_LOGIC; -- PLL is in lock PLLOUT : out STD_LOGIC -- PLL clock out ); end component; signal int_clr : STD_LOGIC; signal int_clk : STD_LOGIC;begin clk <= int_clk; clear : process(clr) begin if CLRVAL = '1' then int_clr <= clr; else int_clr <= not clr; end if; end process; pll : PLL40x320x6LM port map ( REF => clk40, FB => int_clk, BYPASS => bypass, RESET => int_clr, LOCK => lock, PLLOUT => int_clk);end rtl;
33
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Il progetto di un ASIC
• Documento specifiche
• “package” di parametri
• utilizzato anche nella simulazione comportamentale
--*****************************************************--* file: cma_common.vhd--* Package: cma_common--*--* basic package for cma, contains main parameters--* @author : S. Veneziano, R.Vari--* @version 1.0 : 20001001 : birth--*****************************************************--/
--* Frontend grouping --* in number of strips constant FE_GROUP : integer := 8;
--* deadtime setting of input signals --* in number of time slices constant FE_DEADTIME : integer := 32;
34
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Behavioral modelmodule fifo( dout, // head of fifo full, // no more space, no shift-in allowed half, // fifo is >=50% full quarter, // fifo is >=25% full empty, // head of fifo is invalid data clk, res_, din, // data to store shiftin, // store data from din in fifo shiftout); // i've read the head of fifo, show me next
parameter WIDTH = 32; // bit width parameter DEPTH = 16; // depth of fifo
A register-transfer machine has combinationallogic connecting registers:
DQ combinationallogic
D QD Q combinationallogic
combinationallogic
38
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Parametrized blocks
• Per rendere il codice riutilizzabile nello stesso o in altri progetti
ENTITY in_reg_gen IS GENERIC(WIDTH : integer); PORT( a : IN std_logic_vector(WIDTH-1 DOWNTO 0); serout : OUT std_logic; clk40 : IN std_logic; clken_I2C : IN std_logic; clr : IN std_logic; sel : IN std_logic; serin : IN std_logic; shift : IN std_logic );END in_reg_gen;
ARCHITECTURE rtl of in_reg_gen IS
signal q : std_logic_vector(WIDTH-1 downto 0);
BEGIN
serout <= q(q'HIGH); reg: process (clk40,clr) begin if(clr= CLRVAL) then q<=(others=>'0'); elsif rising_edge(clk40) then if(clken_I2C = CLKENVAL) then if(sel = '1' AND shift = '1') then q <= q(WIDTH-2 downto 0) & serin; else q <= a; end if; END if; end if; end process reg;
END rtl;39
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
La simulazione
• possiamo distinguere almeno due tipi:
• Compiler driven
• Event driven
40
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Unit-delay
• tutti i ritardi dei gate sono unitari
• utile per osservare l’evoluzione temporale (glitches)
time
’0’
’1’
’0’
’1’
0 1 2 3 4
n1
n2
n3
n4
n5
n6
n7
n8
n9
’0’
’1’
’0’
’1’
’0’
’1’
’0’
’1’
’0’
’1’
’0’
’1’
’0’
’1’
for (t ! tstart ; t " tend ; t ! t + 1) {new[1]! A;
new[2]! B;
new[3]! C;
new[4]! D;
new[5]! E;
new[6]! OR(old[1], old[2]);
new[7]! AND(old[4], old[5]);
new[8]! AND(old[6], old[3]);
new[9]! OR(old[7], old[8]);
F! new[9];
old! new;
}
41
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Simulazione event-driven
• utilizzata nei simulatori gate-level.
• un evento e’ un cambiamento di livello di un segnale ad un certo tempo
• un evento puo’ causare la variazione di livello di altri segnali
simulazione contiene anche un modello comportamentale del dispositivo. L’output del modello comportamentale viene confrontato con quello del dispositivo
C001 -- CMID 0 FEL1ID 185A2 -- FEBCID 14421A1F -- BC 3 TIME 2 IJK 0 STRIP 311A3F -- BC 3 TIME 2 IJK 1 STRIP 31195C -- BC 3 TIME 1 IJK 2 STRIP 28199C -- BC 3 TIME 1 IJK 4 STRIP 281ADF -- BC 3 TIME 2 IJK 6 STRIP 311AEB -- BC 3 TIME 2 OVL 2 THR 34025 -- CODE 0 CRC 25
C002 -- CMID 0 FEL1ID 28620 -- FEBCID 15681A18 -- BC 3 TIME 2 IJK 0 STRIP 241A38 -- BC 3 TIME 2 IJK 1 STRIP 241955 -- BC 3 TIME 1 IJK 2 STRIP 211995 -- BC 3 TIME 1 IJK 4 STRIP 211AD8 -- BC 3 TIME 2 IJK 6 STRIP 241AEB -- BC 3 TIME 2 OVL 2 THR 3408B -- CODE 0 CRC 8B
C003 -- CMID 0 FEL1ID 3869E -- FEBCID 16941961 -- BC 3 TIME 1 IJK 3 STRIP 119A1 -- BC 3 TIME 1 IJK 5 STRIP 1406C -- CODE 0 CRC 6C
C001 -- CMID 0 FEL1ID 185A2 -- FEBCID 14421A1F -- BC 3 TIME 2 IJK 0 STRIP 311A3F -- BC 3 TIME 2 IJK 1 STRIP 31195C -- BC 3 TIME 1 IJK 2 STRIP 28199C -- BC 3 TIME 1 IJK 4 STRIP 281ADF -- BC 3 TIME 2 IJK 6 STRIP 311AEB -- BC 3 TIME 2 OVL 2 THR 34025 -- CODE 0 CRC 25
C002 -- CMID 0 FEL1ID 28620 -- FEBCID 15681A18 -- BC 3 TIME 2 IJK 0 STRIP 241A38 -- BC 3 TIME 2 IJK 1 STRIP 241955 -- BC 3 TIME 1 IJK 2 STRIP 211995 -- BC 3 TIME 1 IJK 4 STRIP 211AD8 -- BC 3 TIME 2 IJK 6 STRIP 241AEB -- BC 3 TIME 2 OVL 2 THR 3408B -- CODE 0 CRC 8B
C003 -- CMID 0 FEL1ID 3869E -- FEBCID 16941961 -- BC 3 TIME 1 IJK 3 STRIP 119A1 -- BC 3 TIME 1 IJK 5 STRIP 1406C -- CODE 0 CRC 6C
46
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Synthesis
Flusso tradizionale: piu’ avanti parleremo del legame tra sintesi e layout
Sintesi logicaALGORITHMS FOR VLSI DESIGN AUTOMATION
LOGIC SYNTHESIS AND VERIFICATION
1
July 19, 1999
LOGIC SYNTHESIS AND VERIFICATION
Logic synthesis:
* Starts from a register-transfer level (RTL) description, given in e.g.VHDL or given as a set of Boolean expressions.
* Three different tasks: two-level combinational synthesis, multilevelcombinational synthesis and sequential synthesis.
* Outputs a standard-cell netlist or some other form of realization suchas a PLA.
Verification:
* Checks the equivalence of a specification and an implementation.
63
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Esempio VHDLlibrary ieee;
use ieee.std logic 1164.all;
entity example is
port (x1, x2, x3, x4, x5: in std logic;
y1, y2: out std logic);
end example;
architecture behavioral of example is
begin
react: process (x1, x2, x3, x4, x5)
begin
if x1 = ’1’ and x2 = ’0’
then
y1<= x3 and x4;
y2<= x3 or x4;
elsif x2 = ’1’
then
y1<= not (x3 and (x4 or x5));
y2<= ’-’;
else
y1 <= ’-’;
y2 <= ’0’;
end if;
end process react;
end behavioral; 64
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Sintesi di alto livelloALGORITHMS FOR VLSI DESIGN AUTOMATION
HIGH-LEVEL SYNTHESIS
1
June 10, 1999
HIGH-LEVEL SYNTHESIS (HLS)
VHDL synthesis:
* Starts from a register-transfer level (RTL) description; circuit behav-ior in each clock cycle is fixed.
* Uses logic synthesis techniques to optimize the design.
* Generates a standard-cell netlist.
High-level synthesis (also called architectural synthesis):
* Starts from an abstract behavioral description.
* Generates an RTL description.
BEHAVIORAL DOMAIN
PHYSICAL DOMAIN
Physical partitions
Floorplans
Module layout
Cell layout
Transistor layout
Systems
Algorithms
Register transfers
Logic
Transfer functions
Processors
ALU’s, RAM, etc.
Gates, flip-flops, etc.
Transistors
STRUCTURAL DOMAIN
65
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Modelli hardware
f
i1 i2
o! f (i1, i2)
i1 i2 i3i0
c1
c0
o! ik , k! 2c1" c0
i
o
enable
bus
(a) (b) (c)
(d) (e)
• unita’ funzionali
• registri
• (de)multiplexers
• buses
66
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Vincoli alla sintesi di alto livello
• ai constraint di sintesi e di timing standard, si aggiungono vincoli sul tipo di unita’ funzionali utilizzabili, sulla possibilita’ di fare pipeline retiming, multicycle operations...
• piu’ in generale sara’ la strategia di clocking, la scelta della connettivita’ (bus vs mux), a determinare l’architettura.
67
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Esempio di adder
December 12, 2005 Synopsys, Inc. 51
DesignWare IP Family
DW01_addAdder
Arith
DW
L Sythesizable IP
DW01_addAdder
! Parameterized word length
! Carry-in and carry-out signals
! Module Compiler Architectures
Table 1: Pin DescriptionPin Name Width Direction Function
A width bit(s) Input Input dataB width bit(s) Input Input dataCI 1 bit Input Carry-inSUM width bit(s) Output Sum of (A + B + CI)CO 1 bit Output Carry-out
Implementation Name Function License Feature Requiredrpl Ripple-carry synthesis model nonecla Carry-look-ahead synthesis model noneclf Fast carry-look-ahead synthesis model DesignWarebk Brent-Kung architecture synthesis model DesignWare
a. During synthesis, Design Compiler will select the appropriate architecture for your constraints. However, you may force Design Compiler to use one of the architectures described in this table. For more details, please refer to the DesignWare Building Block IP User Guide.
b. The performance of the csm implementation is heavily dependent on the use of a high-performance inverting 2-to-1 multiplexer in the technology library. In such libraries, the csm implementation exhibits a superior area-delay product. Although the csm implementation does not always surpass the delay performance of the clf implementation, it is much lower in area.
c. This architecture is specially generated using Module Compiler technology. It is normally used as a replacement for, rather than in conjunction with, the HDL architectures available for the same DesignWare part. To use this architecture during synthesis, the dc_shell-t variable ‘dw_prefer_mc_inside’ must be set to ‘true.’ From the DC 2004.12 release onward, the MC architectures are not available by default. For more information, refer to the DesignWare Building Block IP Users Guide.
d. This delay-optimized parallel-prefix architecture is generated using Datapath generator technology DW "gensh.” This is ON by default in the Design Compiler flow. The DC variable ‘synlib_enable_dpgen’ must be set to ‘true’ (the default) to make use of this Datapath technology.
68
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Esempio di sintesi di alto livello// corso master// esempio di codice per sintesi di alto livello//module addera (clk,a,b,d,e,c);input clk;input [7:0] a,b;input [8:0] d,e;output [15:0] c;reg [15:0] c;reg [8:0] banka, bankb;
always @(posedge clk)begin banka <= a + b; bankb <= d - banka; c <= bankb * e;endendmodule
69
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Esempio di sintesi di alto livello 2
70
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Mapping
• ...
71
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
floorplan
• ciascun blocco puo’ essere piazzato manualmente, definendone la geometria (area fissa)
72
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Hierarchy and connectivity
73
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
floorplan
• output: file di floorplan (DEF), piazzamento “hard” di celle di IO e macrocelle, piazzamento “soft” di blocchi di celle standard.
74
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
macrocells placement
75
• in questa fase viene creata anche la rete di distribuzione di potenza (analisi IR dopo il piazzamento).
Il piazzamento• E’ il problema di assegnare automaticamente le
posizioni corrette delle celle sul chip, ottimizzando una funzione costo.
• problemi diversi di piazzamento appaiono nel caso di:
• celle standard;
• building blocks;
• celle e building blocks.
• puo’ essere parte integrante del processo di sintesi (sintesi fisica).
79
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Rappresentazione del circuito
• esempio
S
R
Q
Q
g1
g2
80
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Cell-port-net modelALGORITHMS FOR VLSI DESIGN AUTOMATION
PLACEMENT AND PARTITIONING
2
July 22, 1999
CELL-PORT-NET MODEL
S
R
Q
Q
g1
g2
n1
n2
n3
n4
struct cell {struct cell master *cell type; /* Access to cell type, e.g. NAND
gate and other generic properties */
char id[ ]; /* A string that uniquely identifies the cell, e.g. g1 */
set of struct port in ports, out ports;
};
struct port {struct port master *port type; /* Access to generic port information */
char id[ ]; /* Unique identification */
struct cell *parent cell; /* To which cell does this port belong? */
struct net *connected net; /* To which net is this port connected? */
};
struct net {char id[ ]; /* Unique identification */
set of struct port joined ports; /* Ports connected by the net */
};
81
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Standard cell
• piazzamento non vincolato.
VDD
GND
CLK
CELL 1 CELL 2
82
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Relazione col routing• Idealmente, il piazzamento e il routing
dovrebbe essere fatto simultaneamente perche’ ognuno dipende dai risultati dell’altro. Attualmente non e’ possibile fare routing dettagliato e piazzamento con lo stesso tool.
• In pratica il piazzamento viene effettuato in una fase preliminare. Viene fatta una stima delle lunghezze usando una metrica (minimum spanning tree, steiner tree).
83
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Piazzamento e stima del routing
84
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Relazione con la sintesi
• wire lenght estimation
• dal floorplan al piazzamento85
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Routing
• Il routing locale e’ il processo volto a determinare la struttura che connette un insieme di terminali in una data area di routing
• Il routing locale e’ diverso dal routing globale, dedicato all ricerca delle aree di routing dove verranno realizzate le connessioni. Quest’ultimo tipo di routing non fissa le strutture di connessione al’interno delle aree di routing.
86
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Caratterizzazione• Il routing locale e’ caratterizzato da un numero di parametri (ogni set di
valori definisce un tipo di problema):
• numero di layer di routing
• l’orientazione dei segmenti per ogni layers, orizzontale, verticale, altro
• griglia presente o assente
• presenza o assenza di ostacoli nell’area di routing
• aree di routing fisse o estendibili
• posizione dei terminali: su due linee parallele, su un rettangolo, su un area arbitraria.
• terminali con posizione fissa o mobile.
87
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Routing all’interno di un blocco di standard cell
• griglia di densita’ di routing, e’ utilizzata anche in fase di piazzamento, per identificare problemi di congestione
(a) (b)
6 (= n)1 2 3 40 5 6 (= n1 2 3 40 5
0
1
2
3
4
5 (= m)
88
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
La sintesi fisica • Procede per fasi (semplificando):
• prima fase di sintesi, senza alcuna informazione fisica. Viene generata la netlist iniziale.
• piazzamento delle celle standard, utilizzando un floorplan (dimensioni die, IO, macrocelle), guidato dai constraints di sintesi (timing-driven)
• iterazione sui due punti seguenti:
• routing approssimato e stima degli RC di ciascuna net
• ottimizzazione delle net piu’ critiche, che violano i constraints di timing/area, attraverso una ottimizzazione logica e il nuovo piazzamento delle celle interessate.
89
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
physical synthesis output
• netlist e piazzamento del circuito, manca il routing dettagliato.90
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
legame tra sintesi e routing dettagliato
• la sintesi fisica fa uso di un router per stimare i valori RC di ciascuna connessione.
• e’ possibile confrontare i valori stimati con i valori ottenuti dal router dettagliato, ed applicare delle costanti di calibrazione per rendere piu’ simili i risultati.
91
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
analisi RC• problema: stima sbagliata del
tool di sintesi, che effettua solo una stima del routing e non il routing dettagliato, offset e distribuzione allargata
• soluzione: introduzione delle costanti di calibrazione
• piccole congestioni risolte nella fase di routing dettagliato
93
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Esempio
• Esempio, 6LM, 0.18 um94
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
Post-route parasitics
• tools 3-D di estrazione
DATASHEET
FIRE & ICE QXTFULL-CHIP 3-D EXTRACTION
Fire & Ice® QXT is a fast, 3-D accurate transistor-level extractor with digital full-chip
capacity. It integrates seamlessly with GDSII and SPICE flows — and with Calibre
through a HCCI interface. Fire & Ice QXT generates parasitic data files in DSPF
and SPICE formats for sign-off timing, power, or signal integrity analysis with
VoltageStorm™ power analysis and CeltIC™ crosstalk analysis.
Figure 1: Fire & Ice QXT uses a halo to account for all near-body and multi-level interconnectcapacitive effects, including the impact of crossover fringe, corners, and capacitive shading
KEY BENEFITS• Transistor-level extraction for final
tapeout verification:
– Extracts data for power grid sign-offwith VoltageStorm
– Device-level and mixed modeextraction
– Integrated with GDSII, LEF/DEFand DEF/GDSII inputs and DSPF,SPICE output
SILICON-VALIDATED ACCURACYFire & Ice QXT incorporates an enhancedversion of the production-proven 3-Dadaptive analytical extraction modelingtechnology, which enables distributedand coupled RC extraction faster thanever before. A suite of analyticalmodels is created once per process.
During extraction, parameters aregenerated based on very specific 3-Dregions, then the parameters arepassed to the analytical models forcapacitance calculation. The modelsemploy a special influence region,known as a “dynamic halo.” The halo
accounts for all near-body and multi-level interconnect capacitive effects,including the impact of crossover fringe,corners, and capacitive shading (seeFigure 1). Coupling capacitance playsa dominant role in determining theperformance of UDSM designs, so 3-D
In contrast to less accurate 2-D, 2-D +2-D, or “Quasi 3-D” methods, 3-Dadaptive analytical extraction modelingidentifies in three dimensions all theinteracting objects within the contextof the conductor being evaluated.Because 3-D adaptive analyticalextraction models do not rely onpattern-matching techniques, they donot suffer from boundary errors. 3-Dadaptive analytical extraction modelsscale with shrinking processes andincreasing design sizes and can beused across a broad spectrum ofdesign styles — without “tuning”for each design.
Cadence foundry partners havevalidated the accuracy of 3-D adaptiveanalytical extraction in actual silicon,so users can have confidence that theywill meet their timing budgets anddesign goals. Through the CadenceFoundry Partners program, theseIceCaps models are pre-validated forsix of the world’s leading foundries —Chartered Semiconductor, IBM, NEC,Toshiba, TSMC, and UMC. Models areavailable directly from the foundry.Or you can build your own modelsfor any process technology using thebuilt-in, fully automated IceCapsmodel generation capability inFire & Ice QXT, which makes processtransitions fast and easy (see Figure 2).
3-D ACCURATE EXTRACTIONFire & Ice QXT promotes greater accuracyin timing closure and reduces the
number of layout iterations to achievetargeted design performance. It alsoeliminates the need for excessive over-design in order to meet timing goals.Fire & Ice QXT provides a simplifiedset-up process and single-stepexecution that give designers the fastturnaround and ease of use neededfor transistor-level ASIC design flows.
PERFORMANCE ON 300 MHZ CPUFire & Ice QXT is a transistor-levelextraction product that providesovernight, full-chip, 3-D-accurate,transistor-level extraction for finaltapeout verification. It takes advantageof multiple-CPU processing, by sectioninga design into stripes, then running theresistance and capacitance engines onthe stripes before re-combining thedata to efficiently extract multi-milliontransistor chips. It also extracts the datarequired by VoltageStorm for powergrid sign-off—an essential sign-offrequirement for UDSM designs.
INTEGRATION WITHSTANDARD DESIGN FLOWS
• Fire & Ice QXT includes SPICE andGDSII readers that enable easyintegration with physical design tools
• Produces DSPF and SPICE outputs
• Calibre users can enter design datathrough the HCCI interface (see Figure 2)
• Cadence works closely with EDApartners like Synopsys, Mentor, andMagma to provide support forcommon design flows
PLATFORM SUPPORTFire & Ice runs on standard UNIXworkstations from Sun Microsystemsand Hewlett-Packard.
• OS Support: 32- and 64-bit
• LINUX: Red Hat 7.2 (32-bit)
• Solaris 7 and 8
• HP-UX 11
SYSTEM REQUIREMENTSSystem requirements will vary dependingon your circuit size. Here are somegeneral guidelines:
MINIMUM CONFIGURATION
DRAM: 512 Mb
Swap space: 2 Gb
Disk space for software: 50 Mb
Disk space for 1 million-gate design: 2 Gb
RECOMMENDED CONFIGURATION
DRAM: 1 Gb
Swap space: 4 Gb
Disk space for software: 50 Mb
Disk space for 1 million-gate design: 4 Gb
FOR MORE INFORMATIONLog on to www.cadence.com or emailus at [email protected]
DSPF
Technology file
Cell library
Calibre database
GDSII
LEF or GDSII
SPICE
Fire & Ice QXT
IceCapsTextual process data
PROCESS DATA
CELL DATA
DESIGN DATA
LibGen
XTC
H – HierarchicalC – CalibreC – ConnectivityI – Interface
Figure 2: Fire & Ice QXT dataflow. Foundry-specific processinformation is fed into IceCapsto create a technology file.LibGen reads in LEF and GDSIIdesign data to create Fire & IceQXT cell library information.Interconnect data can be inputvia a hierarchical Calibredatabase or annotated GDSII.These are the only threeinputs to Fire & Ice QXT
7-6
Chapter 7: Parasitic Back-Annotation
Annotating Detailed Parasitics
You can annotate detailed parasitics into PrimeTime and annotate
each physical segment of the routed netlist in the form of resistance
and capacitance (see Figure 7-3). Annotating detailed parasitics is
very accurate but more time-consuming than annotating lumped
parasitics. Because of the potential complexity of the RC network,
PrimeTime takes longer to calculate the pin-to-pin delays in the
netlist.
This RC network is used to compute effective capacitance
(Ceffective), slew, and delays at each subnode of the net.
PrimeTime can read detailed RC in DSPF and SPEF formats.
Figure 7-3 Detailed RC
You can use this model for netlists that have critical timing delays,
such as clock trees. This model can produce more accurate results,
especially in deep submicron designs where net delays are more
significant compared to cell delays.
The detailed RC network supports meshes, as shown in Figure 7-4.
C1
R1
C2
R2
R3
R4
R6
R5
C4
C6
C3
C5
C7
95
Friday, January 22, 2010
S.Veneziano, gennaio 2010, Roma
test vectors generation
• vengono utilizzati strumenti di analisi dei faults.