Interconnect Delay Aware RTL Verilog Bus Architecture Generation for an SoC
Kyeong Ryu, Alexandru Talpasanu, Vincent Mooney and Jeffrey Davis
School of Electrical and Computer EngineeringGeorgia Institute of Technology
August 2004
Outline
• Introduction• Interconnect Delay Estimation• Interconnect Aware Module Generation• BusSynth Overview• Application Example• Conclusion
Introduction
• A methodology to generate a custom bus architecture using accurate estimations of interconnect delay– Easy and quick design of an SoC
bus system– Fast design space exploration
across performance influencing factors
– Development of a bus synthesis tool (BusSynth)
– Register-transfer level HDL output based on user options and interconnect delay
Bus Synthesis Tool(BusSynth)
Bus Synthesis Tool(BusSynth)
User Options
Related Work• Shin et al. (’04), “Fast Exploration of Parameterized
Bus Architecture for Communication-Centric SoC Design” [5]– A single type of bus topology
• Thepayasuwan et al. (’04), “Layout Conscious Bus Architecture Synthesis for Deep Submicron Systems on Chip” [6]– A single type of bus topology
• BusSynth– A variety of bus types including multiple and
heterogeneous type– Interconnect delay aware bus generation
Bus Synthesis (BusSynth) Overview
INPUT
LIBRARIES
SYNTHESIZABLEVERILOG HDL CODE
User options
BusSynth
BUS GENERATION TOOL
Interconnect Delay Estimation
Interconnect Delay Estimation
FloorplanDesign
FloorplanDesign
MPC755PE3
SRAMSRAM
MPC755PE1
MPC755PE 2
MPC755PE4
Memory Bus Interface (MBI)Bus Arbitrer
Bus InterconnectLegend
CPU Bus Interface (CBI)
(b) Interconnect length estimation(a) Estimated Floorplan
Interconnect Length Estimation
** TSMC 0.25 µm Design Rules
Interconnect ModelParameters
M1
M2
M3
substrate
M1
M2
M3
Ra
C1n
C1n
C1n
C1n
C1n
C1n
C1n
C1n
R1/n R1/n R1/n R1/n R1/n R1/n R1/n R1/n
Ra = MOSIS sheet resistance
R1
Ca
Ca = MOSIS fringe capacitance
Cb
Cb = MOSIS area capacitance
Coupling capacitanceeffects explained in technical report[11]
Accurate Interconnect Delay Estimation
MPC755PE3
SRAMSRAM
MPC755PE1
MPC755PE 2
MPC755PE4
Memory Bus Interface (MBI)Bus Arbitrer
Bus InterconnectLegend
CPU Bus Interface (CBI)
Floorplan Bus Interconnect Length
Calculation
MOSIS Process Parameters
HSPICE CodeGeneration Tool
HSPICEsimulator
Interconnect Delay Calculation for
Each Bus Segment
[MOSIS website]
A Bus System Example:General Global Bus Architecture (GGBA)
Note BAN: Bus Access Node, PE: Processing Element, CBI: CPU Bus InterfaceMBI: Memory Bus Interface
Memory Bus Interface (MBI) Module Generation 1
• One of effects of interconnect delay insertion in an SoC: memory access cycle
• Memory controller to adapt delay clocks due to interconnect delay
PowerPCsMBI
(delayinfo.)
SRAM
aack_bars
ta_bars
address
datacontrol signals
sram_ data
cs_barwe_bar
sram address
re_bar
Memory Bus Interface (MBI) Module Generation 2
(a) Estimated total delay of paths between each PE and a shared memory
(b) Number of clock delays in data paths
MBI and Bus System Generation
Reference*: K. Ryu and V. Mooney, “Automated Bus Generation for Multiprocessor SoC Design,” Design, Automation and Test in Europe (DATE'03), pp. 282-287, March 2003.
• Memory Bus Interface (MBI) module generation
(a) Sequence of MBI Generation (b) Bus System Generation*
Bus Access Node (BAN) Generation
SynthesizableVerilog HDL code
WireLibrary Bus System Generation
BusSynth
Bus Subsystem Generation
For each Bus Subsystem
# of Subsystem > 1
Y
N
ModuleLibrary
For each BAN
Module Generation
User Option Input
Input of interconnect delays
Calculation of the number of clocks to be inserted
Extraction of MBI modulefrom Module Library
Update of memory accessdelay parameters in an MBI module
A Bus System Generation ExampleUser Input List1. Bus System: # of Bus Subsystems = 12. Bus Subsystem: # of BANs = 53. Bus Properties:
- Bus Subsystem: address bus width = 32 and data bus width: 64
4. BAN Properties:For Bus Subsystem- BAN1: CPU Type = MPC755, non-CPU Type = None
and # of global and local memories = 0- BAN2: CPU Type = MPC755, non-CPU Type = None
and #s of global and local memories = 0- BAN3: CPU Type = MPC755, non-CPU Type = None
and #s of global and local memories = 0- BAN4: CPU Type = MPC755, non-CPU Type = None
and #s of global and local memories = 0- BAN5: CPU Type = None , non-CPU Type = None,
# of global memories = 1, and # of local memories = 05. Memory Properties:
- BAN5: Type = SRAM, address bus width = 21 and data bus width = 64
BAN4BAN3BAN2
SynthesizableVerilog HDL codeSynthesizable
Verilog HDL code
WireLibrary Bus System GenerationBus System Generation
User Option InputUser Option Input
BusSynth
Bus Subsystem GenerationBus Subsystem Generation
For each Subsystem 1
# of Subsystem > 1
Y
N
ModuleLibrary
Bus Subsystem GenerationBus Subsystem Generation
Bus System GenerationBus System GenerationBus System GenerationBus System Generation
MPC755MPC755
CBI_MPC755
CBI_MPC755 CBI_
MPC755
CBI_MPC755
MPC755MPC755 MPC755MPC755
CBI_MPC755
CBI_MPC755 CBI_
MPC755
CBI_MPC755
MPC755MPC755
BAN1
BAN5
Bus Subsystem
Bus System
User Option InputUser Option Input
Bus Subsystem GenerationBus Subsystem Generation
For each Subsystem
Bus Access Node Generation
Bus Access Node Generation
Bus Subsystem GenerationBus Subsystem Generation
# of Subsystem > 1# of Subsystem > 1
Bus System GenerationBus System GenerationSRAMSRAM
ArbiterArbiter MBI_SRAM
MBI_SRAM
Bus System GenerationBus System Generation
SynthesizableVerilog HDL codeSynthesizable
Verilog HDL code
// Skipped .up_dataout(dataout_up_2[FIFO_D_WIDTH-1:0]),.up_gen_int(gen_int_up_2),.up_isr0_ctlhi(isr0_ctlhi_up_2),.up_isr0_ctllo(isr0_ctllo_up_2),.dn_datain(datain_up_3[FIFO_D_WIDTH-1:0]),.reb_dn(reb_up_3),.web_dn(web_up_3),.fifo_area_dn(fifo_area_up_3)
);endmodule
module BusSystem(sysrstb, sysclk);input sysrstb;input sysclk;// Skipped
SubSys_GGBA SubSystem(.sysrstb(sysrstb),.sysclk(sysclk)// Skipped
);
endmodule
Bus Access Node 1 (BAN1) Generation
Bus Access Node 1 (BAN1) GenerationBus Access Node 2 (BAN2) Generation
Bus Access Node 2 (BAN2) GenerationBus Access Node 3 (BAN3) Generation
Bus Access Node 3 (BAN3) GenerationBus Access Node 4 (BAN4) Generation
Bus Access Node 4 (BAN4) GenerationBus Access Node 5 (BAN5) Generation
Bus Access Node 5 (BAN5) GenerationBus Access NodeGeneration
Bus Access NodeGeneration
Application Example
• Orthogonal Frequency Division Multiplexing (OFDM) Transmitter, a wireless algorithm
• Function assignment and their processing
Experimental Setup
INPUT
LIBRARIES
SYNTHESIZABLEVERILOG HDL
CODE
User options
BusSynth
VCS SEAMLESSCVE
XRAY
GCC USER C-CODE
BUS GENERATION TOOL SIMULATION ENVIRONMENT
SYNTHESIS ENVIRONMENT
DESIGNCOMPILER
Note: VCS and Design Compiler from Synopsys, Seamless CVE and Xray from Mentor Graphics and GCC fromGNU
Interconnect Delay Estimation
Interconnect Delay Estimation
FloorplanDesign
FloorplanDesign
Three Configurations of GGBA for Performance Comparison
– GGBA I - (NO WIRE MODEL) GGBA I is a GGBA system with no regard to interconnect delay on the bus
– GGBA II - (ACCURATE WIRE MODEL) GGBA II is a GGBA system that works with different estimated interconnect delays on the shared bus
– GGBA III - (WORST-CASE WIRE MODEL) GGBA III is a GGBA system that operates with a maximum estimated delay on all connections between PEs and a shared memory
Memory Bus Interface (MBI) Module Generation 2
(a) Estimated total delay of paths between each PE and a shared memory
(b) Number of clock delays in data paths
Conclusion
• Interconnect delay is a major concern as feature size is scaled down
• Interconnect delay estimation from floorplan• Memory Bus Interface (MBI) module and Bus
System generation• Performance improvement due to
interconnect delay aware design• In an OFDM transmitter example, 35.3%
reduction in execution time against GGBA III