Networks-on-Chip
Oct 23, 2015
Networks-on-Chip
Seminar contents
The Premises Homogenous and Heterogeneous Systems-
on-Chip and their interconnection networks
The Network-on-Chip approach
Slide from S. Tota and M. R. Casu [1]
The premises
The System-on-Chip (SoC) today Heterogeneous ~10 IP’s Homogeneous (MP-SoC) ~ 10 uP (with exceptions) On-Chip BUS (AMBA, Core Connect, Wishbone, …) IP and uP are sold with proprietary Bus IF
Near and long-term forecast 100 IP/uP: Busses are non scalable! Physical Design issues: signal integrity, power
consumption, timing closure Clock issues: Is time for the Globally Asynchronous
paradigm? (Still locally synchronous) Need for “more regular” design
Slide from S. Tota and M. R. Casu [1]
Heterogeneous Today’s SoC
CPU DSP MEM
Embedded
FPGA
Dedicated
IP
Interconnection network (BUS)
I/O
Slide from S. Tota and M. R. Casu [1]
Maya (Rabaey’00)
Slide from S. Tota and M. R. Casu [1]
Maya (Rabaey’00)
Slide from S. Tota and M. R. Casu [1]
Maya (Rabaey’00)
Slide from S. Tota and M. R. Casu [1]
Homogeneous SoC (MP-SoC)
CPU
MEM
CPU
MEM
CPU
MEM
CPU
MEM
CPU
MEM
CPU
MEM
CPU
MEM
CPU
MEM
Interconnection network (BUS, XBAR)
Slide from S. Tota and M. R. Casu [1]
MP-SoC: Cisco CRS-1 Router
CRS-1 Router uses 188extensible network processors per “Silicon Packet Processor” chip
Slide from S. Tota and M. R. Casu [1]
MP-SoC: Cisco CRS-1 Router
CRS-1 Router uses 188extensible network processors per “Silicon Packet Processor” chip16 PPE Clusters
of 12 PPEs each
Slide from S. Tota and M. R. Casu [1]
Very long wires
1 ns (1 GHz) 0.1 ns (10 GHz)
A
B
A
B
Year 2005 Year 2010
Slide from S. Tota and M. R. Casu [1]
Bus pros () and cons ()
Every unit attached adds parasitic capacitance, therefore electrical performance degrades with growth.
Bus timing is difficult in a deep submicron process. Bus arbiter delay grows with the number of masters. The
arbiter is also instance-specific. Bandwidth is limited and shared by all units attached. The silicon cost of a bus is small. Any bus is almost directly compatible with most available
IPs, including software running on CPUs. The concepts are simple and well understood.
Slide from S. Tota and M. R. Casu [1]
What are NoC’s?
According to Wikipedia: “Network-on-a-chip (NoC) is a new paradigm
for System-on-Chip (SoC) design. NoC based-systems accommodate multiple asynchronous clocking that many of today's complex SoC designs use. The NoC solution brings a networking method to on-chip communications and claims roughly a threefold performance increase over conventional bus systems.”
Slide from S. Tota and M. R. Casu [1]
ProcessorMaster
GlobalMemory
Slave
Global I/OSlave
Global I/OSlave
ProcessorMaster
ProcessorMaster
ProcessorMaster
ProcessorMaster
ProcessorMaster
ProcessorMaster
ProcessorMaster
ProcessorMaster
RoutingNode
RoutingNode
RoutingNode
RoutingNode
RoutingNode
RoutingNode
RoutingNode
RoutingNode
RoutingNode
NoC exemple
Slide from S. Tota and M. R. Casu [1]
Basic Ingredients of a NoC
N Computational Resources Processing Elements (PE)
1 Connection Topology 1 Routing technique M N Switches N Network Interfaces 1 Addressing system 1 Communication Protocol 1 Programming model
Message passing Shared Memory
Slide from S. Tota and M. R. Casu [1]
Problems
Internal network contention causes (often unpredictable) latency.
The network has a significant silicon area. Bus-oriented IPs need smart wrappers. Software needs clean synchronization in
multiprocessor systems. System designers need reeducation for
new concepts.Slide from S. Tota and M. R. Casu [1]
Network on Chip (NoC)
Adoption of network-based packet communication paradigm.
Use abstraction and layering to decouple the communication issue from computation
Distribute the responsibility of reliable transmission evenly over higher and lower layers of abstraction
SoftwareApplication
systems
SoftwareApplication
systems
Architecture and control
• Transport• Network• Data link
Architecture and control
• Transport• Network• Data link
Physical wiringPhysical wiring
Protocol stack abstractionBenini & De Micheli, Computer 2002
Slide from L. Benini [2]
Physical layer - Synchronization
Physical design: Voltage levels Driver design Sizing Physical routing
Synchronization: How and when to sample the channel? Avoid a clock: asyncronous communication The clock travels with the data The clock can be reconstructed from the data
Synchronization recovery has a cost Cannot be abstracted away Can cause errors (e.g., metastability)
Slide from L. Benini [2]
Data-link layer
Provide reliable data transfer on an unreliable physical channel
Access to the communication medium Dealing with contention and arbitration
Issues Fairness and safe communication Achieve high throughput Error resiliency
Slide from L. Benini [2]
Topologies Heritage of networks with new constraints
Need to accommodate interconnects in a 2D layout Cannot route long wires (clock frequency bound)
a) SPIN,
b) CLICHE’
c) Torus
d) Folded torus
e) Octagon
f) BFT.
Slide from S. Tota and M. R. Casu [1]
Topologies Comparison of topologies according to different QoS parameters.
Throughput as a function of number of IPs.
Topologies Comparison of topologies according to different QoS parameters.
Drop probability as a function of number of IPs.
Topologies Comparison of topologies according to different QoS parameters.
Latency as a function of number of IPs.
Switching
Again, techniques inherited from Computer and Communication Networks
New constraints in silicon: area and power Use as few buffers as possible
Store & Forward and Virtual-Cut-Through Need buffers size for an entire packet, unsuited!
Limited buffer size in Wormhole Deflection Routing, a.k.a. “Hot Potato”
Virtual channels Increase buffer size…
Slide from L. Benini [2]
Switching
Classification of Switching Techniques :
Routing
Deterministic vs. Adaptive Simplify/Complicate routing logic Easy/Uneasy deadlock free Prone/Robust to congestion
2D dimension order routing (XY) most used static routing in NoC (e.g. with Wormhole and Mesh)
Slide from L. Benini [2]
Routing Classification of Routing Algorithms :
Transport layer
Decompose and reconstruct informationImportant choices
Packet granularity Admission/congestion control Packet retransmission parameters (Ex.:Timeout)
All these factors affect heavily energy and performance
Application-specific schemes vs. standards
Slide from L. Benini [2]
Flow controlDetermines how resources are allocated to
packets moving in the network.Classification of Flow Control Algorithms :
System software
Programming paradigms Shared memory Message passing
Middleware: Layered system software Should provide low communication latency Modular, scalable, robust ….
Slide from L. Benini [2]
Who first had the idea?
The most referred papers according to Google (#cit.)
Guerrier’00 (204), A Generic Architecture for On-Chip Packet-Switched Interconnections
Dally’01 (392), Route Packets, Not Wires: On-Chip Interconnection Networks
Benini’02 (417), Networks on Chips: A New SoC Paradigm
Kumar’02 (184), A Network on Chip Architecture and Design Methodology
Slide from S. Tota and M. R. Casu [1]
Some NoC References J. Rabaey et al., “A 1-V heterogeneous reconfigurable DSP IC for wireless baseband
digital signal processing,” IEEE Journal of Solid State Circuits, Vol. 35, No. 11, Nov. 2000, pp. 1697 - 1704
P. Guerrier and A. Greiner, “A Generic Architecture for On-Chip Packet-Switched Interconnections,” Proc. Design and Test in Europe (DATE), pp. 250-256, Mar. 2000.
A. Adriahantenaina et al., “SPIN: a Scalable, Packet Switched, On-chip Micro-network,” Proc. Design and Test in Europe (DATE), Mar. 2003.
L. Benini and G. De Micheli, “Networks on Chips: A New SoC Paradigm,” Computer, vol. 35, no. 1, Jan. 2002, pp. 70-78.
S. Kumar et al., “A network on chip architecture and design methodology,” in Proc. ISVLSI, 2002.
W. J. Dally and B. Towles, “Route packets, not wires: on-chip interconnection networks,” in Proc. Design Automation Conf., 2001.
K. Goossens et al., “Trade-offs in the design of a router with both guaranteed and best-effort services for networks on chip,” IEE Proc.-Comput. Digit. Tech., Vol. 150, No. 5, Sep. 2003, pp. 294-302.
P.P. Pande et al., “Performance Evaluation and Design Trade-offs for Network-on-Chip Interconnect Architectures,” IEEE Trans. Computers, vol. 54, no. 8, Aug. 2005, pp. 1025-1040.
Slide from S. Tota and M. R. Casu [1]
References
1. S. Tota and M. R. Casu Sergio Tota and Mario R. Casu, “Networks-on-Chip,” presentation. www.tlc.polito.it/~nordio/seminars/2006_05_05_Casu.ppt
2. L. Benini, “Networks on chip,” presentation, http://www.ida.liu.se/~petel/NoC/lecture-notes/lect2.pdf