http:// variability.org http:// mesl.ucsd.edu Procedure Hopping: a Low Overhead Solution to Mitigate Variability in Shared-L1 Processor Clusters Abbas Rahimi ‡ , Luca Benini † , and Rajesh Gupta ‡ ‡ CSE, UC San Diego † DEIS, Università di Bologna International Symposium on Low-Power Electronics and Design http:// micrel.deis.unibo.it
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
http://variability.org http://mesl.ucsd.edu
Procedure Hopping: a Low Overhead Solution to Mitigate Variability in
Shared-L1 Processor ClustersAbbas Rahimi‡, Luca Benini†, and Rajesh Gupta‡
‡CSE, UC San Diego†DEIS, Università di Bologna
International Symposium on Low-Power Electronics and Design
http:// micrel.deis.unibo.it
Procedure Hopping to Mitigate Variability
2
Main Point
3
Across-wafer FrequencyVCC DroopTemperature
Clock
actual circuit delay guardband
Other uncertainty
Sources of Device Variation
10% VCC, ~160˚C Temperature, 40% VTH Variations are more challenging in a many-core platform!
• Sources of Variations• Variation-tolerant Shared-L1 Processor Cluster
1. Process Variation → Variation-aware VDD-hopping
2. Dynamic Voltage Variation → Procedure hopping• Methodology for PLV
– Design time characterization– Compile time PLV metadata generation– Runtime preventive compensation
• Experimental Results
4
Outline
Each cluster consists of:• 16 LEON-3 cores• An intra-cluster shared-L1I$ • An on-chip multi-banked tightly
coupled data memory (TCDM)• Two single-cycle logarithmic
interconnections for both instruction and data sides
• A hardware synchronization handler module (SHM) to coordinate and synchronize cores for accessing shared data on TCDM.
• VDD-hopping per core.
5
Shared-L1 TCDM cluster template
4x8 cluster: 4 PEs and an 8-bank TCDM
Shared-L1 Processor Clusters *
* D. Melpignano, L. Benini, et al., “Platform 2012, a many-core computing accelerator for embedded SoCs: performance evaluation of visual analytics applications”, DAC’12
Three cores (f4, f8, f9) cannot meet the target frequency of 830MHz.
Procedure hopping facilitates fast and proactive migration of procedures within a cluster to prevent voltage variation thanks to shared I$ and TCDM resources.
Each procedure hops from one core to another if it causes voltage variation.
• Sources of Variations• Variation-tolerant Shared-L1 Processor Cluster
1. Process Variation → Variation-aware VDD-hopping
2. Dynamic Voltage Variation → Procedure hopping• Methodology for PLV
– Design time characterization– Compile time PLV metadata generation– Runtime preventive compensation
• Experimental Results
10
Outline
Procedure-level Vulnerability (PLV) • The notion of PLV to fast dynamic voltage variation is
defined.• The design time stage analyzes the dynamic voltage
droops/rises for every ProcX under full operating conditions generating PLVx metadata.
11
int ProcX (…) { …
}
(Vi,Tj)
Corei
Observe IR-drops
(V,T) PLVX
V1,T1 0.75
V2,T2 0.35
V3,T3 0.01
… …
Characterization of PLV to IR-drop: Compile time + Runtime
12
Open-source Leon3
Design Compiler
IC Compiler
PrimeTime PX
ModelSimVsim
VHDL Timing constraints
Verilognet-list
Verilognet-list
Parasitics
Switchingactivity
ProcX
Power @(Vi,Tj)
DynamicVoltage droop/rise @(Vi,Tj)
Object code
PLVcharacterized metadata
For ProcX@Caller :Read current (V,T) sensors of CoreiRead characterized metadata for ProcXIf PLVX > PLV_threshold
Invoke Procedure Hopping (ProcX@Callee)
VA-Proc generation: ProcX/ProcX@Caller/
ProcX@Callee
Generating metadata
Operating condition
(V,T) monitor
Design time Compile time
RuntimeLe
on-3
: C
ore i
(0.81V,-40˚C)
(0.90V,25˚C)
(0.99V,125˚C)
TSMC 45nm LIBs
Prim
eRail
SDF
(0.81V,125˚C)
Source code
(V,T)
Executables
BCC Compiler
VA-Procedures’ source code
• At compile time, PLVx metadata of ProcX is attached to the procedure.• During runtime, the discretized (V,T) point to the corresponding characterized
PLV metadata to assess the vulnerability of ProcX at the current (V,T).• If PLVx ≥ PLV_threshold, the ProcX will be hopped from caller core to a favor
callee core.
• Sources of Variations• Variation-tolerant Shared-L1 Processor Cluster
1. Process Variation → Variation-aware VDD-hopping
2. Dynamic Voltage Variation → Procedure hopping• Methodology of PLV
– Design time characterization– Compile time PLV metadata generation– Runtime preventive compensation
• Experimental Results
13
Outline
Max Voltage Variation Across Corners and Procedures
Conclusion • The notion of procedure-level vulnerability to fast
dynamic voltage variation is defined.• Based on PLV metadata, a fully-software low-cost
procedure hopping technique is proposed which guarantees the voltage emergency-free migration of all procedures, fast and proactively enough within a shared-L1 processor cluster.
• Full post-P&R results in 45nm TSMC technology confirms that the procedure hopping avoids the voltage emergency across a variability-affected cluster, while imposing only an amortized cost of less than 1% latency for any of the characterized embedded procedures.
HW/SW Collaborative Architecture to Support Intra-cluster Procedure Hopping
18
• The code is easily accessible via the shared-L1 I$.• The data and parameters are passed through the shared stack in TCDM. • A procedure hopping information table (PHIT) keeps the status for a migrated