Compiling Application-Specific Hardware Mihai Budiu Seth Copen Goldstein Carnegie Mellon University.
Post on 21-Dec-2015
219 Views
Preview:
Transcript
Problems
• Complexity
• Power
• Global Signals
• Limited issue window => limited ILP
We propose a scalable architecture
Our Solution
General: applicable to today’s software - programming languages
- applications
Automatic: compiler-driven
Scalable: - run-time: with clock, hardware - compile-time: with program size
Parallelism: exploit application parallelism
New
• Entire C applications
• Dynamically scheduled circuits
• Custom dataflow machines
- application-specific
- direct execution (no interpretation)
- spatial computation
!
ret
i
+1< 100
0
*
+
sum
0
Loops
int sum=0, i;
for (i=0; i < 100; i++)
sum += i*i;
return sum;
Control flow => data flow
Compilation
• Translate C to dataflow machines
• Optimizationssoftware-, hardware-, dataflow-specific
• Expose parallelism – predication– speculation– localized synchronization– pipelining
ASH Features
• What you code is what you get– no hidden control logic– lean hardware
(no CAM, multi-ported files, etc.)– no global signals
• Compiler has complete control
• Dynamic scheduling => latency tolerant
• Natural ILP and loop pipelining
Conclusions
• ASH: compiler-synthesized hardware from HLL
• Exposes program parallelism
• Dataflow techniques applied to hardware
• ASH promises to scale with:
– circuit speed
– transistors
– program size
Backup slides
• Hyperblocks• Predication• Speculation• Memory access• Procedure calls• Recursive calls• Resources• Performance
Memory Access
back
load
addresspredicate
token
tokendataLoad-store
queue
store
address pred token
token
data
Inte
rcon
nect
ion
netw
ork
Memory
Resources
• Estimated SpecINT95 and Mediabench
• Average < 100 bit-operations/line of code
• Routing resources harder to estimate
• Detailed data in paper
back
top related