Design Support for Design Support for Embedded Processors Embedded Processors and Applications and Applications Prof. Kurt Keutzer Prof. Kurt Keutzer EECS EECS University of California University of California Berkeley, CA Berkeley, CA [email protected][email protected]
24
Embed
Design Support for Embedded Processors and Applications Prof. Kurt Keutzer EECS University of California Berkeley, CA [email protected].
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Design Support for Design Support for Embedded ProcessorsEmbedded Processorsand Applicationsand Applications
Customer needs: Customer needs: Fast-time to marketFast-time to market Predictability Predictability ReliabilityReliability RobustnessRobustness EfficiencyEfficiency EconomyEconomy
Looking for ``platforms’’ – devices that will amortize system design costs over Looking for ``platforms’’ – devices that will amortize system design costs over
multiple generations multiple generations
``"Based on our analysis, having a software approach is the only way to scale to ``"Based on our analysis, having a software approach is the only way to scale to
the next generation," Corgan (, Intel PMM said) . "If you have to approach each the next generation," Corgan (, Intel PMM said) . "If you have to approach each
fourfold increase in speed — from OC-48 [2.5 Gbits/second] to OC-192 [10 fourfold increase in speed — from OC-48 [2.5 Gbits/second] to OC-192 [10
Gbits/s], say — with a new architecture, it's not cost-effective."Gbits/s], say — with a new architecture, it's not cost-effective."
Addressing the problem areasAddressing the problem areasMModern odern EEmbedded mbedded SSystems ystems CCompilers ompilers
AArchitectures and rchitectures and LLanguagesanguages
MESCALMESCAL research mission: research mission:
To bring a disciplined methodology, To bring a disciplined methodology, and a supporting tool set, to the and a supporting tool set, to the development, deployment, and development, deployment, and programming of application-specific programming of application-specific programmable platforms aka programmable platforms aka application specific instruction application specific instruction processors.processors.
Invited paper: ``From ASIC to ASIP:The Next Design Discontinuity’’,K. Keutzer, S. Malik, R. Newton,Proceedings of ICCD, pp. 84-91, 2002.www.gigascale.org/mescal
Press coverage Sept 2002:Programmable Platforms will Rule:http://www.eetimes.com/story/OEG20020911S0063
High on MESCALhttp://www.eetimes.com/story/OEG20020911S0065
SDRAM Controller
enginePCI
Interface
SRAMController
StrongArmCore
I$
engine
engine
engine
engine
engine
MiniD$
D$
IX BusInterface
HashEngine
ScratchPad
SRAM
12
Three Key Problem AreasThree Key Problem Areas
Development of programmable platforms:Development of programmable platforms:
Domain-specific presentationDomain-specific presentation Device Specific LibrariesDevice Specific Libraries Environmental supportEnvironmental support
System architecture/micro-architecture/MOCSystem architecture/micro-architecture/MOC
Primitive computation and communication Primitive computation and communication mechanismsmechanisms
Port 0 IP ForwardingEngine
Port 0
14
Our ApproachOur Approach
Bottom-up view - create abstractions of existing devices Bottom-up view - create abstractions of existing devices opacity - hide micro-architectural details from programmeropacity - hide micro-architectural details from programmervisibility - sufficient detail of the architecture to allow the visibility - sufficient detail of the architecture to allow the
programmer to improve the efficiency of the programprogrammer to improve the efficiency of the program Top down – experiment with existing modeling/programming Top down – experiment with existing modeling/programming
environmentsenvironmentsLearn from their abstractions of the devicesLearn from their abstractions of the devicesTry to maximize performance within these environmentsTry to maximize performance within these environments
In real-time embedded systems correct logical functionality can In real-time embedded systems correct logical functionality can never be divorced from system performancenever be divorced from system performance
In commercial (especially consumer-oriented) embedded systems In commercial (especially consumer-oriented) embedded systems system price is an utmost concernsystem price is an utmost concern
Quality-of-results (e.g. speed, but also power, device cost)Quality-of-results (e.g. speed, but also power, device cost)Programmer productivity (how long does all this take?)Programmer productivity (how long does all this take?)
Basic C language constructs like loops, condition statements and basic Basic C language constructs like loops, condition statements and basic
data types (char, int, float)data types (char, int, float)
IXP library defines additional data types, macros and functions (useful for IXP library defines additional data types, macros and functions (useful for
common networking applications)common networking applications)
Memory management is user defined. Hence explicit declaration of Memory management is user defined. Hence explicit declaration of
memory allocation (and no support for pointers).memory allocation (and no support for pointers).
Teja is founded by Akash Deshpande – Student of Prof. Pravin VaraiyaTeja is founded by Akash Deshpande – Student of Prof. Pravin Varaiya
Based on his thesis “Control of Hybrid Systems” (1994)Based on his thesis “Control of Hybrid Systems” (1994)
Teja Language FeaturesTeja Language Features
User interacts mostly with the graphical interface (which exports pre-User interacts mostly with the graphical interface (which exports pre-
defined application primitives)defined application primitives)
Extending the Teja primitives is done via a FSM-based model (however, Extending the Teja primitives is done via a FSM-based model (however,
this still requires coding in assembly via the graphical interface)this still requires coding in assembly via the graphical interface)
Memory management for pre-defined primitives is done by Teja. User can Memory management for pre-defined primitives is done by Teja. User can
alter this process (but is tedious and error prone)alter this process (but is tedious and error prone)
20
Teja FeaturesTeja Features
21
Our own NPU programming environment: NPClickOur own NPU programming environment: NPClick Based on ClickBased on Click
Popular environment for describing/implementing network applicationsPopular environment for describing/implementing network applications Developed by Eddie Kohler, MIT=> ICSIDeveloped by Eddie Kohler, MIT=> ICSI
NPClickNPClick Implemented subset of element library in IXP uCImplemented subset of element library in IXP uC Element communication via function callsElement communication via function calls
header in SRAMheader in SRAM payload in DRAMpayload in DRAM
Designer needs to specify:Designer needs to specify: thread boundariesthread boundaries thread/uEngine assignmentthread/uEngine assignment memory allocation of queues (SRAM, DRAM, Scratch)memory allocation of queues (SRAM, DRAM, Scratch)
Opportunities for optimization (future work)Opportunities for optimization (future work) redundant memory loads/stores based on element/thread mappingredundant memory loads/stores based on element/thread mapping schemes for multiplexing hardware resources among multiple element instantiations (e.g. muxing TFIFO among 8 to schemes for multiplexing hardware resources among multiple element instantiations (e.g. muxing TFIFO among 8 to
Device’s)Device’s)
22
Programming Models for IXP1200Programming Models for IXP1200
0
200
400
600
800
1000
1200
1400
Th
rou
gh
pu
t (M
b/s
)
Click Teja uEngineC ASM
23
Productivity EstimatesProductivity Estimates ``First time’’ learning curve issues makes it difficult to compare the productivity of these approaches``First time’’ learning curve issues makes it difficult to compare the productivity of these approaches
Based on our experience, we estimate the following design times for implementing an IPv4 routerBased on our experience, we estimate the following design times for implementing an IPv4 router
Time to functional correctnessTime to functional correctness Additional time for Additional time for
performance tuningperformance tuning
ASMASM 8 weeks8 weeks 8 weeks8 weeks
uCuC 4 weeks4 weeks 6 weeks6 weeks
TejaTeja 2 weeks2 weeks 3-4 weeks3-4 weeks
NPClickNPClick 2 days2 days 2 weeks2 weeks
The advantages with Teja and NPClick come from the ability to perform design-space The advantages with Teja and NPClick come from the ability to perform design-space
exploration at a higher levelexploration at a higher level
24
Conclusions: Programming Embedded SystemsConclusions: Programming Embedded Systems Neither ASICs or general-purpose processors will fill the needs of most embedded system Neither ASICs or general-purpose processors will fill the needs of most embedded system
applicationsapplications
System design teams will increasingly choose ASIPs/programmable platformsSystem design teams will increasingly choose ASIPs/programmable platforms
Programming these devices is a new challenge:Programming these devices is a new challenge: ParallelismParallelism
Special-purpose execution unitsSpecial-purpose execution units
Need to develop matches between application development environments and programming Need to develop matches between application development environments and programming models of ASIPs/programmable platformsmodels of ASIPs/programmable platforms
Match must consider:Match must consider: EfficiencyEfficiency ProductivityProductivity RobustnessRobustness ReliabiltyReliabilty