Design Methodology for Design Methodology for Customizable Programmable Customizable Programmable Processors Processors Berkeley – Finland Day, Oct. 18, 2002 Berkeley – Finland Day, Oct. 18, 2002 Prof. Jarmo Takala Institute of Digital and Computer Systems Tampere University of Technology Tampere, Finland Tel: +358 – 33115 3879; Email: [email protected]
23
Embed
Design Methodology for Customizable Programmable Processors Berkeley – Finland Day, Oct. 18, 2002
Design Methodology for Customizable Programmable Processors Berkeley – Finland Day, Oct. 18, 2002. Prof. Jarmo Takala Institute of Digital and Computer Systems Tampere University of Technology Tampere, Finland Tel: +358 – 33115 3879; Email: [email protected]. Outline. Motivation - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Design Methodology for Customizable Design Methodology for Customizable Programmable ProcessorsProgrammable Processors
Berkeley – Finland Day, Oct. 18, 2002Berkeley – Finland Day, Oct. 18, 2002
Prof. Jarmo TakalaInstitute of Digital and Computer Systems
MotivationTransport Triggered Architecture (TTA)Design Methodology for TTAsResearch at TUTConclusions
J.Takala/TUT Berkeley – Finland Day, Oct.18, 2002
MotivationMotivation
Programmable processors often used in products using digital signal processing (DSP)Flexibility
Ease of verification
Traditionally DSP processor architectures have been developed based on average performance in several benchmark tasks (~100)User applications often contain only subset of total
benchmarks
Efficiency can be improved by customizing architecture according to given tasks
J.Takala/TUT Berkeley – Finland Day, Oct.18, 2002
MotivationMotivationDSP applications are often hard realtime
constrainedexecution should be deterministicdynamic runtime behaviours should be avoided
Static scheduling lends itself to DSP
Current design complexities call for increase in designer productivity
High level languages should be used
DSP algorithms contain inherent parallelism
Instruction level parallelism (ILP) should be maximized
J.Takala/TUT Berkeley – Finland Day, Oct.18, 2002
What is needed?What is needed?
Application driven design process with easy design space exploration
Replace hardware complexity by software complexityCompiler driven process
Use templated architectureFlexible
heterogeneous function units
Modularscalability
Orthogonalcompiler friendly
J.Takala/TUT Berkeley – Finland Day, Oct.18, 2002
Choices for Architecture TemplateChoices for Architecture Template
FrontendFrontend
Application
sequential(superscalar)
dependence
(dataflow)
independence
(EPIC)
independence
(VLIW)
Compilation time(Software)
Determine DependenciesDetermine Dependencies
Determine IndependenciesDetermine Independencies
Bind Function UnitsBind Function Units
Determine DependenciesDetermine Dependencies
Determine IndependenciesDetermine Independencies
Bind Function UnitsBind Function Units
Bind Datapaths & ExecuteBind Datapaths & Execute
Run time(Hardware)
ILP Architectures
J.Takala/TUT Berkeley – Finland Day, Oct.18, 2002
VLIW Gained Popularity in DSPVLIW Gained Popularity in DSP
Re
gis
ter
File
Inst
ruct
ion
Fet
ch
Inst
ruct
ion
Dec
ode
Dat
a M
emor
y
Inst
ruct
ion
Mem
ory
Byp
assi
ng
Net
wo
rkCPU
FU-1
FU-2
FU-3
FU-4
FU-5
J.Takala/TUT Berkeley – Finland Day, Oct.18, 2002
Transport Triggered ArchitectureTransport Triggered Architecture
VLIW drawbacksBypass complexityRegister file complexityRegister file design restricts FU flexibilityOperation encoding format restricts FU flexibility
Reverse programming paradigm [H. Corporaal, 94]
data transport operation
Instruction set contains only a single instruction: move
J.Takala/TUT Berkeley – Finland Day, Oct.18, 2002
From VLIW to TTAFrom VLIW to TTA
Re
gis
ter
File
Byp
assi
ng
Net
wo
rkVLIW
Inst
ruct
ion
Fet
ch
Inst
ruct
ion
Dec
ode
Inst
ruct
ion
Mem
ory
FU-1
FU-2
FU-3
FU-4
FU-5
Dat
a M
emor
y
Inst
ruct
ion
Fet
ch
Inst
ruct
ion
Dec
ode
Byp
assi
ng
Net
wo
rk
FU-1
FU-2
FU-3
FU-4
FU-5
RegisterFileTTA
J.Takala/TUT Berkeley – Finland Day, Oct.18, 2002
TTA DatapathTTA Datapath
IntegerALU
IntegerALU
FloatALU
Boolean RF
Float RF
Integer RF
Socket
Instruction Memory
Data Memory
Load/StoreUnit
Load/StoreUnit
Immediate Unit
Instruction Unit
J.Takala/TUT Berkeley – Finland Day, Oct.18, 2002
Function UnitsFunction Units
Operands written to operand registers (O)
Operation performed when last operand written to trigger register (T)
Pipeline synchronized with control bits (C)
Standard interface FU_ready Result_ready Global_lock
T
optional
Optional shadow register
O
logic
logic
R
logic
C
C
C
C
J.Takala/TUT Berkeley – Finland Day, Oct.18, 2002
ILP ArchitecturesILP Architectures
FrontendFrontend
Application
sequential(superscalar)
dependence
(dataflow)
independence
(EPIC)
independence
(VLIW)
Compilation time
independence
(TTA)
Determine DependenciesDetermine Dependencies
Determine IndependenciesDetermine Independencies
Bind Function UnitsBind Function Units
Bind DatapathsBind Datapaths
ExecuteExecute
Determine DependenciesDetermine Dependencies
Determine IndependenciesDetermine Independencies
Bind Function UnitsBind Function Units
Bind DatapathsBind Datapaths
Run time
J.Takala/TUT Berkeley – Finland Day, Oct.18, 2002
TTA Characteristics: HWTTA Characteristics: HW
ModularCan be constructed with standard building blocks
Very flexible and scalableFU functionality can be arbitrarySupports user defined Special Function Units (SFU)
Lower complexityReduction on # register portsReduced bypass complexityReduction in bypass connectivityReduced register pressureTrivial decoding (implies long instructions)
J.Takala/TUT Berkeley – Finland Day, Oct.18, 2002
TTA Characteristics: SWTTA Characteristics: SW
Traditional operation-triggered instruction:
Transport-triggered instruction:
Reminds dataflow and time-stationary coding
mul r1,r2,r3;
r1mul.o; r2mul.t; mul.rr3;
r1mul.o, r2mul.t; mul.rr3;
or
J.Takala/TUT Berkeley – Finland Day, Oct.18, 2002
TTA Design ToolsTTA Design Tools
Design tools based on TTA architecture template have been developed at Delft University of Technology (DUT), Delft, the NetherlandsMOVE project lead by Prof. Henk CorporaalFully parametric C/C++ Compiler
buses, connections, function units, register files, etc.