CS 110 Computer Architecture Single-Cycle CPU Datapath & Control Instructor: Sören Schwertfeger http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University 1 Slides based on UC Berkley's CS61C
CS110ComputerArchitecture
Single-CycleCPUDatapath &Control
Instructor:SörenSchwertfeger
http://shtech.org/courses/ca/
School of Information Science and Technology SIST
ShanghaiTech University
1Slides based on UC Berkley's CS61C
ProcessorDesign:5stepsStep1:Analyzeinstructionsettodetermine datapathrequirements
– Meaningofeachinstructionisgivenbyregistertransfers– Datapath mustincludestorageelementforISAregisters– Datapath mustsupporteachregistertransferStep2:Selectsetofdatapath components&establishclockmethodology
Step3:Assembledatapath componentsthatmeettherequirements
Step4:Analyzeimplementationofeachinstructiontodeterminesettingofcontrolpointsthatrealizestheregistertransfer
Step5:Assemblethecontrollogic2
Register-RegisterTiming:OneCompleteCycle(Add/Sub)
Clk
PCRs,Rt,Rd,Op,Func
ALUctr
InstructionMemoryAccessTime
OldValue NewValue
RegWr OldValue NewValue
DelaythroughControlLogic
busA,BRegisterFileAccessTime
OldValue NewValue
busWALUDelay
OldValue NewValue
OldValue NewValue
NewValueOldValue
RegisterWriteOccursHere32
ALUctr
clk
busW
RegWr
32busA
32
busB
5 5
Rw Ra Rb
RegFile
Rs Rt
ALU
5Rd
3
ASingleCycleDatapath
imm16
32
ALUctr
clk
busW
RegWr
32
32busA
32
busB
5 5
Rw Ra Rb
RegFile
Rs
Rt
Rt
RdRegDst
Extender
3216imm16
ALUSrcExtOp
MemtoReg
clk
Data In32
MemWrEqual
Instruction<31:0><21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRtRs
clk
PC
00
4
nPC_sel
PC Ext
Adr
InstMemory
Adder
Adder
Mux
01
0
1
=ALU 0
1WrEn Adr
DataMemory
5
4
• R[rd] = R[rs] op R[rt] (addu rd,rs,rt)– Ra,Rb,andRw comefrominstruction’sRs,Rt,andRdfields
– ALUctr and RegWr:controllogicafterdecodingtheinstruction
• …Alreadydefinedtheregisterfile&ALU
Step3b:Add&Subtract
32Result
ALUctr
clk
busW
RegWr
3232
busA
32busB
5 5 5
Rw Ra Rb32x32-bitRegisters
Rs RtRd
ALUop rs rt rd shamt funct
061116212631
6bits 6bits5bits5bits5bits5bits
5
3c:LogicalOp(or)withImmediate• R[rt]=R[rs]opZeroExt[imm16]
op rs rt immediate016212631
6bits 16bits5bits5bits
immediate016 1531
16bits16bits0000000000000000
WhataboutRtRead?
32
ALUctr
clk
RegWr
32
32busA
32
busB
5 5
Rw Ra Rb
RegFile
Rs
Rt
Rt
Rd
ZeroExt 3216imm16
ALUSrc
01
0
1
ALU5
RegDst
WritingtoRt register(notRd)!!
6
3d:LoadOperations• R[rt]=Mem[R[rs]+SignExt[imm16]]Example:lw rt,rs,imm16
op rs rt immediate016212631
6bits 16bits5bits5bits
32
ALUctr
clk
busW
RegWr
32
32busA
32
busB
5 5
Rw Ra Rb
RegFile
Rs
Rt
Rt
RdRegDst
Extender 3216imm16
ALUSrcExtOp
MemtoReg
clk
01
0
1
ALU 0
1Adr
DataMemory
5
7
• Mem[R[rs]+SignExt[imm16]]=R[rt]Ex.:sw rt, rs, imm16
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
32
ALUctr
clk
busW
RegWr
32
32busA
32
busB
5 5
Rw Ra Rb
RegFile
Rs
Rt
Rt
RdRegDst
Extender 3216imm16
ALUSrcExtOp
MemtoReg
clk
Data In32
MemWr01
0
1
ALU 0
1WrEn Adr
DataMemory
5
8
3e:StoreOperations
3e:StoreOperations• Mem[R[rs]+SignExt[imm16]]=R[rt]Ex.:sw rt, rs, imm16
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
32
ALUctr
clk
busW
RegWr
32
32busA
32
busB
5 5
Rw Ra Rb
RegFile
Rs
Rt
Rt
RdRegDst
Extender 3216imm16
ALUSrcExtOp
MemtoReg
clk
Data In32
MemWr01
0
1
ALU 0
1WrEn Adr
DataMemory
5
9
3f:TheBranchInstruction
beq rs, rt, imm16– mem[PC]Fetchtheinstructionfrommemory– Equal=(R[rs]==R[rt])Calculatebranchcondition– if(Equal)Calculatethenextinstruction’saddress
• PC=PC+4+(SignExt(imm16)x4)
else• PC=PC+4
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
10
Datapath forBranchOperationsbeq rs,rt,imm16
Datapath generatescondition(Equal)
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
Already have mux, adder, need special sign extender for PC, need equal compare (sub?)imm16
PC
00
4 nPC_sel
PC Ext
Adder
Adder
Mux
Inst Address
32
ALUctr
clk
busW
RegWr
32busA
32
busB
5 5
Rw Ra Rb
RegFile
Rs Rt
ALU
5
=
Equal
11
clk
InstructionFetchUnitincludingBranch
• if(Zero==1)thenPC=PC+4+SignExt[imm16]*4;elsePC=PC+4
op rs rt immediate016212631
• HowtoencodenPC_sel?•DirectMUXselect?•Branchinst./notbranchinst.
• Let’spick2ndoption
nPC_sel zero? MUX0 x 01 0 01 1 1
Adr
InstMemory
nPC_selInstruction<31:0>
Equal
nPC_sel
Q:Whatlogicgate?
imm16 clkPC
00
4
PCExt
AdderAdder
Mux
0
1
MUXctrl
12
PuttingitAllTogether:A SingleCycleDatapath
imm16
32
ALUctr
clk
busW
RegWr
32
32busA
32
busB
5 5
Rw Ra Rb
RegFile
Rs
Rt
Rt
RdRegDst
Extender
3216imm16
ALUSrcExtOp
MemtoReg
clk
Data In32
MemWrEqual
Instruction<31:0><21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRtRs
clk
PC
00
4
nPC_sel &Equal
PC Ext
Adr
InstMemory
Adder
Adder
Mux
01
0
1
=ALU 0
1WrEn Adr
DataMemory
5
13
imm16
32
ALUctr
clk
busW
RegWr
32
32busA
32busB
5 5Rw Ra Rb
RegFile
Rs
Rt
Rt
RdRegDst
Extender 3216imm16
ALUSrcExtOp
MemtoReg
clk
Data In32
MemWrEqual
Instruction<31:0><21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRtRs
clk
PC
00
4
nPC_sel &Equal
PC Ext
Adr
InstMemory
Adder
Adder
Mux
01
0
1
=ALU 0
1WrEn Adr
DataMemory
5
QuestionWhatnewinstructionwouldneednonewdatapath hardware?• A:branchifreg==immediate• B:addtworegistersandbranchifresultzero• C:storewithauto-incrementofbaseaddress:
– sw rt,rs,offset//rs incrementedbyoffsetafterstore
• D:shiftleftlogicalbytwobits
14
ProcessorDesign:5stepsStep1:Analyzeinstructionsettodetermine datapathrequirements
– Meaningofeachinstructionisgivenbyregistertransfers– Datapath mustincludestorageelementforISAregisters– Datapath mustsupporteachregistertransferStep2:Selectsetofdatapath components&establishclockmethodology
Step3:Assembledatapath componentsthatmeettherequirements
Step4:Analyzeimplementationofeachinstructiontodeterminesettingofcontrolpointsthatrealizestheregistertransfer
Step5:Assemblethecontrollogic
Datapath ControlSignals• ExtOp: “zero”,“sign”• ALUsrc: 0=> regB;
1 => immed• ALUctr: “ADD”,“SUB”,“OR”• nPC_sel: 1=>branch
• MemWr: 1 => writememory• MemtoReg:0 => ALU;1 => Mem• RegDst: 0 => “rt”;1 => “rd”• RegWr: 1 => writeregister
32
ALUctr
clk
busW
RegWr
32
32busA
32
busB
5 5
Rw Ra Rb
RegFile
Rs
Rt
Rt
RdRegDst
Extender 3216imm16
ALUSrcExtOp
MemtoReg
clk
DataIn32
MemWr01
0
1
ALU 0
1WrEn Adr
DataMemory
5
imm16
clk
PC
00
4nPC_sel &Equal
PCExt
AdderAdder
Mux
InstAddress
0
1
16
GivenDatapath:RTLà Control
ALUctrRegDst ALUSrcExtOp MemtoRegMemWr
Instruction<31:0>
<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
nPC_sel
Address
InstructionMemory
DATAPATH
Control
Op
<0:5>
Fun
RegWr
<26:31>17
RTL:TheAdd Instruction
add rd, rs, rt– MEM[PC] Fetchtheinstructionfrommemory– R[rd]=R[rs]+R[rt] Theactualoperation– PC=PC+4 Calculatethenext instruction’saddress
op rs rt rd shamt funct061116212631
6bits 6bits5bits5bits5bits5bits
18
InstructionFetchUnitattheBeginningofAdd• FetchtheinstructionfromInstructionmemory:Instruction=MEM[PC]– sameforallinstructions
imm16
clk
PC
00
4 nPC_sel
PCExt
AdderAdder
Mux
Inst Address
InstMemory Instruction<31:0>
19
SingleCycleDatapath duringAdd
R[rd]=R[rs]+R[rt]op rs rt rd shamt funct
061116212631
32
ALUctr=ADD
clk
busW
RegWr=1
32
32busA
32
busB
5 5
Rw Ra Rb
RegFile
Rs
Rt
Rt
RdRegDst=1
Extender 3216imm16
ALUSrc=0ExtOp=x
MemtoReg=0
clk
DataIn32
MemWr=0
zero01
0
1
=
ALU 0
1WrEn Adr
DataMemory
5
Instruction<31:0><21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRtRs
nPC_sel=+4 instrfetchunitclk
20
InstructionFetchUnitatEndofAdd• PC=PC+4
– Sameforallinstructionsexcept:BranchandJump
imm16
clk
PC
00
4 nPC_sel=+4
PCExt
AdderAdder
Mux
InstAddress
InstMemory
21
32
ALUctr=
Clk
busW
RegWr=
3232
busA
32busB
55 5
Rw Ra Rb32x32-bitRegisters
Rs
Rt
Rt
RdRegDst=
Extender
Mux
Mux
3216imm16
ALUSrc=
ExtOp=
Mux
MemtoReg=
Clk
DataInWrEn
32Adr
DataMemory
32
MemWr=ALU
InstructionFetchUnit
Clk
Zero
Instruction<31:0>
0
1
0
1
01<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRtRs
• NewPC={PC[31..28],targetaddress,00}
nPC_sel=
SingleCycleDatapathduringJumpop targetaddress
02631J-type jump
25
Jump=
<0:25>
TA26
22
SingleCycleDatapathduringJump
32
ALUctr=x
Clk
busW
RegWr=0
3232
busA
32busB
55 5
Rw Ra Rb32x32-bitRegisters
Rs
Rt
Rt
RdRegDst =x
Extender
Mux
Mux
3216imm16
ALUSrc=x
ExtOp=x
Mux
MemtoReg=x
Clk
DataInWrEn
32Adr
DataMemory
32
MemWr=0ALU
InstructionFetchUnit
Clk
Zero
Instruction<31:0>
0
1
0
1
01<21:25>
<16:20>
<11:15>
<0:15>
RdRtRs
• NewPC={PC[31..28],targetaddress,00}
nPC_sel=?
Jump=1
Imm16
<0:25>
TA26
op targetaddress02631
J-type jump25
23
InstructionFetchUnitattheEndofJump
Adr
InstMemory
AdderAdder
PCClk
00Mux
4
nPC_sel
imm16
Instruction<31:0>
0
1
Zero
nPC_MUX_sel
• NewPC={PC[31..28],targetaddress,00}op targetaddress
02631J-type jump
25
Howdowemodifythistoaccountforjumps?
Jump
24
InstructionFetchUnitattheEndofJump
Adr
InstMemory
AdderAdder
PC
Clk00
Mux
4
nPC_sel
imm16
Instruction<31:0>
0
1
Zero
nPC_MUX_sel
• NewPC={PC[31..28],targetaddress,00}op targetaddress
02631J-type jump
25
Mux
1
0
Jump
TA
4(MSBs)
00
Query• CanZerostillgetasserted?
• DoesnPC_selneedtobe0?• Ifnot,what?
26
25
QuestionWhichofthefollowingisTRUE?A. Theclockcanhaveashorterperiodfor
instructionsthatdon’tusememoryB. TheALUisusedtosetPCtoPC+4when
necessaryC. Worst-delaypathinInstructionFetchunitis
Add+mux delayD. TheCPU’scontrolneedsonlyopcode to
determinethenextPCvaluetoselectE. npc_sel affectsthenextPCaddressonajump
26
SummaryoftheControlSignals(1/2)inst Register Transfer
add R[rd] ← R[rs] + R[rt]; PC ← PC + 4
ALUsrc=RegB, ALUctr=“ADD”, RegDst=rd, RegWr, nPC_sel=“+4”
sub R[rd] ← R[rs] – R[rt]; PC ← PC + 4
ALUsrc=RegB, ALUctr=“SUB”, RegDst=rd, RegWr, nPC_sel=“+4”
ori R[rt] ← R[rs] + zero_ext(Imm16); PC ← PC + 4
ALUsrc=Im, Extop=“Z”, ALUctr=“OR”, RegDst=rt,RegWr, nPC_sel=“+4”
lw R[rt] ← MEM[ R[rs] + sign_ext(Imm16)]; PC ← PC + 4
ALUsrc=Im, Extop=“sn”, ALUctr=“ADD”, MemtoReg, RegDst=rt, RegWr, nPC_sel = “+4”
sw MEM[ R[rs] + sign_ext(Imm16)] ← R[rs]; PC ← PC + 4
ALUsrc=Im, Extop=“sn”, ALUctr = “ADD”, MemWr, nPC_sel = “+4”
beq if (R[rs] == R[rt]) then PC ← PC + sign_ext(Imm16)] || 00else PC ← PC + 4
nPC_sel = “br”, ALUctr = “SUB”
27
SummaryoftheControlSignals(2/2)
add sub ori lw sw beq jumpRegDstALUSrcMemtoRegRegWriteMemWritenPCselJumpExtOpALUctr<2:0>
1001000xAdd
1001000x
Subtract
01010000Or
01110001Add
x1x01001Add
x0x0010x
Subtract
xxx00?1xx
op targetaddress
op rs rt rd shamt funct061116212631
op rs rt immediate
R-type
I-type
J-type
add,sub
ori,lw,sw,beq
jump
funcop 000000 000000 001101 100011 101011 000100 000010AppendixA
100000See 100010 WeDon’tCare:-)
28
BooleanExpressionsforControllerRegDst = add + subALUSrc = ori + lw + swMemtoReg = lwRegWrite = add + sub + ori + lwMemWrite = swnPCsel = beqJump = jump ExtOp = lw + swALUctr[0] = sub + beq (assume ALUctr is 00 ADD, 01 SUB, 10 OR)ALUctr[1] = or
Where:
rtype = ~op5 ∙ ~op4 ∙ ~op3 ∙ ~op2 ∙ ~op1 ∙ ~op0ori = ~op5 ∙ ~op4 ∙ op3 ∙ op2 ∙ ~op1 ∙ op0lw = op5 ∙ ~op4 ∙ ~op3 ∙ ~op2 ∙ op1 ∙ op0sw = op5 ∙ ~op4 ∙ op3 ∙ ~op2 ∙ op1 ∙ op0beq = ~op5 ∙ ~op4 ∙ ~op3 ∙ op2 ∙ ~op1 ∙ ~op0jump = ~op5 ∙ ~op4 ∙ ~op3 ∙ ~op2 ∙ op1 ∙ ~op0
add = rtype ∙ func5 ∙ ~func4 ∙ ~func3 ∙ ~func2 ∙ ~func1 ∙ ~func0sub = rtype ∙ func5 ∙ ~func4 ∙ ~func3 ∙ ~func2 ∙ func1 ∙ ~func0
How do we implement this in
gates?
29
ADD 000000ssssst tttt dddd d0000010 0000SUB 000000ssssst tttt dddd d0000010 0010ORI 001101ssssst tttt iiii iiii iiii iiiiLW 100011ssssst tttt iiii iiii iiii iiiiSW 101011ssssst tttt iiii iiii iiii iiiiBEQ 000100ssssst tttt iiii iiii iiii iiiiJUMP 000010iiiiii iiii iiii iiii iiii iiii
ControllerImplementation
addsuborilwswbeqjump
RegDstALUSrcMemtoRegRegWriteMemWritenPCselJumpExtOpALUctr[0]ALUctr[1]
“AND” logic “OR” logic
opcode func
30
P&HFigure4.17
31
Summary:Single-cycleProcessor• Fivestepstodesignaprocessor:
1.Analyzeinstructionsetàdatapath requirements
2.Selectsetofdatapathcomponents&establishclockmethodology
3.Assembledatapath meetingtherequirements
4.Analyzeimplementationofeachinstructiontodeterminesettingofcontrolpointsthateffectstheregistertransfer.
5.Assemblethecontrollogic• FormulateLogicEquations• DesignCircuits
Control
Datapath
Memory
ProcessorInput
Output
32
LevelsofRepresentation/Interpretation
lw $t0,0($2)lw $t1,4($2)sw $t1,0($2)sw $t0,4($2)
HighLevelLanguageProgram(e.g.,C)
AssemblyLanguageProgram(e.g.,MIPS)
MachineLanguageProgram(MIPS)
HardwareArchitectureDescription(e.g.,blockdiagrams)
Compiler
Assembler
MachineInterpretation
temp=v[k];v[k]=v[k+1];v[k+1]=temp;
0000 1001 1100 0110 1010 1111 0101 10001010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111
LogicCircuitDescription(CircuitSchematicDiagrams)
ArchitectureImplementation
Anythingcanberepresentedasanumber,
i.e.,dataorinstructions
33
NoMoreMagic!
34
I/O systemProcessor
CompilerOperatingSystem(Mac OSX)
Application (ex: browser)
Digital DesignCircuit Design
Instruction Set Architecture
Datapath & Control
Transistors
MemoryHardware
Software Assembler