BK TP.HCM 2011 dce KIẾN TRÚC MÁY TÍNH CS2011 Khoa Khoa học và Kỹ thuật Máy tính BM Kỹ thuật Máy tính Đinh Đức Anh Vũ http://www.cse.hcmut.edu.vn/~anhvu ©201 1, Dr . Dinh Du c A nh Vu
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 1/43
BKTP.HCM
2011
dce
KIẾN TRÚC MÁY TÍNH
CS2011Khoa Khoa học và Kỹ thuật Máy tính
BM Kỹ thuật Máy tính
Đinh Đức Anh Vũ http://www.cse.hcmut.edu.vn/~anhvu
©2011, Dr. Dinh Duc Anh Vu
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 2/43
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 3/43
2011
dce
• Simplicity favors regularity
– fixed size instructions – small number of instruction formats – opcode always the first 6 bits
• Smaller is faster – limited instruction set
– limited number of registers in register file – limited number of addressing modes
• Make the common case fast – arithmetic operands from the register file (load-store machine) – allow instructions to contain immediate operands
• Good design demands good compromises – Same instruction length – Single instruction format => 3 instruction formats
©2011, Dr. Dinh Duc Anh Vu 3Computer Architecture, Chapter 4
Review: Design Principles
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 4/43
2011
dce
The Processor: Datapath & Control
• We're ready to look at an implementation of the MIPS
• Simplified to contain only: – memory-reference instructions: lw, sw – arithmetic-logical instructions: add , addu, sub, subu, and , or,
xor, nor, slt, sltu – arithmetic-logical immediate instructions: addi, addiu, andi,
ori, xori, slti, sltiu – control flow instructions: beq , j
• Generic implementation: – use the PC to supply the instruction address
and fetch the instruction from memory(and update the PC)
– decode the instruction (and read registers) – execute the instruction
• All instructions (except j) use the ALUafter reading the registers – How? memory-reference? arithmetic? control flow?
©2011, Dr. Dinh Duc Anh Vu 4Computer Architecture, Chapter 4
Fetch
PC=PC+4
DecodeExecute
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 5/43
2011
dce
Abstract Implementation View
• Two types of functional units:
– elements that operate on data values (combinational) – elements that contain state (sequential)
• Single cycle operation• Split memory (Harvard) model – one memory for
instructions and one for data
©2011, Dr. Dinh Duc Anh Vu 5Computer Architecture, Chapter 4
Address Instruction
Instruction
Memory
Write Data
Reg Addr
Reg Addr
Reg Addr
Register
File ALU
Data
Memory
Address
Write Data
Read DataPC
Read
Data
Read
Data
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 6/43
2011
dce
Aside: Clocking Methodologies• The clocking methodology defines when data in a state element is valid
and stable relative to the clock
– State elements – a memory element such as a register – Edge-triggered – all state changes occur on a clock edge
• Typical execution – read contents of state elements -> send values through combinational logic ->
write results to one or more state elements
• Assumes state elements are written on every clock cycle; if not, needexplicit write control signal – write occurs only when both the write control is asserted and the clock edge
occurs
©2011, Dr. Dinh Duc Anh Vu 6Computer Architecture, Chapter 4
Stateelement
1
Stateelement
2
Combinationallogic
clock
one clock cycle
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 7/43
2011
dce
Building a Datapath
• Datapath
– Elements that process data and addressesin the CPU
• Registers, ALUs, mux’s, memories, …
• We will build a MIPS datapath incrementally – Refining the overview design
©2011, Dr. Dinh Duc Anh Vu
Read
AddressInstruction
Instruction
Memory
Add PC
Computer Architecture, Chapter 4 7
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 8/43
2011
dce
Fetching Instructions
• Fetching instructions involves – reading the instruction from the Instruction Memory – updating the PC value to be the address of the next (sequential)
instruction
– PC is updated every clock cycle, so it does not need an explicit writecontrol signal just a clock signal – Reading from the Instruction Memory is a combinational activity, so it
doesn’t need an explicit read control signal
©2011, Dr. Dinh Duc Anh Vu 8Computer Architecture, Chapter 4
Read
AddressInstruction
Instruction
Memory
Add
PC
4
clock
Fetch
PC=PC+4
DecodeExecute
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 9/43
2011
dce
Decoding Instructions
• Decoding instructions involves
– sending the fetched instruction’s opcode and function field bitsto the control unit
And
– reading two values from the Register File• Register File addresses are contained in the instruction
©2011, Dr. Dinh Duc Anh Vu 9Computer Architecture, Chapter 4
Instruction
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
Control
unitFetch
PC=PC+4
DecodeExecute
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 10/43
2011
dce
Executing R Format Operations
• R format operations (add , sub, slt, and , or)
– perform operation (op and funct) on values in rs and rt
– store the result back into the Register File (into location rd )
– Note that Register File is not written every cycle (e.g. sw), so weneed an explicit write control signal for the Register File
©2011, Dr. Dinh Duc Anh Vu 10Computer Architecture, Chapter 4
Instruction
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
ALU
overflow
zero
ALU controlRegWrite
R-type: 31 25 20 15 5 0
op rs rt rd funct shamt 10
Fetch
PC=PC+4
DecodeExecute
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 11/43
2011
dce
Executing Load and Store Operations
• Load and store operations have to
– compute a memory address by adding the base register (in
rs) to the 16-bit signed offset field in the instruction• base register was read from the Register File during decode
• offset value in the low order 16 bits of the instruction must besign extended to create a 32-bit signed value
– store value, read from the Register File during decode,
must be written to the Data Memory – load value, read from the Data Memory, must be stored in
the Register File
©2011, Dr. Dinh Duc Anh Vu 1Computer Architecture, Chapter 4
I-Type: op rs rt address offset
31 25 20 15 0
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 12/43
2011
dce
©2011, Dr. Dinh Duc Anh Vu 12Computer Architecture, Chapter 4
Executing Load and Store Operations
Instruction
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
ALU
overflow
zero
ALU controlRegWrite
Data
Memory
Address
Write Data
Read Data
Sign
Extend
MemWrite
MemRead
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 13/43
2011
dce
R-Type/Load/Store Datapath
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 14/43
2011
dce
Executing Branch Operations
• Branch operations have to
– compare the operands read from the Register File during
decode (rs
andrt
values) for equality (zero
ALU output) – compute the branch target address by adding the updatedPC to the sign extended16-bit signed offset field in theinstruction
• “base register” is the updated PC
• offset value in the low order 16 bits of the instruction must besign extended to create a 32-bit signed value and thenshifted left 2 bits to turn it into a word address
©2011, Dr. Dinh Duc Anh Vu 14Computer Architecture, Chapter 4
I-Type: op rs rt address offset
31 25 20 15 0
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 15/43
2011
dce
©2011, Dr. Dinh Duc Anh Vu 15Computer Architecture, Chapter 4
Instruction
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
ALU
zero
ALU control
Sign
Extend16 32
Shift
left 2
Add
4Add
PC
Branch
target
address
(to branchcontrol logic)
Executing Branch Operations, con’t
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 16/43
2011
dce
Executing Jump Operations
• Jump operations have to
– replace the lower 28 bits of the PC with the lower 26bits of the fetched instruction shifted left by 2 bits
©2011, Dr. Dinh Duc Anh Vu 16Computer Architecture, Chapter 4
Read
AddressInstruction
Instruction
Memory
Add
PC
4
Shiftleft 2
Jump
address
26
4
28
J-Type: op 31 25 0
jump target address
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 17/43
2011
dce
Creating a Single Datapath
• Assemble the datapath elements, add control lines as
needed, and design the control path• Fetch, decode and execute each instructions in one
clock cycle – single cycle design – no datapath resource can be used more than once per
instruction, so some must be duplicated (e.g., why we have
a separate Instruction Memory and Data Memory) – to share datapath elements between two different
instruction classes will need multiplexors at the input of theshared elements with control lines to do the selection
• Cycle time is determined by length of the longest path
©2011, Dr. Dinh Duc Anh Vu 17Computer Architecture, Chapter 4
etc an emory ccess
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 18/43
2011
dce
Read
Address
Instruction
Instruction
Memory
Add
PC
4
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
ALU
ovf
zero
ALU controlRegWrite
Data
Memory
Address
Write Data
Read Data
MemWrite
MemReadSign
Extend16 32
Computer Architecture, Chapter 4 18 ©2011, Dr. Dinh Duc Anh Vu
etc an emory ccessPortions
M l i l I i
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 19/43
2011
dce
Multiplexor Insertion
©2011, Dr. Dinh Duc Anh Vu 19Computer Architecture, Chapter 4
MemtoReg
Read
Address
Instruction
Instruction
Memory
Add
PC
4
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
ALU
ovf
zero
ALU controlRegWrite
Data
Memory
Address
Write Data
Read Data
MemWrite
MemReadSign
Extend16 32
ALUSrc
Cl k Di t ib ti
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 20/43
2011
dce
Clock Distribution
MemtoReg
Read
AddressInstruction
Instruction
Memory
Add
PC
4
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
ALU
ovf
zero
ALU control
RegWrite
Data
Memory
Address
Write Data
Read Data
MemWrite
MemReadSign
Extend16 32
ALUSrc
System Clock
clock cycle
Computer Architecture, Chapter 4 20 ©2011, Dr. Dinh Duc Anh Vu
dce
Addi th B h P ti
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 21/43
2011 Adding the Branch Portion
Computer Architecture, Chapter 4 21 ©2011, Dr. Dinh Duc Anh Vu
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
ReadData 1
Read
Data 2
ALU
ovf
zero
ALU controlRegWrite
Data
Memory
Address
Write Data
Read Data
MemWrite
MemRead
SignExtend16 32
MemtoRegALUSrc
Read
AddressInstruction
Instruction
Memory
Add
PC
4
dce
Addi th B h P ti
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 22/43
2011 Adding the Branch Portion
dce
O Si l C t l St t
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 23/43
2011 Our Simple Control Structure
©2011, Dr. Dinh Duc Anh Vu 23Computer Architecture, Chapter 4
We are ignoring some details like registersetup and hold times
• We wait for everything to settle down
– ALU might not produce “right answer” right away – Memory and RegFile reads are combinational (as are
ALU, adders, muxes, shifter, signextender) – Use write signals along with the clock edge to
determine when to write to the sequential elements (to
the PC, to the Register File and to the Data Memory)
• The clock cycle time is determined by the logicdelay through the longest path
2011
dce
Addi g th C t l
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 24/43
2011 Adding the Control
• Selecting the operations to perform (ALU, Register File andMemory read/write)
• Controlling the flow of data (multiplexor inputs)
• Information comes from the 32 bits of the instruction
• Observations
– op field alwaysin bits 31-26 – addr of two
registers to be read are always specified by the rs and rt fields(bits 25-21 and 20-16)
– base register for lw and sw always in rs (bits 25-21)
– addr. of register to be written is in one of two places – in rt (bits20-16) for lw; in rd (bits 15-11) for R-type instructions – offset for beq , lw, and sw always in bits 15-0
I-Type: op rs rt address offset 31 25 20 15 0
shamt R-type:
31 25 20 15 5 0
op rs rt rd funct
10
Computer Architecture, Chapter 4 24 ©2011, Dr. Dinh Duc Anh Vu
lw
sw
beq
add
sub
and
or
slt
2011
dce
(Almost) Complete Single C cle Datapath
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 25/43
2011 (Almost) Complete Single Cycle Datapath
Read
AddressInstr[31-0]
Instruction
Memory
Add
PC
4
Write Data
Read Addr 1
Read Addr 2
Write AddrALU
ovfzero
Data
Memory
Address
Write Data
Read Data
MemWrite
MemRead
Register
File
Read
Data 1
Read
Data 2
RegWrite
Sign
Extend16 32
MemtoRegALUSrc
Shift
left 2
Add
PCSrc
1
0
RegDst
0
1
1
0
1
0
Instr[15-0]
Instr[25-21]
Instr[20-16]
Instr[15 -11]
Computer Architecture, Chapter 4 25 ©2011, Dr. Dinh Duc Anh Vu
4
ALU
control
ALUOp
Instr[5-0]
6
2
2011
dce
ALU Control
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 26/43
2011 ALU Control
• ALU's operation based on instruction type and function code – Load/Store: F = add
– Branch: F = subtract – R-type: F depends on
funct field
• Notice that we are using different encodings than in the book
©2011, Dr. Dinh Duc Anh Vu 26Computer Architecture, Chapter 4
ALU controlinput
Function
0000 and
0001 or
0010 xor0011 nor
0110 add
1110 subtract
1111 set on less than
2011
dce
ALU Control Con’t
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 27/43
2011 ALU Control, Con t
Instr op funct ALUOp action ALUcontrol
lw xxxxxx 00
sw xxxxxx 00
beq xxxxxx 01add 100000 10 add 0110
subt 100010 10 subtract 1110
and 100100 10 and 0000
or 100101 10 or 0001
xor 100110 10 xor 0010
nor 100111 10 nor 0011
slt 101010 10 slt 1111Computer Architecture, Chapter 4 27 ©2011, Dr. Dinh Duc Anh Vu
• Controlling the ALU uses of multiple decoding levels – main control unit generates the ALUOp bits
– ALU control unit generates ALUcontrol bits
2011
dce
ALU Control Truth Table
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 28/43
2011 ALU Control Truth Table
• Four, 6-input truth tables
©2011, Dr. Dinh Duc Anh Vu 28Computer Architecture, Chapter 4
F5 F4 F3 F2 F1 F0 ALUOp1
ALUOp0
ALUcontrol3
ALUcontrol2
ALUcontrol1
ALUcontrol0
X X X X X X 0 0 0 1 1 0
X X X X X X 0 1 1 1 1 0
X X 0 0 0 0 1 0 0 1 1 0
X X 0 0 1 0 1 0 1 1 1 0
X X 0 1 0 0 1 0 0 0 0 0
X X 0 1 0 1 1 0 0 0 0 1
X X 0 1 1 0 1 0 0 0 1 0
X X 0 1 1 1 1 0 0 0 1 1
X X 1 0 1 0 1 0 1 1 1 1
2011
dce
ALU Control Logic
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 29/43
ALU Control Logic
• From the truth table can design the ALU Control logic
©2011, Dr. Dinh Duc Anh Vu 29Computer Architecture, Chapter 4
Instr[3]Instr[2]
Instr[1]
Instr[0]
ALUOp1
ALUOp0
ALUcontrol3
ALUcontrol2
ALUcontrol1
ALUcontrol0
2011
dce
(Almost) Complete Datapath with Control Unit
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 30/43
©2011, Dr. Dinh Duc Anh Vu 30Computer Architecture, Chapter 4
Read
AddressInstr[31-0]
Instruction
Memory
Add
PC
4
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
ALU
ovf
zero
RegWrite
Data
Memory
Address
Write Data
Read Data
MemWrite
MemRead
Sign
Extend16 32
MemtoReg
ALUSrc
Shift
left 2
Add
PCSrc
RegDst
ALU
control
1
1
1
0
00
0
1
ALUOp
Instr[5-0]
Instr[15-0]
Instr[25-21]
Instr[20-16]
Instr[15 -11]
Control
UnitInstr[31-26]
Branch
(Almost) Complete Datapath with Control Unit
4
6
2
2011
dce
Main Control Unit
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 31/43
Main Control Unit
• Completely determined by the instruction opcode field
– Note that a multiplexor whose control input is 0 has adefinite action, even if it is not used in performing theoperation
©2011, Dr. Dinh Duc Anh Vu 31Computer Architecture, Chapter 4
Instr RegDst ALUSrc MemReg RegWr MemRd MemWr Branch ALUOp
R-type
000000
1 0 0 1 0 0 0 10
lw100011
0 1 1 1 1 0 0 00
sw101011
x 1 x 0 0 1 0 00
beq000100
x 0 x 0 0 0 1 01
2011
dce
R-type Instruction – Data/Control Flow
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 32/43
Read
AddressInstr[31-0]
Instruction
Memory
Add
PC
4
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
ALU
ovf
zero
RegWrite
Data
Memory
Address
Write Data
Read Data
MemWrite
MemRead
Sign
Extend16 32
MemtoReg
ALUSrc
Shift
left 2
Add
PCSrc
RegDst
ALU
control
1
1
1
0
00
0
1
ALUOp
Instr[5-0]
Instr[15-0]
Instr[25-21]
Instr[20-16]
Instr[15 -11]
Control
UnitInstr[31-26]
Branch
Computer Architecture, Chapter 4 32 ©2011, Dr. Dinh Duc Anh Vu
R-type Instruction – Data/Control Flow
2011dce
sw Instruction – Data/Control Flow
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 33/43
Read
AddressInstr[31-0]
Instruction
Memory
Add
PC
4
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
ALU
ovf
zero
RegWrite
Data
Memory
Address
Write Data
Read Data
MemWrite
MemRead
Sign
Extend16 32
MemtoReg
ALUSrc
Shift
left 2
Add
PCSrc
RegDst
ALU
control
1
1
1
0
00
0
1
ALUOp
Instr[5-0]
Instr[15-0]
Instr[25-21]
Instr[20-16]
Instr[15 -11]
Control
UnitInstr[31-26]
Branch
Computer Architecture, Chapter 4 33 ©2011, Dr. Dinh Duc Anh Vu
sw Instruction – Data/Control Flow
2011dce
lw Instruction – Data/Control Flow
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 34/43
Read
AddressInstr[31-0]
Instruction
Memory
Add
PC
4
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
ALU
ovf
zero
RegWrite
Data
Memory
Address
Write Data
Read Data
MemWrite
MemRead
Sign
Extend16 32
MemtoReg
ALUSrc
Shift
left 2
Add
PCSrc
RegDst
ALU
control
1
1
1
0
00
0
1
ALUOp
Instr[5-0]
Instr[15-0]
Instr[25-21]
Instr[20-16]
Instr[15 -11]
Control
UnitInstr[31-26]
Branch
Computer Architecture, Chapter 4 34 ©2011, Dr. Dinh Duc Anh Vu
lw Instruction Data/Control Flow
2011dce
Branch Instruction – Data/Control Flow
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 35/43
Read
AddressInstr[31-0]
Instruction
Memory
Add
PC
4
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
ALU
ovf
zero
RegWrite
Data
Memory
Address
Write Data
Read Data
MemWrite
MemRead
Sign
Extend16 32
MemtoReg
ALUSrc
Shift
left 2
Add
PCSrc
RegDst
ALU
control
1
1
1
0
00
0
1
ALUOp
Instr[5-0]
Instr[15-0]
Instr[25-21]
Instr[20-16]
Instr[15 -11]
Control
UnitInstr[31-26]
Branch
Computer Architecture, Chapter 4 35 ©2011, Dr. Dinh Duc Anh Vu
Branch Instruction Data/Control Flow
2011dce
Control Unit Logic
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 36/43
Control Unit Logic
• From the truth table can design the Main Control logic
©2011, Dr. Dinh Duc Anh Vu 36Computer Architecture, Chapter 4
Instr[31]
Instr[30]Instr[29]Instr[28]Instr[27]Instr[26]
R-type lw sw beq RegDst
ALUSrc
MemtoReg
RegWrite
MemRead
MemWrite
Branch
ALUOp1
ALUOp0
2011dce
Review: Handling Jump Operations
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 37/43
• Jump operation have to
– replace the lower 28 bits of the PC with the lower 26bits of the fetched instruction shifted left by 2 bits
©2011, Dr. Dinh Duc Anh Vu 37Computer Architecture, Chapter 4
Read
AddressInstruction
Instruction
Memory
Add
PC
4
Shift
left 2
Jump
address
26
4
28
J-Type op jump target address
31 0
Review: Handling Jump Operations
2011dce
Adding the Jump Operation
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 38/43
Read
AddressInstr[31-0]
Instruction
Memory
Add
PC
4
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
ALU
ovf
zero
RegWrite
Data
Memory
Address
Write Data
Read Data
MemWrite
MemRead
Sign
Extend16 32
MemtoReg
ALUSrc
Shiftleft 2
Add
PCSrc
RegDst
ALU
control
1
1
1
0
00
0
1
ALUOp
Instr[5-0]
Instr[15-0]
Instr[25-21]
Instr[20-16]
Instr[15 -11]
Control
Unit
Instr[31-26]
Branch
Shift
left 2
0
1
Jump
32
Instr[25-0]
26PC+4[31-28]
28
Computer Architecture, Chapter 4 38 ©2011, Dr. Dinh Duc Anh Vu
Adding the Jump Operation
2011dce
Main Control Unit
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 39/43
Main Control UnitInstr RegDst ALUSrc MemReg RegWr MemRd MemWr Branch ALUOp Jump
R-type
000000
1 0 0 1 0 0 0 10
lw100011
0 1 1 1 1 0 0 00
sw101011
x 1 x 0 0 1 0 00
beq000100
x 0 x 0 0 0 1 01
j000010
•Setting of the MemRd signal (for R-type, sw,beq) depends on the memory design
Computer Architecture, Chapter 4 39 ©2011, Dr. Dinh Duc Anh Vu
2011dce
Single Cycle Implementation – Cycle Time
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 40/43
g y p y
• Unfortunately, though simple, the single cycle
approach is not used because it is very slow• Clock cycle must have the same length for
every instruction
– It is determined by the longest possible path in theprocessor
• What is the longest (slowest) path (slowestinstruction)?
©2011, Dr. Dinh Duc Anh Vu 40Computer Architecture, Chapter 4
2011dce
Performance Issues
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 41/43
Performance Issues
• Longest delay determines clock period
– Critical path: load instruction
– Instruction memory register file ALU data
memory register file
• Not feasible to vary period for differentinstructions
• Violates design principle
– Making the common case fast
• We will improve performance by pipelining
©2011, Dr. Dinh Duc Anh VuComputer Architecture, Chapter 4 41
2011dce
Instruction Critical Paths
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 42/43
st uct o C t ca at s
• Calculate cycle time assuming negligible delays (formuxes, control unit, sign extend, PC access, shift left
2, wires, setup and hold times) except: – Instruction and Data Memory (4 ns) – ALU and adders (2 ns) – Register File access (reads or writes) (1 ns)
©2011, Dr. Dinh Duc Anh Vu 42Computer Architecture, Chapter 4
Instr. I Mem Reg Rd ALU Op D Mem Reg Wr TotalR-type
load
storebeq
jump
2011dce ng e yc e – sa vantages
Advantages
8/2/2019 CA 4 1 Handout
http://slidepdf.com/reader/full/ca-4-1-handout 43/43
Advantages• Uses the clock cycle inefficiently – the clock cycle must
be timed to accommodate the slowest instruction – especially problematic for more complex instructions like
floating point multiply
• May be wasteful of area since some functional units(e.g., adders) must be duplicated since they can not beshared during a clock cycle
but• It is simple and easy to understand
©2011, Dr. Dinh Duc Anh Vu 43Computer Architecture, Chapter 4
Clk
lw sw Waste
Cycle 1 Cycle 2