Designing a Single Cycle Datapath or The Do-It-Yourself CPU Kit Reading 4.4 – HW due Monday Peer Instruction Lecture Materials for Computer Architecture by Dr. Leo Porter is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 U nported License .
45
Embed
Designing a Single Cycle Datapath or The Do-It-Yourself CPU Kit Reading 4.4 – HW due Monday Peer Instruction Lecture Materials for Computer Architecture.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Designing a Single Cycle Datapath
or
The Do-It-Yourself CPU Kit
Reading 4.4 – HW due Monday
Peer Instruction Lecture Materials for Computer Architecture by Dr. Leo Porter is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
• Today’s Topic: Datapath Design, then Control Design
Control
Datapath
Memory
Processor
Input
Output
The Big Picture: The Performance Perspective
• Processor design (datapath and control) will determine:– Clock cycle time
– Clock cycles per instruction
• Starting today:– Single cycle processor:
Advantage: One clock cycle per instruction Disadvantage: long cycle time
• ET = Insts * CPI * Cycle Time Execute anentire instruction
• We're ready to look at an implementation of the MIPS simplified to contain only:– memory-reference instructions: lw, sw – arithmetic-logical instructions: add, sub, and, or, slt– control flow instructions: beq
• Generic Implementation:– use the program counter (PC) to supply instruction address
– get the instruction from memory
– read registers
– use the instruction to decide exactly what to do
The Processor: Datapath & Control
Let’s look at some regularity in our instructions
Review: Two Types of Logic Components
StateElement
clk
A
BC = f(A,B,state)
CombinationalLogic
A
BC = f(A,B)
Clocking Methodology
• All storage elements are clocked by the same clock edge
Clk
Don’t Care
Setup Hold
.
.
.
.
.
.
.
.
.
.
.
.
Setup Hold
Consequently, our cycle time will be the sum of:(a) The Clock-to-Q time of the input registers.(b) The longest delay path through the combinational logic block.(c) The set up time of the output register.(d) And finally the clock skew.In order to avoid hold time violation, you have to make sure this inequality is fulfilled. ---- DRAW CT
Which is correct about the ALU and memory in MIPS?
A. The ALU always performs an operation before accessing data memoryB. The ALU sometimes performs an operation before accessing data memoryC. Data memory is always accessed before performing an ALU operationD. Data memory is sometimes accessed before performing an ALU operationE. None of the above.
Isomorphic
Which is correct about the ALU and the register file in MIPS?
A. The ALU always performs an operation before accessing the register fileB. The ALU sometimes performs an operation before accessing the register fileC. The register file is always accessed before performing an ALU operationD. The register file is sometimes accessed before performing an ALU operationE. None of the above.
Isomorphic
So what does this tell us?
Draw the register file before ALU before memory
Register Transfer Language (RTL)
• is a mechanism for describing the movement and manipulation of data between storage elements:
R[3] <- R[5] + R[7]
PC <- PC + 4 + R[5]
R[rd] <- R[rs] + R[rt]
R[rt] <- Mem[R[rs] + immed]
We’ll be using this from time to time – its just a shorthand for what is going on in hardware, we’ll use it in a second
Review: The MIPS Instruction Formats
• All MIPS instructions are 32 bits long. The three instruction formats:
R-type
I-type
J-typeop target address
02631
6 bits 26 bits
op rs rt rd shamt funct
061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
op rs rt immediate
016212631
6 bits 16 bits5 bits5 bits
Before we start designing our processor – we need to know how the instructions look alike.
MIPS is simple – only 3 formats and they have some common features. Let’s look more closely at the few instructions we are focusing on today.
The MIPS Subset
• R-type– add rd, rs, rt
– sub, and, or, slt
• LOAD and STORE– lw rt, rs, imm16
– sw rt, rs, imm16
• BRANCH:– beq rs, rt, imm16
op rs rt rd shamt funct
061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
op rs rt immediate
016212631
6 bits 16 bits5 bits5 bits
op rs rt displacement
016212631
6 bits 16 bits5 bits5 bits
PC = PC+4
R[rd] = R[rs] OP R[rt]
PC = PC+4
R[rt] = Mem[R[rs] + SE(imm)] OR
Mem[R[rs] + SE(imm)] = R[rt]
ZERO = (R[rs] – R[rt] == 0)
PC = if(ZERO) PC + 4+ (SE(Imm)<<2)
Else PC = PC+4
BEFORE GOING ON… quick reminder…
Storage Element: Register
• Register– Similar to the D Flip Flop except
N-bit input and output Write Enable input
– Write Enable: 0: Data Out will not change 1: Data Out will become Data In (on the clock edge)
Clk
Data In
Write Enable
N N
Data Out
Which of these describes our register file?
A. Two 32-bit outputs, 3 5-bit inputs, clk input, 1-bit control input
B. Two 32-bit outputs, 3 32-bit inputs, clk input, 1-bit control input
C. Two 32-bit outputs, 2 5-bit inputs, 1 32-bit input, clk input, 1-bit control input
D. Two 32-bit outputs, 2 32-bit inputs, 1 32-bit input, clk input, 1-bit control input
E. None of the above
Register File
Clk
Write Data
RegWrite
32
32
Read Data 1
32
Read Data 2
32 32-bitRegisters
5
5
5
RR1
RR2
WR
Which of these describes our memory (for now)?
A. One 32-bit output, 1 5-bit input, 1 32-bit input, clk input, 1-bit control input, 1 bit control input
B. One 32-bit output, 2 5-bit inputs, clk input, 1-bit control input, 1 bit control input
C. One 32-bit output, 2 32-bit inputs, clk input, 2 1-bit control inputs
D. One 32-bit output, 1 32-bit input, clk input, 2 1-bit control inputs
E. None of the above
Memory
Clk
Write Data
MemWrite
32 32
Read Data
Address
MemRead
Can we layout a high-level design to do this?
Draw as much as you can implementing one instruction at a time – get the students involved
You’ll want to do something like this for your lab
Putting it All Together: A Single Cycle Datapath
• We have everything except control signals (later)
Ignoring control - which instruction does this active datapath represent
A. R-typeB. lwC. swD. BeqE. None of the above
Active Single-Cycle Datapath
Ignoring control - which instruction does this active datapath represent
A. R-typeB. lwC. swD. BeqE. None of the above
Active Single-Cycle Datapath
Ignoring control - which instruction does this active datapath represent
A. R-typeB. lwC. swD. BeqE. None of the above
Active Single-Cycle Datapath
Ignoring control - which instruction does this active datapath represent
A. R-typeB. lwC. swD. BeqE. None of the above
Active Single-Cycle Datapath
Key Points
• CPU is just a collection of state and combinational logic
• We just designed a very rich processor, at least in terms of functionality
• ET = IC * CPI * Cycle Time– where does the single-cycle machine fit in?
Control Logic for the Single-Cycle CPU
or
Who’s in charge here?
Putting it All Together: A Single Cycle Datapath• We have everything except control signals
We’re going to connect up all these Signals to a central place, and controlThem from there, based on opcode/funct
Okay, then, what about those Control Signals?
Point out we’ve just hooked these up.
Peer instruction question asking if decode can happen in parallel with register read.
Selection
Select the true statement for MIPS
A Registers can be read in parallel with control signal generation
B Instruction Read can be done in parallel with control signal generation
C Registers can be written in parallel with control signal generation
D The main ALU can execute in parallel with control signal generation
E None of the above
Okay, then, what about those Control Signals?
Start here
Notice control bits come from opcode and sometimes function code bits. R-type are the same except for the ALU
ALU control bits• Recall: 5-function ALU
ALU control input Function Operations 000 And and 001 Or or 010 Add add, lw, sw 110 Subtract sub, beq 111 Slt slt
Take your time here, this isn’t obvious. These are the 3 bit input signals which cause the processor to do what you want.
Full ALU
sign bit (adder output from bit 31)
what signals accomplish: Binvert CIn Operand?or? add?sub?beq?slt?