Top Banner
BK TP.HCM 2011 dce KIN TRÚC MÁY TÍNH CS2011 Khoa Khoa hc và Kthut Máy tính  BM Kthut Máy tính   Đinh Đức Anh Vũ http://www.cse.hcmut.edu.vn/~anhvu   ©201 1, Dr . Dinh Du c A nh Vu
41

CA 4 1 Handout

Apr 05, 2018

Download

Documents

wind_of_change
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 1/43

BKTP.HCM 

2011

dce

KIẾN TRÚC MÁY TÍNH

CS2011Khoa Khoa học và Kỹ thuật Máy tính 

BM Kỹ thuật Máy tính 

 Đinh Đức Anh Vũ http://www.cse.hcmut.edu.vn/~anhvu 

 ©2011, Dr. Dinh Duc Anh Vu

Page 2: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 2/43

Page 3: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 3/43

2011

dce

• Simplicity favors regularity

 – fixed size instructions – small number of instruction formats – opcode always the first 6 bits

• Smaller is faster – limited instruction set

 – limited number of registers in register file – limited number of addressing modes

• Make the common case fast – arithmetic operands from the register file (load-store machine) – allow instructions to contain immediate operands

• Good design demands good compromises – Same instruction length – Single instruction format => 3 instruction formats 

 ©2011, Dr. Dinh Duc Anh Vu 3Computer Architecture, Chapter 4

Review: Design Principles

Page 4: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 4/43

2011

dce

The Processor: Datapath & Control

• We're ready to look at an implementation of the MIPS

• Simplified to contain only: – memory-reference instructions: lw, sw – arithmetic-logical instructions: add , addu, sub, subu, and , or,

xor, nor, slt, sltu – arithmetic-logical immediate instructions: addi, addiu, andi,

ori, xori, slti, sltiu – control flow instructions: beq , j

• Generic implementation: – use the PC to supply the instruction address

and fetch the instruction from memory(and update the PC)

 – decode the instruction (and read registers) – execute the instruction

• All instructions (except j) use the ALUafter reading the registers – How? memory-reference? arithmetic? control flow?

 ©2011, Dr. Dinh Duc Anh Vu 4Computer Architecture, Chapter 4

Fetch

PC=PC+4

DecodeExecute

Page 5: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 5/43

2011

dce

Abstract Implementation View

• Two types of functional units:

 – elements that operate on data values (combinational) – elements that contain state (sequential)

• Single cycle operation• Split memory (Harvard) model – one memory for

instructions and one for data

 ©2011, Dr. Dinh Duc Anh Vu 5Computer Architecture, Chapter 4

Address Instruction

Instruction

Memory

Write Data

Reg Addr

Reg Addr

Reg Addr

Register

File ALU

Data

Memory

Address

Write Data

Read DataPC

Read

Data

Read

Data

Page 6: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 6/43

2011

dce

Aside: Clocking Methodologies• The clocking methodology defines when data in a state element is valid

and stable relative to the clock

 – State elements – a memory element such as a register – Edge-triggered – all state changes occur on a clock edge

• Typical execution – read contents of state elements -> send values through combinational logic ->

write results to one or more state elements

• Assumes state elements are written on every clock cycle; if not, needexplicit write control signal – write occurs only when both the write control is asserted and the clock edge

occurs

 ©2011, Dr. Dinh Duc Anh Vu 6Computer Architecture, Chapter 4

Stateelement

1

Stateelement

2

Combinationallogic

clock

one clock cycle

Page 7: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 7/43

2011

dce

Building a Datapath

• Datapath

 – Elements that process data and addressesin the CPU

• Registers, ALUs, mux’s, memories, … 

• We will build a MIPS datapath incrementally – Refining the overview design

 ©2011, Dr. Dinh Duc Anh Vu

Read

AddressInstruction

Instruction

Memory

Add PC

Computer Architecture, Chapter 4 7

Page 8: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 8/43

2011

dce

Fetching Instructions

• Fetching instructions involves – reading the instruction from the Instruction Memory – updating the PC value to be the address of the next (sequential)

instruction

 – PC is updated every clock cycle, so it does not need an explicit writecontrol signal just a clock signal – Reading from the Instruction Memory is a combinational activity, so it

doesn’t need an explicit read control signal 

 ©2011, Dr. Dinh Duc Anh Vu 8Computer Architecture, Chapter 4

Read

AddressInstruction

Instruction

Memory

Add

PC

4

clock

Fetch

PC=PC+4

DecodeExecute

Page 9: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 9/43

2011

dce

Decoding Instructions

• Decoding instructions involves

 – sending the fetched instruction’s opcode and function field bitsto the control unit

And

 – reading two values from the Register File• Register File addresses are contained in the instruction

 ©2011, Dr. Dinh Duc Anh Vu 9Computer Architecture, Chapter 4

Instruction

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read

Data 1

Read

Data 2

Control

unitFetch

PC=PC+4

DecodeExecute

Page 10: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 10/43

2011

dce

Executing R Format Operations

• R format operations (add , sub, slt, and , or)

 – perform operation (op and funct) on values in rs and rt

 – store the result back into the Register File (into location rd )

 – Note that Register File is not written every cycle (e.g. sw), so weneed an explicit write control signal for the Register File

 ©2011, Dr. Dinh Duc Anh Vu 10Computer Architecture, Chapter 4

Instruction

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read

Data 1

Read

Data 2

ALU

overflow

zero

ALU controlRegWrite

R-type: 31  25  20  15  5  0 

op  rs  rt  rd  funct shamt 10 

Fetch

PC=PC+4

DecodeExecute

Page 11: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 11/43

2011

dce

Executing Load and Store Operations

• Load and store operations have to

 – compute a memory address by adding the base register (in

rs) to the 16-bit signed offset field in the instruction• base register was read from the Register File during decode

• offset value in the low order 16 bits of the instruction must besign extended to create a 32-bit signed value

 – store value, read from the Register File during decode,

must be written to the Data Memory – load value, read from the Data Memory, must be stored in

the Register File

 ©2011, Dr. Dinh Duc Anh Vu 1Computer Architecture, Chapter 4

I-Type:  op  rs  rt  address offset 

31  25  20  15  0 

Page 12: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 12/43

2011

dce

 ©2011, Dr. Dinh Duc Anh Vu 12Computer Architecture, Chapter 4

Executing Load and Store Operations

Instruction

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read

Data 1

Read

Data 2

ALU

overflow

zero

ALU controlRegWrite

Data

Memory

Address

Write Data

Read Data

Sign

Extend

MemWrite

MemRead

Page 13: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 13/43

2011

dce

R-Type/Load/Store Datapath

Page 14: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 14/43

2011

dce

Executing Branch Operations

• Branch operations have to

 – compare the operands read from the Register File during

decode (rs

andrt

values) for equality (zero

ALU output) – compute the branch target address by adding the updatedPC to the sign extended16-bit signed offset field in theinstruction

• “base register” is the updated PC

• offset value in the low order 16 bits of the instruction must besign extended to create a 32-bit signed value and thenshifted left 2 bits to turn it into a word address

 ©2011, Dr. Dinh Duc Anh Vu 14Computer Architecture, Chapter 4

I-Type:  op  rs  rt  address offset 

31  25  20  15  0 

Page 15: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 15/43

2011

dce

 ©2011, Dr. Dinh Duc Anh Vu 15Computer Architecture, Chapter 4

Instruction

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read

Data 1

Read

Data 2

ALU

zero

ALU control

Sign

Extend16 32

Shift

left 2

Add

4Add

PC

Branch

target

address

(to branchcontrol logic)

Executing Branch Operations, con’t 

Page 16: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 16/43

2011

dce

Executing Jump Operations

• Jump operations have to

 – replace the lower 28 bits of the PC with the lower 26bits of the fetched instruction shifted left by 2 bits

 ©2011, Dr. Dinh Duc Anh Vu 16Computer Architecture, Chapter 4

Read

AddressInstruction

Instruction

Memory

Add

PC

4

Shiftleft 2

Jump

address

26

4

28

J-Type:  op 31  25  0 

jump target address 

Page 17: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 17/43

2011

dce

Creating a Single Datapath

• Assemble the datapath elements, add control lines as

needed, and design the control path• Fetch, decode and execute each instructions in one 

clock cycle – single cycle design – no datapath resource can be used more than once per

instruction, so some must be duplicated (e.g., why we have

a separate Instruction Memory and Data Memory) – to share datapath elements between two different

instruction classes will need multiplexors at the input of theshared elements with control lines to do the selection

• Cycle time is determined by length of the longest path

 ©2011, Dr. Dinh Duc Anh Vu 17Computer Architecture, Chapter 4

etc an emory ccess

Page 18: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 18/43

2011

dce

Read

Address

Instruction

Instruction

Memory

Add

PC

4

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read

Data 1

Read

Data 2

ALU

ovf

zero

ALU controlRegWrite

Data

Memory

Address

Write Data

Read Data

MemWrite

MemReadSign

Extend16 32

Computer Architecture, Chapter 4 18 ©2011, Dr. Dinh Duc Anh Vu

etc an emory ccessPortions

M l i l I i

Page 19: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 19/43

2011

dce

Multiplexor Insertion

 ©2011, Dr. Dinh Duc Anh Vu 19Computer Architecture, Chapter 4

MemtoReg

Read

Address

Instruction

Instruction

Memory

Add

PC

4

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read

Data 1

Read

Data 2

ALU

ovf

zero

ALU controlRegWrite

Data

Memory

Address

Write Data

Read Data

MemWrite

MemReadSign

Extend16 32

ALUSrc

Cl k Di t ib ti

Page 20: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 20/43

2011

dce

Clock Distribution

MemtoReg

Read

AddressInstruction

Instruction

Memory

Add

PC

4

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read

Data 1

Read

Data 2

ALU

ovf

zero

ALU control

RegWrite

Data

Memory

Address

Write Data

Read Data

MemWrite

MemReadSign

Extend16 32

ALUSrc

System Clock

clock cycle

Computer Architecture, Chapter 4 20 ©2011, Dr. Dinh Duc Anh Vu

dce

Addi th B h P ti

Page 21: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 21/43

2011 Adding the Branch Portion

Computer Architecture, Chapter 4 21 ©2011, Dr. Dinh Duc Anh Vu

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

ReadData 1

Read

Data 2

ALU

ovf

zero

ALU controlRegWrite

Data

Memory

Address

Write Data

Read Data

MemWrite

MemRead

SignExtend16 32

MemtoRegALUSrc

Read

AddressInstruction

Instruction

Memory

Add

PC

4

dce

Addi th B h P ti

Page 22: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 22/43

2011 Adding the Branch Portion

dce

O Si l C t l St t

Page 23: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 23/43

2011 Our Simple Control Structure

 ©2011, Dr. Dinh Duc Anh Vu 23Computer Architecture, Chapter 4

We are ignoring some details like registersetup and hold times

• We wait for everything to settle down

 – ALU might not produce “right answer” right away  – Memory and RegFile reads are combinational (as are

ALU, adders, muxes, shifter, signextender) – Use write signals along with the clock edge to

determine when to write to the sequential elements (to

the PC, to the Register File and to the Data Memory)

• The clock cycle time is determined by the logicdelay through the longest path

2011

dce

Addi g th C t l

Page 24: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 24/43

2011 Adding the Control

• Selecting the operations to perform (ALU, Register File andMemory read/write)

• Controlling the flow of data (multiplexor inputs)

• Information comes from the 32 bits of the instruction

• Observations

 – op field alwaysin bits 31-26 – addr of two

registers to be read are always specified by the rs and rt fields(bits 25-21 and 20-16)

 – base register for lw and sw always in rs (bits 25-21)

 – addr. of register to be written is in one of two places – in rt (bits20-16) for lw; in rd (bits 15-11) for R-type instructions – offset for beq , lw, and sw always in bits 15-0

I-Type:  op  rs  rt  address offset 31  25  20  15  0 

shamt R-type: 

31  25  20  15  5  0 

op  rs  rt  rd  funct 

10 

Computer Architecture, Chapter 4 24 ©2011, Dr. Dinh Duc Anh Vu

lw

sw

 beq 

add 

sub

and 

or

slt

2011

dce

(Almost) Complete Single C cle Datapath

Page 25: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 25/43

2011 (Almost) Complete Single Cycle Datapath

Read

AddressInstr[31-0]

Instruction

Memory

Add

PC

4

Write Data

Read Addr 1

Read Addr 2

Write AddrALU

ovfzero

Data

Memory

Address

Write Data

Read Data

MemWrite

MemRead

Register

File

Read

Data 1

Read

Data 2

RegWrite

Sign

Extend16 32

MemtoRegALUSrc

Shift

left 2

Add

PCSrc

1

0

RegDst

0

1

1

0

1

0

Instr[15-0]

Instr[25-21]

Instr[20-16]

Instr[15 -11]

Computer Architecture, Chapter 4 25 ©2011, Dr. Dinh Duc Anh Vu

4

ALU

control

ALUOp

Instr[5-0]

6

2

2011

dce

ALU Control

Page 26: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 26/43

2011 ALU Control

• ALU's operation based on instruction type and function code – Load/Store: F = add

 – Branch: F = subtract – R-type: F depends on

funct field

• Notice that we are using different encodings than in the book

 ©2011, Dr. Dinh Duc Anh Vu 26Computer Architecture, Chapter 4

ALU controlinput 

Function 

0000 and

0001 or

0010 xor0011 nor

0110 add

1110 subtract

1111 set on less than

2011

dce

ALU Control Con’t

Page 27: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 27/43

2011 ALU Control, Con t 

Instr op funct ALUOp action ALUcontrol

lw xxxxxx 00

sw xxxxxx 00

beq xxxxxx 01add 100000 10 add 0110

subt 100010 10 subtract 1110

and 100100 10 and 0000

or 100101 10 or 0001

xor 100110 10 xor 0010

nor 100111 10 nor 0011

slt 101010 10 slt 1111Computer Architecture, Chapter 4 27 ©2011, Dr. Dinh Duc Anh Vu

• Controlling the ALU uses of multiple decoding levels – main control unit generates the ALUOp bits

 – ALU control unit generates ALUcontrol bits

2011

dce

ALU Control Truth Table

Page 28: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 28/43

2011 ALU Control Truth Table

• Four, 6-input truth tables

 ©2011, Dr. Dinh Duc Anh Vu 28Computer Architecture, Chapter 4

F5 F4 F3 F2 F1 F0 ALUOp1

ALUOp0

ALUcontrol3

ALUcontrol2

ALUcontrol1

ALUcontrol0

X X X X X X 0 0 0 1 1 0

X X X X X X 0 1 1 1 1 0

X X 0 0 0 0 1 0 0 1 1 0

X X 0 0 1 0 1 0 1 1 1 0

X X 0 1 0 0 1 0 0 0 0 0

X X 0 1 0 1 1 0 0 0 0 1

X X 0 1 1 0 1 0 0 0 1 0

X X 0 1 1 1 1 0 0 0 1 1

X X 1 0 1 0 1 0 1 1 1 1

2011

dce

ALU Control Logic

Page 29: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 29/43

ALU Control Logic

• From the truth table can design the ALU Control logic

 ©2011, Dr. Dinh Duc Anh Vu 29Computer Architecture, Chapter 4

Instr[3]Instr[2]

Instr[1]

Instr[0]

ALUOp1

ALUOp0

ALUcontrol3

ALUcontrol2

ALUcontrol1

ALUcontrol0

2011

dce

(Almost) Complete Datapath with Control Unit

Page 30: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 30/43

 ©2011, Dr. Dinh Duc Anh Vu 30Computer Architecture, Chapter 4

Read

AddressInstr[31-0]

Instruction

Memory

Add

PC

4

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read

Data 1

Read

Data 2

ALU

ovf

zero

RegWrite

Data

Memory

Address

Write Data

Read Data

MemWrite

MemRead

Sign

Extend16 32

MemtoReg

ALUSrc

Shift

left 2

Add

PCSrc

RegDst

ALU

control

1

1

1

0

00

0

1

ALUOp

Instr[5-0]

Instr[15-0]

Instr[25-21]

Instr[20-16]

Instr[15 -11]

Control

UnitInstr[31-26]

Branch

(Almost) Complete Datapath with Control Unit

4

6

2

2011

dce

Main Control Unit

Page 31: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 31/43

Main Control Unit

• Completely determined by the instruction opcode field

 – Note that a multiplexor whose control input is 0 has adefinite action, even if it is not used in performing theoperation

 ©2011, Dr. Dinh Duc Anh Vu 31Computer Architecture, Chapter 4

Instr RegDst ALUSrc MemReg RegWr MemRd MemWr Branch ALUOp

R-type

000000

1 0 0 1 0 0 0 10

lw100011

0 1 1 1 1 0 0 00

sw101011

x 1 x 0 0 1 0 00

beq000100

x 0 x 0 0 0 1 01

2011

dce

R-type Instruction – Data/Control Flow

Page 32: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 32/43

Read

AddressInstr[31-0]

Instruction

Memory

Add

PC

4

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read

Data 1

Read

Data 2

ALU

ovf

zero

RegWrite

Data

Memory

Address

Write Data

Read Data

MemWrite

MemRead

Sign

Extend16 32

MemtoReg

ALUSrc

Shift

left 2

Add

PCSrc

RegDst

ALU

control

1

1

1

0

00

0

1

ALUOp

Instr[5-0]

Instr[15-0]

Instr[25-21]

Instr[20-16]

Instr[15 -11]

Control

UnitInstr[31-26]

Branch

Computer Architecture, Chapter 4 32 ©2011, Dr. Dinh Duc Anh Vu

R-type Instruction – Data/Control Flow

2011dce

sw Instruction – Data/Control Flow

Page 33: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 33/43

Read

AddressInstr[31-0]

Instruction

Memory

Add

PC

4

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read

Data 1

Read

Data 2

ALU

ovf

zero

RegWrite

Data

Memory

Address

Write Data

Read Data

MemWrite

MemRead

Sign

Extend16 32

MemtoReg

ALUSrc

Shift

left 2

Add

PCSrc

RegDst

ALU

control

1

1

1

0

00

0

1

ALUOp

Instr[5-0]

Instr[15-0]

Instr[25-21]

Instr[20-16]

Instr[15 -11]

Control

UnitInstr[31-26]

Branch

Computer Architecture, Chapter 4 33 ©2011, Dr. Dinh Duc Anh Vu

sw Instruction – Data/Control Flow

2011dce

lw Instruction – Data/Control Flow

Page 34: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 34/43

Read

AddressInstr[31-0]

Instruction

Memory

Add

PC

4

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read

Data 1

Read

Data 2

ALU

ovf

zero

RegWrite

Data

Memory

Address

Write Data

Read Data

MemWrite

MemRead

Sign

Extend16 32

MemtoReg

ALUSrc

Shift

left 2

Add

PCSrc

RegDst

ALU

control

1

1

1

0

00

0

1

ALUOp

Instr[5-0]

Instr[15-0]

Instr[25-21]

Instr[20-16]

Instr[15 -11]

Control

UnitInstr[31-26]

Branch

Computer Architecture, Chapter 4 34 ©2011, Dr. Dinh Duc Anh Vu

lw Instruction  Data/Control Flow

2011dce

Branch Instruction – Data/Control Flow

Page 35: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 35/43

Read

AddressInstr[31-0]

Instruction

Memory

Add

PC

4

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read

Data 1

Read

Data 2

ALU

ovf

zero

RegWrite

Data

Memory

Address

Write Data

Read Data

MemWrite

MemRead

Sign

Extend16 32

MemtoReg

ALUSrc

Shift

left 2

Add

PCSrc

RegDst

ALU

control

1

1

1

0

00

0

1

ALUOp

Instr[5-0]

Instr[15-0]

Instr[25-21]

Instr[20-16]

Instr[15 -11]

Control

UnitInstr[31-26]

Branch

Computer Architecture, Chapter 4 35 ©2011, Dr. Dinh Duc Anh Vu

Branch Instruction  Data/Control Flow

2011dce

Control Unit Logic

Page 36: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 36/43

Control Unit Logic

• From the truth table can design the Main Control logic

 ©2011, Dr. Dinh Duc Anh Vu 36Computer Architecture, Chapter 4

Instr[31]

Instr[30]Instr[29]Instr[28]Instr[27]Instr[26]

R-type lw sw beq RegDst

ALUSrc

MemtoReg

RegWrite

MemRead

MemWrite

Branch

ALUOp1

ALUOp0

2011dce

Review: Handling Jump Operations

Page 37: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 37/43

• Jump operation have to

 – replace the lower 28 bits of the PC with the lower 26bits of the fetched instruction shifted left by 2 bits

 ©2011, Dr. Dinh Duc Anh Vu 37Computer Architecture, Chapter 4

Read

AddressInstruction

Instruction

Memory

Add

PC

4

Shift

left 2

Jump

address

26

4

28

J-Type op jump target address

31 0

Review: Handling Jump Operations

2011dce

Adding the Jump Operation

Page 38: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 38/43

Read

AddressInstr[31-0]

Instruction

Memory

Add

PC

4

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read

Data 1

Read

Data 2

ALU

ovf

zero

RegWrite

Data

Memory

Address

Write Data

Read Data

MemWrite

MemRead

Sign

Extend16 32

MemtoReg

ALUSrc

Shiftleft 2

Add

PCSrc

RegDst

ALU

control

1

1

1

0

00

0

1

ALUOp

Instr[5-0]

Instr[15-0]

Instr[25-21]

Instr[20-16]

Instr[15 -11]

Control

Unit

Instr[31-26]

Branch

Shift

left 2

0

1

Jump

32

Instr[25-0]

26PC+4[31-28]

28

Computer Architecture, Chapter 4 38 ©2011, Dr. Dinh Duc Anh Vu

Adding the Jump Operation

2011dce

Main Control Unit

Page 39: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 39/43

Main Control UnitInstr RegDst ALUSrc MemReg RegWr MemRd MemWr Branch ALUOp  Jump 

R-type

000000

1 0 0 1 0 0 0 10

lw100011

0 1 1 1 1 0 0 00

sw101011

x 1 x 0 0 1 0 00

beq000100

x 0 x 0 0 0 1 01

j000010

•Setting of the MemRd signal (for R-type, sw,beq) depends on the memory design

Computer Architecture, Chapter 4 39 ©2011, Dr. Dinh Duc Anh Vu

2011dce

Single Cycle Implementation – Cycle Time

Page 40: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 40/43

g y p y

• Unfortunately, though simple, the single cycle

approach is not used because it is very slow• Clock cycle must have the same length for

every instruction

 – It is determined by the longest possible path in theprocessor

• What is the longest (slowest) path (slowestinstruction)?

 ©2011, Dr. Dinh Duc Anh Vu 40Computer Architecture, Chapter 4

2011dce

Performance Issues

Page 41: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 41/43

Performance Issues

• Longest delay determines clock period

 – Critical path: load instruction

 – Instruction memory register file ALU data

memory register file

• Not feasible to vary period for differentinstructions

• Violates design principle

 – Making the common case fast

• We will improve performance by pipelining

 ©2011, Dr. Dinh Duc Anh VuComputer Architecture, Chapter 4 41

2011dce

Instruction Critical Paths

Page 42: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 42/43

st uct o C t ca at s

• Calculate cycle time assuming negligible delays (formuxes, control unit, sign extend, PC access, shift left

2, wires, setup and hold times) except: – Instruction and Data Memory (4 ns) – ALU and adders (2 ns) – Register File access (reads or writes) (1 ns)

 ©2011, Dr. Dinh Duc Anh Vu 42Computer Architecture, Chapter 4

Instr. I Mem Reg Rd ALU Op D Mem Reg Wr TotalR-type

load

storebeq

 jump

2011dce ng e yc e – sa vantages

Advantages

Page 43: CA 4 1 Handout

8/2/2019 CA 4 1 Handout

http://slidepdf.com/reader/full/ca-4-1-handout 43/43

Advantages• Uses the clock cycle inefficiently – the clock cycle must

be timed to accommodate the slowest instruction – especially problematic for more complex instructions like

floating point multiply

• May be wasteful of area since some functional units(e.g., adders) must be duplicated since they can not beshared during a clock cycle

but• It is simple and easy to understand

 ©2011, Dr. Dinh Duc Anh Vu 43Computer Architecture, Chapter 4

Clk

lw sw Waste

Cycle 1 Cycle 2