Top Banner
Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin
67

Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

Jan 05, 2016

Download

Documents

Adam Carroll
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

Computer Organization

Lecture Set – 05.2

Chapter 5

Huei-Yung Lin

Page 2: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 2

Outline - Processor Implementation Overview Single-Cycle Implementation Multi-Cycle Implementation

1. Analyze instruction set; get datapath requirements

2. Select datapath components andestablish clocking methodology

3. Assemble datapath that meets requirements

4. Determine control signal values for each instruction

5. Assemble control logic to generate control signals

Pipelined Implementation

Page 3: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 3

Outline - Multicycle Design

Overview Datapath Design Controller Design Microprogramming Exceptions Performance Considerations

Page 4: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 4

Multicycle Execution - Key Idea Break instruction execution into multiple cycles One clock cycle for each major task

1. Instruction Fetch

2. Instruction Decode and Register Fetch

3. Execution, memory address computation, or branch computation

4. Memory access / R-type instruction completion

5. Memory read completion

Share hardware to simplify datapath

Page 5: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 5

Characteristics of Multicycle Design Instructions take more than one cycle

Some instructions take more cycles than others Clock cycle is shorter than single-cycle clock

Reuse of major components simplifies datapath Single ALU for all calculations Single memory for instructions and data But, added registers needed to store values across cycles

Control Unit Implemented w/ State Machine Control signals no longer a function of just the instruction

Page 6: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 6

(Equivalent to Book Fig. 5.25, p. 318)

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

Multicycle Datapath - High-Level View

Page 7: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 7

Review: Steps in Processor Design1. Analyze instruction set; get datapath requirements

2. Select datapath components andestablish clocking methodology

3. Assemble datapath that meets requirements

4. Determine control signal values for each instruction

5. Assemble control logic to generate control signals

Page 8: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 8

Review: Register Transfers

Instruction FetchInstruction <= MEM[PC]; PC = PC + 4;

Instruction ExecutionInstr. Register Transfersadd R[rd] <= R[rs] + R[rt];

sub R[rd] <= R[rs] - R[rt];

and R[rd] <= R[rs] & R[rt];

or R[rd] <= R[rs] | R[rt];

lw R[rt] <= MEM[R[rs] + s_extend(offset)];

sw MEM[R[rs] + sign_extend(offset)] <= R[rt]; PC <= PC + 4

beq if (R[rs] == R[rt]) then PC <- PC+4 + s_extend(offset<<2)

else PC <= PC + 4

j PC <= upper(PC)@(address << 2)

Key idea: break into multiple cycles!

Page 9: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 9

Multicycle Execution Steps

1. Instruction Fetch

2. Instruction Decode and Register Fetch (and branch target calculation)

3. One of the following: Execute R-Type Instruction OR Calculate memory address for load/store OR Perform comparison for branch

4. Memory access for load/storeOR R-type instruction completion (save result)

5. Memory read completion (save result - load only)

Page 10: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 10

Summary - Multicycle Execution Instructions take between 3 and 5 clock cycles

Step nameAction for R-type

instructionsAction for memory-reference

instructionsAction for branches

Action for jumps

Instruction fetch IR = Memory[PC]PC = PC + 4

Instruction A = Reg [IR[25-21]]decode/register fetch B = Reg [IR[20-16]]

ALUOut = PC + (sign-extend (IR[15-0]) << 2)

Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] IIcomputation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2)jump completion

Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut]completion ALUOut or

Store: Memory [ALUOut] = B

Memory read completion Load: Reg[IR[20-16]] = MDR

(Book Fig. 5.30, p. 329)

(1)

(2)

(3)

(4)

(5)

New registers needed to store values across clock steps!

Page 11: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 11

IR = Memory[PC];PC = PC + 4;

4PC + 4

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

Multicycle Execution Step (1)Instruction Fetch

Page 12: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 12

Multicycle Execution Step (2)Instruction Decode and Register FetchA = Reg[IR[25-21]]; (A = Reg[rs])B = Reg[IR[20-15]]; (B = Reg[rt])ALUOut = (PC + sign-extend(IR[15-0]) << 2)

BranchTarget

Address

Reg[rs]

Reg[rt]

PC + 4

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

Page 13: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 13

Multicycle Execution Steps (3)Memory Reference InstructionsALUOut = A + sign-extend(IR[15-0]);

Mem.Address

Reg[rs]

Reg[rt]

PC + 4

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

Page 14: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 14

Multicycle Execution Steps (3)ALU Instruction (R-Type)ALUOut = A op B

R-TypeResult

Reg[rs]

Reg[rt]

PC + 4

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

Page 15: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 15

Multicycle Execution Steps (3)Branch Instructionsif (A == B) PC = ALUOut;

BranchTarget

Address

Reg[rs]

Reg[rt]

BranchTarget

Address

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

Page 16: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 16

Multicycle Execution Step (3)Jump InstructionPC = PC[31-28] concat (IR[25-0] << 2)

JumpAddress

Reg[rs]

Reg[rt]

BranchTarget

Address

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

Page 17: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 17

Multicycle Execution Steps (4)Memory Access - Read (lw)MDR = Memory[ALUOut];

Mem.Data

PC + 4

Reg[rs]

Reg[rt]

Mem.Address

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

Page 18: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 18

Multicycle Execution Steps (4)Memory Access - Write (sw)Memory[ALUOut] = B;

PC + 4

Reg[rs]

Reg[rt]

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

Page 19: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 19

Multicycle Execution Steps (4)ALU Instruction (R-Type)Reg[IR[15:11]] = ALUOUT

R-TypeResult

Reg[rs]

Reg[rt]

PC + 4

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

Page 20: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 20

Multicycle Execution Steps (5)Memory Read Completion (lw)Reg[IR[20-16]] = MDR;

PC + 4

Reg[rs]

Reg[rt]Mem.Data

Mem.Address

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

Page 21: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 21

Summary - Multicycle Execution Instructions take between 3 and 5 clock cycles

Step nameAction for R-type

instructionsAction for memory-reference

instructionsAction for branches

Action for jumps

Instruction fetch IR = Memory[PC]PC = PC + 4

Instruction A = Reg [IR[25-21]]decode/register fetch B = Reg [IR[20-16]]

ALUOut = PC + (sign-extend (IR[15-0]) << 2)

Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] IIcomputation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2)jump completion

Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut]completion ALUOut or

Store: Memory [ALUOut] = B

Memory read completion Load: Reg[IR[20-16]] = MDR

(Book Fig. 5.30, p. 329)

Page 22: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 22

Full Multicycle Datapath

(Equivalent to Book Fig. 5.27, p. 322 without ALU control)

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

Page 23: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 23

Full Multicycle Implementation

ALUControl

ControlUnit

6 6op I[31:26] funct I[5:0]

ALUOp

2

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

ZeroRD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWriteCond

PCWrite

Zero

IRWrite

(Equivalent to BookFig. 5.28, p. 323)

Page 24: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 24

Review: Steps in Processor Design1. Analyze instruction set; get datapath requirements

2. Select datapath components andestablish clocking methodology

3. Assemble datapath that meets requirements

4. Determine control signal values for each instruction

5. Assemble control logic to generate control signals

Page 25: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 25

Multicycle Execution Step (1)FetchIR = Memory[PC];PC = PC + 4;

1

0

1

0

1

0X

0X

0010

1

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

Page 26: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 26

Multicycle Execution Step (2)Instruction Decode and Register FetchA = Reg[IR[25-21]]; (A = Reg[rs])B = Reg[IR[20-15]]; (B = Reg[rt])ALUOut = (PC + sign-extend(IR[15-0]) << 2)

0

0X

0

0X

3

0X

X

010

0

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

Page 27: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 27

0X

Multicycle Execution Steps (3)Memory Reference Instructions

ALUOut = A + sign-extend(IR[15-0]);

X

2

0

0X

0 1

X

010

0

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

Page 28: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 28

Multicycle Execution Steps (3)ALU Instruction (R-Type)ALUOut = A op B

0X

X

0

0

0X

0 1

X

???

0

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

Page 29: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 29

1 if Zero=1

Multicycle Execution Steps (3)Branch Instructionsif (A == B) PC = ALUOut;

0X

X

0

0

X0 1

1

011

0

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

Page 30: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 30

Multicycle Execution Step (3)Jump InstructionPC = PC[21-28] concat (IR[25-0] << 2)

0X

X

X

0

1X

0 X

2

XXX

0

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

Page 31: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 31

Multicycle Execution Steps (4)Memory Access - Read (lw)MDR = Memory[ALUOut];

0X

X

X

1

01

0 X

X

XXX

0

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

Page 32: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 32

Multicycle Execution Steps (4)Memory Access - Write (sw)Memory[ALUOut] = B;

0X

X

X

0

01

1 X

X

XXX

0

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

Page 33: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 33

10

0X

0

X

0

XXX

X

X

1

15 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

Multicycle Execution Step (4)ALU Instruction (R-Type)Reg[IR[15:11]] = ALUOut; (Reg[Rd] = ALUOut)

Page 34: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 34

Multicycle Execution Steps (5)Memory Read Completion (lw)

Reg[IR[20-16]] = MDR;

1

0

0

X

0

0X

0 X

X

XXX

0

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

Page 35: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 35

Review: Steps in Processor Design1. Analyze instruction set; get datapath requirements

2. Select datapath components andestablish clocking methodology

3. Assemble datapath that meets requirements

4. Determine control signal values for each instruction

5. Assemble control logic to generate control signals

Page 36: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 36

Full Multicycle Implementation

ALUControl

ControlUnit

6 6op I[31:26] funct I[5:0]

ALUOp

2

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

ZeroRD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWriteCond

PCWrite

Zero

IRWrite

Page 37: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 37

Multicycle Control Unit

Review: single-cycle implementation All control signals can be determined by current instruction +

condition Implemented in combinational logic

Multicycle implementation is different Control signals depend on

current instruction which clock cycle is currently being executed

Implemented as Finite State Machine (FSM) OR Microprogrammed Implementation - “stylized FSM”

Page 38: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 38

Review - Finite State Machines What’s in the Finite State Machine

Combinational Logic Storage Elements (Flip-flops)

D Q

D Q

CombinationalLogic

Inputs Outputs

CurrentState

Next State

Page 39: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 39

Review - Finite State Machines Behavior characterized by

States - unique values of flip-flops e.g., “101” Transitions between states, depending on

current state inputs

Output values, depending on current state inputs

Describing FSMs: State Diagrams State Transition Tables

Page 40: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 40

Review - State Diagrams

"Bubbles" - states Arrows - transition edges labeled with condition

expressions Example: Car Alarm

arm

doorhonk

clk

fclk = 1Hz

IDLEBEEP

Honk=1

WAIT

ARM•DOOR

ARMARM

ARM’

ARM’ + ARM•DOOR’ = ARM’ + DOOR’

Page 41: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 41

State Transition Table

Transition List - lists edges in State DiagramPS Condition NS Output

IDLE ARM' + DOOR' IDLE 0

IDLE ARM*DOOR BEEP 0

BEEP ARM WAIT 1

BEEP ARM' IDLE 1

WAIT ARM BEEP 0

WAIT ARM' IDLE 0

IDLEBEEP

Honk=1

WAIT

ARM•DOOR

ARMARM

ARM’

ARM’ + ARM•DOOR’ = ARM’ + DOOR’

Page 42: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 42

State Machine Design

Traditional Approach: Create State Diagram Create State Transition Table Assign State Codes Write Excitation Equations & Minimize

Modern Approach Enter FSM Description into CAD program Synthesize implementation of FSM

Page 43: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 43

FSM Control Unit Implementation FSM Inputs:

Clock Instruction register op field

FSM Outputs: control signals for datapath Enable signals: Register file, Memory, PC, MDR, and IR Multiplexer signals: ALUSrcA, ALUSrcB, etc. ALUOp signal - used for ALU Control as in single-cycle

implementation

Page 44: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 44

Enable Control Signals Asserted (true) in a state if they appear

in the state bubble Assumed Deasserted (false) in a state

if they do not appear in the state bubble

Multiplexer/ALU Control Signals Asserted with a given value in a state

if they appear in a state bubble Assumed Don’t Care if they do not

appear in the state bubble See Figures 5.32 - 5.36

3

MemReadIorD = 0

Memoryaccess

(Op

=‘L

W’)

Enablesignal

Multiplexersignal

Statename

Transition fromprevious state(conditioanl)

Transition tonext state

(unconditioanl)

Describing Controller Function in State Diagrams - Conventions

Page 45: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 45

0 Instruction FetchInstruction decode /

register fetch

1

2

Memory addresscomputation

3

4

5 7

6 8 9Execution

BranchCompletion

JumpCompletion

Memoryaccess

Memoryaccess

R-type completion

Writeback step

Start

MemReadALUSrcA = 0

IorD = 0IRWrite

ALUSrcB = 01ALUOp = 00

PCWritePCSource = 00

(OP = ‘JMP’)

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

(OP = ‘LW’)

(OP = (‘SW’)

MemReadIorD = 1

RegWriteMemToReg=1

RegDst = 0

MemWriteIorD = 1

ALUSrcA = 1ALUSrcB = 00ALUOp = 10

RegDst = 1RegWrite

MemtoReg = 0

ALUSrcA = 1ALUSrcB = 00ALUOp = 01

PCWriteCondPCSource = 01

PCWritePCSource = 10

Multicycle Control - Full State Diagram

Page 46: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 46

State 8Branch

State

(OP

= ‘B

EQ

’)

State 6Execution

States

(OP = ‘R

-Type’)

State 2Mem. Ref

States

(Op = ‘LW’ or Op = ‘S

W’)

MemReadALUSrcA = 0

IorD = 0IRWrite

ALUSrcB = 01ALUOp = 00

PCWritePCSource = 00

State 0

Instruction Fetch

Start

State 1

Instruction Decode /Register Fetch

State 9JumpState

(OP

= ‘

J’)

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

Control FSM - Instruction Fetch / Decode States

Page 47: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 47

(OP = ‘LW’)

State 3 Memory Access

State 4 Write-back step

State 5 Memory Access

(OP = ‘SW’)

Memory Address ComputationState 2

from State 1(OP = ‘LW’ OR OP = ‘SW’)

to State 0

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

MemReadIorD = 1

MemWriteIorD = 1

RegWriteMemToReg=1

RegDst = 0

Control FSM - Memory Reference States

Page 48: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 48

Control FSM - R-Type States

State 7 R-type completion

to State 0

State 6

from State 1(OP = R-Type)

ExecutionALUSrcA = 1

ALUSrcB = 00ALUOp = 10

RegDst = 1RegWrite

MemtoReg = 0

Page 49: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 49

Control FSM - Branch State

State 8 Branch Completion

from State 1(OP = ‘BEQ’)

to State 0

ALUSrcA = 1ALUSrcB = 00ALUOp = 01

PCWriteCondPCSource = 01

Page 50: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 50

Control FSM - Jump State

State 9 Jump Completion

from State 1(OP = ‘J’)

to State 0

PCWritePCSource = 10

Page 51: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 51

Controller Implementation

Typical Implementation: Figure 5-37, p. 338 Variations

Random logic PLA ROM

address lines = inputs data lines = outputs contents = “truth table”

Datapathcontroloutputs

Inputs fromInstr. Reg(opcode)

CombinationalControlLogic

State

NextState

Page 52: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 52

What is the CPI of the Multicycle Implementation? Using measured instruction mix from SPECINT2000

lw 5 cycles25%sw 4 cycles10%R-type 4 cycles52%branch 3 cycles11%jump 3 cycles2%

What is the CPI? CPI = (5 cycles * 0.25) + (4 cycles * 0.10) + (4 cycles * 0.53) +

(3 cycles * 0.11) + (3 cycles * 0.02) CPI = 4.12 cycles per instruction

Performance of a Multicycle Implementation

Page 53: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 53

Performance Continued

Assuming a 200ps clock, what is average execution time/instruction? Sec/Instr = 4.12 CPI * 200ps/cycle) = 824ps/instr

How does this compare to the Single-Cycle Case? Sec/Instr = 1 CPI * 600ps/cycle = 600ps/instr Single-Cycle is 1.38 times faster than Multicycle

Why is Single-Cycle faster than Multicycle? Branch & jump are the same speed (600ps vs 600ps) R-type & store are faster (600ps vs 800ps) Load word is faster (600ps vs 1000ps)

Page 54: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 54

Multicycle Example Problem

Extend the design to implement the “jr” (jump register) instruction:

jr rs PC = Reg[rs] Format:

Steps:1. Review instruction requirements (register transfer)

2. Modify datapath

3. Modify control logic

0 rs 0 0 80

6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

Page 55: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 55

Reg[rs]

Example Problem: Datapath

What needs to be changed?

ALUControl

ControlUnit

6 6op I[31:26] funct I[5:0]

ALUOp

2

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

ZeroRD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWriteCond

PCWrite

Zero

IRWrite

32

1

0

Page 56: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 56

Example Problem: Control

What needs to be changed?

PCWritePCSource = 11

(OP = ‘JR‘)

0 Instruction FetchInstruction decode /

register fetch

1

2

Memory addresscomputation

3

4

5 7

6 8 9Execution

BranchCompletion

JumpCompletion

Memoryaccess

Memoryaccess

R-type completion

Writeback step

Start

MemReadALUSrcA = 0

IorD = 0IRWrite

ALUSrcB = 01ALUOp = 00

PCWritePCSource = 00

(OP = ‘JMP’)

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

(OP = ‘LW’)

(OP = (‘SW’)

MemReadIorD = 1

RegWriteMemToReg=1

RegDst = 0

MemWriteIorD = 1

ALUSrcA = 1ALUSrcB = 00ALUOp = 10

RegDst = 1RegWrite

MemtoReg = 0

ALUSrcA = 1ALUSrcB = 00ALUOp = 01

PCWriteCondPCSource = 01

PCWritePCSource = 10

Page 57: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 57

Exceptions - “Stuff Happens”

Definition: "unexpected change in control flow" Used to handle runtime errors

Overflow Undefined Instruction Hardware malfunction

Used to handle external events, "service" functions Interrupts - external I/O Device request Page fault - virtual memory System call - user request for OS action

Page 58: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 58

Example: Undefined Instructions

0 Instruction FetchInstruction decode /

register fetch

1

2

Memory addresscomputation

3

4

5 7

6 8 9Execution

BranchCompletion

JumpCompletion

Memoryaccess

Memoryaccess R-type completion

Writeback step

Start

MemReadALUSrcA = 0

IorD = 0IRWrite

ALUSrcB = 01ALUOp = 00

PCWritePCSource = 00

(OP = ‘BEQ’)

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

(OP = ‘LW’)

(OP = (‘SW’)

MemReadIorD = 1

RegWriteMemToReg=1

RegDst = 0

MemWriteIorD = 1

ALUSrcA = 1ALUSrcB = 00ALUOp = 10

RegDst = 1RegWrite

MemtoReg = 0

ALUSrcA = 1ALUSrcB = 00ALUOp = 01

PCWriteCondPCSource = 01

PCWritePCSource = 10

(OP ≠ R-TypeOP≠LW,OP–W,

OP≠BEQ)

What happens here ????

Page 59: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 59

What Happens During an Exception Save user state – register values, etc. Take action to handle exception Restore user state and continue execution if possible (e.g., emulate

undefined instr.)

Exception:undefined instruction

Return fromexception

user program

add.s f0,f1,f2

srl r1,r2,2

beq r0,r1,L

sub r5,r3,r2

add r5,r4,r3

bne r4,r3,L2

add r3,r1,r2

Exception Handler(System)

rfe

Page 60: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 60

Two exceptions (for now): Undefined instruction Arithmetic overflow

Add registers to architecture to save state EPC - Exception Program Counter (32 bits) Cause - records cause of exception (32 bits)

Undefined instruction: Cause <- 0 Arithmetic overflow: Cause <- 1

Alternatives used by other architectures Save PC on stack Communicate exception type using Exception Vector

Adding Exceptions to the Multicycle Processor

Page 61: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 61

Implementing Exceptions

Add datapath components - Figure 5.39 ExceptionPC (EPC) - stores PC of offending instruction Cause Register - records the cause of the exception

Modify control - Figures 5.40 Undefined Instruction - state 1 (Instruction decode) Overflow - state 7 (R-type completion)

Page 62: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 62

Implementing Exceptions

Datapath modifications Calculate address of offending instruction (PC-4)

and store in EPC Store 0 or 1 in Cause Add overflow output from ALU (described in Ch. 4) Assign "8000180hex" to PC - fixed location of handler

Control modifications Undefined instruction: add ”default" branch to

Instruction Fetch state Overflow: test after execution state for R-type instructions

Page 63: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 63

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

01

2 MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

EPC

EPCWrite

CAUSE

CauseWrite

MUX

0

1

IntCause.

1

0

3

8000180 hex

Overflow

Adding Exception Support to Datapath

Page 64: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 64

IntCause = 0CauseWrite

ALUSrcA = 0ALUSrcB = 01ALUOp = 01EPCWRITE

PCWritePCSource=11

State 10

UndefinedInstruction

State 0

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

State 1

Instruction Decode /Register Fetch

State 9JumpState

State 8Branch

State

State 6Execution

States

State 2Mem. Ref

States

(OP

= ‘

J’)

(OP

= ‘B

EQ

’)

(OP = ‘R

-Type’)

(Op = ‘LW’ or Op = ‘S

W’)

from State 0

(OP = Other)

Adding Exception Support to Control - Undefined Instruction

Page 65: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 65

IntCause = 1CauseWrite

ALUSrcA = 0ALUSrcB = 01ALUOp = 01EPCWRITE

PCWritePCSource=11

State 10 Overflow

State 0

Overflow

Note: requires storageof overflow condition from

previous state

ALUSrcA = 1ALUSrcB = 00ALUOp = 10

State 6

RegDst = 1RegWrite

MemtoReg = 0

State 7 R-type completion

from State 1(OP = R-Type)

to State 0

Execution

No Overflow

Adding Exception Support to Control - Arithmetic Overflow

Page 66: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 66

What Makes Exceptions Hard The “real” MIPS architecture requires that instruction

causing exception must have "no effect" Implication: undo effects of instruction

decrement PC prevent storage of result during R-type instruction

What about “recursive” exceptions? Must add instructions to save exception registers on stack Disable exceptions / interrupts until registers saved More details in Appendix A (A.7)

Page 67: Computer Organization Lecture Set – 05.2 Chapter 5 Huei-Yung Lin.

H.Y. Lin, CCUEE Computer Organization 67

Summary - Exceptions

Must consider as part of overall design Must find "convenient" places to detect exceptions Must find way to cleanly return from exception Must keep control “small and fast” Much harder in pipelined implementations!