Top Banner
Nov. 9, 2004 1 Lecture 6: Dynamic Scheduling with Scoreboarding and Tomasulo Algorithm (Section 2.4)
31

Lecture 6: Dynamic Scheduling with Scoreboarding and Tomasulo Algorithm (Section 2.4)

Feb 23, 2016

Download

Documents

Jewel

Lecture 6: Dynamic Scheduling with Scoreboarding and Tomasulo Algorithm (Section 2.4). Scoreboard Implications. Out-of-order completion => WAR, WAW hazards Solutions for WAR CDC 6600: Stall Write to allow Reads to take place; Read registers only during Read Operands stage. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

1Nov. 9, 2004

Lecture 6: Dynamic Scheduling with

Scoreboarding and Tomasulo Algorithm (Section 2.4)

Page 2: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

2

Scoreboard Implications• Out-of-order completion => WAR, WAW hazards

• Solutions for WAR– CDC 6600: Stall Write to allow Reads to take place; Read registers only during Read

Operands stage.• For WAW, must detect hazard: stall in the Issue stage until other completes

• Need to have multiple instructions in execution phase => multiple execution units or pipelined execution units

• Scoreboard replaces ID with 2 stages (Issue and RO)• Scoreboard keeps track of dependencies, state or operations

– Monitors every change in the hardware.– Determines when to read ops, when can execute, when can wb.– Hazard detection and resolution is centralized.

Page 3: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

3

Four Stages of Scoreboard Control1. Issue—decode instructions & check for structural hazards (ID1)

If a functional unit for the instruction is free and no other active instruction has the same destination register (WAW), the scoreboard issues the instruction to the functional unit and updates its internal data structure. If a structural or WAW hazard exists, then the instruction issue stalls, and no further instructions will issue until these hazards are cleared.

2. Read operands—wait until no data hazards, then read operands (ID2) A source operand is available if no earlier issued active instruction is going to write it, or if the

register containing the operand is being written by a currently active functional unit. When the source operands are available, the scoreboard tells the functional unit to proceed to read the operands from the registers and begin execution. The scoreboard resolves RAW hazards dynamically in this step, and instructions may be sent into execution out of order.

Page 4: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

4

Four Stages of Scoreboard Control3.Execution—operate on operands (EX)

The functional unit begins execution upon receiving operands. When the result is ready, it notifies the scoreboard that it has completed execution.

4.Write result—finish execution (WB) Once the scoreboard is aware that the functional unit has completed execution,

the scoreboard checks for WAR hazards. If none, it writes results. If WAR, then it stalls the instruction.Example:

DIVD F0,F2,F4 ADDD F10,F0,F8 SUBD F8,F8,F14 CDC 6600 scoreboard would stall SUBD until ADDD reads operandsCDC 6600 has one integer, 2 FP multipliers, 1 FP divide, 1 FP add units. See Fig. A.50.

Page 5: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

5

Three Parts of the Scoreboard1. Instruction status—which of 4 steps the instruction is in

2. Functional unit status—Indicates the state of the functional unit (FU). 9 fields for each functional unit

Busy—Indicates whether the unit is busy or notOp—Operation to perform in the unit (e.g., + or –)Fi—Destination registerFj, Fk—Source-register numbersQj, Qk—Functional units producing source registers Fj, FkRj, Rk—Flags indicating when Fj, Fk are ready and not yet read. Set to

No after operand are read.

3. Register result status—Indicates which functional unit will write each register, if one exists. Blank when no pending instructions will write that register

Page 6: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

6

Scoreboard Example Cycle 7Instruction status Read Execution WriteInstruction j k Issue operandscomplete ResultLD F6 34+ R2 1 2 3 4LD F2 45+ R3 5 6 7MULTD F0 F2 F4 6SUBD F8 F6 F2 7DIVD F10 F0 F6ADDD F6 F8 F2Functional unit status dest S1 S2 FU for j FU for k Fj? Fk?

TimeName Busy Op Fi Fj Fk Qj Qk Rj RkInteger Yes Load F2 R3 NoMult1 Yes Mult F0 F2 F4 Integer No YesMult2 NoAdd Yes Subd F8 F6 F2 Integer Yes NoDivide No

Register result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

7 FU Mult Integer Add

Note:(1) In-order Issue (2) I2 could not be issued at cycle 2 due to structural hazard (3) I3 issued in cycle 6, but stalled at read because I2 isn’t complete

Page 7: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

27

Page 8: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 8Nov. 2, 2004

Review: Scoreboard• Limitations of 6600 scoreboard

– No forwarding– Limited to instructions in basic block (small window)– Large number of functional units (structural hazards)– Stall on WAR hazards– Stall on WAW hazards

DIV.D F0, F2, F4ADD.D F6, F0, F8S.D F6, 0(R1)SUB.D F8, F10, F14MUL.D F6, F10, F8

WARWAW

Antidependence Output dependence

Name dependence

Page 9: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 9Nov. 2, 2004

Dynamic Scheduling: Tomasulo Algorithm• For IBM 360/91 about 3 years after CDC 6600 that proposed

scoreboarding• Goal: High Performance without special compilers• Differences between Tomasulo Algorithm & Scoreboard

– Control & buffers distributed with Function Units vs. centralized in scoreboard; called “reservation stations”

– Registers in instructions replaced by pointers to reservation station buffer– HW renaming of registers to avoid WAW hazards– Buffer operand values to avoid WAR hazards– Common Data Bus broadcasts results to all FUs– Load and Stores treated as FUs as well

• Why study? Lead to Alpha 21264, HP 8000, MIPS 10000, Pentium II, Power PC 604 …

Page 10: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 10Nov. 2, 2004

FP unit and load-store unit using Tomasulo’s alg.

Page 11: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 11Nov. 2, 2004

Dynamic Algorithm: Tomasulo AlgorithmDIV.D F0, F2, F4ADD.D S, F0, F8S.D S, 0(R1) register renamingSUB.D T, F10, F14MUL.D F6, F10, T

• Implemented through reservation stations (rs) per functional unit– Buffers an operand as soon as it is available – avoids WAR hazards.– Pending instr. designate rs that will provide their inputs – avoids WAW hazards.– The last write in a sequence of same-register-writing actually updates the

register– Decentralize hazard detection and execution control– Instruction results are passed directly to the FU from rs rather than from

registersThrough common data bus (CDB)

Page 12: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 12Nov. 2, 2004

Three Stages of Tomasulo Algorithm1. Issue—get instruction from FP Op Queue

Stall if structural hazard, ie. no space in the rs. If reservation station (rs) is free, the issue logic issues instr to rs & read operands into rs if ready (Register renaming => Solves WAR). Make status of destination register waiting for this latest instn even if the previous instn writing to this register hasn’t completed => Solves WAW hazards.

2. Execution—operate on operands (EX) When both operands are ready then execute;

if not ready, watch CDB for result – Solves RAW3. Write result—finish execution (WB)

Write on Common Data Bus to all awaiting units; mark reservation station available. Write result into dest. reg. if its status is r. => Solves WAW.

• Normal data bus: data + destination (“go to” bus)• CDB: data + source (“come from” bus)

– 64 bits of data + 4 bits of Functional Unit source address– Write if matches expected Functional Unit (produces result)– Does broadcast

Page 13: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 13Nov. 2, 2004

Reservation Station ComponentsOp—Operation to perform in the unit (e.g., + or –)Vj, Vk— Value of the source operand.Qj, Qk— Name of the RS that would provide the source operands. Value zero means the source operands already available in Vj or Vk, or is not necessary. Busy—Indicates reservation station or FU is busy

Register File Status Qi:Qi —Indicates which functional unit will write each register, if one exists. Blank (0) when no pending instructions that will write that register meaning that the value is already available.

Page 14: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

14

Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 x x x Load1 NoLD F2 45+ R3 x x Load2 YesMULTD F0 F2 F4 xSUBD F8 F6 F2 xDIVD F10 F0 F6 xADDD F6 F8 F2 xReservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk ALoad1 NoLoad2 yes Load 45+Regs[R3]

0 Add1 yes SUB Mem[34+Regs[R2]] Load20 Add2 yes ADD Add1 Load2

Add3 No0 Mult1 yes MUL Regs[F4] Load20 Mult2 yes DIV Mem[34+Regs[R2]] Mult1

Register result statusField F0 F2 F4 F6 F8 F10 F12

Address

Tomasulo Status pp. 99

Page 15: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 15Nov. 2, 2004

Tomasulo Example Cycle 0Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 Load1 NoLD F2 45+ R3 Load2 NoMULTD F0 F2 F4 Load3 NoSUBD F8 F6 F2DIVD F10 F0 F6ADDD F6 F8 F2Reservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk0 Add1 No0 Add2 No

Add3 No0 Mult1 No0 Mult2 No

Register result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

0 FU

Address

Page 16: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 16Nov. 2, 2004

Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 1 Load1 YesLD F2 45+ R3 Load2 NoMULTD F0 F2 F4 Load3 NoSUBD F8 F6 F2DIVD F10 F0 F6ADDD F6 F8 F2Reservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk0 Add1 No0 Add2 No

Add3 No0 Mult1 No0 Mult2 No

Register result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

1 FU Load1

Address34+R2

Tomasulo Example Cycle 1

Page 17: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 17Nov. 2, 2004

Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 1 2- Load1 YesLD F2 45+ R3 2 Load2 YesMULTD F0 F2 F4 Load3 NoSUBD F8 F6 F2 Assume Load takes 2 cyclesDIVD F10 F0 F6ADDD F6 F8 F2Reservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk0 Add1 No0 Add2 No

Add3 No0 Mult1 No0 Mult2 No

Register result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

2 FU Load2 Load1

Address34+R245+R3

Tomasulo Example Cycle 2

Page 18: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 18Nov. 2, 2004

Tomasulo Example Cycle 3Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 1 2--3 Load1 YesLD F2 45+ R3 2 3- Load2 YesMULTD F0 F2 F4 3 Load3 NoSUBD F8 F6 F2DIVD F10 F0 F6ADDD F6 F8 F2Reservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk0 Add1 No0 Add2 No

Add3 No0 Mult1 Yes Mult R(F4) Load20 Mult2 No

Register result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

3 FU Mult1 Load2 Load1

read value

Address34+R245+R3

Page 19: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 19Nov. 2, 2004

Tomasulo Example Cycle 4Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 1 2--3 4 Load1 NoLD F2 45+ R3 2 3--4 Load2 YesMULTD F0 F2 F4 3 Load3 NoSUBD F8 F6 F2 4DIVD F10 F0 F6ADDD F6 F8 F2Reservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk0 Add1 Yes Sub M(A1) Load20 Add2 No

Add3 No0 Mult1 Yes Mult R(F4) Load20 Mult2 No

Register result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

4 FU Mult1 Load2 M(A1) Add1

Address

45+R3

Page 20: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 20Nov. 2, 2004

Tomasulo Example Cycle 5Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 1 2--3 4 Load1 NoLD F2 45+ R3 2 3--4 5 Load2 NoMULTD F0 F2 F4 3 Load3 NoSUBD F8 F6 F2 4DIVD F10 F0 F6 5ADDD F6 F8 F2Reservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk2 Add1 Yes Sub M(A1) M(A2)0 Add2 No

Add3 No10 Mult1 Yes Mult M(A2) R(F4)

0 Mult2 Yes Div M(A1) Mult1Register result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

5 FU Mult1 M(A2) M(A1) Add1 Mult2

Address

Page 21: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 21Nov. 2, 2004

Tomasulo Example Cycle 6Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 1 2--3 4 Load1 NoLD F2 45+ R3 2 3--4 5 Load2 NoMULTD F0 F2 F4 3 6 -- Load3 NoSUBD F8 F6 F2 4 6 --DIVD F10 F0 F6 5ADDD F6 F8 F2 6Reservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk1 Add1 Yes Sub M(A1) M(A2)0 Add2 Yes Add M(A2) Add1

Add3 No9 Mult1 Yes Mult M(A2) R(F4)0 Mult2 Yes Div M(A1) Mult1

Register result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

6 FU Mult1 M(A2) Add2 Add1 Mult2

Address

Page 22: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 22Nov. 2, 2004

Tomasulo Example Cycle 7Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 1 2--3 4 Load1 NoLD F2 45+ R3 2 3--4 5 Load2 NoMULTD F0 F2 F4 3 6 -- Load3 NoSUBD F8 F6 F2 4 6 -- 7DIVD F10 F0 F6 5ADDD F6 F8 F2 6Reservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk0 Add1 Yes Sub M(A1) M(A2)0 Add2 Yes Add M(A2) Add1

Add3 No8 Mult1 Yes Mult M(A2) R(F4)0 Mult2 Yes Div M(A1) Mult1

Register result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

7 FU Mult1 M(A2) Add2 Add1 Mult2

Address

Page 23: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 23Nov. 2, 2004

Tomasulo Example Cycle 8Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 1 2--3 4 Load1 NoLD F2 45+ R3 2 3--4 5 Load2 NoMULTD F0 F2 F4 3 6 -- Load3 NoSUBD F8 F6 F2 4 6 -- 7 8DIVD F10 F0 F6 5ADDD F6 F8 F2 6Reservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk0 Add1 No2 Add2 Yes Add M1-M2 M(A2)

Add3 No7 Mult1 Yes Mult M(A2) R(F4)0 Mult2 Yes Div M(A1) Mult1

Register result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

8 FU Mult1 M(A2) Add2 M1-M2 Mult2

Address

Page 24: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 24Nov. 2, 2004

Tomasulo Example Cycle 9Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 1 2--3 4 Load1 NoLD F2 45+ R3 2 3--4 5 Load2 NoMULTD F0 F2 F4 3 6 -- Load3 NoSUBD F8 F6 F2 4 6 -- 7 8DIVD F10 F0 F6 5ADDD F6 F8 F2 6 9 --Reservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk0 Add1 No1 Add2 Yes Add M1-M2 M(A2)

Add3 No6 Mult1 Yes Mult M(A2) R(F4)0 Mult2 Yes Div M(A1) Mult1

Register result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

9 FU Mult1 M(A2) Add2 M1-M2 Mult2

Address

Page 25: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 25Nov. 2, 2004

Tomasulo Example Cycle 10Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 1 2--3 4 Load1 NoLD F2 45+ R3 2 3--4 5 Load2 NoMULTD F0 F2 F4 3 6 -- Load3 NoSUBD F8 F6 F2 4 6 -- 7 8DIVD F10 F0 F6 5ADDD F6 F8 F2 6 9 -- 10Reservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk0 Add1 No0 Add2 Yes Add M1-M2 M(A2)

Add3 No5 Mult1 Yes Mult M(A2) R(F4)0 Mult2 Yes Div M(A1) Mult1

Register result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

10 FU Mult1 M(A2) Add2 M1-M2 Mult2

Address

Page 26: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 26Nov. 2, 2004

Tomasulo Example Cycle 11Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 1 2--3 4 Load1 NoLD F2 45+ R3 2 3--4 5 Load2 NoMULTD F0 F2 F4 3 6 -- Load3 NoSUBD F8 F6 F2 4 6 -- 7 8DIVD F10 F0 F6 5ADDD F6 F8 F2 6 9 -- 10 11Reservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk0 Add1 No

Add2 NoAdd3 No

4 Mult1 Yes Mult M(A2) R(F4)0 Mult2 Yes Div M(A1) Mult1

Register result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

11 FU Mult1 M(A2) M1-M2+M(A2)M1-M2 Mult2

Address

Page 27: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 27Nov. 2, 2004

Tomasulo Example Cycle 12Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 1 2--3 4 Load1 NoLD F2 45+ R3 2 3--4 5 Load2 NoMULTD F0 F2 F4 3 6 -- Load3 NoSUBD F8 F6 F2 4 6 -- 7 8DIVD F10 F0 F6 5ADDD F6 F8 F2 6 9 -- 10 11Reservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk0 Add1 No

Add2 NoAdd3 No

4 Mult1 Yes Mult M(A2) R(F4)0 Mult2 Yes Div M(A1) Mult1

Register result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

12 FU Mult1 M(A2) M1-M2+M(A2)M1-M2 Mult2

Address

Page 28: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 28Nov. 2, 2004

Tomasulo Example Cycle 15Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 1 2--3 4 Load1 NoLD F2 45+ R3 2 3--4 5 Load2 NoMULTD F0 F2 F4 3 6 -- 15 Load3 NoSUBD F8 F6 F2 4 6 -- 7 8DIVD F10 F0 F6 5ADDD F6 F8 F2 6 9 -- 10 11Reservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk0 Add1 No

Add2 NoAdd3 No

0 Mult1 Yes Mult M(A2) R(F4)0 Mult2 Yes Div M(A1) Mult1

Register result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

15 FU Mult1 M(A2) M1-M2+M(A2)M1-M2 Mult2

Address

Page 29: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 29Nov. 2, 2004

Tomasulo Example Cycle 16Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 1 2--3 4 Load1 NoLD F2 45+ R3 2 3--4 5 Load2 NoMULTD F0 F2 F4 3 6 -- 15 16 Load3 NoSUBD F8 F6 F2 4 6 -- 7 8DIVD F10 F0 F6 5ADDD F6 F8 F2 6 9 -- 10 11Reservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk0 Add1 No

Add2 NoAdd3 NoMult1 No

40 Mult2 Yes Div M*F4 M(A1)Register result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

16 FU M*F4 M(A2) M1-M2+M(A2)M1-M2 Mult2

Address

Page 30: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 30Nov. 2, 2004

Tomasulo Example Cycle 56Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 1 2--3 4 Load1 NoLD F2 45+ R3 2 3--4 5 Load2 NoMULTD F0 F2 F4 3 6 -- 15 16 Load3 NoSUBD F8 F6 F2 4 6 -- 7 8DIVD F10 F0 F6 5 17 -- 56ADDD F6 F8 F2 6 9 -- 10 11Reservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk0 Add1 No

Add2 NoAdd3 NoMult1 No

0 Mult2 Yes Div M*F4 M(A1)Register result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

56 FU M*F4 M(A2) M1-M2+M(A2)M1-M2 Mult2

Address

Page 31: Lecture  6: Dynamic Scheduling with Scoreboarding  and  Tomasulo  Algorithm (Section 2.4)

Lec. 7 31Nov. 2, 2004

Tomasulo Example Cycle 57Instruction status Execution WriteInstruction j k Issue complete Result BusyLD F6 34+ R2 1 2--3 4 Load1 NoLD F2 45+ R3 2 3--4 5 Load2 NoMULTD F0 F2 F4 3 6 -- 15 16 Load3 NoSUBD F8 F6 F2 4 6 -- 7 8DIVD F10 F0 F6 5 17 -- 56 57ADDD F6 F8 F2 6 9 -- 10 11Reservation Stations S1 S2 RS for j RS for k

Time Name Busy Op Vj Vk Qj Qk0 Add1 No

Add2 NoAdd3 NoMult1 No

0 Mult2 NoRegister result statusClock F0 F2 F4 F6 F8 F10 F12 ... F30

57 FU M*F4 M(A2) M1-M2+M(A2)M1-M2 result

Address