Top Banner
COMPSYS 304 Computer Architecture Speculation & Branching Morning visitors - Paradise Bay, Bay of Islands
15

COMPSYS 304 Computer Architecture Speculation Branching Morning visitors - Paradise Bay, Bay of Islands.

Jan 18, 2018

Download

Documents

Osborne Dorsey

Speculation - General Some functional units almost always idle Make them do some (possibly useful) work rather than idle If the speculation was incorrect, results are simply abandoned No loss in efficiency; Chance of a gain Researchers are actively looking at software prefetch schemes Fetch data well before it’s needed Reduce latency when it’s actually needed Speculative operations have low priority and use idle resources
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: COMPSYS 304 Computer Architecture Speculation  Branching Morning visitors - Paradise Bay, Bay of Islands.

COMPSYS 304

Computer ArchitectureSpeculation & Branching

Morning visitors - Paradise Bay, Bay of Islands

Page 2: COMPSYS 304 Computer Architecture Speculation  Branching Morning visitors - Paradise Bay, Bay of Islands.

Speculation• High Tech Gambling?• Data Prefetch

• Cache instruction dcbt : data cache block touch

• Attempts to bring data into cache• so that it will be “close” when needed

• Allows SIU to use idle bus bandwidth• if there’s no spare bandwidth,

this read can be given low priority• Speculative because

• a branch may occur before it’s used• we speculate that this data may be needed

PowerPC mnemonic -Similar opcodes found in other architectures:SPARC v9, MIPS, …

Page 3: COMPSYS 304 Computer Architecture Speculation  Branching Morning visitors - Paradise Bay, Bay of Islands.

Speculation - General• Some functional units almost always idle

• Make them do some (possibly useful) workrather than idle

• If the speculation was incorrect, results are simply abandoned

• No loss in efficiency; Chance of a gain• Researchers are actively looking at

software prefetch schemes• Fetch data well before it’s needed• Reduce latency when it’s actually needed

• Speculative operations have low priority and use idle resources

Page 4: COMPSYS 304 Computer Architecture Speculation  Branching Morning visitors - Paradise Bay, Bay of Islands.

Branching• Expensive

• 2-3 cycles lost in pipeline• All instructions following branch ‘flushed’

• Bandwidth wasted fetching unused instructions• Stall while branch target is fetched

• We can speculate about the target of a branch• Terminology

• Branch Target : address to which branch jumps

• Branch Taken : control transfers to non- sequential address (target)

• Branch Not Taken : next instruction is executed

Page 5: COMPSYS 304 Computer Architecture Speculation  Branching Morning visitors - Paradise Bay, Bay of Islands.

Branching - Prediction• Branches can be

• unconditional: branch is always taken call subroutine return from subroutine

• conditional: branch depends on state of computation, eg

has loop terminated yet?• Unconditional branches are simple

• New instructions are fetched as soon as the branch is recognized

• As early in the pipeline as possible • Branch units often placed with fetch &

decode stages

Page 6: COMPSYS 304 Computer Architecture Speculation  Branching Morning visitors - Paradise Bay, Bay of Islands.

Branching - Branch Unit• PowerPC 603 logical layout

Page 7: COMPSYS 304 Computer Architecture Speculation  Branching Morning visitors - Paradise Bay, Bay of Islands.

Branching - Speculation• We have the following code: if ( cond ) s1; else s2;

• Superscalar machine • Multiple functional units• Start executing both branches (s1 and s2)• Keep idle functional units busy!

• One is speculative and will be abandoned• Processor will eventually calculate the branch

condition and select which result should be retained (written back)

• MIPS R10000 - up to 4 speculative at once

Page 8: COMPSYS 304 Computer Architecture Speculation  Branching Morning visitors - Paradise Bay, Bay of Islands.

Branching - Speculation• MIPS R10000 -

• Up to 4 speculative at once• Instructions are “tagged” with a 4 bit mask

• Indicates to which branch instruction it belongs

• As soon as condition is determined,mis-predicted instructions are aborted

Page 9: COMPSYS 304 Computer Architecture Speculation  Branching Morning visitors - Paradise Bay, Bay of Islands.

Branching - Prediction• We have a sequence of instructions:

addlw

sub brne L1 or st

? If you were asked to guess which branch should be preferred, which would you choose:

? Next sequential instruction (L2)? Branch target (L1)

L2

L1 Some mixture of arithmetic,load, store, etc, instructions

branch on some condition

Some more arithmetic,load, store, etc, instructions

Page 10: COMPSYS 304 Computer Architecture Speculation  Branching Morning visitors - Paradise Bay, Bay of Islands.

Branching - Prediction• Studies show that branches are taken

most of the time!• Because of loops:

add ;any mix of arith,lw ;load, store, etc,

sub ;instructionsbrne L1 ;branch back to loop start

or ;some more arith,st ;memory, etc instructions

L2

L1

Page 11: COMPSYS 304 Computer Architecture Speculation  Branching Morning visitors - Paradise Bay, Bay of Islands.

Branching - Prediction Rule• A simple prediction rule:

• Take backward branches works amazingly well!• For a loop with n iterations,

this is wrong in 1/n cases only!• A system working on this rule alone would

• detect the backward branch and • start fetching from the branch target

rather than the next instruction

Page 12: COMPSYS 304 Computer Architecture Speculation  Branching Morning visitors - Paradise Bay, Bay of Islands.

Branching - Improving the prediction• Static prediction systems

• Compiler can mark branches• Likely to be taken or not

• Instruction fetch unit will use the marking as advice on which instruction to fetch

• Compiler often able to give the right advice • Loops are easily detected• Other patterns in conditions can be recognized

• Checking for EOF when reading a file• Error checking

Page 13: COMPSYS 304 Computer Architecture Speculation  Branching Morning visitors - Paradise Bay, Bay of Islands.

Branching - Improving the prediction• Dynamic prediction systems

• Program history determines most likely branch• Branch Target Buffers - Another cache!

Page 14: COMPSYS 304 Computer Architecture Speculation  Branching Morning visitors - Paradise Bay, Bay of Islands.

Branching - Branch Target Buffer• Instruction Add[11:3] selects BTB entry• Tag determines “hit”• Stats select taken/not taken

Pentium 4>91% prediction

accuracy -4K entry BHT

(Branch History Table)G4e – 2K entries

Page 15: COMPSYS 304 Computer Architecture Speculation  Branching Morning visitors - Paradise Bay, Bay of Islands.

Superscalar - summary• Superscalar machines have multiple

functional units (FUs)eg 2 x integer ALU, 1 x FPU, 1 x branch, 1 x

load/store• Requires complex IFU

• Able to issue multiple instructions/cycle (typ 4)• Able to detect hazards (unavailability of

operands)• Able to re-order instruction issue

• Aim to keep all the FUs busy• Typically, 6-way superscalars can achieve

instruction level parallelism of 2-3