Top Banner
Chapter One Introduction to Pipelined Processors
38

Chapter One Introduction to Pipelined Processors

Feb 02, 2016

Download

Documents

Hamish

Chapter One Introduction to Pipelined Processors. Principle of Designing Pipeline Processors. (Design Problems of Pipeline Processors). Internal Data Forwarding and Register Tagging. Internal Forwarding and Register Tagging. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chapter One  Introduction to Pipelined Processors

Chapter One Introduction to Pipelined

Processors

Page 2: Chapter One  Introduction to Pipelined Processors

Principle of Designing Pipeline Processors

(Design Problems of Pipeline Processors)

Page 3: Chapter One  Introduction to Pipelined Processors

Internal Data Forwarding and Register Tagging

Page 4: Chapter One  Introduction to Pipelined Processors

Internal Forwarding and Register Tagging

• Internal Forwarding: It is replacing unnecessary memory accesses by register-to-register transfers.

• Register Tagging: It is the use of tagged registers for exploiting concurrent activities among multiple ALUs.

Page 5: Chapter One  Introduction to Pipelined Processors

Internal Forwarding

• Memory access is slower than register-to-register operations.

• Performance can be enhanced by eliminating unnecessary memory accesses

Page 6: Chapter One  Introduction to Pipelined Processors

Internal Forwarding

• This concept can be explored in 3 directions:1.Store – Load Forwarding2.Load – Load Forwarding3.Store – Store Forwarding

Page 7: Chapter One  Introduction to Pipelined Processors

Store – Load Forwarding

Page 8: Chapter One  Introduction to Pipelined Processors

Load – Load Forwarding

Page 9: Chapter One  Introduction to Pipelined Processors

Store – Store Forwarding

Page 10: Chapter One  Introduction to Pipelined Processors

Example

EXAMPLE

Page 11: Chapter One  Introduction to Pipelined Processors

Example

EXAMPLE

Page 12: Chapter One  Introduction to Pipelined Processors

Register Tagging

Page 13: Chapter One  Introduction to Pipelined Processors

Example : IBM Model 91 : Floating Point Execution Unit

Page 14: Chapter One  Introduction to Pipelined Processors

Example : IBM Model 91-FPU

• The floating point execution unit consists of :– Data registers– Transfer paths– Floating Point Adder Unit– Multiply-Divide Unit– Reservation stations– Common Data Bus

Page 15: Chapter One  Introduction to Pipelined Processors
Page 16: Chapter One  Introduction to Pipelined Processors

Example : IBM Model 91-FPU• There are 3 reservation stations for adder

named A1, A2 and A3 and 2 for multipliers named M1 and M2.

• Each station has the source & sink registers and their tag & control fields

• The stations hold operands for next execution.

Page 17: Chapter One  Introduction to Pipelined Processors
Page 18: Chapter One  Introduction to Pipelined Processors

Example : IBM Model 91-FPU

• 3 store data buffers(SDBs) and 4 floating point registers (FLRs) are tagged

• Busy bits in FLR indicates the dependence of instructions in subsequent execution

• Common Data Bus(CDB) is to transfer operands

Page 19: Chapter One  Introduction to Pipelined Processors

Example : IBM Model 91-FPU• There are 11 units to supply information to

CDB: 6 FLBs, 3 adders & 2 multiply/divide unit• Tags for these stations are :

Unit Tag Unit TagFLB1 0001 ADD1 1010FLB2 0010 ADD2 1011FLB3 0011 ADD3 1100FLB4 0100 M1 1000FLB5 0101 M2 1001FLB6 0110

Page 20: Chapter One  Introduction to Pipelined Processors

Example : IBM Model 91-FPU• Internal forwarding can be achieved with

tagging scheme on CDB.• Example: • Let F refers to FLR and FLBi stands for ith FLB

and their contents be (F) and (FLBi)

• Consider instruction sequenceADD F,FLB1 F (F) + (FLB1)

MPY F,FLB2 F (F) x (FLB2)

Page 21: Chapter One  Introduction to Pipelined Processors

Example : IBM Model 91-FPU• During addition :

– Busy bit of F is set to 1– Contents of F and FLB1 is sent to adder A1 – Tag of F is set to 1010 (tag of adder)

Busy Bit = 1 Tag=1010F

Page 22: Chapter One  Introduction to Pipelined Processors

Floating Point

Operand Stack(FLOS)

Tag Sink Tag Source CTRLTag Sink Tag Source CTRL1010 F 0001 FLB1 CTRL

TagsStore 3data buffers 2(SDB) 1

Tag Sink Tag Source CTRLTag Sink Tag Source CTRL

Floating Point Buffers (FLB)

Control

1

2

3

4

5

6

Storage Bus Instruction Unit

Decoder

Adder Multiplier

(Common Data Bus)

Busy Bit = 1 Tag=1010

Page 23: Chapter One  Introduction to Pipelined Processors

Example : IBM Model 91-FPU• Meantime, the decode of MPY reveals F is

busy, then– F should set tag of M1 as 1010 (Tag of adder)– F should change its tag to 1000 (Tag of Multiplier)– Send content of FLB2 to M1

Busy Bit = 1 Tag=1000F

Page 24: Chapter One  Introduction to Pipelined Processors

Floating Point

Operand Stack(FLOS)

Tag Sink Tag Source CTRLTag Sink Tag Source CTRLTag Sink Tag Source CTRL

TagsStore 3data buffers 2(SDB) 1

1010 F 0010 FLB2 CTRLTag Sink Tag Source CTRL

Floating Point Buffers (FLB)

Control

1

2

3

4

5

6

Storage Bus Instruction Unit

Decoder

Adder Multiplier

(Common Data Bus)

Busy Bit = 1 Tag=1000

Before addition

Page 25: Chapter One  Introduction to Pipelined Processors

Floating Point

Operand Stack(FLOS)

Tag Sink Tag Source CTRLTag Sink Tag Source CTRLTag Sink Tag Source CTRL

TagsStore 3data buffers 2(SDB) 1

1000 F 0010 FLB2 CTRLTag Sink Tag Source CTRL

Floating Point Buffers (FLB)

Control

1

2

3

4

5

6

Storage Bus Instruction Unit

Decoder

Adder Multiplier

(Common Data Bus)

Busy Bit = 1 Tag=1000

After addition

Page 26: Chapter One  Introduction to Pipelined Processors

Example : IBM Model 91-FPU

• When addition is done, CDB finds that the result should be sent to M1

• Multiplication is done when both operands are available

Page 27: Chapter One  Introduction to Pipelined Processors

Hazard Detection and Resolution

Page 28: Chapter One  Introduction to Pipelined Processors

Hazard Detection and Resolution

• Hazards are caused by resource usage conflicts among various instructions

• They are triggered by inter-instruction dependencies

Terminologies:• Resource Objects: set of working registers,

memory locations and special flags

Page 29: Chapter One  Introduction to Pipelined Processors

Hazard Detection and Resolution

• Data Objects: Content of resource objects• Each Instruction can be considered as a

mapping from a set of data objects to a set of data objects.

• Domain D(I) : set of resource of objects whose data objects may affect the execution of instruction I.(e.g.Source Registers)

Page 30: Chapter One  Introduction to Pipelined Processors

Hazard Detection and Resolution

• Range R(I): set of resource objects whose data objects may be modified by the execution of instruction I .(e.g. Destination Register)

• Instruction reads from its domain and writes in its range

Page 31: Chapter One  Introduction to Pipelined Processors

Hazard Detection and Resolution

• Consider execution of instructions I and J, and J appears immediately after I.

• There are 3 types of data dependent hazards:1.RAW (Read After Write)2.WAW(Write After Write)3.WAR (Write After Read)

Page 32: Chapter One  Introduction to Pipelined Processors

RAW (Read After Write)

• The necessary condition for this hazard is )()( JDIR

Page 33: Chapter One  Introduction to Pipelined Processors

RAW (Read After Write)

• Example:I1 : LOAD r1,aI2 : ADD r2,r1

• I2 cannot be correctly executed until r1 is loaded

• Thus I2 is RAW dependent on I1

Page 34: Chapter One  Introduction to Pipelined Processors

WAW(Write After Write)• The necessary condition is

)()( JRIR

Page 35: Chapter One  Introduction to Pipelined Processors

WAW(Write After Write)

• ExampleI1 : MUL r1, r2I2 : ADD r1,r4

• Here I1 and I2 writes to same destination and hence they are said to be WAW dependent.

Page 36: Chapter One  Introduction to Pipelined Processors

WAR(Write After Read)

• The necessary condition is )()( JRID

Page 37: Chapter One  Introduction to Pipelined Processors

WAR(Write After Read)

• Example:• I1 : MUL r1,r2• I2 : ADD r2,r3• Here I2 has r2 as destination while I1 uses it as

source and hence they are WAR dependent

Page 38: Chapter One  Introduction to Pipelined Processors

Hazard Detection and Resolution

• Hazards can be detected in fetch stage by comparing domain and range.

• Once detected, there are two methods:1.Generate a warning signal to prevent hazard2.Allow incoming instruction through pipe and

distribute detection to all pipeline stages.