Top Banner
Snoopy Coherence Protocols Small-scale multiprocessors
21

Snoopy Coherence Protocols Small-scale multiprocessors.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Snoopy Coherence Protocols Small-scale multiprocessors.

Snoopy Coherence Protocols

Small-scale multiprocessors

Page 2: Snoopy Coherence Protocols Small-scale multiprocessors.

Assumptions

• broadcast-style interconnect– e.g. shared bus, free-space optical, …– allows passive listeners

• assume write-back caches– invalidation after a write rather than update

• write-through (update protocol) is also possible

Page 3: Snoopy Coherence Protocols Small-scale multiprocessors.

Invalidate vs. update

• Write-invalidate protocol:– write to shared data: an invalidate is sent to

all caches which snoop and invalidate copies.– read miss: snoop caches to find most recent

copy

• Write-update protocol:– write to shared data: broadcast on bus,

processors snoop and update any copies.– read miss: memory is always up to date.

Page 4: Snoopy Coherence Protocols Small-scale multiprocessors.

Three-state MSI protocol

• Each block of memory is in one state:– Clean in all caches and up-to-date in memory (shared)– Dirty in exactly one cache (modified)– Not in any cache

• Each cache line is in one state:– Modified: cache has only copy, it is writable and dirty– Shared: line can be read– Invalid: line contains no valid data

• Read misses cause the cache to snoop the bus• Write to a shared block is treated as a miss - needs a

(snoopy) bus transaction

Page 5: Snoopy Coherence Protocols Small-scale multiprocessors.

I S

M

Read (miss)

Write (hit)

Write (miss)

Read (hit)

Read or write (hit)

a) Processor actions

Page 6: Snoopy Coherence Protocols Small-scale multiprocessors.

I S

M

Bus write

Bus read – send data to requestor

Bus write

Bus read

b) Bus snooping

-send datato requestor

Bus reador write

Page 7: Snoopy Coherence Protocols Small-scale multiprocessors.

Example

• assume cache line is initially invalid

• consider two addresses, A1 and A2

• assume A1 and A2 map to the same cache line, but A1 != A2– that is, A1 and A2 refer to completely different

places in memory, not adjacent (or nearby) addresses that fit within the same block

Page 8: Snoopy Coherence Protocols Small-scale multiprocessors.

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

Step 1a: Write miss, invalid line

- is A1 cached anywhere?

Page 9: Snoopy Coherence Protocols Small-scale multiprocessors.

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

  M A1 10       Bus write P1 A1      

Step 1b: No other cache responds

- assert ownership

Page 10: Snoopy Coherence Protocols Small-scale multiprocessors.

Wait a minute...

• if we only have one type of read transaction (“Bus read”) how do we tell the difference between memory or another cache responding?

• the bus cycle allows for an “intervention”– more properly, a cache-to-cache intervention– a cache pre-empts the bus and answers

instead of memory

Page 11: Snoopy Coherence Protocols Small-scale multiprocessors.

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

  M A1 10       Bus write P1 A1      

P1: read A1 M A1 10                  

Step 2: Read hit

- no bus action needed

Page 12: Snoopy Coherence Protocols Small-scale multiprocessors.

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

  M A1 10       Bus write P1 A1      

P1: read A1 M A1 10                  

P2: read A1       I     Bus read P2 A1      

Step 3a: Read miss

- does anyone have A1 cached?

Page 13: Snoopy Coherence Protocols Small-scale multiprocessors.

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

  M A1 10       Bus write P1 A1      

P1: read A1 M A1 10                  

P2: read A1       I     Bus read P2 A1      

  S A1 10 S A1 10 Bus write P1 A1 10 A1 10

Step 3b: Cached elsewhere

- P1 replies

Page 14: Snoopy Coherence Protocols Small-scale multiprocessors.

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

  M A1 10       Bus write P1 A1      

P1: read A1 M A1 10                  

P2: read A1       I     Bus read P2 A1      

  S A1 10 S A1 10 Bus write P1 A1 10 A1 10

P2: write 20 to A1 I     M A1 20 Bus write P2 A1      

Step 4: Write hit, shared line

- now P2 owns it

Page 15: Snoopy Coherence Protocols Small-scale multiprocessors.

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

  M A1 10       Bus write P1 A1      

P1: read A1 M A1 10                  

P2: read A1       I     Bus read P2 A1      

  S A1 10 S A1 10 Bus write P1 A1 10 A1 10

P2: write 20 to A1 I     M A1 20 Bus write P2 A1      

P2: write 40 to A2             Bus write P2 A1 20 A1 20

Step 5a: Write miss, A2 maps to the same line as A1

- first, write back the victim

Page 16: Snoopy Coherence Protocols Small-scale multiprocessors.

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

  M A1 10       Bus write P1 A1      

P1: read A1 M A1 10                  

P2: read A1       I     Bus read P2 A1      

  S A1 10 S A1 10 Bus write P1 A1 10 A1 10

P2: write 20 to A1 I     M A1 20 Bus write P2 A1      

P2: write 40 to A2             Bus write P2 A1 20 A1 20

              Bus read P2 A2      

Step 5b: Service the miss

- does anyone have A2 cached?

Page 17: Snoopy Coherence Protocols Small-scale multiprocessors.

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

  M A1 10       Bus write P1 A1      

P1: read A1 M A1 10                  

P2: read A1       I     Bus read P2 A1      

  S A1 10 S A1 10 Bus write P1 A1 10 A1 10

P2: write 20 to A1 I     M A1 20 Bus write P2 A1      

P2: write 40 to A2             Bus write P2 A1 20 A1 20

              Bus read P2 A2      

        M A2 40 Bus write P2 A2      

Step 5c: Not cached elsewhere

- like the second half of step 1

Page 18: Snoopy Coherence Protocols Small-scale multiprocessors.

Four state protocol

• add “exclusive” state

• indicates this is the only cached copy

• no need to broadcast an invalidation on a write hit to an E line

• goal is to reduce bus traffic

• works well for local variables

Page 19: Snoopy Coherence Protocols Small-scale multiprocessors.

I E

M

Write (hit)Write (miss)

Read (hit)

Read or write (hit)

a) Processor actions

S

Write (hit)

Read (miss) 2

Read (miss) 1

Read (hit)1: data comes from memory2: data from another cache

Page 20: Snoopy Coherence Protocols Small-scale multiprocessors.

I E

M

Bus readBus write

Bus reador write

b) Bus snooping

S

Bus read

Bus write

Bus write

Bus read

Page 21: Snoopy Coherence Protocols Small-scale multiprocessors.

Coherence misses

• a new type of miss has been added

• we still have the usual cold, capacity and conflict misses

• now we also have coherence misses

• these occur when a read miss is serviced from another cache