Top Banner
Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept. of Electrical and Science Computer Engineering Duke University Duke University
21

Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Jan 19, 2018

Download

Documents

Edgar Fields

Memory System Error Detection Must cover all memory system components DRAMs, caches, controllers, interconnect, and write buffers Mechanisms for individual components exist Storage structures: ECC Interconnect: checksums, sequence numbering Cache and memory controllers: replication Adding detection to all components is hard Complicates design of every component Requires good intuition of interactions and possible errors  Want comprehensive, end-to-end error detection
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Dynamic Verification of Sequential Consistency

Albert Meixner Daniel J. SorinDept. of Computer Dept. of Electrical and

Science Computer EngineeringDuke University Duke University

Page 2: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Introduction

• Multithreaded systems becoming ubiquitous• Commercial workloads rely heavily on parallel machines• Reliability and availability are crucial

• Backward Error Recovery can provide high availability• Recover to known good state upon error• But can only recover from errors detected in time

• Memory system is of special interest• Complex – Many components, large transistor count• Numerous error hazards

Page 3: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Memory System Error Detection

• Must cover all memory system components• DRAMs, caches, controllers, interconnect, and write buffers

• Mechanisms for individual components exist• Storage structures: ECC• Interconnect: checksums, sequence numbering• Cache and memory controllers: replication

• Adding detection to all components is hard• Complicates design of every component• Requires good intuition of interactions and possible errors Want comprehensive, end-to-end error detection

Page 4: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Dynamic Verification

• Dynamic verification• Correct system operation constantly monitored at runtime• End-to-end scheme• Detects transient errors, design bugs, and manufacturing

errors• Differs from statically verifying that design is bug-free

• High level invariants are checked, instead of individual components• Simplified design of system components• Can detect any low-level error that violates invariant

Page 5: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Memory Consistency

• Memory consistency model• Formal specification of memory system behavior in a

multithreaded system• Defines order in which memory accesses from different CPUs

can become globally visible• Many consistency models exist, we focus on one

• Verifying memory consistency =Verifying correctness of the memory system• Ideal invariant for dynamic verification

Page 6: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Sequential Consistency (SC)

• Requires appearance of total global order of all loads and stores in system• Each load must receive value of most recent store in total order

to the same address• Program order of all processors is preserved in total order

• SC is most intuitive consistency model• Good for programmers• Speculation can make SC almost as fast as more relaxed

models

• Our contribution: Dynamic Verification of Sequential Consistency (DVSC)

Page 7: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Outline

• Introduction• DVSC-Direct• DVSC-Indirect• Results• Conclusion

Page 8: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Global Order

Program Order

Program Order

ST A←2LD C→1

DVSC-Direct

CPU 1

CPU 2

t=1.1 LD A→1 t=2.1 ST B←2 t=3.1 LD A→2

t=1.2 LD C→1 t=2.2 ST A←2 t=3.2 LD C→1

Verifier

LD A→1 ST B←2 LD A→2

LD C→1 ST A←2 LD C→1

LD A→1ST B←2LD A→2LD C→1

Page 9: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

DVSC-Indirect: Idea

• Verify conditions sufficient for Sequential Consistency• In-order performance of memory operations• Cache coherence

• Conditions formally defined and proven by Plakal et al. [SPAA 1998]

• Two mechanisms• On-chip checker for in-order performance• Distributed checker for cache coherence

Page 10: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

In-Order Performance Verification

• A load of block B receives the value of…• …the most recent local store to B or most recent global store to

B performed after all local stores

• Trivially observed on in-order processor with coherent caches• Modern processors execute out-of-order• Results of ooo-execution are considered speculative until in-

order re-execution and verification

• DVSC-Indirect uses DIVA checker core by Austin [Micro 1999]• Could substitute other mechanisms

Page 11: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Cache Coherence

• All processors observe the same order of stores to a given memory location• Difficult because the same memory location can exist in different

caches

• Maintained by a coherence protocol• Different protocols: MOSI, MSI, MOESI, Token Coherence, …• Different maintenance mechanisms: directory, snooping

• Verification uses “divide and conquer”• Verify conditions provably sufficient for cache coherence• Initially defined for proof of sequential consistency by Plakal et

al. [SPAA1998]

Page 12: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Cache Coherence Verification

• Coherence Conditions1. Cache accesses are contained in an

epoch• Stores in read-write epochs• Loads in read-write or read-only

epochs2. Read-write epochs do not overlap other epochs3. Block data at beginning of epoch equals block data at end of

last read-write epoch

• Verification• Check if accesses are in appropriate epoch during DIVA-replay• Collect epoch information at every node and send to verifier• Verifier checks epoch history for overlaps and data propagation

Epoch

The time interval between obtaining and losing permissions on a block.

Page 13: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Implementation Overview

CPUCore

Cache

Memory VerifyEpochs

RecordEpochs

EpochHistory

CollectEpochs

DIVA CPUCore

Cache

Memory VerifyEpochs

RecordEpochs

EpochHistory

CollectEpoch

DIVA CPUCore

Cache

Memory VerifyEpochs

RecordEpochs

EpochHistory

CollectEpochs

DIVA

Interconnect

Page 14: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

At the Cache Controller

• All caches keep track of active epochs in the Cache Epoch Table (CET)• Epoch Inform sent to the memory controller

when epoch ends• Begin and end data are hashed

• Every DIVA cache access checks CET for active epoch• Ensure access is contained in epoch

• Verification off the critical path• Second order performance effect from

bandwidth usage

Epoch InformCET

Type:read-write orread-onlyBegin time

Begin data

End time

End data

Page 15: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

At the Memory Controller

• Check for epoch overlaps and correct value propagation• Generally requires entire block history → O(N) space

• If epoch informs are processed in order…• Need end value of last read-write epoch for propagation check• Need end time of last read-write and last read-only epoch for

overlap check• O(1) space

• Epochs arrive almost in order• Fix remaining re-orderings in priority queue before verifications

• Epoch state in Memory Epoch Table (MET)• Last end time of read-only epoch and read-write epoch, last

value

Page 16: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Experimental Evaluation

• Empirically determine error detection capability• Error injection into caches, controller, interconnect,

switches, etc.

• Quantify error-free overhead• Increase in interconnect bandwidth consumption• Potential decrease in application performance

Page 17: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Simulation Methodology• Full-system

simulation of 8-CPU UltraSPARC SMP• Simics functional

simulation• GEMS-based timing

simulation• 2 GB RAM, 4-way 32KB

I+D L1, 4-way 1MB L2• SafetyNet for backward

error recovery• MOSI-Directory and MOSI-

Snooping

BenchmarksApache 2 Static web-server

SpecJBB 3-Tier Java systemOLTP Online transaction

system with DB2Slashcode Dynamic website

with perl and mysql

Barnes Barnes-Hut from SPLASH2

Page 18: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Bottleneck Link Bandwidth - Directory

Page 19: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Error-Free Runtime - Directory

slow

er

Page 20: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Conclusions

• DVSC-Direct and DVSC-Indirect enable end-to-end verification of the memory system

• DVSC-Indirect imposes acceptable hardware and performance overhead

• An extension of DVSC-Indirect to relaxed consistency is currently under development

Page 21: Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept.…

Questions?