Top Banner
Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear MICRO’09 LBA reading group – 09/29/09 (by Evangelos)
16

Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.

Dec 24, 2015

Download

Documents

Ronald Davidson
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.

Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs

A. Nistor, D. Marinov and J. Torellas

to appear MICRO’09LBA reading group – 09/29/09

(by Evangelos)

Page 2: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.

Introduction – Context Debugging of parallel applications

Even for 1 input too many interleavings Systematic Testing

Execute many times - explore all interleavings

Assumptions: Input provided Thread Interleaving only cause of non-determinism

Goal: Hardware support for data race detection under Systematic Testing

Page 3: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.

Background of Systematic Testing

• Serializing of threads (multiplexing)

• New scheduler implementation

• Happens-before definition

• Segment-based interleaving

Page 4: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.

Background of Systematic Testing

State: represented by a Serial Log; ordered list of segments

Page 5: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.

Background of Systematic Testing

Page 6: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.

Light64 – The Idea

“Two different thread interleavings that have the same happens-before graph but a flipped data race, will very likely have at least a small deviation in the execution history”

Page 7: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.

Corner cases?

No false positives; few false negatives Systematic tester environment highly

deterministic Extremely improbable for two different

streams of values to generate the same hash

Cannot identify benign races; races on data that will never be consumed

By construction…

Page 8: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.

Design

Small hardware modifications CRC logic at the head of ROB ISA extensions; start/stop – save/load hash

history Two modes of execution

Passive Mode Active Mode Tradeoff between accuracy and

performance

Page 9: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.

Passive Mode

During step 4 Augment each state with the Execution

History Hash. Check if executions with same happens-before have the same hash value (e.g., S2 & S11)

No guarantees on coverage Dependable on systematic tester’s exploration

strategy and pruning heuristics No practical overhead

Page 10: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.

Active Mode

During step 2; While re-executing to reach the selected state ‘S’,

flip as many segments as possible. Compare Execution History Hash against original execution

Heuristic 1 – efficient segment reordering Smallest-ID Thread first during first run Biggest-ID Thread first during re-execution

Heuristic 2 – additional re-executions to increase coverage

ActiveFIN – re-execute all final states ActiveFULL – re-execute all states

Page 11: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.

Experimental Setup

Used Pin to model a system running a systematic tester

Instruction count as a performance metric

SPLASH-2 benchmarks (modified & unmodified)

6 versions of a system: Plain, Plain+RD, ActiveNO, ActiveFIN,

ActiveFULL, Passive

Page 12: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.

State Space Characterization

Page 13: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.

Race Detection Capability

Page 14: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.

Runtime Overhead

Page 15: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.

Runtime Overhead – Software-based

Page 16: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.

Conclusions

Lightweight support for data race detection in a Systematic Tester world

Relatively low overhead for S.T. Not a conventional MICRO paper