University of Maryland Dynamic Floating-Point Error Detection Mike Lam, Jeff Hollingsworth and Pete Stewart
Feb 25, 2016
University of Maryland
Dynamic Floating-Point Error Detection
Mike Lam, Jeff Hollingsworth and Pete Stewart
University of Maryland 2
Motivation Finite precision -> roundoff error
Compromises ill-conditioned calculations Hard to detect and diagnose Increasingly important as HPC grows
Single-precision is faster on GPUs Double-precision fails on long-running
computations Previous solutions are problematic
Numerical analysis requires training Manual re-writing and testing in higher
precision is tedious and time-consuming
University of Maryland 3
Our Solution• Instrument floating-point instructions
• Automatic• Minimize developer effort• Ensure analysis consistency and correctness
• Binary-level• Include shared libraries w/o source code• Include compiler optimizations
• Runtime• Data-sensitive
University of Maryland 4
Our Solution• Three parts• Utility that inserts binary instrumentation• Runtime shared library with analysis routines• GUI log viewer
General overview Find floating-point instructions and insert
calls to shared library Run instrumented program View output with GUI
University of Maryland 5
Our Solution Dyninst-based instrumentation
Cross-platform No special hardware required Stack walking and binary rewriting
Java GUI Cross-platform Minimal development effort
University of Maryland 6
Our Solution• Cancellation detection• Instrument addition & subtraction• Compare runtime operand values• Report cancelled digits
• Side-by-side (“shadow”) calculations• Instrument all floating-point instructions• Higher/lower precision• Different representation (i.e. rationals)• Report final errors
University of Maryland 7
Cancellation Detection• Overview• Loss of significant digits during operations
• For each addition/subtraction: Extract value of each operand Calculate result and compare magnitudes
(binary exponents)• If eans < max(ex,ey) there is a
cancellation• For each cancellation event:
• Record a “priority:” max(ex,ey) - eans• Save event information to log
University of Maryland 8
University of Maryland 9
University of Maryland 10
Gaussian EliminationA -> [L,U]
Comparison of eight methods Classical Classical w/ partial pivoting Classical w/ full pivoting Bordering (“Sherman’s march”) “Pickett’s charge” “Pickett’s charge” w/ partial pivoting Crout’s method Crout’s method w/ partial pivoting
University of Maryland 11
Gaussian Elimination
University of Maryland 12
Gaussian Elimination
Classical vs. Bordering
University of Maryland 13
Gaussian Elimination
Classical BorderingOperations 285 294Cancellations 39 9Cancels/ops 14% 3%Average bits 5.23 22.78
University of Maryland 14
SPEC Benchmarks• Results are hard to interpret without
domain knowledge
• Overheads:
University of Maryland 15
Roundoff Error Sparse “shadow value” table
Maps memory addresses to alternate values Shadow values can be single-, double-, quad- or
arbitrary-precision Other ideas: rationals, # of significant digits, etc.
Instrument every FP instruction• Extract operation type and operand addresses• Perform the same operation on corresponding
shadow values• Output shadow values and errors upon
termination
University of Maryland 16
University of Maryland 17
More Gaussian Elimination
Maximum relative error
25x25 50x50 100x100
Partial pivoting 9.3e-10 2.3e-2 1.0Full pivoting 1.3e-15 2.4e-15 4.8e-15
University of Maryland 18
Issues & Possible Solutions• Expensive overheads (100-500X)• Optimize with inline snippets• Reduce workload with data flow analysis
• Following values through compiler optimizations• Selectively instrument MOV instructions
• Filtering false positives• Deduce “root cause” of error using data flow
University of Maryland 19
Conclusion
• Analysis of floating-point error is hard
• Our tool provides automatic analysis of such error
• Work in progress
University of Maryland 20
Thank you!