+ Probabilistic Accuracy Bounds @aysylu22 October 28, 2015 Papers We Love Too
+
Probabilistic Accuracy Bounds
@aysylu22 October 28, 2015
Papers We Love Too
+Aysylu Greenberg
@aysylu22 http://aysy.lu
+Papers We Love NYC
+Today
+
https://www.youtube.com/watch?v=1o9RGnujlkI
+Inaccuracy is OK in Some Domains
n Monte Carlo
n Video/audio encoder/decoder
n Reed-Solomon codes
n Robust statistical techniques
n Near-realistic visualization of collision detection
+Benefit of Inaccurate Computations
n Latency reduction
n Working around small failures
n Reducing utilization of compute resource power
n Resilient to software errors
+
Andy Warhol
+
+MapReduce:
Warholfaux Gallery Showing
Reduce
Map
Input
Map
Input
Map
Input
+
Christopher Wool
+
+
+
FAILED CRITICALITY TEST
+Criticality Test
n Will corrupt data consumed by the next task block n Null pointer type corruption
n Wrong or incomplete data distorts results downstream “too much”
n “Too much” = if 10% of tasks failed and n No output OR n Computed distortion > 0.1
+Distortion
• For each input: • measure the difference between
correct output and observed output, • scale it by correct output value for
meaningful comparison across different inputs.
• Sum over all inputs • Divide by total number of samples.
+Obtaining a Model
n Linear least-squares regression
+Distortion Model
• Given n failable tasks, model is sum of terms where each term consists of: • least-squares coefficient for
regression, • 95% confidence intervals for the
coefficients • non-zero 0th term coefficient indicates
different modes for small vs large task failure rates
+Good Statistical Properties
in Layman’s Terms
n Linear least-squares regression
n R2 = how much of the variation in the data the model accounts for
n confidence interval = range of values from training set that is likely to explain the distribution
n F value = how well model explains the data
+Simulations
n Water
+Simulations
n Water
n Search
+Simulations
n Water
n Search
n SOR
+Simulations
n Water
n Search
n SOR
n String
+String
Shoot rays Put results into storage
Create data structures
Deallocate data
structures
Compute new model
+String
Shoot rays Put results into storage
Create data structures
Deallocate data
structures
Compute new model
+Distortion Model
+Time Model
+Distortion & Time Model
n Distortion:
n Time:
n Ratio: 0.053/-0.50
= -0.106
n Distortion:
n Time:
n Ratio: 0.54 / -0.004
= -135
1st failable task x1 4th failable task x4
+Amdahl’s Law
Theoretical Maximum Speedup Parallelization Speedup
+Jade
n Portable, implicitly parallel language designed for exploiting task-level concurrency n Start with a program written in a serial, imperative
language n Jade constructs to declare how parts of the program
access data
n Jade implementation uses data access information to automatically extract the concurrency and map the application onto the machine
n Extension to C
http://people.csail.mit.edu/rinard/paper/toplas98.pdf
+Computing with
Bounded Inaccuracies
n Purposeful failure of tasks to reduce execution time
+
+Computing with
Bounded Inaccuracies
n Purposeful failure of tasks to reduce execution time
n Simplified implementation resilient to software errors that avoids expensive handling of edge cases
n More focus on failure detection and repair mechanisms
+References
n Practical Probabilistic Programming
+References
n Practical Probabilistic Programming
n Jade design: http://people.csail.mit.edu/rinard/paper/toplas98.pdf
n Distributed Information Processing in Biological and Computational Systems: http://m.cacm.acm.org/magazines/2015/1/181614-distributed-information-processing-in-biological-and-computational-systems/fulltext
n Loop perforation: http://people.csail.mit.edu/rinard/paper/fse11.pdf
n Confidence intervals: http://blog.minitab.com/blog/adventures-in-statistics/when-should-i-use-confidence-intervals-prediction-intervals-and-tolerance-intervals
+
Probabilistic Accuracy Bounds
@aysylu22 October 28, 2015
Papers We Love Too