Ekaterina Gonina, Jike Chong, N.R. Satish*, Mikhail Smelyanskiy*, Kurt Keutzer, Department of EECS, University of California, Berkeley This research is supported in part by an Intel Ph.D. Fellowship. This research also supported in part by Microsoft (Award #024263 ) and Intel (Award #024894) funding and by matching funding by U.C. Discovery (Award #DIG07-10227). Parallel Computing Lab AS PA Market Conditions Portfolio Pricer Value at Risk Strategy Trade Instrument Models Market Price Feed - Input Data (general parameter information) - Algorithms/Routines - Input Data (specific to user) Quantitative Finance High Level Overview • Analytical • Implicit & Explicit Finite Difference • Discretization • Convergence criteria • Iterative method (implicit) Pricer and Value-at-Risk Estimator Customizable Parameters Structural Patterns: • Iterative refinement • MapReduce • Pipe-and-filter Computational Patterns: • Dense Linear Algebra • Sparse Linear Algebra • Structured Grids • Monte Carlo Monte Carlo Based Value-at-Risk Analysis Intra-day Risk Analysis In collaboration with Matthew Dixon, University of California, Davis Potential Future Exposure Analysis In collaboration with the Center for Innovative Financial Technologies, UC Berkeley Time Steps Trades Path Phase 1: Collect reporting time steps of interest Input: all trade types and their parameters Phase 2: Generate market data for all time steps Phase 3: Estimate instrument value for all trades on reporting time step Path Time Steps Time Steps Path Time Steps Time Steps Phase 4: Aggregating value of all trades Phase 5: Find PFE based on all path • Billions of instrument value must be priced overnight • Challenge lies in developing software architectures that are efficient, scalability, and maintainable • Working with one of the top financial data provider to architect their risk analytics engine Typical Monte Carlo Simula3on Simula3on of market VaR op3mized for Manycore resource hierarchy (2) ... (3) ... (4) (1) ... Number of Scenarios to Simulate (3) (4) (1) (2) Performance optimized using a three-levels approach comprised of problem reformulation, module selection and implementation styling. 148x speedup on same platform Data Assimila3on Uniform Random Number Genera3on (URNG) (1) Parameter Distribu3on Transforma3on (2) Instrument Pricing (3) (4) With the proliferation of algorithmic trading, derivative usage and highly leveraged hedge funds, Value-at-Risk is an increasingly important metric for financial institutions. Typical four-step process in Monte Carlo simulation, each step can be customized with alternative implementations. Important computation pattern in Our Pattern Language. Number of Scenarios to Simulate Finite-Difference Methods for Option Pricing An Option is a tradable financial security whose value depends on the value of the underlying asset. Task of Option Pricing is to find the price of the option given the underlying asset and market conditions 60- Black-Scholes equation Heat Diffusion Equation • S-underlying asset price • V – option price • Sigma - volatility • R – riskless rate of return • Use the heat equation to solve the PDE using finite-difference methods • Discretization & Stencil computation Crank-Nicolson Method In collaboration with N.R. Satish and Mikhail Smelyansky Throughput Computing Lab, Intel • Half-step implicit, half-step explicit method • Gauss-Seidel iterative solver for implicit • Explicit data dependency Market Parameters • Starting Stock Price • Volatility Option Type • American/European • Payoff function Discretization Parameters • Number of timesteps • Number of points Input Output Option Values For n timesteps: 1. Forward half-step 2. Backward half-step (Gauss-Seidel Solver) Until Convergence Pricing Routine 60%-90% of computation Parallelization & Results Evaluated on Core i7 and Larrabee* SIMD-Level • Unroll the Gauss-Seidel iteration loop • Reuse registers • Efficient gather/scatter Core-level Embarrassingly-parallel – map one option per core A, B, C.. = each an iteration operating on a vector of 4 values Before Parallelization After Parallelization Nehalem Larrabee Nehalem Larrabee* 1 Option 2.13x 6.08x 128 Options 5.7x 25.95x Speedup Core Scaling