ALASTAIR HOUSTON COMPUTE FSI SALES MANAGER GPU OVERVIEW IN FINANCIAL SERVICES
May 10, 2015
ALASTAIR HOUSTONCOMPUTE
FSI SALES MANAGER
GPU OVERVIEW INFINANCIAL SERVICES
© NVIDIA Corporation 2009
Agenda
Nvidia and HPC markets
GPU Overview
CUDA and OpenCL
Current FS deployments
© NVIDIA Corporation 2009
TeslaTM
High-Performance ComputingQuadro®
Design & CreationGeForce®
Entertainment
CUDA Runs on NVIDIA GPUs …Over 80 Million CUDA GPUs Deployed
© NVIDIA Corporation 2009
146X
Medical Imaging Medical Imaging U of UtahU of Utah
36X
Molecular DynamicsMolecular DynamicsU of Illinois, UrbanaU of Illinois, Urbana
18X
Video TranscodingVideo TranscodingElemental TechElemental Tech
50X
Matlab ComputingMatlab ComputingAccelerEyesAccelerEyes
100X
AstrophysicsAstrophysicsRIKENRIKEN
149X
Financial simulationFinancial simulationOxfordOxford
47X
Linear AlgebraLinear AlgebraUniversidad Jaime
20X
3D Ultrasound3D UltrasoundTechniscanTechniscan
130X
Quantum ChemistryQuantum ChemistryU of Illinois, UrbanaU of Illinois, Urbana
30X
Gene SequencingGene SequencingU of MarylandU of Maryland
50x – 150x
© NVIDIA Corporation 2009
Options Pricing, Risk Modeling, Algorithmic Trading
Options pricing use Monte Carlo (MC) simulations
Random Number Generators (RNG) are key to MC
Up to 100x speed-up in RNGs using CUDA
25-60x overall speedup in Monte Carlo simulations
© NVIDIA Corporation 2009
GPUGPUCPUCPU
The Right Processor for the Right Tasks
Co-Processing
© NVIDIA Corporation 2009
The Performance Gap Widens Further
8x double precisionECC
L1, L2 Caches
1 TF Single Precision4GB Memory
NVIDIA GPUX86 CPU
© NVIDIA Corporation 2009
3 billion transistors
Over 2× the cores (512 total)
8× the peak DP performance
ECC
L1 and L2 caches
~2× memory bandwidth (GDDR5)
Up to 1 Terabyte of GPU memory
Concurrent kernels
Hardware support for C++
DR
AM
I/F
DR
AM
I/F
HO
ST I/
FH
OST
I/F
Gig
a Th
read
DR
AM
I/F
DR
AM
I/F D
RA
M I/F
DR
AM
I/FD
RA
M I/F
DR
AM
I/FD
RA
M I/F
DR
AM
I/FD
RA
M I/F
DR
AM
I/F
L2L2
Introducing the ‘Fermi’ ArchitectureThe Soul of a Supercomputer in the body of a GPU
© NVIDIA Corporation 2009
NVIDIA Compute Products
4 Tesla GPUsData Center Product
1U Server ProductBoard Level Products
1 Tesla GPUWorkstation ProductOEM Product
© NVIDIA Corporation 2009
GPU Computing ApplicationsGPU Computing Applications
Momentum
CUDA C and OpenCL
Over 100,000,000 installed CUDA-Architecture GPUs
Over 60,000 GPU Computing Developers (1/09)
Windows, Linux and MacOS Platforms supported
GPU Computing spans Consumer applications to HPC
200+ Universities teaching the CUDA Architecture and GPU Computing NVIDIA GPUNVIDIA GPU
with the CUDA Parallel Computing Architecturewith the CUDA Parallel Computing Architecture
CC OpenCLOpenCL DirectX DirectX ComputeCompute FORTRANFORTRAN Python,Python,
Java, Java, ……With CUDA ExtensionsWith CUDA ExtensionsOver 60,000 developersOver 60,000 developersRunning in Production Running in Production since 2008 since 2008 SDK + LibSDK + Lib’’s + Visual s + Visual Profiler and DebuggerProfiler and Debugger
11stst GPU demoGPU demoShipped 1Shipped 1stst OpenCL OpenCL DriverDriverStrategic developers Strategic developers using NV SW todayusing NV SW today
MicrosoftMicrosoft’’s GPU s GPU Computing APIComputing APISupports all CUDASupports all CUDA--Architecture GPUs Architecture GPUs since G80 (DX10 and since G80 (DX10 and future DX11 GPUs)future DX11 GPUs)
SW supplied by:SW supplied by:•• The Portland GroupThe Portland Group•• NCSA releaseNCSA release
Compute KernelsCompute KernelsDriver API BindingsDriver API Bindings
© NVIDIA Corporation 2009
NVIDIA NexusNexus is a GPU application development suite that integrates directly into Visual Studio.
A C/CUDA source debugger for both the CUDA runtime and driver APINew C/CUDA performance analysis/trace tools
© NVIDIA Corporation 2009
FSI CUSTOMER DEPLOYMENTS
© NVIDIA Corporation 2009
Case Study: Equity Derivatives
2 Tesla S1070 500 CPU Cores
2.8 KWatts 37.5 KWatts
$24 K $250 K
16x Less Space
13x Lower Power
10x Lower Cost
15x Faster15 1
Source: BNP Paribas, March 4, 2009
© NVIDIA Corporation 2009
Case Study: Security Pricing
Source: Wall Street & Technology, September 24, 2009
48 Tesla S1070 8000 CPU Cores
2 hours 16 hours
10x Less Space
8x Faster