GPUs: Engines for Future High- Performance Computing John Owens John Owens Assistant Professor, Electrical and Computer Assistant Professor, Electrical and Computer Engineering Engineering Institute for Data Analysis and Visualization Institute for Data Analysis and Visualization University of California, Davis University of California, Davis
19
Embed
GPUs: Engines for Future High- Performance Computing
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
GPUs: Engines for Future High-Performance Computing
John OwensJohn OwensAssistant Professor, Electrical and ComputerAssistant Professor, Electrical and Computer
EngineeringEngineeringInstitute for Data Analysis and VisualizationInstitute for Data Analysis and Visualization
University of California, DavisUniversity of California, Davis
GPUs as Compute Engines10 years ago:10 years ago:•• Graphics done in softwareGraphics done in software5 years ago:5 years ago:•• FFull graphics pipelineull graphics pipelineToday:Today:•• 40x geometry, 13x fill 40x geometry, 13x fill vsvs. 5 yrs ago. 5 yrs ago•• Programmable!Programmable!
Programmable, data parallelProgrammable, data parallelprocessing on every desktopprocessing on every desktop
Enormous opportunity to change theEnormous opportunity to change theway commodity computing is done!way commodity computing is done!
GPU
The Rendering Pipeline
Application
Rasterization
Geometry
Composite
Compute 3D geometryMake calls to graphics API
Transform geometry from 3Dto 2D
Generate fragments from 2Dgeometry
Combine fragments into image
GPU
The Programmable Rendering Pipeline
Application
Rasterization
(Fragment)
Geometry
(Vertex)
Composite
Compute 3D geometryMake calls to graphics API
Transform geometry from 3Dto 2D; vertex programs
Generate fragments from 2Dgeometry; fragment programs
Combine fragments into image
Triangle Setup
L2 Tex
Shader Instruction Dispatch
Fragment Crossbar
MemoryPartition
MemoryPartition
MemoryPartition
MemoryPartition
Z-Cull
NVIDIA GeForce 6800 3D Pipeline
Courtesy Nick Triantos, NVIDIA
Vertex
Fragment
Composite
Courtesy Naga Govindaraju
GPU
CPU
Long-Term Trend: CPU vs. GPU
GPU
Recent GPU Performance TrendsG
FLO
PS
32-bit FP multiplies per second
NVIDIA NV30, 35, 40
ATI R300, 360, 420
Pentium 4
July 01 Jan 02 July 02 Jan 03 July 03 Jan 04
Courtesy Pat Hanrahan/David Luebke
$294
$385
$335
25 GB/s
6 GB/s
36 GB/s
Why Are GPUs Fast?Characteristics of computation permit efficientCharacteristics of computation permit efficient
hardware implementationshardware implementations• High amount of parallelism …
• … exploited by graphics hardware
• High latency tolerance and feed-forward dataflow …
• … allow very deep pipelines
• … allow optimization for bandwidth not latency
Simple controlSimple control• Restrictive programming model
Competition between vendorsCompetition between vendorsWhat about programmability? Effect onWhat about programmability? Effect on
performance?performance? How hard to program?How hard to program?
Programming a GPU for GP Programs
•• Run a SIMD programRun a SIMD programover each fragmentover each fragment
•• ““GatherGather”” is permitted is permittedfrom texture memoryfrom texture memory
•• Resulting buffer canResulting buffer canbebe treated as texturetreated as textureon next passon next pass
•• Draw a screen-sizedDraw a screen-sizedquadquad
GPU Programming is Hard
Must think in graphics metaphorsMust think in graphics metaphorsRequires parallel programmingRequires parallel programming (CPU-GPU,(CPU-GPU,
Stream programming modelStream programming model• Treats GPU as streaming coprocessor
• Streams enforce data parallel computing
• Kernels encourage arithmetic intensity
• Streams and kernels explicitly specified
C with stream extensionsC with stream extensionsOpen-source: www.Open-source: www.sfsf.net/projects/brook/.net/projects/brook/Ian Buck et al., Ian Buck et al., ““Brook for Brook for GPUsGPUs: Stream: Stream
Computing on Graphics HardwareComputing on Graphics Hardware””,,Siggraph 2004Siggraph 2004
•• Major vendors support PCI-E cards nowMajor vendors support PCI-E cards now•• Multiple Multiple GPUs GPUs supported per CPU - opportunity!supported per CPU - opportunity!
• Cheap and upgradable
GPUs lack band-width to the host,so we won’t use it!
No one uses hostbandwidth, so wewon’t optimize it!
Architecture: Increase featuresArchitecture: Increase features andandperformance without sacrificing coreperformance without sacrificing coremissionmission
Craig Lund: Mercury Computer SystemsCraig Lund: Mercury Computer SystemsJeremy Jeremy KepnerKepner: Lincoln Labs: Lincoln LabsNick Nick TriantosTriantos, Craig Kolb: NVIDIA, Craig Kolb: NVIDIAMark Segal: ATIMark Segal: ATIKari Kari PulliPulli: Nokia: NokiaAaron Aaron LefohnLefohn: UC Davis: UC DavisIan Buck: StanfordIan Buck: StanfordFunding: DOE Office of Science, Los AlamosFunding: DOE Office of Science, Los Alamos
National Laboratory, National Laboratory, ChevronTexacoChevronTexaco, UC, UCMICRO, UC DavisMICRO, UC Davis
For more information …
GPGPU home: http://www.gpgpu.org/GPGPU home: http://www.gpgpu.org/• Mark Harris, UNC/NVIDIA