Top Banner
May 8, 2007 Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad & Alaa Shams
22

May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

Jan 17, 2016

Download

Documents

Thomas Newman
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

Overview of the GPU Architecture

CS7080 Final Class ProjectSupervised by: Dr. Elias Khalaf

By: Farid Harhad & Alaa Shams

Page 2: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

Outline

• Introduction

• GPU Architecture

• GPU programming– GPU programming model– Toolkit and language

• Sample Code

• Conclusion

Page 3: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

Introduction

• The GPU on commodity video cards has evolved into extremely flexible and powerful processor.

• GPUs are fast :– 3.0 GHz Pentium 4: 6 GFLOPs, 6 GB/Sec peak– 3.0 GHz dual-core Pentium 4: 24.6 GFLOPs– GeoForceFX 6800: 53 GFLOPs, 34 GB/Sec Peak – GeoForceFX 7800: 165 GFLOPs– 1066 MHz FSB Pentium Extreme Edition: 8.5 GB/s– ATI Radeo X850 XT Platinum Edition: 37.8 GB/s

• GPUS are getting faster and faster– CPUs: ~1.5x annual growth ~60x per decade– GPUs: ~2.3x annual growth ~1000x per decade

Page 4: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

Computational power

Page 5: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

Cont.

Page 6: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

Why are GPUsgetting faster so fast?

• Arithmetic intensity: the specialized nature of GPUs makes it easier to use additional transistors for computation not cache

• Economics: multi-billion dollar video game market is a pressure cooker that drives innovation

Page 7: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

Flexible and Precise

• Modern GPUs are deeply Programmable– Programmable pixel, vertex, video engines

– Solidifying high-level language support

• Modern GPUs support high precision– 32 bit floating point throughout the pipeline

– High enough for many (not all) applications

Page 8: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

The Potential of GPU

• In short:– The power and flexibility of GPUs makes

them an attractive platform for general- purpose computation

– Example applications range from in-game physics simulation to conventional Computational science

– Goal: make the inexpensive power of the GPU available to developers as a sort of computational coprocessor

Page 9: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

The Problem: Difficult To Use

• GPUs designed for & driven by video games– Programming model unusual– Programming idioms tied to computer graphics– Programming environment tightly constrained

• Underlying architectures are:– Inherently parallel– Rapidly evolving (even in basic feature set!)– Largely secret

• Can’t simply “port” CPU code!

Page 10: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

GPU ArchitectureGraphic PL

Page 11: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

Modern Graphic PL

Page 12: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

Transform

• Vertex processor

(multiple in parallel)– Transform from “world space” to “image space”

– Compute per-vertex lighting

Page 13: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

Rasterizer

– Convert geometric rep. (vertex) to image rep. (fragment)• Fragment = image fragment

– Pixel + associated data: color, depth, stencil, etc.

– Interpolate per-vertex quantities across pixels

Page 14: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

Shade

• Fragment processors

(multiple in parallel)– Compute a color for each pixel

– Optionally read colors from textures (images)

Page 15: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

GPU programming

Page 16: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

GPU Programming Model

• Useful analogies:– Rasterization = Kernel Invocation– Texture coordinates = Computation domain– Vertex coordinates = computational range

• Invoking computation amounts to drawing pixels:– GPGPU invocation is commonly a full-screen quad

GPU CPU

Stream / Data array:• Memory read

Texture:• Texture sampling

Loop body / Kernel / Algorithm Fragment program

Feedback: Array write Feedback: render a texture

Page 17: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

GPU Programming Model

Page 18: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

Toolkits and Language

• High level shading languages– Cg: C for Graphics– HLSL: The D3D Shading Language– The OpenGL Shading Language

• GPGPU Languages– Sh - University of Waterloo– Brook - Stanford University

• CUDA SDK– Includes a C compiler and many libraries

Page 19: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

Sample Code

Page 20: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

Conclusion

• GPU provide the programmer with unparalleled flexibility and performance

in a product line that spans the entire PC market.

• Utilizing the capabilities of the GPU allow the programmers to develop newer applications-either graphical or general purpose-in more efficient way.

Page 21: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

References

• GPU Gem2 (Chapters 29 & 30)

• SIGGRAPH 2005 GPGPU Course

• http://www.gpgpu.org/

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

Page 22: May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

May 8, 2007 Farid Harhad and Alaa ShamsCS7080

Questions?

Thanks