Top Banner
Larrabee A many-core GPU architecture.
16

A many-core GPU architecture.. Price, performance, and evolution.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A many-core GPU architecture.. Price, performance, and evolution.

LarrabeeA many-core GPU architecture.

Page 2: A many-core GPU architecture.. Price, performance, and evolution.

GPUs vs CPUsPrice, performance, and evolution.

Page 3: A many-core GPU architecture.. Price, performance, and evolution.

Definitions

CPU (Central Processing Unit) – general purpose processor able to execute computer programs.

GPU (Graphics Processing Unit) - dedicated graphics rendering device.

Page 4: A many-core GPU architecture.. Price, performance, and evolution.

Price and Performance

The nVIDIA GeForce 6800 Ultra is able to reach a performance of 40 Gflops whereas an Intel 3GHz Pentium4 is able to reach only 6. [1]

What is more impressive, current cards such as ATI HD5870, AMD FireStream 9250, NVIDIA GeForce 9800 run between 1 and 3 TFLOPS.

Reasons for this include highly parallel vector processing, fast onboard memory, and pipeline constraints which stream data without stalls.

Page 5: A many-core GPU architecture.. Price, performance, and evolution.

Evolution

GPU performance has approximately doubled every 6 months since the mid-1990s.

CPU performance doubles every 18 months on average (Moore’s law).

Page 6: A many-core GPU architecture.. Price, performance, and evolution.

Current trendsHow we use GPUs.

Page 7: A many-core GPU architecture.. Price, performance, and evolution.

Alternative applications

New trends are showing GPU use in scientific computing using data-parallel algorithms. Examples include:

Page 8: A many-core GPU architecture.. Price, performance, and evolution.

Clustering

GPU clustering to simulate the dispersion of airborne contaminants in New York City.

Page 9: A many-core GPU architecture.. Price, performance, and evolution.

Image Stitching

Fast seamless stitching and tone-mapping of gigapixel images. (~1 hour on a notebook PC)

Page 10: A many-core GPU architecture.. Price, performance, and evolution.

Molecular Dynamics

Molecular dynamics to evaluate forces between atoms that do not share bonds.

Page 11: A many-core GPU architecture.. Price, performance, and evolution.

ArchitectureHow it is built.

Page 12: A many-core GPU architecture.. Price, performance, and evolution.

Key differences

TYPICAL GPU

Ordered sequence of rendering steps. Fixed hardware dedicated to each step.

LARABEE

Runs most of its pipeline in software running on multiple general purpose x86 cores.

This allows the rendering pipeline to be reconfigured dynamically. Hence, we are able to skip steps or allocate extra resources when required.

Page 13: A many-core GPU architecture.. Price, performance, and evolution.

Larrabee CPU Core

The Larrabee core is “derived” from the Pentium processor.

1 scalar unit for single operations and 1 vector unit for multiple operations.

32KB L1 data and instruction cache.

256 KB L2 cache which share a ring network.

Page 14: A many-core GPU architecture.. Price, performance, and evolution.

Details

8KB L1 cache is 4 times larger than original Pentium.

This is due to the fact that each core is able to perform four-way multithreading to reduce thread switching overhead. (Not to be confused with simultaneous multithreading.)

The 256KB L2 cache share a ring network. If a core is unable to find data in its own L2 cache, it places a request on a ring bus/network and will eventually find the data in its L2.

Uses a rendering technique called binning, which divides the screen into regions, and renders polygons accordingly.

Page 15: A many-core GPU architecture.. Price, performance, and evolution.

Benefits of Larrabee

Game physicsReal-time ray tracingImage and video processingPhysical simulationExtended rendering capabilities

Page 16: A many-core GPU architecture.. Price, performance, and evolution.

References

[1] Zhe Fan, Feng Qiu, Kaufman A., Yoakum-Stover S.  GPU Cluster for High Performance Computing. 2004. ACM / IEEE Supercomputing Conference 2004, November 06-12, Pittsburgh, PA.

[2] L. Seiler et al. 2008. Larrabee: A Many-Core x86 Architecture for Visual Computing. ACM Transactions on Graphics, vl. 27, n. 3, Article 18, August 2008.