Top Banner
Demo of running CUDA programs on GPU and potential speed-up over CPU ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 10, 2011
9

Demo of running CUDA programs on GPU and potential speed-up over CPU ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 10, 2011.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Demo of running CUDA programs on GPU and potential speed-up over CPU ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 10, 2011.

Demo of running CUDA programs on GPU and

potential speed-up over CPU

ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 10, 2011

Page 2: Demo of running CUDA programs on GPU and potential speed-up over CPU ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 10, 2011.

2

Xclock running on client PC

Xclock running on coit-

grid01.uncc.edu

Xclock running on coit-

grid06.uncc.edu

Xterm running on client PC, logged onto coit-grid06.uncc.edu

Typical user

interface (using a Windows

PC)

WinSCP running on client PC connected to

grid01.uncc.edu

To make sure all X servers running

Page 3: Demo of running CUDA programs on GPU and potential speed-up over CPU ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 10, 2011.

3

Heat distribution problem(Solving Laplace’s equation)

800 x 800 points with 2000 iterationsSpeed-up = 21.2(Not sufficiently converged)

Fireplace

Page 4: Demo of running CUDA programs on GPU and potential speed-up over CPU ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 10, 2011.

4

800 x 800 points50000 iterations

Different GPU block structure

Speed-up = 16.57

Fireplace

Page 5: Demo of running CUDA programs on GPU and potential speed-up over CPU ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 10, 2011.

5

200 x 200 points with 20000 iterations

Different GPU block structure

Speed-up = 3.9

Fireplace

Page 6: Demo of running CUDA programs on GPU and potential speed-up over CPU ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 10, 2011.

6

Potential speed-upSpeed-up factor = Execution time on CPU

Execution using GPU

One can get one or two orders of magnitude speed up just by using a single GPU!!

But it will take care to achieve large speed-ups.

Algorithm used on GPU may be different to that used on CPU because of constraints on GPU, so should really compare best sequential version on CPU with algorithm used on GPU

Page 7: Demo of running CUDA programs on GPU and potential speed-up over CPU ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 10, 2011.

7

N Body problem

Page 8: Demo of running CUDA programs on GPU and potential speed-up over CPU ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 10, 2011.

8

Page 9: Demo of running CUDA programs on GPU and potential speed-up over CPU ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 10, 2011.

Questions