Top Banner
HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery
12

HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.

Dec 16, 2015

Download

Documents

Stephany Parker
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.

HPCC Mid-Morning Break

High Performance Computing on a GPU cluster

Dirk Colbry, Ph.D.

Research Specialist

Institute for Cyber Enabled Discovery

Page 2: HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.

What is a GPU?

• Graphics Processing Unit• Originally designed to make

Video Games• Uses many processing cores to

parallelize the math required for real time game play.

• Early researchers made general programs that looked like graphics so they could run in the GPU.

• In 2006 nVidia released the CUDA programming interface to allow users to easily make scalable general purpose programs that run on the GPU (GPGPU).

Page 3: HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.

GPU vs CPU

Page 4: HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.

CPU and GPU working together

Page 5: HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.

Running on the GPU

• Program Starts on the CPU Copy data to GPU (slow-ish) Run kernel threads on GPU (very fast) Copy results back to CPU (slow-ish)

• There are a lot of clever ways to fully utilize both the GPU and CPU.

Page 6: HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.

Pros and Cons

• Benefits Lots of processing

cores. Works with the CPU

as a co-processor Very fast local

memory bandwidth Large online

community of developers

• Drawbacks Can be difficult to

program. Memory Transfers

between GPU and CPU are costly (time).

Cores typically run the same code.

Page 7: HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.

gfx-000 Test hardware

• Single Quad core 2.4 Ghz Intel Processor.

• 8GB of CPU RAM• Three Nvidia GTX 280 Video cards:

1GB of ram per card 240 CUDA processing Cores per card 1.3 GHz Processor Clock Speed

• Total of 724 cores on a single machine

Page 8: HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.

Installed Software on gfx-000

• Cuda toolkit 2.2 and 2.3 For programming in c/c++ and fortran

• cublas – Cuda version of blas libraries• cufft – Cuda version of fft libraries• pycuda – Python Cuda Interface• Zephyr – Molecular Dynamics Program

optimized for GPUs

Page 9: HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.

Other Available Software

• OpenCL c/c++ interface

• Jacket Matlab GPU wrapper

• Lattice Boltzmann pde solver

• OpenVIDIA Machine Vision

• Many Many others

• Cuda Zone ~90 thousand cuda

developers. Lots of software

examples Developer Forms Tutorials

• http://www.nvidia.com/object/cuda_home.html

Page 10: HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.

New GPU Cluster Buy-In

• Rack Units: 1U• CPU: 2x Intel Xeon E5530 Quad-Core 2.40GHZ• Memory: 18GB of Ram• Hard drive: 250GB disk for OS and Local

Scratch• Network: Ethernet only, (no Infiniband support)• GPU: Two Nvidia Tesla M1060 GPUs• Support: Four year, next business day hardware

support• Cost: $5,224

Page 11: HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.

Each Nvidia Tesla M1060

• Number of Streaming Processor Cores 240• Frequency of processor cores 1.3 GHz• Single Precision peak floating point performance 933 gigaflops• Double Precision peak floating point performance 78 gigaflops• Dedicated Memory 4 GB GDDR3• Memory Speed 800 MHz• Memory Interface 512-bit • Memory Bandwidth 102 GB/sec• System Interface PCIe

Page 12: HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.

What are we buying

• 240 cores * 2 GPUs + 4 cores * 2 CPUs = 488 Cores / node

• 31 Nodes (minimum) * 488 Cores / node = 15,128 cores in our new cluster

• However, 20 of these nodes are dedicated buy-in nodes so only 5368 cores will be available in the general cluster