CUDA

CUDABy: Areeb Ahmed Khan

Topics:• What is CUDA?

• What is FLOPS?

• CUDA C.

• Processing flow of CUDA.

• What is GPGPU?

• An Overview of Parallel Computing.

• Serial Computing VS Parallel Computing.

• World’s Most Powerful GPU.

What is CUDA?

CUDA stands for “Compute Unified Device Architecture”. It is a Parallel computing platform developed by NVIDIA (Graphics Card Manufacturer) in 2006 and implemented in their GPUs. CUDA gives developers, access to the instruction set and memory of the parallel computation elements in GPUs.

CUDA enabled GPUs can be used for General Purpose Processing (GPGPU).

CUDA Platform:

The CUDA platform is accessible to programmers by several CUDA accelerated libraries and by several extended standard programming languages like:

• C extended to CUDA C.

• C++ extended to CUDA C++.

• Fortran extended to CUDA Fortran

NVIDIA CUDA™ technology is the world’s only C language environment that enables programmers and developers to write software to solve complex computational problems in a fraction of the time by tapping into the many-core parallel processing power of GPU's.

CUDA C:The user writes a C code, while the compiler divides the code into two portions. One portion is delivered to CPU (because CPU is best for such tasks), while the other portion, involving extensive calculations (FLOPS), is delivered to the GPU(s), that executes the code in parallel. Because C is a familiar programming language, CUDA results in very steep learning curve and hence it is becoming a favorite tool for accelerating various applications.

FLOPS:FLOPS an acronym for “FLoating - point Operations Per Second” is a unit to measure computer performance, useful in fields of scientific calculations that make heavy use of floating-point calculations. For such cases it is a more accurate measure than the generic instructions per second.

CUDA C Sample Code:

#include < stdio.h >#include < cuda.h >

__global__ void kernel_function(int a, int b ) {

printf(“ The value is %d”,a*b);}

int main( void ) {

kernel_function<<<1,1>>>(5,2);return 0;

}

Processing Flow of CUDA:• CPU copies data from main

memory to GPU memory.

• CPU instructs the processes to GPU.

• GPU executes the processes parallel in each core.

• CPU copies the result from GPU memory to main memory.

http://upload.wikimedia.org/wikipedia/commons/5/59/CUDA_processing_flow_(En).PNG

What is GPGPU?

GPGPU stands for “General purpose Graphics processing unit” . It is an approach which describes the use of a GPU for those computations in applications which are generally handled by the Central processing Unit (CPU). As in CPU, there are several no. of cores working parallel where as in a GPU there are hundred of cores working in parallel which could be very useful if we utilize them in general computations. It is also called GPU Accelerated Computing.

Parallel Computing:

It is a form of computation in which many calculations are carried out simultaneously which results in quick completion of calculations.

All the GPUs works on parallel computing whereas in CPUs , parallel computing is in the form of multiple cores as they can entertain many processes simultaneously for an time efficient solution.

Serial Computing VS Parallel Computing

Serial Computing:

• To be run on a single computer having a single Central Processing Unit (CPU);

• A problem is broken into a discrete series of instructions.

• Instructions are executed one after another.

• Only one instruction may execute at any moment in time.

Parallel Computing:

• Problem is broken into discrete parts that can be solved concurrently

• Each part is further broken down to a series of instructions

• Instructions from each part execute simultaneously on different processors

• An overall control/coordination mechanism.

World’s Most Powerful GPU

THANK YOU !

CUDA

Devices & Hardware

cuda platform

cuda results

cuda c

processing flow of cuda

parallel computing platform

overview of parallel

processes parallel

cuda c sample code