Top Banner
NVIDIA Visual Profiler & CUDA-MEMCHECK
16

NVIDIA Visual Profiler - uni-graz.at · NVIDIA Visual Profiler & CUDA-MEMCHECK . Visual Profiler – Overview •Included in CUDA Toolkit •Visualize and optimize performance of

Aug 12, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: NVIDIA Visual Profiler - uni-graz.at · NVIDIA Visual Profiler & CUDA-MEMCHECK . Visual Profiler – Overview •Included in CUDA Toolkit •Visualize and optimize performance of

NVIDIA Visual Profiler &

CUDA-MEMCHECK

Page 2: NVIDIA Visual Profiler - uni-graz.at · NVIDIA Visual Profiler & CUDA-MEMCHECK . Visual Profiler – Overview •Included in CUDA Toolkit •Visualize and optimize performance of

Visual Profiler – Overview

• Included in CUDA Toolkit

• Visualize and optimize performance of a CUDA application

• Shows timeline on CPU and GPU

• nvvp (GUI)

• nvprof (Terminal)

• Two types: – Executable session

– Imported session (importing data generated by nvprof)

• Generate pdf report

Page 3: NVIDIA Visual Profiler - uni-graz.at · NVIDIA Visual Profiler & CUDA-MEMCHECK . Visual Profiler – Overview •Included in CUDA Toolkit •Visualize and optimize performance of

Getting started

Page 4: NVIDIA Visual Profiler - uni-graz.at · NVIDIA Visual Profiler & CUDA-MEMCHECK . Visual Profiler – Overview •Included in CUDA Toolkit •Visualize and optimize performance of

Timeline View

• CPU activity

• GPU activity

• Shows start & end of

– Threads

– Kernels

– Memcpy

– …

• Zoom, filter, reorder, …

Page 5: NVIDIA Visual Profiler - uni-graz.at · NVIDIA Visual Profiler & CUDA-MEMCHECK . Visual Profiler – Overview •Included in CUDA Toolkit •Visualize and optimize performance of
Page 6: NVIDIA Visual Profiler - uni-graz.at · NVIDIA Visual Profiler & CUDA-MEMCHECK . Visual Profiler – Overview •Included in CUDA Toolkit •Visualize and optimize performance of
Page 7: NVIDIA Visual Profiler - uni-graz.at · NVIDIA Visual Profiler & CUDA-MEMCHECK . Visual Profiler – Overview •Included in CUDA Toolkit •Visualize and optimize performance of

Analysis View

• Guided or unguided – For unguided compile with SET(LOCAL_CUDA_NVCC_FLAGS ${LOCAL_CUDA_NVCC_FLAGS] –lineinfo)

• CUDA Application Analysis – Application‘s overall GPU utilization

– Kernel performance (orders kernels according to optimization importance based on execution time and achieved occupancy)

• Performance-Critical Kernels – Detailed analysis of a selected kernel

Page 8: NVIDIA Visual Profiler - uni-graz.at · NVIDIA Visual Profiler & CUDA-MEMCHECK . Visual Profiler – Overview •Included in CUDA Toolkit •Visualize and optimize performance of
Page 9: NVIDIA Visual Profiler - uni-graz.at · NVIDIA Visual Profiler & CUDA-MEMCHECK . Visual Profiler – Overview •Included in CUDA Toolkit •Visualize and optimize performance of
Page 10: NVIDIA Visual Profiler - uni-graz.at · NVIDIA Visual Profiler & CUDA-MEMCHECK . Visual Profiler – Overview •Included in CUDA Toolkit •Visualize and optimize performance of

• Compute, Bandwith, or Latency Bound

• Instruction and memory latency

– Examine occupancy

How many warps the kernel has active on the GPU, relative to the maximum number of warps supported by GPU

– Examine stall reasons

Could give insight why latency is still an issue for the kernel

Page 11: NVIDIA Visual Profiler - uni-graz.at · NVIDIA Visual Profiler & CUDA-MEMCHECK . Visual Profiler – Overview •Included in CUDA Toolkit •Visualize and optimize performance of
Page 12: NVIDIA Visual Profiler - uni-graz.at · NVIDIA Visual Profiler & CUDA-MEMCHECK . Visual Profiler – Overview •Included in CUDA Toolkit •Visualize and optimize performance of
Page 13: NVIDIA Visual Profiler - uni-graz.at · NVIDIA Visual Profiler & CUDA-MEMCHECK . Visual Profiler – Overview •Included in CUDA Toolkit •Visualize and optimize performance of

• Compute resources

GPU compute resources could limit the performance of a kernel, if they are insufficient or poorly utilized

Page 14: NVIDIA Visual Profiler - uni-graz.at · NVIDIA Visual Profiler & CUDA-MEMCHECK . Visual Profiler – Overview •Included in CUDA Toolkit •Visualize and optimize performance of

CUDA-MEMCHECK

• detects memory access errors

• Run time error detection

• Included in CUDA Toolkit

• Getting started:

– cuda-memcheck executable -options

best case:

Page 15: NVIDIA Visual Profiler - uni-graz.at · NVIDIA Visual Profiler & CUDA-MEMCHECK . Visual Profiler – Overview •Included in CUDA Toolkit •Visualize and optimize performance of

Supported error detection

• Memory access error Errors due to out of bound or misaligned access to memory by global,

local, shared or global atomic access

• Hardware exception Errors reported by hardware error reporting mechanism

• Malloc/Free errors Errors due to incorrect use of malloc or free

• CUDA API errors Failure of CUDA API call

• cudaMalloc memory leaks Allocations of device memory which have not been freed

• Device heap memory leaks Allocations of device memory in device code which have not been freed

Page 16: NVIDIA Visual Profiler - uni-graz.at · NVIDIA Visual Profiler & CUDA-MEMCHECK . Visual Profiler – Overview •Included in CUDA Toolkit •Visualize and optimize performance of

Example

__global__ : for device global memory __shared__ : for per block shared memory __local__ : for per thread local memory Information about type of access (read / write) Size of access in bytes Source file and line number Thread indices and block indices Memory address being accessed and type of access error