Top Banner
August, 2019 ACCELERATING THE DATACENTER
14

ACCELERATING THE DATACENTER - NVIDIA · NEW! NVIDIA Confidential. 10 Machine Learning Virtual Graphics Deep Learning ... T4 GPUs Containers NGC Ready Support NVIDIA Confidential CISCO

Jun 03, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ACCELERATING THE DATACENTER - NVIDIA · NEW! NVIDIA Confidential. 10 Machine Learning Virtual Graphics Deep Learning ... T4 GPUs Containers NGC Ready Support NVIDIA Confidential CISCO

August, 2019

ACCELERATING THE DATACENTER

Page 2: ACCELERATING THE DATACENTER - NVIDIA · NEW! NVIDIA Confidential. 10 Machine Learning Virtual Graphics Deep Learning ... T4 GPUs Containers NGC Ready Support NVIDIA Confidential CISCO

2

1

10

100

1000

Mar-12 Mar-13 Mar-14 Mar-15 Mar-16 Mar-17 Mar-18

Re

lati

ve

Pe

rfo

rm

an

ce

Mar-19

2013

BEYOND MOORE’S LAW

Base OS: CentOS 6.2

Resource Mgr: r304

CUDA: 5.0

Thrust: 1.5.3

2019

Accelerated Server

With FermiAccelerated Server

with Volta

NPP: 5.0

cuSPARSE: 5.0

cuRAND: 5.0

cuFFT: 5.0

cuBLAS: 5.0

Base OS: Ubuntu 16.04

Resource Mgr: r384

CUDA: 10.0

NPP: 10.0

cuSPARSE: 10.0

cuSOLVER: 10.0

cuRAND: 10.0

cuFFT: 10.0

cuBLAS: 10.0

Thrust: 1.9.0

Progress Of Stack In 6 Years

GPU-Accelerated Computing

CPU

Moore’s Law

2013 2014 2015 2016 2017 2018 2019March

Rela

tive P

erf

orm

ance

Page 3: ACCELERATING THE DATACENTER - NVIDIA · NEW! NVIDIA Confidential. 10 Machine Learning Virtual Graphics Deep Learning ... T4 GPUs Containers NGC Ready Support NVIDIA Confidential CISCO

3

AI, MACHINE LEARNING, AND DEEP LEARNING

Page 4: ACCELERATING THE DATACENTER - NVIDIA · NEW! NVIDIA Confidential. 10 Machine Learning Virtual Graphics Deep Learning ... T4 GPUs Containers NGC Ready Support NVIDIA Confidential CISCO

4

THE BIG PROBLEM IN DATA SCIENCE

All

DataETL

Manage Data

Structured

Data Store

Data Preparation

Training

Model Training

Visualization

Evaluate

Inference

Deploy

Slow Training Times for Data Scientists

Page 5: ACCELERATING THE DATACENTER - NVIDIA · NEW! NVIDIA Confidential. 10 Machine Learning Virtual Graphics Deep Learning ... T4 GPUs Containers NGC Ready Support NVIDIA Confidential CISCO

5

DATA SCIENCE IS THEKEY TO MODERN BUSINESS

Use Cases in Every Industry

Ad Personalization

Click Through Rate Optimization

Churn Reduction

CONSUMER INTERNET

Claim Fraud

Customer Service Chatbots/Routing

Risk Evaluation

FINANCIAL SERVICES

Remaining Useful Life Estimation

Failure Prediction

Demand Forecasting

MANUFACTURING

Detect Network/Security Anomalies

Forecasting Network Performance

Network Resource Optimization (SON)

TELECOM

Supply Chain & Inventory Management

Price Management / Markdown Optimization

Promotion Prioritization And Ad Targeting

RETAIL

Personalization & Intelligent Customer Interactions

Connected Vehicle Predictive Maintenance

Forecasting, Demand, & Capacity Planning

AUTOMOTIVE

Sensor Data Tag Mapping

Anomaly Detection

Robust Fault Prediction

OIL & GAS

Improve Clinical Care

Drive Operational Efficiency

Speed Up Drug Discovery

HEALTHCARE

Page 6: ACCELERATING THE DATACENTER - NVIDIA · NEW! NVIDIA Confidential. 10 Machine Learning Virtual Graphics Deep Learning ... T4 GPUs Containers NGC Ready Support NVIDIA Confidential CISCO

6

HOW GPU ACCELERATION WORKSApplication Code

+

GPU CPU5% of Code

Compute-Intensive Functions

Rest of SequentialCPU Code

Page 7: ACCELERATING THE DATACENTER - NVIDIA · NEW! NVIDIA Confidential. 10 Machine Learning Virtual Graphics Deep Learning ... T4 GPUs Containers NGC Ready Support NVIDIA Confidential CISCO

7

NVIDIA BREAKS RECORDS IN AI PERFORMANCEMLPerf Records Both At Scale And Per Accelerator

Record Type Benchmark Record

Max Scale

(Minutes To

Train)

Object Detection (Heavy Weight) Mask R-CNN 18.47 Mins

Translation (Recurrent) GNMT 1.8 Mins

Reinforcement Learning (MiniGo) 13.57 Mins

Per Accelerator

(Hours To Train)

Object Detection (Heavy Weight) Mask R-CNN 25.39 Hrs

Object Detection (Light Weight) SSD 3.04 Hrs

Translation (Recurrent) GNMT 2.63 Hrs

Translation (Non-recurrent)Transformer 2.61 Hrs

Reinforcement Learning (MiniGo) 3.65 Hrs

Per Accelerator comparison using reported performance for MLPerf 0.6 NVIDIA DGX-2H (16 V100s) compared to other submissions at same scale except for MiniGo where NVIDIA DGX-1 (8 V100s) submission was used| MLPerf

ID Max Scale: Mask R-CNN: 0.6-23, GNMT: 0.6-26, MiniGo: 0.6-11 | MLPerf ID Per Accelerator: Mask R-CNN, SSD, GNMT, Transformer: all use 0.6-20, MiniGo: 0.6-10

Page 8: ACCELERATING THE DATACENTER - NVIDIA · NEW! NVIDIA Confidential. 10 Machine Learning Virtual Graphics Deep Learning ... T4 GPUs Containers NGC Ready Support NVIDIA Confidential CISCO

8

4X MORE PERFORMANCE, SAME SERVERRapid Software Innovation Delivers Continuous Improvements

0x

1x

2x

3x

4x

5x

Image ClassificationRN50

Rela

tive

Speedup

DGX-1 At Launch(2017)

DGX-1 MLPerf 0.6(2019)

Comparing the performance of a single DGX-1 server at launch and MLPerf ID 0.6-8

4X Faster From 8 Hrs to 2 Hrs

Page 9: ACCELERATING THE DATACENTER - NVIDIA · NEW! NVIDIA Confidential. 10 Machine Learning Virtual Graphics Deep Learning ... T4 GPUs Containers NGC Ready Support NVIDIA Confidential CISCO

9

DGX REFERENCE ARCHITECTURE SOLUTIONSGrowing Ecosystem of IT-approved Solutions for AI infrastructure

Benefits:

• No more design guesswork

• Faster, simpler deployment

• Predictable performance at scale

• Simplified, single-point of support

NEW!

NVIDIA Confidential

Page 10: ACCELERATING THE DATACENTER - NVIDIA · NEW! NVIDIA Confidential. 10 Machine Learning Virtual Graphics Deep Learning ... T4 GPUs Containers NGC Ready Support NVIDIA Confidential CISCO

10

Machine Learning

Virtual Graphics

Deep Learning

IVA/HPC/Others

ACCELERATING MAINSTREAM BUSINESS SERVERSModern Enterprise Computing Platform

NGC Containers

ML/DA/DLDirect Support

from NVIDIA

Containers NGC Ready SupportT4 GPUs

NVIDIA Confidential

CISCO UCS C240 M5 Dell PowerEdge R740 HPE Proliant DL380 Gen 10 Lenovo ThinkSystem SR670

Page 11: ACCELERATING THE DATACENTER - NVIDIA · NEW! NVIDIA Confidential. 10 Machine Learning Virtual Graphics Deep Learning ... T4 GPUs Containers NGC Ready Support NVIDIA Confidential CISCO

11

Creating A Massive Market Opportunity

VAST WORLD OF AI INFERENCE

Embedded ComputersGeneral Purpose Computers Embedded Devices

Page 12: ACCELERATING THE DATACENTER - NVIDIA · NEW! NVIDIA Confidential. 10 Machine Learning Virtual Graphics Deep Learning ... T4 GPUs Containers NGC Ready Support NVIDIA Confidential CISCO

12

Kernel

Auto-TuningOptimal kernels selected

by activation precision

Layer &

Tensor Fusion

Dynamic Tensor

MemoryEfficient usage by GPU

Precision

Selection FP32, FP16, INT32

CalibrationINT8

NVIDIA TensorRT 5 INFERENCE PLATFORM

Accelerates Throughput On Leading Industry Platforms

Embedded

Automotive

Data center

Jetson

Drive

Tesla

TESLA V100

DRIVE PX 2

TESLA P4

JETSON TX2

NVIDIA DLA

Optimizer Runtime

TensorRT

FRAMEWORKS GPU PLATFORMS

Page 13: ACCELERATING THE DATACENTER - NVIDIA · NEW! NVIDIA Confidential. 10 Machine Learning Virtual Graphics Deep Learning ... T4 GPUs Containers NGC Ready Support NVIDIA Confidential CISCO

13

APPS &FRAMEWORKS

NVIDIA SDK& LIBRARIES

NVIDIA DATA CENTER PLATFORMSingle Platform Drives Utilization and Productivity

VIRTUAL GPU

CUDA & CORE LIBRARIES - cuBLAS | NCCL

DEEP LEARNING

cuDNN

HPC

cuFFTOpenACC

+550 Applications

Amber

NAMD

CUSTOMER USE CASES

VIRTUAL GRAPHICS

Speech Translate Recommender

SCIENTIFIC APPLICATIONS

Molecular Simulations

WeatherForecasting

SeismicMapping

CONSUMER INTERNET & INDUSTRY APPLICATIONS

ManufacturingHealthcare Finance

GPUs & SYSTEMS

SYSTEM OEM CLOUDTESLA GPU NVIDIA HGXNVIDIA DGX FAMILY

MACHINE LEARNING

cuMLcuDF cuGRAPH cuDNN CUTLASS TensorRTvDWS vPC

Creative & Technical

Knowledge Workers

vAPPS

DX/OGL

Page 14: ACCELERATING THE DATACENTER - NVIDIA · NEW! NVIDIA Confidential. 10 Machine Learning Virtual Graphics Deep Learning ... T4 GPUs Containers NGC Ready Support NVIDIA Confidential CISCO

THANK YOU