Top Banner
JEN-HSUN HUANG, CO-FOUNDER & CEO, GTC 2016 A NEW COMPUTING MODEL
52

GTC 2016 Opening Keynote

Jan 07, 2017

Download

Technology

Josef Spjut
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: GTC 2016 Opening Keynote

JEN-HSUN HUANG, CO-FOUNDER & CEO, GTC 2016

A NEW COMPUTING MODEL

Page 2: GTC 2016 Opening Keynote

2

Academia Games

Finance Manufacturing

Internet Oil & Gas

National Labs Automotive

Defense M & E

2X Accelerated Systems, 96% of New Systems on NVIDIA

2X GTC Attendees 4X CUDA Developers,10X in Hyperscale + Auto

Auto InternetGov't / Labs AcademiaM&E FinanceAerospace / Defense ManufacturingOil & Gas IT / HW / SWMedical

LEAPS IN ADOPTION

2012 2016

4x300K

0

20

40

60

80

100

120

Nov 2013 Nov 2014 Nov 2015

# ac

cele

rate

d sy

stem

s

5,500

2,350

2012 2016

Page 3: GTC 2016 Opening Keynote

3

NVIDIA SDKThe Essential Resource for GPU Computing

developer.nvidia.com | Available Now

Page 4: GTC 2016 Opening Keynote

4

NVIDIA GAMEWORKSVolumetric Lighting | Voxel Accelerated Ambient Occlusion | Hybrid Frustum Traced Shadows

Available Now

COMPUTEWORKS

HairWorks WaveWorks FlameWorks

and other technologies such as:Clothing, VXGI, Flex, Destruction

GAMEWORKS VRWORKS DESIGNWORKS DRIVEWORKS JETPACK

PhysX

Page 5: GTC 2016 Opening Keynote

5

NVIDIA DESIGNWORKSAdobe support of MDL | Siemens NX adopts Iray

COMPUTEWORKS

MDL OptiX Path Rendering

and other technologies such as:GL Extensions, GRID, GPU Direct for Video, Mosaic, VXGI, Warp and Blend

GAMEWORKS VRWORKS DESIGNWORKS DRIVEWORKS JETPACK

Iray

Page 6: GTC 2016 Opening Keynote

6

NVIDIA VRWORKSOculus Rift and HTC Vive integration | Epic, Max Play and Unity game engines

Available Now

COMPUTEWORKS

VR SLI Context Priority Warp and Blend

and other technologies such as:Direct Mode, GPUDirect for Video

GAMEWORKS VRWORKS DESIGNWORKS DRIVEWORKS JETPACK

Multi-Res Shading

Page 7: GTC 2016 Opening Keynote

7

NVIDIA COMPUTEWORKSCUDA 8 — Available June | cuDNN 5 — Available April | nvGRAPH — Available June

IndeX plug-in for ParaView — Available May

COMPUTEWORKS

cuDNN

and other technologies such as:AMGx, cuSOLVER, cuSPARSE, OpenACC, NSIGHT, THRUST

GAMEWORKS VRWORKS DESIGNWORKS DRIVEWORKS JETPACK

CUDA nvGRAPH IndeX

Page 8: GTC 2016 Opening Keynote

8

NVIDIA DRIVEWORKSJPL — Available Now | EAP — Available Q2’16

General Release — Available Q1’17

COMPUTEWORKS

Detection Localization HD Maps

GAMEWORKS VRWORKS DESIGNWORKS DRIVEWORKS JETPACK

SensorFusion

and other technologies such as:Driving, Planning

Page 9: GTC 2016 Opening Keynote

9

NVIDIA JETPACKJetson TX1: 24 images/s/W | GIE - GPU Inference Engine — Available May

COMPUTEWORKS

DIGITS Workflow VisionWorks Jetson Media SDK

and other technologies such as:Linux4Tegra, NSIGHT EE, OpenCV4Tegra, OpenGL, System Trace, Visual Profiler, Vulkan

GAMEWORKS VRWORKS DESIGNWORKS DRIVEWORKS JETPACK

Deep Learning SDK

Page 10: GTC 2016 Opening Keynote

10

VR: A START OF A NEW PLATFORM

New York Times ships Cardboard to subscribers

Microsoft demonstrates Holoportation

Google announces Jump VR camera platform

Samsung, Oculus, HTC release headsets

VR Startups Raise $1.5B in funding

Page 11: GTC 2016 Opening Keynote

11

EVEREST VR

Page 12: GTC 2016 Opening Keynote

12

MARS 2030

Page 13: GTC 2016 Opening Keynote

13

IRAY VRBreakthrough Photoreal VR — Available Starting in June

Rasterize depth buffer at headset eye positions

Reconstruct image for new viewpoint from depth and multiple probes

Pre-render light probes surrounding region of interest

Page 14: GTC 2016 Opening Keynote

14

IRAY VR

Page 15: GTC 2016 Opening Keynote

15

IRAY VR LITEAvailable in June

2. Download Irayfor 3ds Max Plug-in

1. Design in 3ds Max 3. Download Android Viewer

4. Get VR HMD

Page 16: GTC 2016 Opening Keynote

16

AN AMAZING YEAR IN AI

AlphaGoRivals a World Champion

Microsoft & Google “Superhuman” Image

Recognition

Microsoft “Super Deep Network”

Berkeley’s BrettOne network,

everything robotics

Deep Speech 2One network, 2 languages

A New Computing Model Hits Pop Culture

Page 17: GTC 2016 Opening Keynote

17

A NEW COMPUTING MODEL

Deep Learning Object DetectionDNN + Data + HPC

Traditional Computer VisionExperts + Time

Deep Learning Achieves “Superhuman” Results

0%10%20%30%40%50%60%70%80%90%

100%

2009 2010 2011 2012 2013 2014 2015 2016

Traditional CVDeep Learning

ImageNet

Page 18: GTC 2016 Opening Keynote

18

Page 19: GTC 2016 Opening Keynote

19

Ad Service Technology

InvestmentMedia

Oil & Gas

Mfg

Retail

Other

$500B OPPORTUNITY OVER 10 YRS

Deep Learning Software Revenue by Industry

Deep Learning Total Revenue by Segment

IBM: “Cognitive business representsa $2T opportunity”

SOURCE: “Deep Learning for Enterprise Applications,” 4Q 2015, Tractica

Page 20: GTC 2016 Opening Keynote

20

NVIDIA GPU FOR HYPERSCALE

10X Speed up | 20 images/s/W Cloud Services Powered by AI

TESLA M40 + TESLA M4

Page 21: GTC 2016 Opening Keynote

21

Soumith ChintalaAI Research Engineer, Facebook

“ Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks.”

— Soumith Chintala, Facebook AI ResearchAlec Radford & Luke Metz indico Research

Page 22: GTC 2016 Opening Keynote

22

UNSUPERVISED LEARNING

Page 23: GTC 2016 Opening Keynote

23

150B XTORS | 5.3TF FP64 | 10.6TF FP32 | 21.2TF FP16 | 14MB SM RF | 4MB L2 Cache

TESLA P100THE MOST ADVANCED HYPERSCALE DATACENTER GPU EVER BUILT

Page 24: GTC 2016 Opening Keynote

24

“FIVE MIRACLES”

16nm FinFETPascal Architecture CoWoS with HBM2 New AI AlgorithmsNVLink

Page 25: GTC 2016 Opening Keynote

25

GIANT LEAPS IN EVERYTHING

3x GPU Mem BW3x Compute 5x GPU-GPU BW

Tera

flop

s (F

P32/

FP16

)

5

10

15

20

K40

P100 (FP32)

P100 (FP16)

M40

K40

Band

wid

th (

GB/

Sec)

40

80

120

160 P100

M40

K40

Band

wid

th

1x

2x

3x P100

M40

Page 26: GTC 2016 Opening Keynote

26

“ This is a new era of computing. New approaches to the underlying technologies will be required for AI and cognitive. The combination of NVIDIA Pascal GPUs and IBM POWER accelerates Watson’s learning of new skills. Together, IBM and NVIDIA will advance the artificial intelligence industry.”

Dr. John Kelly III, SVP, Cognitive Solutions & IBM Research

“ NVIDIA GPU is accelerating progress in AI. As neural nets become larger and larger, we not only need faster GPUs with larger and faster memory, but also much faster GPU-to-GPU communication, as well as hardware that can take advantage of reduced-precision arithmetic. This is precisely what Pascal delivers.”

Yann LeCun, Director of AI Research, Facebook

“ Microsoft is developing super deep neural networks that are more than 1000 layers. NVIDIA Tesla P100’s impressive horsepower will enable Microsoft’s CNTK to accelerate AI breakthroughs.”

Xuedong Huang, Chief Speech Scientist, Microsoft Research

“ AI computers are like space rockets: The bigger the better. Pascal’s throughput and interconnect will make the biggest rocket we’ve seen yet.”

Andrew Ng, Chief Scientist, Baidu

Page 27: GTC 2016 Opening Keynote

27

TESLA P100 SERVERSComing in Q1‘17

Page 28: GTC 2016 Opening Keynote

28

GPU-ACCELERATED DL FOR EVERY MARKET

IBM: “Cognitive business representsa $2T opportunity”

Deep Learning in the Cloud

Deep Learning for Enterprise

Ad Service Technology

InvestmentMedia

Oil & Gas

Mfg

Retail

Other

SOURCE: “Deep Learning for Enterprise Applications,” 4Q 2015, Tractica

Page 29: GTC 2016 Opening Keynote

29

Engineered for deep learning | 170TF FP16 | 8x Tesla P100

NVLink hybrid cube mesh | Accelerates major AI frameworks

NVIDIA DGX-1WORLD’S FIRST DEEP LEARNING SUPERCOMPUTER

Page 30: GTC 2016 Opening Keynote

30

Page 31: GTC 2016 Opening Keynote

31

“250 SERVERS IN-A-BOX”

DUAL XEON DGX-1

FLOPS (CPU + GPU) 3 TF 170 TF

AGGREGATE NODE BW 76 GB/s 768 GB/s

ALEXNET TRAIN TIME 150 HOURS 2 HOURS

TRAIN IN 2 HOURS >250 NODES* 1 NODE

*Caffe Training on Multi-node Distributed-memory Systems Based on Intel® Xeon® Processor E5 Family (extrapolated)Gennady Fedorov (Intel)'s picture Submitted by Gennady Fedorov (Intel), Vadim P. (Intel) on October 29, 2015https://software.intel.com/en-us/articles/caffe-training-on-multi-node-distributed-memory-systems-based-on-intel-xeon-processor-e5

Page 32: GTC 2016 Opening Keynote

32

12X SPEED-UP IN ONE YEAR1.33 billion images/day

25 Hours

2 Hours

GTC 20154 Maxwell GPUS

GTC 20168 Pascal GPUS

Page 33: GTC 2016 Opening Keynote

33

Bryan CatanzaroSenior Researcher, Baidu

Time series input

“Time series output”

GPU0

GPU1

Model Parallel

Data Parallel

Recurrent Neural Nets Model + Data Parallelism

Page 34: GTC 2016 Opening Keynote

34

Add Model Parallelism over NVLINK Compose with Data ParallelismPersistent RNNs:

Peak FLOPs at batch of 8

weights

keep in registers

repeat ~300 times repeat ~300 times

GPU0

GPU1

GPU2

GPU3

Data Parallel

Strong scale to 32X more processors

Page 35: GTC 2016 Opening Keynote

35

Rajat MongaTensorFlow Technical Lead & Manager, Google

Page 36: GTC 2016 Opening Keynote

36

170TF | “250 servers in-a-box” | nvidia.com/dgx1

$129,000

NVIDIA DGX-1WORLD’S FIRST DEEP LEARNING SUPERCOMPUTER

Page 37: GTC 2016 Opening Keynote

37

PIONEERS IN AI RESEARCH

Frameworks for Multi-GPU Pascal

Large-scale Deep Learning

Reinforcement Learning

Unsupervised and Transfer Learning

Natural Language Understanding

Autonomous Driving

Medical Applications

Page 38: GTC 2016 Opening Keynote

38

DEEP LEARNING FOR MEDICINENVIDIA Founding Technology Partner of MGH Center of Clinical Data Science

10B Medical images on DGX-1 to advance radiology, pathology, genomics

Page 39: GTC 2016 Opening Keynote

39

TESLA FAMILY

Multi-App HPCHyperscale HPC Strong-Scale HPC Researchers / Early Adopters

M40 + M4 K80

Page 40: GTC 2016 Opening Keynote

40

Uber Enters the Race

Toyota Invests $1B in AI Lab

Volvo Drive Me on Public Roads in 2017

NHTSA: Computer Counts as Driver

Tesla Model 3: 300K pre-orders

AN AMAZING YEAR FOR SELF-DRIVING CARS

Audi, BMW, Daimler Buy HERE

Tesla Model S Auto-pilot

Baidu Enters the Race

Honda, Nissan, Toyota Team Up

GM Buys Cruise

Page 41: GTC 2016 Opening Keynote

41

SELF-DRIVING LOOPS

LOCALIZEMAP SEE DRIVE

Page 42: GTC 2016 Opening Keynote

42

World’s first DL-powered car computing platform

One scalable architecture — from DNN training to cluster, infotainment, ADAS, autonomous driving, and mapping

Open platform

NVIDIA DRIVE PX AI CAR COMPUTER

Training on DGX-1

Driving with DriveWorks

KALDILOCALIZATION

MAPPING

DRIVENET

DAVENET

NVIDIA DGX-1 NVIDIA DRIVE PX

Page 43: GTC 2016 Opening Keynote

43

NVIDIA DRIVE PX PERCEPTION

Training on DGX-1

Driving with DriveWorks

KALDILOCALIZATION

MAPPING

DRIVENET

DAVENET

NVIDIA DGX-1 NVIDIA DRIVE PX

NVIDIA DRIVENET#1 accuracy score for KITTI car detection

Page 44: GTC 2016 Opening Keynote

44

NVIDIA DRIVE PX PERCEPTION

Training on DGX-1

Driving with DriveWorks

KALDILOCALIZATION

MAPPING

DRIVENET

DAVENET

NVIDIA DGX-1 NVIDIA DRIVE PX

Page 45: GTC 2016 Opening Keynote

45

NEW END-TO-END HD MAPPING

Training on DGX-1

Driving with DriveWorks

KALDILOCALIZATION

MAPPING

DRIVENET

DAVENET

NVIDIA DGX-1 NVIDIA DRIVE PX

Page 46: GTC 2016 Opening Keynote

46

BAIDU SELF-DRIVING CAR COMPUTER

Page 47: GTC 2016 Opening Keynote

47

NEW END-TO-END HD MAPPING

Training on DGX-1

Driving with DriveWorks

KALDILOCALIZATION

MAPPING

DRIVENET

DAVENET

NVIDIA DGX-1 NVIDIA DRIVE PX

Page 48: GTC 2016 Opening Keynote

48

PLATFORM FOR MAPPING THE WORLD

Page 49: GTC 2016 Opening Keynote

49

NEW AI DRIVING

Training on DGX-1

Driving with DriveWorks

KALDILOCALIZATION

MAPPING

DRIVENET

DAVENET

NVIDIA DGX-1 NVIDIA DRIVE PX

Page 50: GTC 2016 Opening Keynote

50

WORLD’S FIRST AUTONOMOUS RACE CARDesigned by Daniel Simon | 2,200 lbs | Blazing fast

Page 51: GTC 2016 Opening Keynote

51

WORLD’S FIRST AUTONOMOUS CAR RACE10 teams, 20 identical cars | DRIVE PX 2: The “brain” of every car | 2016/17 Formula E season

Page 52: GTC 2016 Opening Keynote