Top Banner
THE RISE OF GPU- ACCELERATED DATA SCIENCE
20

THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

Jul 08, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

THE RISE OF GPU-ACCELERATED DATA SCIENCE

Page 2: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

2

Recognize automation will increase

speed and accuracy of decisions²90%

Believe it will transform their

industry³95%

Have invested or deployed

AI solutions today⁴4%

CIOs investing in AI

in the next 3 years¹

85%

¹https://www.cio.com/article/3198121/it-industry/whats-now-in-digital-transformation.html

²https://www.servicenow.com/content/dam/servicenow-assets/public/en-us/doc-type/resource-center/white-paper/wp-cio-global-pov.pdf

³AI Business Book

⁴https://www.gartner.com/en/newsroom/press-releases/2018-02-13-gartner-says-nearly-half-of-cios-are-planning-to-deploy-artificial-intelligence

#1 STRATEGIC IMPERATIVE FOR THE MODERN ENTERPRISE

Page 3: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

3

FROM BUSINESS INTELLIGENCE TO DATA SCIENCEForecasting, Fraud Detection, Recommendation, and More

Ad Personalization

Click Through Rate Optimization

Churn Reduction

CONSUMER INTERNET

Claim Fraud

Customer Service Chatbots/Routing

Risk Evaluation

FINANCIAL SERVICES

Remaining Useful Life Estimation

Failure Prediction

Demand Forecasting

MANUFACTURING

Detect Network/Security Anomalies

Forecasting Network Performance

Network Resource Optimization (SON)

TELECOM

Supply Chain & Inventory Management

Price Management / Markdown Optimization

Promotion Prioritization And Ad Targeting

RETAIL

Personalization & Intelligent Customer Interactions

Connected Vehicle Predictive Maintenance

Forecasting, Demand, & Capacity Planning

AUTOMOTIVE

Sensor Data Tag Mapping

Anomaly Detection

Robust Fault Prediction

OIL & GAS

Improve Clinical Care

Drive Operational Efficiency

Speed Up Drug Discovery

HEALTHCARE

Page 4: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

4

Page 5: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

5

CHALLENGES AFFECTINGDATA SCIENCE TODAY

What is limiting Data Science productivity?

INCREASING DATA

ONSLAUGHT

Data sets are continuing

to dramatically increase

in size

Multitude of sources

Different formats, varying

quality

SLOW CPUPROCESSING

End of Moore’s law, CPUs

aren’t getting faster

Many popular data science

tools have been CPU-only

Can only throw so many

CPUs at a job

COMPLEX INSTALLATION & MANAGEMENT

Time consuming to install

software

Nearly impossible to

manage all version

conflicts

Updates often break other

software

Page 6: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

6

TRADITIONAL INFRASTRUCTURE SETUP

● Resource availability depends

on job queues

● Slower model iteration process

● Software stack needs IT

management and support

● Too expensive for

everyday development

● Slow data migration

● Security and privacy

concerns

CHALLENGES

Page 7: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

7

GPU-ACCELERATEDDATA SCIENCE WORKFLOW

DataSources

Wrangle Data

Train

GPU Accelerated Data Science

DataLakeETL

Evaluate Predictions

Data Preparation Train Deploy

Page 8: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

8

DEVELOPMENT VS. PRODUCTION

Page 9: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

9

NVIDIA DATA SCIENCE PLATFORM

Page 10: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

10

NVIDIA GPU-ACCELERATED DATA SCIENCEA Solution for Every User and Every Organization

PRODUCTION DATA CENTER

QUADRO

Workstations

T4

ServersTITAN RTX

PC

ML EXPERIMENTATION

Cloud

DGX Station,

DGX-1 / HGX-1DGX-2 / HGX-2

V100

Servers

Academia Enterprise

Page 11: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

11

NVIDIA-POWERED DATA SCIENCE WORKSTATION

Integrated

hardware and

software solution

for Data Science

Page 12: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

12

POWERED BY NVIDIA QUADRO

GPU Architecture Turing

CUDA Cores 4608

RT Cores 72

Tensor Cores 576

Memory BW Up to 672 GB/s

NVLink2-way (2 & 3slot)

100 GB/s bidirectional

Display Support 4x DP + 1x VirtualLink

RTX 8000 48GB / 96GB w/NVLink

RTX 6000 24GB / 48GB w/NVLink

GV100 32GB / 64GBDouble Precision (FP64)

RTX 6000/8000

Page 13: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

13

PERFORMANCE

CPU Gold [email protected] 3.7GHz Turbo (Skylake) End-to-end time = ETL + Conversion + Training + Validation

280

186

528.34

13.6

40.9

78.36

9.13

22.1

53.56

0 100 200 300 400 500 600

ETL

XGBoost

End-to-end

Seconds (lower is better)

Mortgage DataYr 2015-16, 2 parts

2x RTX8000 1x RTX8000 CPU

Page 14: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

14

NVIDIA DATA SCIENCE SOFTWARE STACK

NGC CONTAINERS

DATA PROCESSING DATA LAKE

CUDA-X AI

DATA PROCESSING

cuDF DALI

MACHINE LEARNING

cuML cuGRAPH cuDNN cuBLAS NCCLTensorR

T

DEEP LEARNING

Red Hat Linux Ubuntu

OPERATING

SYSTEMS

CLUSTER MANAGEMENT/DEPLOYMENT (CONTAINERS)

ML/DL INFERENCE

WORKFLOWS (Kubeflow, Airflow,...)

Dask-cuDF

Dask-cuPY

Spark

Datalogue

TensorFlow

PyTorch

Horovod

XGBoost

Dask-cuML

OmniSci

BlazingSQL

SQreamDB

Kinetica

BrytlytDB

TF Serving

ONNX Runtime

TRTIS

Enterprise Desktop Enterprise Server

Page 15: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

© Copyright 2019 Dell Inc.15 of Y

NGC: GPU-OPTIMIZED SOFTWARE HUBSimplifying DL, ML and HPC Workflows

50+ ContainersDL, ML, HPC

Pre-trained ModelsNLP, Classification, Object Detection & more

Industry WorkflowsMedical Imaging, Intelligent Video Analytics

Model Training ScriptsNLP, Image Classification, Object Detection & more

Innovate Faster

Deploy Anywhere

Simplify Deployments

NGC

Page 16: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

16

NVIDIA NGC SUPPORT SERVICESMinimize Downtime And Maximize System Utilization

Availability

• Exclusively for NGC-Ready

workstations

• Availability starting in Q2

• Service agreement between

NVIDIA & customer

• Purchase from OEMSupport by NVIDIA’s subject matter experts

24x7 portal, phone and email access to create support cases

Live support during local, regional business hours for technical assistance

Support Coverage

• NGC DL & ML containers

• NVIDIA drivers

• NV-docker

• CUDA

Page 17: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

17

HIGH-PERFORMANCE DATA SCIENCE

Maximized Productivity

Highly optimized cross-compatible

stack of data science libraries

Faster model design, development

and iteration

Greater flexibility using Python and

conda package management

Ease of Use

Turnkey system for GPU accelerated

data science

End-to-End software stack

acceleration from data preparation

to visualization

Orchestration compatible software

stack to help scale on clusters

Enterprise Support

Built for 24x7 reliability and robustness

Quick and easy deployment using

“NGC-Ready” containers and conda

Tested across GPUs and systems for

compatibility and performance

Page 18: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster

18

BUSINESS IMPACT

“The NVIDIA-powered data science workstation enables

our data scientists to run end-to-end data processing

pipelines on large data sets faster than ever. Leveraging

RAPIDS to push more of the data processing pipeline to

the GPU reduces model development time which leads to

faster deployment and business insights.”

-Mike Koelemay, Lockheed Martin Fellow

“Our initial look at the NVIDIA-Powered Lenovo AI

workstation showed significant performance gains. Data

scientists will appreciate being able to move more quickly

through the analytics life cycle, which will allow them to

address and support more analytics needs to transform

business processes.”

-- Gavin Day, Senior Vice President for Technology at

SAS

“We have a diverse, multi-disciplinary

environment and are looking to couple data

science and analytics to a wider range of our

technical practices throughout our business. The

NVIDIA-powered Data Science Workstation

promises to ease the transition and democratize

the application of data science. We find it

extremely well suited to experimentation,

exploration, solution discovery, and early

prototyping work. Its combination of well-

designed software and highly performant

hardware provides a 20x and higher speed-ups

in our analytics work and our team found its

ease of use liberating.”

-Steve Walker, Associate Director,

Arup, Advanced Digital Engineering

Page 20: THE RISE OF GPU- ACCELERATED DATA SCIENCE“The NVIDIA-powered data science workstation enables our data scientists to run end-to-end data processing pipelines on large data sets faster