THE RISE OF GPU-ACCELERATED DATA SCIENCE
2
Recognize automation will increase
speed and accuracy of decisions²90%
Believe it will transform their
industry³95%
Have invested or deployed
AI solutions today⁴4%
CIOs investing in AI
in the next 3 years¹
85%
¹https://www.cio.com/article/3198121/it-industry/whats-now-in-digital-transformation.html
²https://www.servicenow.com/content/dam/servicenow-assets/public/en-us/doc-type/resource-center/white-paper/wp-cio-global-pov.pdf
³AI Business Book
⁴https://www.gartner.com/en/newsroom/press-releases/2018-02-13-gartner-says-nearly-half-of-cios-are-planning-to-deploy-artificial-intelligence
#1 STRATEGIC IMPERATIVE FOR THE MODERN ENTERPRISE
3
FROM BUSINESS INTELLIGENCE TO DATA SCIENCEForecasting, Fraud Detection, Recommendation, and More
Ad Personalization
Click Through Rate Optimization
Churn Reduction
CONSUMER INTERNET
Claim Fraud
Customer Service Chatbots/Routing
Risk Evaluation
FINANCIAL SERVICES
Remaining Useful Life Estimation
Failure Prediction
Demand Forecasting
MANUFACTURING
Detect Network/Security Anomalies
Forecasting Network Performance
Network Resource Optimization (SON)
TELECOM
Supply Chain & Inventory Management
Price Management / Markdown Optimization
Promotion Prioritization And Ad Targeting
RETAIL
Personalization & Intelligent Customer Interactions
Connected Vehicle Predictive Maintenance
Forecasting, Demand, & Capacity Planning
AUTOMOTIVE
Sensor Data Tag Mapping
Anomaly Detection
Robust Fault Prediction
OIL & GAS
Improve Clinical Care
Drive Operational Efficiency
Speed Up Drug Discovery
HEALTHCARE
4
5
CHALLENGES AFFECTINGDATA SCIENCE TODAY
What is limiting Data Science productivity?
INCREASING DATA
ONSLAUGHT
Data sets are continuing
to dramatically increase
in size
Multitude of sources
Different formats, varying
quality
SLOW CPUPROCESSING
End of Moore’s law, CPUs
aren’t getting faster
Many popular data science
tools have been CPU-only
Can only throw so many
CPUs at a job
COMPLEX INSTALLATION & MANAGEMENT
Time consuming to install
software
Nearly impossible to
manage all version
conflicts
Updates often break other
software
6
TRADITIONAL INFRASTRUCTURE SETUP
● Resource availability depends
on job queues
● Slower model iteration process
● Software stack needs IT
management and support
● Too expensive for
everyday development
● Slow data migration
● Security and privacy
concerns
CHALLENGES
7
GPU-ACCELERATEDDATA SCIENCE WORKFLOW
DataSources
Wrangle Data
Train
GPU Accelerated Data Science
DataLakeETL
Evaluate Predictions
Data Preparation Train Deploy
8
DEVELOPMENT VS. PRODUCTION
9
NVIDIA DATA SCIENCE PLATFORM
10
NVIDIA GPU-ACCELERATED DATA SCIENCEA Solution for Every User and Every Organization
PRODUCTION DATA CENTER
QUADRO
Workstations
T4
ServersTITAN RTX
PC
ML EXPERIMENTATION
Cloud
DGX Station,
DGX-1 / HGX-1DGX-2 / HGX-2
V100
Servers
Academia Enterprise
11
NVIDIA-POWERED DATA SCIENCE WORKSTATION
Integrated
hardware and
software solution
for Data Science
12
POWERED BY NVIDIA QUADRO
GPU Architecture Turing
CUDA Cores 4608
RT Cores 72
Tensor Cores 576
Memory BW Up to 672 GB/s
NVLink2-way (2 & 3slot)
100 GB/s bidirectional
Display Support 4x DP + 1x VirtualLink
RTX 8000 48GB / 96GB w/NVLink
RTX 6000 24GB / 48GB w/NVLink
GV100 32GB / 64GBDouble Precision (FP64)
RTX 6000/8000
13
PERFORMANCE
CPU Gold [email protected] 3.7GHz Turbo (Skylake) End-to-end time = ETL + Conversion + Training + Validation
280
186
528.34
13.6
40.9
78.36
9.13
22.1
53.56
0 100 200 300 400 500 600
ETL
XGBoost
End-to-end
Seconds (lower is better)
Mortgage DataYr 2015-16, 2 parts
2x RTX8000 1x RTX8000 CPU
14
NVIDIA DATA SCIENCE SOFTWARE STACK
NGC CONTAINERS
DATA PROCESSING DATA LAKE
CUDA-X AI
DATA PROCESSING
cuDF DALI
MACHINE LEARNING
cuML cuGRAPH cuDNN cuBLAS NCCLTensorR
T
DEEP LEARNING
Red Hat Linux Ubuntu
OPERATING
SYSTEMS
CLUSTER MANAGEMENT/DEPLOYMENT (CONTAINERS)
ML/DL INFERENCE
WORKFLOWS (Kubeflow, Airflow,...)
Dask-cuDF
Dask-cuPY
Spark
Datalogue
TensorFlow
PyTorch
Horovod
XGBoost
Dask-cuML
OmniSci
BlazingSQL
SQreamDB
Kinetica
BrytlytDB
TF Serving
ONNX Runtime
TRTIS
Enterprise Desktop Enterprise Server
© Copyright 2019 Dell Inc.15 of Y
NGC: GPU-OPTIMIZED SOFTWARE HUBSimplifying DL, ML and HPC Workflows
50+ ContainersDL, ML, HPC
Pre-trained ModelsNLP, Classification, Object Detection & more
Industry WorkflowsMedical Imaging, Intelligent Video Analytics
Model Training ScriptsNLP, Image Classification, Object Detection & more
Innovate Faster
Deploy Anywhere
Simplify Deployments
NGC
16
NVIDIA NGC SUPPORT SERVICESMinimize Downtime And Maximize System Utilization
Availability
• Exclusively for NGC-Ready
workstations
• Availability starting in Q2
• Service agreement between
NVIDIA & customer
• Purchase from OEMSupport by NVIDIA’s subject matter experts
24x7 portal, phone and email access to create support cases
Live support during local, regional business hours for technical assistance
Support Coverage
• NGC DL & ML containers
• NVIDIA drivers
• NV-docker
• CUDA
17
HIGH-PERFORMANCE DATA SCIENCE
Maximized Productivity
Highly optimized cross-compatible
stack of data science libraries
Faster model design, development
and iteration
Greater flexibility using Python and
conda package management
Ease of Use
Turnkey system for GPU accelerated
data science
End-to-End software stack
acceleration from data preparation
to visualization
Orchestration compatible software
stack to help scale on clusters
Enterprise Support
Built for 24x7 reliability and robustness
Quick and easy deployment using
“NGC-Ready” containers and conda
Tested across GPUs and systems for
compatibility and performance
18
BUSINESS IMPACT
“The NVIDIA-powered data science workstation enables
our data scientists to run end-to-end data processing
pipelines on large data sets faster than ever. Leveraging
RAPIDS to push more of the data processing pipeline to
the GPU reduces model development time which leads to
faster deployment and business insights.”
-Mike Koelemay, Lockheed Martin Fellow
“Our initial look at the NVIDIA-Powered Lenovo AI
workstation showed significant performance gains. Data
scientists will appreciate being able to move more quickly
through the analytics life cycle, which will allow them to
address and support more analytics needs to transform
business processes.”
-- Gavin Day, Senior Vice President for Technology at
SAS
“We have a diverse, multi-disciplinary
environment and are looking to couple data
science and analytics to a wider range of our
technical practices throughout our business. The
NVIDIA-powered Data Science Workstation
promises to ease the transition and democratize
the application of data science. We find it
extremely well suited to experimentation,
exploration, solution discovery, and early
prototyping work. Its combination of well-
designed software and highly performant
hardware provides a 20x and higher speed-ups
in our analytics work and our team found its
ease of use liberating.”
-Steve Walker, Associate Director,
Arup, Advanced Digital Engineering
19
LEARN MORE
NVIDIA-POWERED DATA SCIENCE
WORKSTATION WEB PAGE
NVIDIA ACCELERATED
DATA SCIENCE WEB PAGE
NVIDIA QUADRO
PRODUCTS WEB PAGE