Top Banner
44

Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Dec 30, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server
Page 2: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server
Page 3: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server
Page 4: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server
Page 5: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Ask the right question, regardless of scale

Customers use 100s to 1,000s Of cores to answer business-criticalQuestions they couldn’t have done before.

Page 6: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Trivial to support different use cases

Different RAM ratios, GPU, FPGA, Application/OS needs

Move workloads that don’t fit internally to Cloud

Page 7: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server
Page 8: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server
Page 9: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

#6 – Accelerating answers, accelerates people

720 (hours) 720 720

Computing Analysis

2880 hours /

120 Days to

Decision

Computing

720

Analysis

SCALABLE COMPUTING (in hours)

720

Computing Analysis Analysis

1456 hours /

60.6 Days to Decision

7208

Computing

ANTICIPATED BENEFIT (in hours)

8

Page 10: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

#6 – Accelerating answers, accelerates people

720 (hours) 720 720

Computing Analysis

2880 hours /

120 Days to Decision

Computing

720

Analysis

SCALABLE COMPUTING (in hours)

Higher Quality Output,

Iterative Analysis,

Less Context Switching

Computing & Analysis

POST ADOPTION: AGILE DESIGN PROCESS

8

Page 11: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Old: Shared internal cluster• Competition for resources

• Waiting in line for compute

• Shared downtime

New: Cluster Per Researcher

11

User

User User UserUser User UserUserUser

User

User User

• Remove bottlenecks

• Cost controls to manage $

• No waiting = 2x faster users

Page 12: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server
Page 13: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Korea Central

42Azure regions

US DoD West

US DoD East

Korea South

Page 14: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Core infrastructure

Advanced workloads

Tools

Azu

re S

tack +

Hyb

rid

TrustedProductiveIntelligentHybrid

Core infrastructure – Infrastructure-as-a-Service (IaaS)

Compute Storage Networking

Security Management

Advanced workloads – Platform-as-a-Service (PaaS)

Web + Mobile + Media

Internet of Things

Microservices

Containers

Serverless

Identity

Data + Analytics

Artificial intelligence

Cognitive services

High performance computing

Tools

Developer tools

DevOps Portal + scripting

Azu

re S

tack +

Hyb

rid

Page 15: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server
Page 16: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Self-managed Fully-managed

Cluster on the cloudCloud burst HPC as a service

Page 17: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

End User Infrastructure

On Prem HPC

Connectivity to Azure

HPC Head Node

HPC Compute Nodes

Lustre Parallel File System

RDMA High Speed Networking

Azure Front End Network

Blob storage

Job Submission Web Interface

1

2

3

4

5

6

7

8

9

10

System

Admins

End User

Azure

Front-end

network

Azure Blob storage

for long term data

storage

Parallel file

Management system servers

Parallel file

system servers

Parallel A8/A9 compute node

instances

HPC Head

Nodes

D or DS Series

Head node

RDMA

Azure

Back- end

Network

EthernetLarge Scale Compute

Express RouteMicrosoft Azure

On-premise

PBS PRO

Scheduler

Servers

LDAP HPC Head

Nodes

HPC Cluster on Prem

compute nodes

on prem

Custom Web

front end for job

scheduler

File Server/

SAN/NAS/NFS or

Parallel file system

Engineering desktop

with pre and post

processing

Web front end

accessed via

Client desktop

Private

network fabric

Corporate

Network

ON PREM ENVIRONMENT ON PREM CLIENT RESOURCES

2

3

1

7

5

4

6

9

8

10

Page 18: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

• Up to 16 cores, 3.2 GHz E5-2667 V3 Haswell processor

• Up to 224 GiB DDR4 memory

• FDR InfiniBand (56 Gbps, 2.6 microsecond latency)

• 2 TB of local SSD

• Up to 4 NVIDIA Tesla K80 GPUs

• Up to 24 cores

• Up to 224 GiB memory

• Up to 1440 GiB of local SSD

• FDR InfiniBand

• Up to 4 NVIDIA Tesla M60 GPUs

• Up to 24 cores

• Up to 224 GiB memory

• Up to 1440 GiB of local SSD

• Up to 4 NVIDIA Pascal P40 GPUs

• Up to 24 cores

• Up to 448 GiB memory

• Up to 3 TB of local SSD

• FDR InfiniBand

• Up to 4 NVIDIA Pascal P100 GPUs

• Up to 24 cores

• Up to 448 GiB memory

• Up to 3 TB of local SSD

• FDR InfiniBand

• Up to 72 cores, 3.7 GHz Intel Xeon Scalable (Skylake)

• Up to 144 GiB DDR4 memory

• Accelerated Networking (30 Gbps VM-to-VM)

• 500 GB of local SSD

• Up to 4 NVIDIA Tesla V100 GPUs

• Up to 24 cores

• Up to 448 GiB memory

• Up to 1344 GiB of local SSD

• FDR InfiniBand

Page 19: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Makes clouds fasterIntel® Xeon® processors for Azure compute and storage

Makes cloud smarterIntel® Field-Programmable

Gate Arrays (FPGA)

Makes clouds saferIntel® SGX enhances security with

encryption data during computation

Enables the future of AI:Intel® Open Source machine learning

frameworks and libraries

Accelerates networking for more efficiency:

Intel® Silicon Photonics 100G PSM4

Maximizes performance across operating systems:Clear Linux* OS for Intel®

Architecture

Page 20: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

High-performance compute

High-performance compute workloads; modeling; simulations;

genomic research

Intel® Xeon® processor E5-2667 v3 with DDR 4 memory

Intel® Xeon® processor E5-2670

Azure H and A8-11 Series

Memory optimized

Large database workloads; ERP; SAP; data warehousing

solutions

Intel® Xeon® E5-2673 v4 processors

Azure GS, G, DSv3, Ev3 and DS Series

Compute intensive

High CPU-to-memory ratio; massive large-scale

computation; deep learning

Intel® Xeon® Platinum 8168 processor

Fv2 VM family

SAP workloads

SAP applications across Dev/Test and production scenarios. SAP NetWeaver;

SAP S4/HANA; SAP BI

Intel® Xeon® E7-8890 V4 processors

SAP HANA VM family

Page 21: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Analyze large-scale data

Run simulations and financial models

Reduce time to market

Break free from the limitations of on-

premises infrastructure

Page 22: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Financial workloads

Scientific analysis

Genomics

Geothermal visualization

Deep learning

Ideal for compute-intensive workloads

Fv2-series

for the most high-demand apps

for workload-optimized performance

to speed up data compression and cryptography

for ultra low latencies

Intel® Xeon® Scalable processor

Intel® AVX-512

Intel® QAT

Intel® Arria® 10 FPGAs

Page 23: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

0

2000

4000

6000

8000

10000

12000

1 2 3 4 5 6 7 8

Ru

n t

ime in

seco

nd

s

Number of cores

Radioss Crash Simulation code results (Lower is better)

Linux RDMA On Azure Bare metal

Page 24: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

0

2000

4000

6000

8000

10000

12000

1 2 3 4 5 6 7 8

Ru

n t

ime in

seco

nd

s

Number of cores

Nodes with Ethernet Vs A9 run time for crash models/jobs

Azure A9 nodes MPI RDMA

Page 25: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

HPC Simulation and Analysis:

Deep Learning and AI Training:

Cloud Rendering:

Cloud Workstation:

Supported OS:

Page 26: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Optimization

Provisioning

Cluster

Configuration Monitoring

Internal

AdminScope Configure

Run on Cloud Optimize

User

Page 27: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server
Page 28: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Enable applications and algorithms

to easily and efficiently run in

parallel at scale

Rendering

Media transcoding & pre-/post-

processing

Test execution

Monte Carlo simulations

Genomics

Deep Learning

OCR

Data ingestion, processing, ETL

R at scale

Compiled MATLAB

Engineering simulations

Image analysis & processing

Page 29: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

How these services are built in Azure: Using Azure Batch

Get and manage VMs

Start the tasks

Move task input and output Queue tasks

Install task applications

Scale up and downTask failure? Task frozen?

Manage and authenticate users

Significant amount effort

spent managing compute

resources, security, data

movement, job running,

and application lifecycle,

not related to your actual

workload or business

User application or service

PaaS

Cloud Services

IaaS

Virtual Machines

Hardware

Provided by the cloud

platform

Page 30: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

User application or service

PaaS

Cloud Services

IaaS

Virtual Machines

Hardware

Azure Batch

VM management and job scheduling

App lifecycle, job dependencies, data movement,

task rescheduling, user management & authorization

• Don’t worry about the “plumbing”

• Focus on the workload/app

• Access higher-level capabilities

• Minimize the required cloud or

Azure experience

Provided by the cloud

platform

Page 31: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Capacity on demand

Jobs on demand

1 to 10,000’s VMs

1 to millions of tasks

Scale according to load

Pay by the minute

No charge for Batch;

pay for used resources

No head node

Use low-priority VMs

Page 32: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Page 33: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server
Page 34: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

https://github.com/Azure/doAzureParallel

Page 35: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Autodesk 3ds Max / Maya

Upload assets

Submit job

Return outputs

VM

Renderer

VM

Renderer

VM

Renderer

Integrated Client Plugin

Azure Batch

• Monitoring• Reporting• Single bill

Page 36: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server
Page 37: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Intelligence In Your

Apps and Data Services

Your Data Training With Scale-Out

GPU Clusters on Demand

Azure Batch AI Training

CNTK, TensorFlow,

Chainer…

Python, Visual Studio,…

Azure Machine Learning

Azure Data Lake

SQL Server

Your Data (Images, Text,

Logs, Time Series…)

+ =

Page 38: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Azure BatchAI Training

Service

Page 39: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

https://github.com/Azure/batch-shipyard

Page 40: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

A revolution in genomic analysis

Genomics acceleration in Azure

“As this type of information is used more often in the clinical setting, the emphasis on speed becomes much stronger.” – Geraldine Van der Auwera, Broad Institute

HowA Microsoft team worked with

researchers at the Broad

Institute to review the

algorithms in the Burrows-

Wheeler Aligner (BWA) and the

Genome Analysis Toolkit

(GATK)

ResultsUsing Microsoft’s expertise

in software development,

they discovered how to

greatly increase efficiency

and speed, without

compromising accuracy

Benefits• Run BWA and GATK analysis up

to seven times faster

• Run in parallel, at any scale, with

a single line of code

• Leave behind the complexity of

managing infrastructure

SolutionA fully-managed service on

Azure that enables

clinicians and researchers to

focus on getting the results

they need, faster and

reliably

Page 41: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Data Sources

On-premises Cloud

Data Insights

Business intelligenceAdvanced Analytics & AI

Operational data

Data warehousing

Big data processing

SQL ServerAzure

SQL DatabaseAzure

Document DB

Data virtualization

SQL ServerData Warehouse

Azure SQLData Warehouse

SQL ServerData Warehouse

AzureHDInsight

AzureData Lake

XEON and FPGAs

Data integrationStructured and unstructured

Page 42: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

Deep-learning platformPowered by Intel® 12NM Stratix 10 FPGAs

Record-setting performanceOver 130,000 compute operations per cycle

Page 43: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

INTELAZURE

Productive

Intel and Microsoft

co-engineering to offer

differentiated Azure services

powered by the latest Intel

Xeon processors

Hybrid

Flexible and consistent hybrid

cloud solutions with Intel Xeon

Scalable processors, from

Azure to Azure Stack

Intelligent

Innovative AI, Data, and

Analytics services optimized

with Intel technologies

Trusted

Unique Security Cloud

Services enabled by Intel SGX

technology

Page 44: Supercomputing Asia 2020 · 2018. 4. 2. · Cognitive services High performance computing Tools Developer tools DevOps Portal + ... Azure Machine Learning Azure Data Lake SQL Server

https://azure.microsoft.com/en-us/solutions/high-performance-computing/

Next Steps

https://azure.microsoft.com/en-

us/solutions/big-compute/

Got some

new ideas?