Top Banner
©2012 DataDirect Networks. All Rights Reserved. Accelerates High Performance Computing and Big Data Application with Breaking through I/O Bottleneck 2012/10/26 Philip Zhu Director, North Asia, HPC
25

Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

Jul 16, 2018

Download

Documents

trinhbao
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

Accelerates High Performance

Computing and Big Data Application

with Breaking through I/O Bottleneck

2012/10/26

Philip Zhu

Director, North Asia, HPC

Page 2: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

DDN | We Accelerate Information Insight

2

DDN is a Leader in Massively Scalable Platforms and

Solutions for Big Data and Cloud Applications

► Established: 1998

► Main Office: Sunnyvale, California, USA

► Employees: 600+ Worldwide

► Worldwide Presence: 16 Countries

► Installed Base: 1,000+ End Customers; 50+ Countries

► Go To Market: Global Partners, Resellers, Direct

World-Renowned & Award-Winning

6/8/12

Page 3: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

DDN | Massive Scalability Focus

3

DDN powers

organizations to

accelerate discovery

to find new insights for

drug, energy and

scientific discovery.

The world’s largest

and most critical HPC

systems depend on

DDN.

DDN brings

cloud-enabled business

& social networkers

together – allowing

them to share and

collaborate on data.

The world’s largest web

and cloud infrastructures

depend on DDN.

DDN empowers

data-driven

organizations to make

informed, real-time

decisions to protect

society.

The world’s most

secure governments

and organizations

depend on DDN.

DDN provides the tools

for creative organizations

to create and broadcast

the richest possible

content.

The world’s most creative

and mission critical

content organizations

depend on DDN.

Media Web/Cloud HPC Security

6/8/12

Page 4: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

DDN | Massive Scalability Focus

4

Media Web/Cloud HPC

Life Science

Financial Service

Energy Exploration

Manufacturing

Big Data Analytics

Web 2.0

Social Networking

Service Providers

Cloud Computing

Intelligence

Defense

Surveillance

Homeland Security

Broadcasting

Video Production

Film Animation

Video On Demand

Security

6/8/12

Page 5: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

The Broadest Big Data Portfolio

5

Parallel File Storage SAN File Storage Enterprise NAS

EXAScaler™

10Ks of Clients

1TB/s+

GRIDScaler™

1Ks of Clients

1TB/s+

Scale-Out NAS

xSTREAMScaler™

100s SAN/LAN Clients

HSM Capable

Streaming Optimized

NAS Scaler™

1-16 NAS Servers

Fully-Featured

High Performance

Silicon Storage Architecture Storage Fusion Architecture

WOS®

256 Billion Objects

GeoReplicated

Cloud Foundation

Mobile Cloud Access 6GB/s in Real-Time

1,200 Drives: 2 Racks

2GB/s, 350K IOPS

120 Drives in 8U

15GB/s, 1.2M IOPS

1,200 Drives: 2 Racks

Embedded Computing

40GB/s/1.7M IOPS

1,680 Drives: 2 Racks

Embedded Computing

Platforms Support Any Drive Technology

to Optimize Data Intense Workflows

SAS SATA SSD

Object Storage

9900 6620 10K 12K

File Storage

Storage Engines

DirectMon

Big Data Management Made Easy

Infrastructure Management Media

DDN Confidential

Page 6: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

DDN | Storage Fusion Architecture (SFA)

6

Accelerating Big Data and Cloud, Optimizing TCO Over 1 Million Lines of S/W Code – First Customer Shipped 2008

Designed Specifically for Big Data and Cloud Workloads

Parallel State-Machine Design

Maximum Performance, Lowest

Latency

Virtualized Processing

Optimized Environment for Big Data

Application Hosting

Robust Data Protection

Quality of Service and Performance

Without Compromise

Flexible & Massively Scalable

Best-In-Class Scalability and Density

Storage Fusion Architecture™

[Core Storage S/W Engine]

In-Storage Processing™ Engine & DMA Driver

Dir

ec

tMo

n™

: In

fra

str

uc

ture

Ma

na

gem

en

t

‘Scaler File System Family

Low-Latency Connect: FC, IB, Memory

Interrupt-Free Storage Processing

ReACT™ Adaptive Cache Technology

DirectProtect™ Data Integrity Management

Quality of Service Engine

Storage Fusion Fabric™

6/8/12

Page 7: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

Smarter Big Data Analytics

Infrastructure is A Reality

7 6/8/12

Volume Velocity Variety + +

Manage More Data

Than Traditional

Systems

Accelerate Runtimes

With The Highest

Performance AND

Performance

Efficiency

Best-In-Class Mixed

Mode Performance &

Optimized for The

Right Media For Your

Analytics

Eliminate Bottlenecks & Master:

Page 8: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

DirectMon™

8

DDN | DirectMon™ (new for Q312)

Storage Management Made Simple

► A powerful, intuitive single pane of

glass to monitor & manage your

environment

► Simplify the administration of DDN

SFA and ‘Scaler* environments

► Leverages DDN SFA 1.5 API

*GRIDScaler Support Immediately

EXAScaler and Hadoop in 2H12.

Site Overview

6/8/12

Page 9: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

DDN | World-Leading Deployments

9

HPC Cloud & Web

Infrastructure

Professional

Media

Security

6/8/12

Page 10: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

DDN | The “Big” In Big Data

98% Fortune 500 ► Customers of

Online File Service

► PBs of Cloud

Storage

768 Thousand ► # of Supercomputer

CPUs

► 35PB Storage

Processing System

23 Million ► Broadcast Viewers

► 9PB of Real-Time

Play-To-Air

Systems

If any company is well poised to take on the challenges of exascale

computing and big data, it's DDN, since this is its heritage. 451 Group

10 6/8/12

Page 11: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

DDN | Solving Big Data Challenges

11

“We are very pleased to work with DDN on a

solution that reduces the complexity of our

storage environment and accelerates our

research efforts.”

“We are very happy with the performance & the

flexible way we store the data, and scale. We

are talking 1-2 racks as compared to 6-7 racks.”

“DDN works all the time – there’s never a

need to take it offline, reformat it, or fine-tune

it. In our business there is no such thing as

just good enough.”

SFA technology is used by for massive internet

content analysis. To support this project, DDN

partners with Vertica to drive mission-critical,

real-time insight.

Worldwide Gene Sequencing Leader

Worldwide Cloud Storage Leader

Largest East-Coast Post Studio

US Government Intelligence Agency

6/8/12

Page 12: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

SFA12K™ | Models

Maximum Drives

System Interface

Drive Types

System Capacity

Bandwidth

Cache IOPS

Flash IOPS

In-Storage

Processing™

1,6801 1,6801 1,680

FDR IB

16Gb FC2

FDR IB

16Gb FC

FDR IB

10/40GbE

3.5” & 2.5” SSD, SAS & SATA (inter-mixable)

6.72PB (w/ 4TB HDDs)1

20GB/s

(raw I/O)

40GB/s

(raw I/O)

20GB/s

(file I/O)

850K 1.7M 850K

700K 1.4M 700K

N/A N/A Yes.

ExaScaler, GridScaler

Customer Provided

SFA12K-20 SFA12K-20E SFA12K-40

Page 13: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

SFA12K-40 Benchmark

13 Confidential Information 6/8/12

Page 14: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

GRIDScaler P

ara

llel F

ile S

tora

ge

Ap

plia

nce

Massively Scalable Parallel File Storage Appliance

► Easy to deploy, All-in-One appliance

based on IBM GPFS technology

► Scalable building block architecture

• 200GB/sec+ and 100,000s of IOPS

► Feature-Rich, Enterprise Grade

Quality and High Availability with no

single point of failure

► DirectMon centralized configuration

and monitoring solution

Page 15: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

12KE GRID is a Hit!

- Argonne National Labs - Mira

► Argonne – Mira • 48 racks of BGQ @ 1024 nodes per rack

o 1.6 GHz 16-way core processor and 16 GB RAM per node

• 384 I/O nodes

• Four Mellanox IS5600 QDR switches (GPFS Infiniband Verbs = Main Storage I/O protocol)

• For a total of 768K cores, 768 terabytes of RAM, and a peak performance of 10 petaflops.”

► Scratch file system (SFS) – 240+ GB/s • 16 SFA 12K-20E couplets

• Each w/ 10 SS7000 enclosures, 560 3TB SATA drives, 32 200GB SSD’s

► Home file system (HFS) • 3 SFA 12K-20E couplets, each w/ 5 enclosures, 240 3TB SATA drives, 32 200GB SSD's

• Embedded vm's are dedicated GPFS NSD servers, with GPFS sold and supported by IBM.

► Test and Development filesystem (T&D) • 1 SFA 12K-20E couplet, w/ 10 SS7000 enclosures, 560 3TB SATA drives, 32 200GB SSD's

• Embedded vm's are dedicated GPFS NSD servers, with GPFS sold and supported by IBM.

► HPSS Cache • 1 SFA 12K-40 couplet, w/ 10 SS7000 enclosures, 560 3TB SATA drives, 32 200GB SSD's

Page 16: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

Integration with Web Object Scaler

► Built for collaboration

across geos

► Simulate on GRIDscaler

and distribute using

WOS

► Ingest using WOS

access (NFS and CIFS)

and simulate on

GRIDscaler

► Back up files safely to

the WOS cloud for

disaster recovery

CIFS

Access

Clustered

NFS Access

Simulations

using Parallel File

Systems

Page 17: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

Reference Customers with GridScaler

Customers that require tremendous client-side bandwidth and/or would like to scale to 1000’s of client nodes.

Customers that require low latency communication for client access with high availability.

Customers that need to scale to multi-petabytes with simplified data management.

Customers that require the world’s leading density and energy efficient file storage system

Page 18: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

ExaScaler

• The World's Fastest Parallel File System at 300 GBs/sec

• Completely Open Source Technology

• Fault-Tolerant architecture to support Mission Critical Operations

• Superior Data and Performance Protection with a both S2A and SFA technologies

• Massive Scalability >10PB Volumes

• Allows concurrent file and directory access

• SFA10000E Features ExaScaler Natively Embedded

• Leading Rack, Energy and Disk Performance Efficiency

• Deployed by the HPC Storage Experts

7 Years of HPC

Leadership in

Page 19: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

Worldwide Performance Leader

DDN Exascaler Technology

Powers the World's

Fastest File Systems: • 300GB/s at CEA

• 250GB/s as US DoD

• 250GB/s at ORNL

• The World's Fastest Parallel File System

• RDMA Capable Single Client throughput of

2.5GB/s+

• Ideal for 10,000s of concurrent

client access to storage

Supports full single file concurrency

Supports full single directory

concurrency

• Network Flexible

Writes Natively to IP / IB

• S2A Real-time storage technology provides

guaranteed performance

Writes are as fast as Reads

No degradation in performance due to drive

errors, rebuild events or enclosure failures

Page 20: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

Designed for Scale

► Scalable Volume and Namespace • Scale from a few TBs to greater that 10 PB Volumes

► Scalable Bandwidth • From a few MBs to greater that 300 GB/sec of bandwidth

• Excellent price/performance

► Scalable Connectivity • Seamlessly scale from a single client to 20,000 clients concurrently

accessing storage

• Stripe files and directories across multiple OSSs

► Scalable Resiliency • No Single point of failure

• Add multiple paths to storage with additional data servers

► Scalable Simplicity • ExaScaler monitoring and event notification system

• Management console with SNMP trap display utility, logging, remote

administration and notification utility

IB with RDMA/

1GbE/10GbE

Add 1,000s of ExaScaler Clients

Add ExaScaler Building Blocks

Page 21: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

Sample Customers with ExaScaler

And many more…

Page 22: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

Case Studies

Page 23: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

Oakridge National Labs – Jaguar Super Computer

The HPC storage leader, DDN maintains

advantage through hyper-scale expertise

At 240GB/s – DDN has delivered the world's

most scalable bio-informatics storage solution

to ORNL

ORNL has over 400GB/s of DDN performance &

capacity efficient solutions

Today's supercomputers are tomorrow's

workgroup clusters, only DDN technology

scales up and down to meet and grow with

the complex requirements of life sciences

Gating Mechanism of Membrane Proteins

PI: Benoit Roux, Argonne National Laboratory/University of Chicago

Physical Basis of Recalcitrance to Hydrolysis of Lignocellulosic Biomass

Principal Investigator: Jeremy Smith, ORNL

Enabling

Petascale

Research

Page 24: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

When Total decided they needed to increase their seismic computation by

10X – they turned to DDN's ExaScaler File Storage technology.

With DDN technology – Total can scale 10X or 100X and is now ready for

whatever future requirements that may arise.

The world’s 10th largest company depends on

DDN Solutions to scale production operations…

Scalability Case Study: Seismic Workflow

© 2010 DataDirect Networks. All rights reserved.

Page 25: Accelerates High Performance Computing and Big Data ... for Big Data and Cloud Applications ... and energy efficient file storage system ... •Fault-Tolerant architecture to support

©2012 DataDirect Networks. All Rights Reserved.

Thank You.

25 6/8/12