Top Banner
1 © 2013 IBM Corporation High Performance Storage in Today’s Critical Applications March 23, 2014 Andy Walls, IBM Fellow, CTO and Chief Architect IBM Flash Systems
17

High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

Sep 27, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

1 © 2013 IBM Corporation

High Performance Storage in Today’s Critical Applications

March 23, 2014 Andy Walls, IBM Fellow, CTO and Chief Architect IBM Flash Systems

Page 2: High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

© 2013 IBM Corporation 2

Hard Disk Drive History

• RAMAC was the first hard disk drive! – One of the top technological inventions. . . .

EVER!!

• 5MB across 50 HUGE platters

• After 50 years, the capacity increase is incredible.

• As are the reliability increases. . . .

• Performance limited by the rate at which it can spin.

– 15K RPM

• Has not kept up with the speed of CPUs

RAMAC Prototype

Page 3: High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

© 2013 IBM Corporation 3

HDD growth focus: areal density for 50 years

HDD access latency: <10% / y for most of that period

Cache

Access Latency

Data Rate Areal Density

Data rate has just topped 100MB/sec. But RPM not increasing. New increases will come from linear density improvement

SO: With HDDs, Performance improvements have been gained by scaling out high speed disks and only using a portion

Outer Diameter

Hard Disk Drive History

Page 4: High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

© 2013 IBM Corporation 4

Hard Disk Drive Technology Has Not Kept Up With Advances in CPUs or CPU Scaling

As you can see from this database example, which uses rotating disk drives, even well-tuned databases have the opportunity to improve performance and reduce hardware resources

•App %

•Sys %

•I/O Wait %

Perc

ent

CPU Time

Clock Time Source: Internal IBM performance lab testing

Reducing I/O wait time can allow for higher server utilization

Page 5: High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

© 2013 IBM Corporation 5

Performance Gap

CPU performance up 10x this last decade

Storage has grown capacity but unable to keep up in performance

Systems are now Latency & IO bound resulting in significant performance gap

From 1980 to 2010, CPU performance has grown 60% per year*

…and yet, disk performance has grown ~5% per year during that same period**

IT Infrastructure Challenges

Page 6: High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

© 2013 IBM Corporation 6

Flash is a powerful accelerator for today’s critical applications

• Big Data – Hadoop, MongoDB, Cassandra

• High Performance Cloud

• Business Analytics

• OLTP

• HPC

Page 7: High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

© 2013 IBM Corporation 7

How Flash Accelerates Today’s Most Critical Applications

• Latency – Inherent read latency – Systems employ DRAM for buffering so write latency can be very fast

• IOPS – Very high IOPS – More importantly, high IOPS with low average response time under load. – More consistent performance - can handle temporary workload spikes

• High Throughput – Reduced table scan times – Reuced time for clones and snapshots – Reduced time for backup coalescence

• Reduction in batch windows

Page 8: High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

© 2013 IBM Corporation 8

The Impact of Low Latency on CPU Performance

100 microseconds : 1 second :: 1 second : 2.78 hours

MicroLatency deliver microseconds response time to accelerate critical applications to achieve competitive advantages

• Faster decision making • Increase revenue • Accelerate cost savings

• Eliminate wait time • Scale performance with

capacity

I/O Time Network Time CPU Time

I/O Time Network Time CPU Time Time Recovered

Disk-Based

FlashSystem

Page 9: High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

© 2013 IBM Corporation 9

The Value of Performance

Extreme Performance enable business to unleash the power of performance, scale, and insight to drive services and products to market faster • Improved end-user experience • Faster insights into critical applications

A 1-SECOND DELAY IN PAGE

LOAD TIME

= 7% LOSS IN CONVERSIONS

11% FEWER PAGE VIEWS

16% DECREASE IN CUSTOMER SATISFACTION

In dollar terms, this means that if your site typically earns $100,000 a day, this year you could lose $2.5 million in sales.

Source: Aberdeen Group

Page 10: High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

© 2013 IBM Corporation 10

Much has Changed Around Flash Enabling Technology

• Given the right controller technology, one really does not have to worry about endurance any more – IBM is a Leader in enabling MLC for enterprise applications

• Well designed all flash arrays can be designed with excellent write

performance

• Flash has excellent sequential throughput characteristics – Not just good random IOPs – Most workloads have some attributes of each and Flash excels

Page 11: High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

© 2013 IBM Corporation 11

Flash Offers Other Significant Advantages

• Power reductions – A key consideration in driving Internet data centers to Flash – Can be the main driver in internet data centers and Big Data

• Density

– Incredible densities per rack unit possible with Flash – Saves rack space, floor space

• Form Factors and Flexibility

– Can be placed in many parts of the infrastructure – Can go on DIMMs, PCIE slots, attached directly via cables, unique form factors,

etc.

4TB Custom Flash Module

Page 12: High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

© 2013 IBM Corporation 12

High Performance Networked Flash Storage Architectures

• Inside Traditional Storage Systems – Hybrid or pure storage

• All Flash Arrays

– SAN Attached – RDMA SAN

• IB SRP, iSER, RoCE – SAN “Less”

• Ethernet, iSCSI – Building blocks for scale out storage.

• Advantages – Shared! – High Availability built in – Advanced storage function

like Disaster Recovery – All flash array is flash

optimized from ground up

• Perceived weaknesses – Network latencies – Further away from CPU

Page 13: High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

© 2013 IBM Corporation 13

World Class and Consistent Performance!

0.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

2.00

2.25

2.50

2.75

3.00

3.25

3.50

3.75

4.00

0 200,000 400,000 600,000 800,000 1,000,000 1,200,000

Res

pons

e Ti

me

(ms)

IOPS

IBM FlashSystem 840 Random 4K Read/Write Performance

100% rr

90% rr-- 10% rw

80% rr-- 20% rw

70% rr-- 30% rw

60% rr-- 40% rw

50% rr-- 50% rw

40% rr-- 60% rw

30% rr-- 70% rw

20% rr-- 80% rw

10%rr -- 90% rw

100% rw

Page 14: High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

© 2013 IBM Corporation 14

High Performance Direct Attached Flash Storage Architectures

• PCIe Drawers – Dense and can be

attached to 2 servers

• PCIe Cards

• Flash DIMMs

• Advantages – Attached to lowest latency

buses – Memory bus is snooped – Uses existing infrastructure

for power/cooling • Perceived weaknesses

– No Inherent high availability – Mirroring more expensive

than RAID – No advanced DR or storage

functionality

Page 15: High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

© 2013 IBM Corporation 15

Bottlenecks in Flash Storage

• RAID Controllers – Flash Optimized RAID controllers with hardware assists now exist

• Network HBAs – Reductions in latency – RDMA protocols

•OS and Stack Latency! – Standard driver model adds significant latency and reduces IOPS per core by an

order of magnitude – Fusion-io Atomic Writes – sNVMe and SCSIe – IBM Power CAPI

• Many Legacy Applications written around HDDs – Added path length to coalesce, avoid store, etc.

Page 16: High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

© 2013 IBM Corporation 16

User Space

Concept • Attach FlashSystem to POWER8 via CAPI coherent attach • CAPI flash controller operates in user space to eliminate 97% of instruction path length • Lowest achievable overhead and latency memory to flash. • Saves up to 10-12 cores per 1M IOPs

Application

Kernel

LVM

Disk & Adapter DD

FileSystem

Application

User Space Library Shared Memory Work Queue in cache Hierarchy

* CAPI (Coherent Accelerator Processor Interface)

Legacy Filesystem Stack CAPI Flash Stack User Space

Bounce Buffering and context switch overheads

Memory pinning and mapping for DMA overhead

Standard PCI-express Bus

Lowest achievable latency and overhead from DRAM to Flash

20K instructions reduced to <500 (lower core overhead)

CAPI Bus

Allows many Cores to directly Drive IOP

CAPI Attached Flash Value

11.20.2013 IBM CONFIDENTIAL 16

Page 17: High Performance Storage in Today’s Critical Applications · 2020. 7. 16. · How Flash Accelerates Today’s Most Critical Applications • Latency – Inherent read latency –

© 2013 IBM Corporation 17

Workload Optimized Systems and Flash

• Analytics – Very fast table scans – Tremendous IOPS capability to identify patterns and

relationships

• OLTP – Credit card, travel reservation, other – Can share without sacrificing IOPs – But low response time is key

• Cloud and Big Data

– Either inside servers as hyper converged or – Linear scale out with QoS for Grid Scale.