HUAWEI OceanStor Dorado Series SSD Array Technical White Paper Issue 1.0 Date 2013-05-24 INTERNAL HUAWEI TECHNOLOGIES CO., LTD.
HUAWEI OceanStor Dorado Series SSD Array Technical White Paper
Issue 1.0
Date 2013-05-24
INTERNAL
HUAWEI TECHNOLOGIES CO., LTD.
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
i
Copyright © Huawei Technologies Co., Ltd. 2013. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without prior
written consent of Huawei Technologies Co., Ltd.
Trademarks and Permissions
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective
holders.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and
the customer. All or part of the products, services and features described in this document may not be
within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements,
information, and recommendations in this document are provided "AS IS" without warranties, guarantees or
representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.
Huawei Technologies Co., Ltd.
Address: Huawei Industrial Base
Bantian, Longgang
Shenzhen 518129
People's Republic of China
Website: http://enterprise.huawei.com
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper Contents
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
ii
Contents
Change History ............................................................................................... 错误!未定义书签。
1 Executive Summary ...................................................................................................................... 1
2 Introduction.................................................................................................................................... 2
2.1 Limitation of Traditional Storage Arrays.......................................................................................................... 2
2.2 Flash Memory .................................................................................................................................................. 3
2.2.1 Concept and Principles............................................................................................................................ 3
2.2.2 Technical Features ................................................................................................................................... 4
2.3 SSD .................................................................................................................................................................. 6
2.3.1 Address Space Virtualization .................................................................................................................. 8
2.3.2 Capacity Redundancy ............................................................................................................................. 8
2.3.3 Garbage Collection ................................................................................................................................. 8
2.3.4 Wear Leveling ......................................................................................................................................... 9
2.3.5 Bad Block Management ........................................................................................................................ 10
2.3.6 SSD Service Life ................................................................................................................................... 10
3 Solution ......................................................................................................................................... 12
3.1 Dorado Series All-Flash-Memory Arrays ....................................................................................................... 12
3.1.2 Dorado2100........................................................................................................................................... 13
3.1.3 Dorado5100........................................................................................................................................... 13
3.1.4 Dorado2100 G2 ..................................................................................................................................... 15
3.2 Benefits .......................................................................................................................................................... 16
3.2.1 Reduced TCO ........................................................................................................................................ 16
3.2.2 Improved Customer Service Competitiveness ...................................................................................... 18
3.3 Technical Analysis .......................................................................................................................................... 19
3.3.1 Problems Caused by SSDs .................................................................................................................... 20
3.3.2 Design Philosophy ................................................................................................................................ 23
3.4 Reliability, Service Life, and Performance ..................................................................................................... 23
3.4.1 Reliability .............................................................................................................................................. 24
3.4.2 Service Life ........................................................................................................................................... 29
3.4.3 Performance .......................................................................................................................................... 30
4 Experience ..................................................................................................................................... 32
4.1 Application Analysis of All-Flash-Memory Arrays ........................................................................................ 32
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper Contents
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
iii
4.2 Typical Applications in Target Industries ....................................................................................................... 33
4.3 Typical Cases .................................................................................................................................................. 36
4.3.1 OLTP Case ............................................................................................................................................ 36
4.3.2 OLAP Case ........................................................................................................................................... 37
5 Conclusion .................................................................................................................................... 39
A Acronyms and Abbreviations .................................................................................................. 40
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 1 Executive Summary
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
1
1 Executive Summary
Based on in-depth analysis and investigation into customer data centers, Huawei Technologies
Co., Ltd. (Huawei for short) finds that most data centers must cope with the following two
problems.
Problem 1: With the rapid development of cost-effective x86 servers, the virtualization
technology moves from the high-end server market towards common enterprises. Enterprises
of various scales begin to virtualize the infrastructure of their data centers to eliminate various
problems caused by the silo architecture. This virtualized infrastructure improves the
hardware usage of servers, simplifies IT management, and reduces the operating expense
(OPEX) of data centers. However, it brings the I/O blender effect to customers at the same
time. As various application systems are made invisible by the virtualized infrastructure,
various I/Os are blended. As a result, the performance of back-end storage arrays deteriorates
due to random I/Os. In addition, traditional storage arrays cannot be optimized for upper-layer
application systems. Therefore, traditional storage arrays become a performance bottleneck of
the virtualized infrastructure, preventing the maximum return on investment (ROI) from the
virtualized infrastructure.
Problem 2: To eliminate the performance bottleneck of traditional storage arrays, many
enterprises increase the IOPS by adding hard disk drives (HDDs). As a result, increased
capital expenditure (CAPEX) is incurred for storage capacity and more OPEX is incurred for
data center space and energy consumption. Traditional storage arrays counterbalance a large
proportion of benefits generated by the virtualized infrastructure. Stacking HDDs increases
only the IOPS of traditional storage arrays and cannot reduce the I/O response latency.
Therefore, service system performance cannot be improved completely.
Some enterprise customers have an in-depth understanding about the problems caused by
infrastructure virtualization and HDD stacking. To resolve the previous problems, these
customers install both HDDs and solid-state drives (SSDs) in the same storage array to meet
the requirements of various services. Before 2012, as SSDs had a relatively high unit price,
installing both HDDs and SSDs was the most cost-effective choice for most enterprise
customers. As SSDs lower the price and solid state storage technologies develop rapidly,
enterprise customers tend to use SSD arrays.
To resolve problems that bother enterprise customers, Huawei launches high-performance
OceanStor Dorado series SSD arrays. This document describes storage technology reforms
brought by solid state storage, especially by flash memories, customer concerns, and
application scenarios of Dorado series storage products.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 2 Introduction
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
2
2 Introduction
According to documents issued by the Storage Networking Industry Association (SNIA),
solid state storage means any storage capability that is provided by non-moving memory
technologies rather than moving magnetic or optical media.
According to this definition, the random access memory (RAM), flash memory, and phase
change memory (PCM) are solid state storage. In fact, solid state storage has been used for
storing mission-critical data for a long time. Before using SSDs, some enterprises used RAM
arrays to store core data in real time, meeting the real-time computing requirements.
As flash memories developed and their prices were lowered, some storage array vendors
began to introduce flash memory–based SSDs in their storage arrays to improve storage array
performance in 2008.
After that, the storage industry found that traditional storage arrays were designed based on
HDDs. Even though inserting SSDs could instantly improve performance, it could neither
bring the advantages of flash memories into full play nor evade their disadvantages. In 2010,
various all-flash-memory arrays entered the market and were claimed to be storage arrays
designed and developed based on flash memories.
As various all-flash-memory arrays entered the market, some vendors of traditional storage
arrays launched storage arrays fully equipped with SSDs and claimed to have
all-flash-memory arrays. In fact, claiming all-flash-memory arrays was a way to expand the
market share by the vendors of traditional storage arrays. At that time, various
all-flash-memory arrays flushed into the market, confusing customers.
All-flash-memory arrays are based on flash memories and SSDs. Only those who understand
flash memory features and SSD design concepts can understand the difference between
all-flash-memory arrays and traditional storage arrays.
This chapter describes flash memories and SSDs.
2.1 Limitation of Traditional Storage Arrays
HDD-based traditional storage arrays provided reliable storage services for customers.
However, as enterprise-class storage applications become complicated, traditional storage
arrays expose their limitation:
Reliability
Because the mechanical structure of HDDs limits HDD performance, the annual failure
rate (AFR) of each HDD cannot be further reduced, restricting reliability improvement.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 2 Introduction
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
3
Performance
Traditional storage arrays improve the IOPS by stacking a large number of HDDs.
However, this method cannot reduce I/O latency. In addition, as data center vitalization
expands, traditional storage arrays receive more random I/Os, making dedicated
optimization more difficult.
Cost
Enterprises obtain required IOPS by stacking HDDs. This requires the additional costs of
capacity, equipment room space, and power consumption.
Flash memories and their reduced cost make it possible to resolve the previous problems.
2.2 Flash Memory
RAM-, flash memory-, and PCM-based storage modes and technologies are all called solid
state storage.
Currently, because flash memories have advantages in price, capacity, and reliability over
other storage media, they are widely applied in the solid state storage area.
2.2.1 Concept and Principles
A flash memory is an electronic non-volatile computer storage device. Non-volatile means
that the data stored in a flash memory will not be lost if the flash memory is powered off.
Currently, there are two main types of flash memory: Negated OR (NOR) and Negated AND
(NAND). Because of different usage, these two types of flash memory are used in different
areas. The NOR flash is used to store system start programs and commonly used in embedded
devices. The NAND flash is used to store data and commonly used in SSDs.
These two types of flash memory have the same operation principles: A memory cell consists
of three parts: source, drain, and gate. The source-drain current is controlled by the electric
field effect. A floating gate is added between the gate and silicon substrate for storing
electrons that are used to store memory.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 2 Introduction
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
4
Figure 2-1 Flash memory cell
A flash memory cell shown in Figure 2-1 represents 1-bit data. Charging the floating gate is
logically equivalent to the binary "0" value and discharging is logically equivalent to the
binary "1" value.
The binary "1" value indicates erasing.
The binary "0" value indicates programming.
2.2.2 Technical Features
Because the NAND flash is commonly applied in SSDs and all-flash-memory arrays, the flash
memory in the following sections refers to the NAND flash.
NAND flash structure
Figure 2-2 NAND flash chip structure
− Figure 2-2 shows that each NAND flash chip consists of thousands of the same
blocks that range from several hundred KBs to several MBs.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 2 Introduction
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
5
− Each block is divided into the same pages of 4 KB or 8 KB.
Data writing
− Data is written to the flash memory in pages.
− If a page already has data, new data cannot be written to this page until the existing
data on this page is cleared.
− Data is cleared in blocks. An entire block will be cleared if the clearing is performed
at a time. Clearing is equivalent to the erasing of the flash memory. After a block is
erased, all bits in this block are set to 1.
− Writing is equivalent to the programming of the flash memory. After data is written
onto a page, specified bits are changed from 1 to 0. In this way, this page saves the
data.
− The flash memory works in the program-erase cycles. Each circle is called one
program/erase (P/E).
− Each block in the flash memory has a finite number of P/Es. If the number of P/Es
reaches the threshold, data cannot be accessed correctly.
− The number of P/Es depends on multiple factors. The number of P/Es in the
following sections is based on the performance of mainstream memories at the point
in time when this document is prepared.
Data reading
− After a period of time, errors may occur on multiple bits of data stored in the flash
memory. If data read from the page is transferred to upper-layer services, these
services may fail.
− To ensure that data transferred to upper-layer services is correct and valid, the flash
memory reserves space for storing error-correcting codes (ECCs) of service data.
When data is being read, controllers use corresponding ECCs to detect and correct
errors for the data.
− Because controllers have limited computing capability, the error-correcting range of
an ECC is restricted. ECCs are valid only when the number of bit errors does not
exceed the upper threshold. Currently, a typical ECC can correct 32-bit errors in each
1 KB. When the number of bit errors is within 32 in each 1 KB of service data or
ECC data, controllers can compute out correct and valid service data.
− If the number of bit errors on a page exceeds the computing capability of a controller,
data on this page is incorrectly read and an error message indicating uncorrectable
(UNC) is generated.
− The UNC error can be repaired only by a higher-level RAID mechanism.
Categories of the NAND flash
− The NAND flash is divided into single-level cell (SLC), multi-level cell (MLC), and
triple-level cell (TLC).
− Figure 2-1 shows a cell of a NAND flash chip.
− In an SLC, each cell can store only one bit of information: 1, 0.
− In an MLC, each cell can store multiple bits of information. In actual situation, each
cell stores only two bits of information: 00, 01, 10, 11.
− In a TLC, each cell can store three bits of information: 000, 001, 010, 100, 101, 110,
111.
− The MLC is further divided into the enterprise MLC (eMLC) and consumer MLC
(cMLC).
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 2 Introduction
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
6
− The eMLC and cMLC are the same in essence. During manufacturing, vendors define
MLCs that are verified to have larger number of P/Es as eMLCs and the remaining
MLCs as cMLCs.
− The MLC is commonly used to represent the cMLC in the industry.
− Various types of the NAND flash vary in capacity, the number of P/Es, and price, as
shown in Table 2-1.
Table 2-1 Comparison among NAND flash types
Capacity Per Unit Volume
Number of P/Es Price per Unit Capacity
SLC Small About 100,000 High
eMLC Moderate About 30,000 Medium
cMLC Moderate 5000 to 10,000 Low
TLC Large 500 to 1000 Very low
− Currently, the SLC and eMLC are applied in the enterprise-class market and the
cMLC is applied in the consumer market. The TCL has not been applied widely.
2.3 SSD
This document describes flash memory–based SSDs. Figure 2-3 shows the structure of an
SSD.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 2 Introduction
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
7
Figure 2-3 Components of an SSD
An SSD consists of a controller, memories, and flash chips. Most vendors keep the exterior,
interface properties, and data access methods of both SSDs and HDDs consistent during
manufacturing. Therefore, SSDs can be applied to scenarios where HDDs are applied.
The controller provides the ports for connecting external hosts and managing internal flash
memory and uses the embedded CPU to run SSD firmware. The SSD firmware manages the
storage address space, flash memory physical space, garbage collection (GC), and wear
leveling (WL) that can be perceived by hosts. Controllers used by Huawei SSDs are
self-designed ASIC chips with independent intellectual property rights that provide 6 Gbit/s
SAS 2.0 ports.
The memory is used to operate SSD firmware and store items required by address space
virtualization.
Multiple flash chips are distributed on the circuit board to provide storage space for SSDs.
Compared with an HDD, an SSD does not have the voice coil motor or cantilever. Therefore,
an SSD has strong shockproof. In addition, the multi-concurrent access and low latency of an
SSD increase the IOPS of an SSD by more than two orders of magnitude.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 2 Introduction
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
8
2.3.1 Address Space Virtualization
Section 2.2.2 Technical Features indicates that the number of flash P/Es is limited. If the
number of P/Es in a physical area reaches the upper limit because a large amount of data is
written to hotspot areas, the physical area fails. To resolve this problem and accelerate SSDs'
response to write requests, address space virtualization is designed for SSDs.
What is address space virtualization?
1. The mapping from the logical block address (LBA) to the physical block address (PBA)
of an SSD is changeable.
2. The minimum manageable unit of an SSD is page. Each page has a unique number,
namely the PBA.
3. A mapping table is maintained in an SSD to record mappings between LBAs and PBAs.
4. When data is written to an SSD, one or more clean pages are selected for storing the data.
Meanwhile, the mappings are recorded by the mapping table. Clean pages refer to those
pages that are erased for once and have not been programmed.
5. After address space virtualization, if data is written to the same area on an SSD
repeatedly, the data is written to different physical areas on the SSD.
2.3.2 Capacity Redundancy
To prevent the failure of an entire SSD due to faulty flash memories, SSDs are designed to
achieve capacity redundancy. For example, an SSD with a nominal capacity of 100 GB can
provide an actual flash memory–based physical capacity of more than 110 GB.
The ratio between the part that exceeds the nominal capacity and the nominal capacity is
called the redundancy ratio. Generally, an SSD with a lager redundancy ratio provides higher
reliability, longer service life, and better performance.
All OceanStor Dorado series SSD arrays use Huawei self-developed SSDs. The redundancy
ratio of these SSDs is up to 28%, meeting the requirements of enterprises. Table 2-2 lists the
nominal capacities and physical capacities of some Huawei self-developed SSDs.
Table 2-2 Redundancy ratio of Huawei self-developed SSDs
Nominal Capacity Physical Capacity Redundancy Ratio
100 GB 128 GB 28%
200 GB 256 GB 28%
400 GB 512 GB 28%
2.3.3 Garbage Collection
The address space virtualization and redundancy ratio can not only prevent the failure of an
SSD due to flash failure, but also provide switch space for the GC of SSDs to ensure steady
SSD performance.
Address space virtualization eliminates repeated data reads and writes in the same physical
area but brings junk data and junk pages at the same time. The process is described as
follows:
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 2 Introduction
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
9
1. A host accesses LBA 100 and writes data AA. In this example, page 401 is used to store
data AA.
2. After a period of time, the host accesses LBA 100 again and writes data BB. Then the
address space virtualization mechanism transfers data to another page. In this example,
data BB is transferred to page 623.
3. After the previous two steps, data AA is stored on page 401 and data BB is stored on
page 623. Data stored on page 401 is invalid while that stored on page 623 is valid.
4. Data stored on page 401 is called junk data.
5. Because old data in flash memory must be erased before new data is written in, new data
cannot be directly written on page 401. Therefore, page 401 is called junk page before
data on it is erased.
GC is designed to erase junk data on junk pages. After GC, data can be written on to these
pages.
For an SSD, GC is a background task for monitoring the usage of blocks and pages on the
SSD. When a lot of junk pages exit,
1. Migrate valid data on blocks that contain many junk pages to other clean pages on other
blocks.
2. Erase the blocks that contain no valid data.
3. Place these blocks into the resource pool for new data writes.
The process of GC shows that GC not only cleanses SSDs but also generates more data writes.
For this reason, flash memory carries more data than that written by the host. This
phenomenon is known as write amplification.
When the service model is fixed, the ratio between the data amount that flash memory carries
and the data amount that a host writes is a fixed value with a small fluctuation range. This
ratio is called write amplification coefficient.
The write amplification coefficient varies with the redundancy ratio and other algorithms of
SSD firmware. A smaller write amplification coefficient defines better performance. The write
amplification coefficient of Huawei self-developed SSDs is about 2.5 for all small random
I/Os and is about 1.1 for large sequential I/Os, reaching the mainstream level in the industry.
2.3.4 Wear Leveling
Only address space virtualization and capacity redundancy are insufficient for preventing
some flash blocks from reaching the upper limits of the P/Es earlier than the others. To ensure
all flash blocks are erased and written evenly, WL is introduced.
WL records the number of P/Es on each block and chooses those blocks with fewer P/Es for
erasing or data writing. The service life of SSDs with WL is maximized.
WL is divided into dynamic WL and static WL.
Dynamic WL
Dynamic WL refers to the WL triggered by host I/Os.
When a host sends a write request, an SSD needs to find one or more clean pages for
data writing. Then the dynamic WL algorithm is activated to choose blocks with fewer
P/Es to provide clean pages.
Static WL
Static WL refers to the WL started internally by an SSD.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 2 Introduction
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
10
If an SSD of 100 GB is fully occupied by user data. Among these 100 GB data, 99 GB is
cold data that has not been updated since written and 1 GB is hot data that is updated
frequently.
The cold data, if not handled, occupies at least 99 GB physical flash memory space
permanently. Only 1 GB space is available for data erasing and writing. As a result, the
number of P/Es in the 99 GB space is much smaller than that in the 1 GB space. The
SSD fails in advance.
Static WL can resolve the previous problem. Static WL records the number of P/Es on
each block, recognizes blocks that are not seriously worn and have no junk pages for
long time (or blocks that store cold data), and migrates valid data on these blocks to
blocks that are worn more seriously. In this way, the entire SSD is worn evenly.
2.3.5 Bad Block Management
Even though various mechanisms and algorithms are used to prolong the service life of SSDs,
flash memory is damaged inevitably. Capacity redundancy provides foundation for
eliminating flash memory damage.
Flash memory damage is measured in pages. A block contains multiple pages. Among these
pages, some are normal while the others are damaged.
In actual situation, if several pages are damaged in a block, the other pages in the block are
easily damaged. For this reason, SSD firmware manages flash memory damage in blocks. If
the number of pages on which data cannot be read exceeds a threshold in a block, the block is
regarded as a damaged block. Then valid data on the block is migrated to other available
blocks. This block is marked as damaged and will not be used for storing service data any
more.
Generally, SSD firmware discovers bad blocks using the following two ways: host I/O
triggering and internal inspection.
The previous task is called bad block management.
2.3.6 SSD Service Life
Address space virtualization, capacity redundancy, GC, WL, and bad block management
maximize the service life of SSDs.
Generally, the service life of SSDs can be calculated using the following algorithm:
If the host service carried on SSDs is the around-the-clock database service, the IOPS is about
5000, average I/O is 8 KB, and read/write ratio is 40%:60%. The daily data amount written is
calculated as follows:
5000 x 60% x 8 KB x 60 x 60 x 24 ≈ 2 TB
Based on daily data amount 2 TB and write amplification coefficient for all random I/Os 2.5,
the service life of SSDs with various types and capacities is calculated as follows:
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 2 Introduction
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
11
Table 2-3 Estimated service life of Huawei self-developed SSDs
SSD Calculation Calculation Result
100 GB SLC
More than 7 years
200 GB SLC
More than 14 years
200 GB eMLC
More than 4 years
400 GB eMLC
More than 8 years
The analysis and statistics from operating system and storage device mainstream vendors
show that the average data amount written onto each SSD every day is far less than 50 GB in
the enterprise-class market. However, the daily data amount written used in the previous
formula is 2 TB that is 40 times that of the actual one.
Compared with the calculation results in Table 2-3, the service life of SSDs under actual
service pressure increases by 40 times to 100 years.
Therefore, the service life of SSDs can meet the requirements of various enterprise customers.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
12
3 Solution
As the price of flash media per unit capacity declines, all-flash-memory arrays enter the
market. Now, many enterprises claim to have all-flash-memory arrays. Free from the
limitation of HDDs, all-flash-memory arrays have various exteriors such as an all-in-one box
and traditional storage array fully equipped with SSDs. In fact, all-flash-memory arrays from
some enterprises are traditional storage arrays having HDDs replaced with SSDs.
The I/O response latency of SSDs is 2 orders of magnitude lower than that of HDDs.
Therefore, traditional storage arrays for HDDs cannot bring SSD advantages into full play.
This chapter describes the difference between all-flash-memory arrays and traditional storage
arrays, basic features of HUAWEI Dorado all-flash-memory arrays, and customer benefits
from all-flash-memory arrays.
3.1 Dorado Series All-Flash-Memory Arrays
The Dorado series are all-flash-memory arrays developed by Huawei and contain
self-developed array controllers, software, and SSDs. The Dorado series is featured by
enhanced reliability, high performance, ease of use, and ease of maintenance.
Figure 3-1 Dorado identifier
Figure 3-1 shows a rapid marching dorado, the identifier of Dorado series all-flash-memory
arrays.
Dorado means a kind of fish in Latin, indicating that the Dorado series is the fastest one
among storage devices. Dorado series all-flash-memory arrays demonstrate their outstanding
IOPS and latency in globally recognized performance benchmark tests.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
13
3.1.2 Dorado2100
The Dorado2100 is the first product of the Dorado series and has been replaced by the
Dorado2100 G2.
Figure 3-2 Front view of a Dorado2100 controller enclosure with an air hood
Table 3-1 lists Dorado2100 specifications.
Table 3-1 Dorado2100 specifications
Form The 2 U controller enclosure houses twenty-four 2.5-inch SSDs.
The controller enclosure is fully equipped with SSDs of the same model
for sales and does not support expansion disk enclosures.
All active components including controllers, power modules, and fans are
redundant and field replaceable.
SSD SLC: 50 GB and 100 GB
MLC: 100 GB and 200 GB
Capacity SLC: 1.2 TB and 2.4 TB
MLC: 2.4 TB and 4.8 TB
Host Connectivity
8 Gbit/s Fibre Channel
Performance SPC-1 IOPSTM
: 100,051.99 @ 0.95 ms
3.1.3 Dorado5100
The Dorado5100 is a high-performance all-flash-memory array featured by flexible
configuration and wide coverage.
Figure 3-3 Front view of a Dorado5100 controller enclosure with an air hood
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
14
Figure 3-4 Rear view of a Dorado5100 controller enclosure
Figure 3-5 Dorado5100 disk enclosure
Table 3-2 lists Dorado5100 specifications.
Table 3-2 Dorado5100 specifications
Form The stand-alone 4 U controller enclosure supports multiple interface
cards.
A 2 U disk enclosure houses twenty-four 2.5-inch SSDs.
A disk enclosure is fully equipped with SSDs of the same model for sales.
A Dorado5100 device supports a maximum of four disk enclosures of the
same specifications.
All active components including controllers, power modules, and fans are
redundant and field replaceable.
SSD SLC: 100 GB and 200 GB
eMLC: 200 GB and 400 GB
Capacity SLC: 2.4 TB to 19.2 TB
eMLC: 4.8 TB to 38.4 TB
Host Connectivity
8 Gbit/s Fibre Channel, 10 Gbit/s Ethernet or iSCSI
Performance SPC-1 IOPSTM
: 600,052.49 @ 1.09 ms
Advanced Feature
Snapshot and remote replication
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
15
3.1.4 Dorado2100 G2
The Dorado2100 G2 is a substitute of the Dorado2100. The Dorado2100 G2 has much better
performance and more advanced features compared with the Dorado2100.
Figure 3-6 Front view of a Dorado2100 G2 controller enclosure with an air hood
Figure 3-7 Rear view of a Dorado2100 G2 controller enclosure without an air hood
Figure 3-8 Dorado2100 G2 disk enclosure
The previous figures show that the Dorado2100 G2 controller enclosure accommodates both
controllers and disks, and can be connected to disk enclosures. Each controller enclosure can
provide 25 disk slots to facilitate hot spare disk configuration.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
16
Table 3-3 lists Dorado2100 G2 specifications.
Table 3-3 Dorado2100 G2 specifications
Form The 2 U controller enclosure houses twenty-five 2.5-inch SSDs.
A 2 U disk enclosure houses twenty-five 2.5-inch SSDs.
Both disk enclosures and the controller enclosure are fully equipped with
SSDs of the same model for sales.
A Dorado2100 G2 device supports a maximum of three disk enclosures
of the same specifications.
All active components including controllers, power modules, and fans are
redundant and field replaceable.
SSD SLC: 100 GB and 200 GB
eMLC: 200 GB and 400 GB
Capacity SLC: 2.5 TB to 20.0 TB
eMLC: 5.0 TB to 40.0 TB
Host Connectivity
8 Gbit/s Fibre Channel, 10 Gbit/s Ethernet or iSCSI with the TCP offload
engine (TOE) technology, and 40 Gbit/s InfiniBand
Performance SPC-1 IOPSTM
: 400587.11 @ 0.75 ms
Advanced Feature
Thin provisioning, global WL, and VMware VAAI
3.2 Benefits
All-flash-memory arrays are designed to help customers improve data center processing
capability, enhance service competitiveness, reduce the total cost of ownership (TCO), and
cope with the challenges of application scenarios where traditional storage arrays are not
applicable.
3.2.1 Reduced TCO
Currently, the price per unit capacity of all-flash-memory arrays is several times higher than
that of traditional storage arrays. Therefore, some customers simply think that
all-flash-memory storage arrays are more expensive than traditional storage arrays. In actual
situation, all-flash-memory arrays have low latency, high performance, and low space and
power consumption requirements. In addition, they acquire high IOPS without stacking disks
and require a lower maintenance cost. Therefore, the TCO of all-flash-memory arrays is lower
than that of traditional storage arrays.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
17
Table 3-4 describes the TCO comparison between traditional storage arrays H and I and the
Dorado2100 G2.
Table 3-4 TCO comparison between traditional storage arrays and the Dorado2100 G2
Traditional Storage Array H
Traditional Storage Array I
Dorado2100 G2
Disk Configuration
896 x 10k rpm 300
GB SAS HDDs
230 x 15k rpm 300
GB SAS HDDs
100 x 400 GB
eMLC SSDs
Physical Capacity 268,800 GB 69,000 GB 40,000 GB
SPC-1 IOPSTM 109,986.41 82,496.08 Approximately
250,000
Latency Approximately 0.5
ms
Approximately 7 ms Approximately 2 ms
Price
(Including 3-year Warranty)
$484,985.78 $361,416.00 Approximately
$310,000
Price/Capacity Ratio
($/GB)
1.80 5.24 Approximately 7.75
Price/Performance Ratio
($/SPC-1 IOPSTM)
4.41 4.38 Approximately 1.24
Rack Space 3 racks 16 U 8 U
Typical Power Consumption
Approximately 10
kW
Approximately 3.3
kW
Approximately 1.5
kW
First Year's OPEX $42,000 $5600 $2800
Second Year's OPEX
$42,000 $5600 $2800
Third Year's OPEX $42,000 $5600 $2800
TCO $610,985.78 $53,216.00 Approximately
$318,400
The data about traditional storage arrays H and I in Table 3-4 is from SPC-1 reports on the
SPC website and from specifications on the official websites of the products. These storage
arrays passed the SPC-1 benchmark test in early 2013 that was close to the time when this
document was prepared. Therefore, the data from SPC-1 reports is valuable.
To view the SPC-1 test results, go to
http://www.storageperformance.org/results/benchmark_results_spc1.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
18
In Table 3-4, the Dorado2100 G2 is fully equipped with one hundred 400 GB eMLC SSDs
and provides 40 TB physical storage space, meeting the space requirements of various
high-performance storage devices.
The operating expense (OPEX) mainly includes space expense and electricity consumption
payment. As the OPEX varies with regions, it is computed based on the costs for rack leasing
of mainstream telecom carriers in China.
The previous table shows that traditional storage arrays achieve relatively high performance
by stacking disks. This increases customers' expense for unnecessary capacity. In addition, the
high performance of traditional storage arrays is not comparable with all-flash-memory arrays
in terms of IOPS or latency. Besides meeting the capacity requirements of customers, the
Dorado2100 G2 reduces customers' TCO. In addition, the decreasing price makes
all-flash-memory arrays more competitive.
All-flash-memory arrays can help customers who do not require much for capacity but require
high performance to reduce the investment and TCO.
3.2.2 Improved Customer Service Competitiveness
All-flash-memory arrays can help customers save investment and solve problems that
traditional storage arrays cannot cope with.
In business expansion, a building materials retailer found that the high latency of traditional
storage arrays caused that the I/O wait of the database running on the server kept constantly
high. This greatly delayed each transaction on the entire business system and limited the valid
concurrency increasing of the entire system.
The retailer found that under a specific concurrent pressure, the performance of the entire
system was not improved by adding traditional storage arrays.
To resolve this problem, the IT department of this retailer communicated with the system
integration agent. The agent provided the following two solutions:
Redesign the IT infrastructure, including servers and storage devices.
Use SSD arrays as the primary storage arrays.
After estimation, the retailer preferred the second solution because they thought that the first
solution requires lots of changes and the result is difficult to assess.
After the Dorado5100 was used as the primary storage device in the original storage system,
the retailer found that the waiting time of service processing was reduced to 18.3% of the
original one. In the actual situation, the waiting time of each transaction was reduced to 20%
of the original one. In the simulated environment, the maximum number of system users was
increased by 20 times. After the simulation, the retailer chose the Dorado5100 to improve
service processing capability of the entire system and achieve service expansion.
The previous case shows that all-flash-memory arrays can help customers achieve their
business objectives while saving IT investment and reducing system latency. As a result,
customers' service competitiveness is improved.
For details about this case, see section 4.3.1 "OLTP Case."
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
19
3.3 Technical Analysis
An all-flash-memory array is a storage array that is independent of HDDs and uses only flash
memories for data access.
According to the definition, all-flash-memory arrays in the market can be divided into the
following three types:
Proprietary structure
Various components are placed in an all-in-one box. This kind of all-flash-memory
arrays is like a rack-style device with SSDs, namely lager SSDs. Except for high
performance, these arrays have no other storage features and are hard to maintain.
Open structure
This kind of all-flash-memory arrays has no difference in appearance from traditional
storage arrays. However, as the system software of the all-flash-memory arrays is
developed based on flash memories, these arrays have high performance as well as
various storage features.
Traditional storage array fully equipped with SSDs
The HDDs of traditional arrays are replaced with SSDs. This kind of all-flash-memory
arrays has common performance but diversified functions and features.
A traditional storage array whose HDDs are replaced with SSDs can be called an
all-flash-memory array. However, this all-flash-memory array cannot bring the advantages of
flash memory into full play.
Figure 3-9 Comparison among three types of all-flash-memory arrays
Figure 3-9 compares the SPC-1 IOPSTM
performance between these three kinds of SSD arrays.
The horizontal axis represents IOPS and the vertical axis represents latency (ms). The figure
shows the latencies of storage arrays under various IOPS pressures. Data in Figure 3-9 is from
the SPC official website.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
20
The SPC-1 is an I/O model and test benchmark defined by the Storage Performance Council
(SPC) and used for simulating the I/O features of online transaction processing (OLTP) and
online analytical processing (OLAP). The SPC-1 test benchmark is well-known in the storage
industry. Generally, the SPC-1 IOPSTM
test index of entry-level and mid-range storage arrays
is lower than 50k, that of mid-range and high-end storage arrays ranges from 50k to 200k, and
that of high-end storage arrays ranges from 200k to 300k.
Figure 3-9 compares the performance of five models of all-flash-memory arrays. These arrays
are categorized as follows:
Proprietary structure: RamSan-630
Open structure: Dorado series
Traditional storage array fully equipped with SSDs: V7000
Figure 3-9 shows the I/O response latencies of all-flash-memory arrays when the IOPS
increases.
The comparison shows that:
1. The Dorado5100, Dorado2100 G2, and RamSan-630 have better performance than
high-end traditional storage arrays.
2. When the IOPS is low, the RamSan-630 has a slightly lower latency than the Dorado
series. When the IOPS is high, the latency of the Dorado series keeps low while that of
the RamSan-630 greatly increases.
3. The latency of the V7000 (all SSDs) increases linearly as the IOPS increases. The V7000
does not bring the low latency of flash memory into full play.
4. Among these storage arrays, only the Dorado series inherits the low latency of flash
memory and has better performance than the other kinds of all-flash-memory arrays.
The Dorado series achieves better performance by rewriting system software, inherits the
reliability and maintainability advantages of traditional storage arrays, and implements higher
reliability and maintainability by improving these inherited advantages based on flash
memory.
3.3.1 Problems Caused by SSDs
Problem caused by low latency
The relationship among the IOPS, concurrent I/Os, and I/O latency is as follows:
The latency of a high-performance enterprise-class SAS HDD is about 5 ms for 4 KB I/O
random access.
The latency of a SAS SSD is about 0.2 ms for 4 KB I/O random access.
In Figure 3-10, an HDD and an SSD are respectively connected to a host with the same
configuration. Then these hosts deliver single I/Os.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
21
Figure 3-10 An HDD and an SSD directly connected to hosts
In the previous figure, the left drive is an HDD and its IOPS is calculated as follows:
1/5 ms = 200 IOPS
The right drive is an SSD and its IOPS is calculated as follows: 1/0.2 ms = 5000 IOPS
Use a controller to connect between the host and the HDD, and between the host and the SSD,
as shown in Figure 3-11.
Figure 3-11 An HDD and an SSD connected to hosts through controllers
These controllers cause processing latency. Generally, the latency varies with pressures. The
latency is about 0.2 ms for single I/Os.
The IOPS perceived by hosts is changed.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
22
In the previous figure, the IOPS on the left side is calculated as follows: 1/(0.2 ms + 5 ms) =
192 IOPS
The IOPS on the right side is calculated as follows: 1/(0.2 ms + 0.2 ms) = 2500 IOPS
The previous calculation shows that for an HDD, the latency slightly increases and the IOPS
slightly decreases after you install a controller. This is why the IOPS of HDD arrays can be
estimated by multiplying the IOPS per HDD by the number of HDDs. For an SSD, the latency
is doubled and the IOPS reduce by half after you install a controller. Therefore, the IOPS of
an all-flash-memory array cannot be estimated by simply multiplying the IOPS per SSD by
the number of SSDs.
An all-flash-memory array constructed by replacing HDDs with SSDs cannot bring the high
performance of SSDs into full play.
Performance Difference for Processing Random and Sequential I/Os
HDDs and SSDs vary greatly in performance when processing various I/Os. Table 3-5 lists
the performance difference for processing random and sequential I/Os.
Table 3-5 Performance difference between an HDD and an SSD for processing random and
sequential I/Os
4 KB Random I/O 512 KB Sequential I/O
IOPS Bandwidth (MB/s)
IOPS Bandwidth (MB/s)
HDD Read Approximately
200
Approximately
0.8
Approximately
400
Approximately
200 Write
SSD Read Approximately
20,000
Approximately
80
Approximately
500
Approximately
250
Write Approximately
60,000
Approximately
240
Approximately
600
Approximately
300
Table 3-5 shows that for an HDD, the bandwidth for sequential I/Os is more than 200 times as
large as that for random I/Os. This is why traditional storage arrays use a complicate cache
algorithm to reconstruct service data delivered from hosts and then access HDDs sequentially
to improve system performance.
For an SSD, the bandwidth for sequential writes is only about 4 times as large as that for
random writes. The bandwidth for sequential reads is only about 2 times as large as that for
random reads.
Therefore, the cache algorithm and I/O scheduling algorithm of various vendors for HDDs
may not apply to all-flash-memory arrays.
Performance Bottleneck
For a traditional storage array, HDDs bottleneck its performance. Therefore, the IPOS and
bandwidth of the entire storage array can be increased by adding HDDs. In addition, the
system software of a traditional storage array is developed for eliminating the performance
bottleneck caused by HDDs.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
23
For an all-flash-memory array, the IOPS of an SSD reaches tens of thousands. The IOPS of a
disk enclosure with 24 slots reaches one million. Therefore, the performance bottleneck of an
all-flash-memory array lies in the controller, including the CPU processing capability, system
bandwidth, and system software designs and algorithms.
Compared with traditional storage arrays, all-flash-memory arrays have different performance
bottlenecks. Therefore, the hardware design of all-flash-memory arrays must be different.
Otherwise, the high performance of these arrays cannot be brought into full play.
Limited Number of P/Es
For flash memory, the number of P/Es is limited. Related studies show that the failure
probability of flash memory increases as P/Es increase.
Even though the failure rate of SSDs can be kept low before P/Es reach the maximum number,
the failure rate can be further reduced to prolong the service life of an entire all-flash-memory
array if the P/Es are reduced and the frequent access of hotspot data areas to some SSDs is
eliminated.
Currently, many all-flash-memory array vendors provide online deduplication and global WL
to reduce P/Es and prevent hotspot areas from failure in advance.
3.3.2 Design Philosophy
The Dorado series are all-flash-memory arrays developed by Huawei. When designing and
developing the Dorado series, R&D personnel observe the following rules:
Inherit the advantages of Huawei traditional storage arrays and adapt to customers'
existing habits in using storage arrays. Traditional storage arrays present accumulated
experience in ensuring system reliability and maintainability. For example, all active
components are redundant and replaceable online. These features of traditional storage
arrays have been inherited.
Fully consider the performance difference between SSDs and HDDs. The performance
of SSDs is two orders of magnitude higher than that of HDDs, which causes the change
of the system performance bottleneck. Therefore, the performance designed for HDDs
must be reviewed.
Fully consider the failure modes of SSDs and HDDs. The AFR of SSDs is 0.44% and
that of HDDs is 0.6%. Even though the statistics show that SSDs are more reliable than
HDDs, the failure modes of SSDs and HDDs are different. Therefore, dedicated design
and development for SSDs can further reduce the failure rate of SSDs on an
all-flash-memory array.
3.4 Reliability, Service Life, and Performance
Based on continuous development for several years, SSDs and SSD arrays have been greatly
improved in reliability, service life, and performance, meeting the requirements of various
enterprise-class storage applications.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
24
3.4.1 Reliability
Reliability Basics
Reliability is divided into reliability in narrow sense and reliability is broad sense. In narrow
sense, reliability refers to the zero-failure ability of a device. In wide sense, it refers to the
zero-failure probability of a device, mean time to repair a device, and availability of a device
in a long running period.
Reliability in this document refers to the reliability in broad sense.
Zero-failure probability
Mean time between failures (MTBF), failure in time (FIT), and annual failure rate (AFR)
are used to access the failure probability of devices.
MTBF is a measure of the reliability of the system. It refers to the average time between
consecutive failures of a piece of equipment. It is expressed in hours. A larger MTBF
defines a more reliable device.
FIT is the measure of the number of failures per one billion devices hours. For example,
1 FIT = 1 failure in 109 device hours. The FIT of a device is the sum of the FIT of each
component. A smaller FIT defines a more reliable device.
AFR is a statistic. It is a statistical failure rate based on a large number of samples and
expressed in percentage. A smaller AFR defines a more reliable device.
The relationship among MTBF, FIT, and AFR is as follows:
− MTBF = 109/FIT
− FIT = (109 x AFR)/(365 x 24)
Mean time to repair a device
Mean time to repair (MTTR) is used to measure the mean time to repair a device.
MTTR is a basic measure of the maintainability of repairable items. It means the average
time that a component or device will take to recover from any failure and expressed in
hours. In essence, it refers to the fault tolerance capability of a device. A smaller MTTR
defines a stronger fault tolerance capability.
Availability of a device in a long-term running
Availability is the probability that a system will work as required during the period of a
mission. Availability can be calculated by the following formula:
A = MTBF/(MTBF + MTTR)
The formula indicates that increasing MTBF or decreasing MTTR can improve the
availability of a device.
For a device consisting of multiple reparable, replaceable, and standalone components,
its availability is a multiple of the availability of each component:
Availability of a device = Availability of component 1 x Availability of component 2 x
Availability of component N
For telecommunications, 99.999% of availability requires that MTTR not be more than 5
minutes.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
25
SSD Reliability
SSDs are the primary components of an all-flash-memory array. Compared with HDDs, SSDs
do not have mechanical components. Therefore, the failure modes of SSDs and HDDs are
different. Generally, the failures of SSDs are easy to predict and manage. This is why the AFR
of SSDs is far lower than that of HDDs.
For details about the reliability of SSDs, refer to HUAWEI SSD Technical White Paper.
Hardware Reliability
The Dorado series all-flash-memory arrays use full hardware redundancy design. All active
components are redundant. This eliminates single points of failure and supports online
replacement.
Figure 3-12 Components in a controller enclosure of Dorado5100
Figure 3-12 shows components in a controller enclosure of Dorado5100. These components
are described as follows:
1: system enclosure
The passive design ensures high system reliability.
2: controllers
The two controllers back up for each other. They are field replaceable.
3: backup battery units (BBUs)
The four BBUs are effective against unexpected power failures.
4: fans
The three fans with 16-gear speed control ensure smooth heat dissipation.
5: power modules
The four power supplies greatly reduce the possibility of system power failures.
6: interface card slots
The interface card slots support Fibre Channel and SAS interface cards.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
26
Software Reliability
For Dorado series all-flash-memory arrays, not only common measures such as active-active
controllers, RAID protection, global hot spare, and online upgrades are taken to ensure system
reliability, but also dedicated measures are taken to improve system reliability based on SSD
features and failure modes.
The statistics and analysis about faulty SSDs from Huawei and vendors show that there are
two reasons for SSD failure:
Flash chip failure
SSD hardware defects
Even though the flash chip failure rate cannot be eliminated, the RAID protection and repair
can reduce SSD failure due to flash chip failure.
HUAWEI Dorado series uses self-developed SSDs. Each generation of the SSDs gradually
progresses towards enhanced capability and eliminates the design and implementation defects
of the last generation. Currently, the Dorado series uses the third-generation SSDs.
The Dorado series all-flash-memory arrays implement the following features to improve
system reliability:
Bad block repair. All failed areas on SSDs are obtained and data is repaired using
RAIDs.
Global capacity redundancy. The redundant space of the other SSDs is used to cope with
the failure of multiple flash chips on some SSDs of the same storage array.
Global anti-wear leveling. Global WL is used to prolong the service life of all SSDs and
global anti-wear leveling is used to prevent multiple SSDs in the same RAID group from
failure at the same time.
SSD staggered running. Many software errors are caused by overflowing counters and
these errors are hard to discover in the development phase. The Dorado series
implements SSD staggered running. The running periods of all SSDs are staggered. In
this way, batch failure caused by failed counters is prevented.
The average AFR of SSDs in the industry is about 0.44%. The AFR of Huawei SSDs is only
0.29%, 65% of the average AFR in the industry.
Availability Analysis
The following analyzes and computes the availability of a Dorado series all-flash-memory
array. Table 3-6 lists the reliability data of components in a Dorado series all-flash-memory
array.
Table 3-6 Reliability data of components in a Dorado series all-flash-memory array
Item Component FIT MTBF (Hour)
MTTR (Hour)
Availability (%)
Controller
enclosure
Controller 2500 400,000 0.5 99.99988
Backplane 150 6,666,666.7 4 99.99994
Fan 1000 1,000,000 0.1 99.99999
Power supply 1000 1,000,000 0.1 99.99999
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
27
Item Component FIT MTBF (Hour)
MTTR (Hour)
Availability (%)
BBU 1000 1,000,000 0.1 99.99999
Disk
enclosure
Expansion
module
400 2,500,000 0.2 99.99999
Backplane 150 6,666,666.7 4 99.99994
Fan 1000 1,000,000 0.1 99.99999
Power supply 1000 1,000,000 0.1 99.99999
SSD Disk unit 331 3,021,148 1 99. 99997
Analyzing the availability of a Dorado series all-flash-memory array involves the controller
enclosure, disk enclosures, and RAID groups:
Availability of a Dorado series all-flash-memory array = Availability of the controller
enclosure x Availability of disk enclosures x Availability of RAID groups
For a simplified calculation, the following assumes that all redundant components are in 1+1
redundancy.
Availability of the controller enclosure
Availability of dual controllers (A1) = 1 – (1 – Availability of a single controller) x (1 –
Availability of a single controller) = 99.99999%
Availability of the backplane (A2) = 99.99994%
Availability of dual fans (A3) = 1 – (1 – Availability of a single fan) x (1 – Availability of
a single fan) = 99.99999%
Availability of dual power supplies (A4) = 1 – (1 – Availability of a single power supply)
x (1 – Availability of a single power supply) = 99.99999%
Availability of dual BBUs (A5) = 1 – (1 – Availability of a single BBU) x (1 –
Availability of a single BBU) = 99.99999%
Available of the controller enclosure = A1 x A2 x A3 x A4 x A5 = 99.99990%
Availability of a disk enclosure
Availability of dual expansion modules (A1) = 1 – (1 – Availability of a single expansion
module) x (1 – Availability of a single expansion module) = 99.99999%
Availability of the backplane (A2) = 99.99994%
Availability of dual fans (A3) = 1 – (1 – Availability of a single fan) x (1 – Availability of
a single fan) = 99.99999%
Availability of dual power supplies (A4) = 1 – (1 – Availability of a single power supply)
x (1 – Availability of a single power supply) = 99.99999%
Available of a disk enclosure = A1 x A2 x A3 x A4 = 99.99991%
Availability of a RAID group
The following assumes that each six SSDs form a RAID 5, which allows the failure of
only one SSD:
Availability of a RAID group = 1 – (1 – Availability of single SSD) x (1 – Availability of
single SSD) x C (6, 2) = 99.99999%
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
28
Availability of a Dorado all-flash-memory array
If a Dorado5100 all-flash-memory array is equipped with 1 controller, 4 disk enclosures,
96 SSDs and 16 RAID groups,
Availability of the Dorado5100 all-flash-memory array = Availability of the controller
enclosure x (Availability of a disk enclosure)4 x (Availability of a RAID group)
16 =
99.99970%
Concluded from the previous equations, the hardware of a Dorado series
all-flash-memory array delivers enterprise-class reliability, 99.999% or more.
Network Reliability
Dual-Switch Networking and UltraPath Software
HUAWEI OceanStor Dorado series SSD arrays have active-active controller architecture
and allow a dual-switch network. As shown in Figure 3-13, Dorado series SSD arrays
have four paths that back up for one another.
Figure 3-13 Dual-switch networking
Controller A Controller B
The OceanStor Dorado series SSD array uses Huawei self-developed UltraPath as
multipathing software. The UltraPath is installed on each server to provide multiple
paths from each server to the SSD array. The multipath design enables a network to
provide higher reliability and performance.
The UltraPath performs the following functions:
a. Presents physical disks as an integrated unit to the operating system.
b. Automatically switches over services from the active path to a standby path once the
active path fails.
This function is called failover.
c. Automatically switches services back to the active path after the active path
recovers.
This function is called failback. Failover and failback eliminate single points of
failure on paths.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
29
d. Balances I/Os among reachable paths using the shortest queue first algorithm, least
load first algorithm, or round robin scheduling algorithm.
The UltraPath can run on Linux, AIX, and Windows operating systems, and on Hyper-V
and Xen virtual machines.
Snapshot
The snapshot technology generates a data duplicate that is consistent with source LUN
data at a point in time, without interrupting services running on the source LUN. The
duplicate is available immediately after being generated. Reading or writing the
duplicate has no impact on the source data. The snapshot technology helps handle online
backup, data analysis, and application testing.
Asynchronous remote replication
Remote replication is a type of data mirroring. Remote replication can be synchronous or
asynchronous. Huawei OceanStor Dorado series SSD array supports asynchronous
remote replication. Asynchronous remote replication allows multiple data copies to be
maintained at two or more sites, removing single-site data loss risks. Asynchronous
remote replication uses the snapshot technology to provide instant data collection and
points in time when faults are recovered.
Figure 3-14 Schematic diagram of asynchronous remote replication
3.4.2 Service Life
The service life of SSDs is described in 2.3.6 SSD Service Life. Table 2-3 shows that the
service life of Huawei SSDs under great pressure meets the requirements of enterprises.
The workload of SSDs is defined in the industry. Joint Electron Devices Engineering Council
(JEDEC) is a global leader in developing open standards and publications for the
microelectronics industry. JEDEC is an independent semiconductor engineering trade
organization and standardization body. Its members are from enterprises all around the world.
Currently, JEDEC focuses on developing open standards for solid state technologies.
JEDEC's JES D218 and JES D219A standards define a workload model for calculating the
service life of SSDs. According to this workload model, the service life of Huawei SSDs
meets the requirements of various enterprise-class applications.
In addition, a Dorado series all-flash-memory array has a series of features that improve the
service life of the entire array.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
30
Global WL. WL is implemented on an entire storage array to prevent some SSDs from
being worn in advance caused by service hotspots.
Online deduplication. Data written to flash memories is reduced to decrease write
amplification (in development).
3.4.3 Performance
Currently, all models of Dorado all-flash-memory arrays have been tested in the SPC-1
benchmark test. Figure 3-12 compares the SPC-1 data of Dorado series all-flash-memory
arrays as well as traditional mid-range and high-end storage arrays from peer vendors.
Figure 3-15 Test result comparison between all-flash-memory arrays and traditional storage arrays
The data in Figure 3-15 is from the SPC official website. The horizontal axis represents IOPS
and the vertical axis represents latency (ms). The figure shows the latencies of storage arrays
under various IOPS pressures.
According to this figure:
1. All-flash-memory arrays are designed and developed based on flash memories and
achieve high IOPS by low latency. However, traditional storage arrays achieve high
IOPS by stacking disks.
2. The Dorado2100 has the same IOPS as and lower latency than traditional mid-range
storage arrays.
3. A traditional mid-range storage array fully equipped with SSDs can have relatively high
IOPS, but its latency cannot match that of all-flash-memory arrays.
4. The IOPS of Dorado5100 and Dorado2100 G2 is much higher than that of traditional
high-end storage arrays. In addition, the latency of the Dorado series is one to two orders
of magnitude lower than that of traditional high-end storage arrays.
To provide low latency and high IOPS, Dorado series all-flash-memory arrays are designed
and developed based on flash memories.
Rewriting cache algorithm. The performance difference between SSDs and HDDs
described in section 3.3.1 "Problems Caused by SSDs" causes that the cache algorithm
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 3 Solution
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
31
for HDDs cannot apply to SSDs. Dorado series all-flash-memory arrays uses page tables
to form cache and simplifies data flush and wash out algorithms to reduce time and CPU
usage, achieving lower latency and higher IOPS.
Physically separating the data plane from the management plane. Various features of
hardware are used to accelerate service data processing, releasing CPU for processing
higher IOPS.
Global GC. The cache is used to cache service data of hosts and a time window is
provided for all SSDs in turn. Any SSD will not receive write I/Os in a specified time
window. This ensures smooth GC and reduces the impact on storage arrays (in
development).
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 4 Experience
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
32
4 Experience
4.1 Application Analysis of All-Flash-Memory Arrays
Item Feature Advantage Disadvantage Application Scenario
Traditional
storage array
equipped with
SSDs
HDDs sequentially read and
write I/Os in large blocks.
However, most application I/Os
are random. Traditional storage
arrays are designed for HHDs.
The cache and I/O scheduling
algorithms must ensure that
random I/Os are integrated into
large-block sequential I/Os,
which increases the algorithm
complexity and latency. SSDs
on a traditional storage array
are regarded as HDDs.
Therefore, SSDs' advantage of
processing random I/Os cannot
be brought into full play. In
addition, complex algorithms
counteract the advantage of low
latency.
Tiered storage,
reducing the
comprehensive
cost
Although
applications can
be accelerated,
advantages of
SSDs cannot be
brought into full
play.
Because
reliability and
WL cannot be
designed for
SSDs, their
reliability and
service life
cannot match
those of solid
state storage.
A few
applications
requiring tiered
storage and
acceleration
PCIe SSD PCIe SSDs are installed in
servers, which keeps SSDs
closest to the CPU. This
method increases the
complexity of maintenance and
capacity expansion. In addition,
storage space of these SSDs
cannot be shared. The
convenience and reliability
designs of external storage
cannot be applied to these
SSDs.
Storage
deployment
method of the
highest
performance
Single points of
failure lower
reliability.
Services must be
stopped for
maintenance.
Storage capacity
cannot be
expanded and
shared.
Non-core
applications and
distributed
computing
systems
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 4 Experience
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
33
Item Feature Advantage Disadvantage Application Scenario
All-flash-memory
array
SSDs are installed in
SSD-based all-flash-memory
arrays. Their performance
advantages can be brought into
full play and their reliability
and service life are maximally
improved. All-flash-memory
arrays feature high reliability
and maintainability.
1. High
performance
2. Robust
reliability
3. Low cost in
terms of
performance
High cost in
terms of
capacity
Core services
requiring high
performance
and reliability,
especially
databases
4.2 Typical Applications in Target Industries
Government:
External e-government network: Is a database bearing government affairs transparency
and public services. These customers expect to accelerate public query and service
processing, improve the efficiency of public services, and enhance public service
satisfaction. The Dorado series is used as the primary storage array of a database.
Internal e-government network: Is used for internal work of governments and
communication between government sectors. These customers expect to improve their
work efficiency. The Dorado series is used as the primary storage array of a database.
Virtual desktop infrastructure (VDI) for work efficiency: These customers expect a
VDI featured by robust performance, high density, low cost, and energy savings. The
Dorado series is used as the primary storage array of the VDI.
Public security bureau:
Population, entry and exit, and vehicle information resource databases: Store
information about population, entry-exit personnel, goods, and vehicle registration,
violation, and annual review. Other applications need to extract required data from these
basic databases. These customers expect to accelerate information access and query and
cope with a large amount of concurrent access. Serving as the primary storage array of
these databases, the Dorado series improves the efficiency of resource database–based
applications and supports a large amount of high-speed access.
Information judgment and analysis data warehouse: a large platform where
information extracted from the basic databases is processed in batches. These customers
expect to reduce information analysis duration and improve analysis efficiency. Serving
as the storage array of the data warehouse, the Dorado series improves analysis
efficiency and reduces analysis duration.
Human resources and social security:
Personnel information management system
Information comparison and inspection service system
Service information comparison and inspection service system
Electronic record management system
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 4 Experience
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
34
Employment management subsystem
Employment service subsystem
Basic social security funds monitoring subsystem
Social security monitoring, inspection, and management subsystem
Social security management subsystem for medical payment
Urban and rural residents social security management subsystem
Employees social security management subsystem
Social security card certification subsystem
These are the information databases of human resources, social security, and employment, and
various service systems built on these databases. In the human resources and social security
systems of the Ministry of Human Resources and Social Security as well as its provincial and
municipal bureaus, both the online transaction processing (OLTP) and online analytical
processing (OLAP) exist. The OLTP is used to query and access the information databases.
The OLAP is used to analyze, report, and integrate database information. These customers
have the following requirements:
1. Fast public services
2. Support for a large amount of concurrent access
3. Efficient reporting, auditing, batch processing, and data integration
Serving as the primary storage array, the Dorado series accelerates service processing,
increases concurrent access, and shortens reporting and batch processing durations, improving
the customer satisfaction and work efficiency of human resources and social security bureaus.
Finance
Treasury payment system and budgeting system: The treasury payment is the most
important service in the financial system. The treasury pays for the expenditure of
departments after approval. The budgeting system processes and analyzes various budget
reports. The budgeting system requires high performance, especially at the beginning
and end of every month and year, to perform data batch processing and prepare reports.
These customers expect to shorten the duration of financial and budget report generating
and batch processing durations. Serving as the primary storage array of the OLAP
database, the Dorado series improves the efficiency of batch processing and report
generating.
Tax:
Tax collecting system: Is the most important system in the tax industry and is
responsible for tax collecting, statistics, analysis, and report. These customers expect to
shorten the duration of tax collecting, statistics, analysis, and report for the OLAP. The
Dorado series is used as the primary storage array of the tax collecting system.
Customs:
E-customs port system: Is responsible for customs clearance registration and inspection,
customs declaration bills, and customs declaration data integration, analysis, and report.
The customs clearance system uses both the OLTP and OLAP. The OLTP is used to
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 4 Experience
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
35
process customs clearance bills. The OLAP is used to integrate and process data in
batches at night. These customers have the following requirements:
1. Improve customs clearance speed and service efficiency.
2. Shorten data integration. Service data generated in the day can be processed at that
night.
Serving as the primary storage array, the Dorado series accelerates the customs clearance
service and shortens data integration.
E-hospital:
Hospital information system (HIS): Processes the medical history, appointments with
doctors, and payment of patients.
Laboratory Information System (LIS): Processes the medical history, test reports, and
medicine use of patients. These customers expect to accelerate the processing of the HIS
and LIS systems.
The Dorado series is used as the primary storage arrays of these two systems.
Education:
Online exam paper marking system: Enables teachers to read and mark electronic
exam papers from a database and then input scores. These customers expect to accelerate
the exam paper marking process to cope with a large number of exam papers after
college entrance examination. The Dorado series is used as the primary storage array of
the system.
VDI teaching: Uses the VDI to deploy teaching systems in schools, especially in
primary and secondary schools. These customers expect a teaching system featured by
robust performance, high density, low cost, and energy savings.
Grid:
Advanced metering infrastructure (AMI): Is an architecture consisting of hardware
and software for automated and two-way communication between a smart utility meter
with an IP address and a utility company. The goal of an AMI is to provide utility
companies with real-time data about power consumption and allow customers to make
informed choices about energy usage based on the price at the time of use. The data is
typical database data. The data amount is small and generated once a day. Gateway
energy meters change frequently. These customers expect to improve system
performance. The Dorado series is used as the primary storage array of a database.
Finance:
Bank card system, loans, and deposits: The main services and bank card management,
loan management, and deposit management systems of a bank require rapid transaction
and high concurrency.
Report, finance, and data warehouse: The report, analysis, data mining, and audit
require the analysis of a large amount of data in a short time.
Customer relationship management (CRM) system: Manages the detailed information,
assets, credits, and bad records of customers. It can also filter, collect statistics on,
analyze, and mine customer information in different dimensions. These customers
require rapid transaction, high concurrency, and short analysis process.
Retail:
Sales system: Manages the sales record, inventory anticipation, and cargo scheduling of
each outlet. The OLTP performance determines the transaction speed and whether the
system can support high access concurrency during peak hours.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 4 Experience
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
36
Business intelligence (BI): Is used to analyze sales volumes, market, and customer
behaviors so that business plans can be formulated based on the analysis. The OLAP
capability and system data analysis speed determine the efficiency of scheduling
customer cargoes and the acute sense for the retail market.
Voyage transportation and logistics:
Voyage transportation management system: Needs the OLAP system to process order
data and generate plans about warehousing, transporting, and routes at night. If these
plans cannot be generated in time, the cost increases while the benefit decreases.
Therefore, the voyage transportation management system has a strict requirement for
batch processing duration at night.
Petroleum:
Earthquake materials processing system: Requires high-performance computing to
simulate and compute seismic data. A large amount of data needs to be computed and
scheduled. This requires high-performance storage devices.
Oil pipeline Enterprise Resource Planning (ERP): An efficient ERP and production
management service of the OLAP system determine efficient production.
Finished product, wholesale, and retail: The finished products, bulk accounting,
inventory anticipation, cargo scheduling, and market trend analysis require an integrated
marketing system. The OLTP system is combined with the OLAP system to rapidly
generate reports and scheduling plans, saving costs.
4.3 Typical Cases
4.3.1 OLTP Case
The transaction waiting time of a building material retailer is reduced to 20% of the
original one. Customer satisfaction is improved. The maximum number of concurrent users is increased by 20X. The service growth is smooth.
The retailer is the Germany's largest building material retailer. In 2001, the retailer had over
110 stores in seven countries of Western Europe and set up a strategic relationship with
Kingfisher. In 2012, the retailer had more than 200 stores in Eastern Europe, Asia, and
Australia.
Customer Pain Points: The excessive transaction waiting time incurs a bottleneck of
increasing concurrent users.
In German-speaking countries including Germany, Switzerland, and Austria, the retailer uses
unified data centers for online transactions. The online transaction system SAP is based on the
Oracle 11g database and uses eight IBM 3850 servers, WMware virtual machines (VMs), and
IBM DS8700. The service system averagely processes 500 transactions per minute during
peak hours. Each transaction takes 10 seconds and 3.5 persons have to wait at each cash
register on average.
In early 2012, the number of stores was increased by 25%. The service system had to process
600 transactions per minute during peak hours. In the actual situation, the service system
could not process 600 transactions per minute in peak hours and the transaction time
increased to 30 seconds. 7.5 persons had to wait at each cash register on average. Customer
complaints increased and about 8% customers gave up procurement because of long waiting
time. The monthly loss totaled 10 million euros in sales.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 4 Experience
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
37
In desperation, the retailer shut down 1/6 cash registers in each store to ensure that the service
system could process 500 transactions per minute and each transaction took 10 seconds. As a
result, 5.1 persons had to wait at the cash register on average and customer complaints and
procurement give up still existed.
Huawei Solution: The Dorado5100 is used to store the database of the MAP system.
Each event of the database is analyzed. The result indicates that when the service pressure
increases, the I/O latency of the original storage system increases sharply. During peak hours,
the IOPS reaches 300,000 and the average I/O latency is up to 10 ms. High I/O latency causes
that 97% of the time is used for I/O waiting during database running. During I/O waiting, the
CPU is idle. As a result, service processing is greatly prolonged. To improve database
performance, the latency under the pressure of processing 300,000 and more IOPS must be
reduced.
In July 2012, the Dorado5100 was used to replace original IBM 8700, serving as the online
storage device of the MAP system and IBM 8700 was used for tests, backup, and office
applications. The latency is within 1 ms when the Dorado5100 processes 600,000 IOPS.
In the TPC-C tests of the Dorado5100, the service processing waiting time is reduced to
18.3% of the original one. In actual situation, the service process waiting time of each order is
reduced to 2 seconds, 20% of the original one. In the TPC-C tests, the number of concurrent
users supported by the Dorado5100 is increased by 20 times. In the actual situation, the
Dorado5100 processes 800 orders per minute.
Benefits: Services are expanded economically, effectively, and conveniently.
The LVM function of AIX enables the data to be migrated within 15 minutes. Currently, the
transaction system can process 800 transactions per minute and each transaction takes 2
seconds. 1.3 persons have to wait at the cash register on average. No customers complain
about waiting time and give up transactions because of waiting. The retailer does not need to
shut down cash registers to ensure acceptable waiting time any longer.
In addition, after the Dorado5100 is used, the number of concurrent transactions increases by
85%. The transaction system can process another 85% transactions without adding a server,
VM, software license file, service and maintenance cost. To the retailer, every 10 new stores
can save about 350,000 euros on IT costs.
4.3.2 OLAP Case
The batch processing duration of a voyage transportation company is reduced from 155
minutes to 15 minutes (9.6% of the original one). The company successfully expands its
services to North America.
This company is a large voyage transportation company whose services cover the Europe,
Africa, North America, and South America. This company's services include shipping,
warehousing, land transportation, cargo handling, and ship management.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 4 Experience
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
38
Customer Pain Point: The batch processing time at night is too long. The service
processing capability is low.
The company uses the self-developed MAP system. Based on the Oracle 11g database, the
MAP system integrates the OLTP and OLAP. Currently, the MAP system uses two IBM P750
midrange computers and its storage system uses IBM SAN Volume Controller (SVC) to
manage one IBM DS300 and one DS4800. In daytime business hours, the MAP system
processes OLTP services and orders, including reservation for land transportation, loading and
unloading, warehouses, containers, ships, customs, and insurance. In non-business hours at
night, OLAP data is consolidated and backed up for the planning and scheduling of land
transportation, warehouses, containers, and ships. The data processing must be completed
before 6:00 a.m. the next day to ensure the smooth running of services.
Almost 100,000 transactions must be handled each day. At 00:15 in the morning, the MAP
system starts batch processing, such assorting orders, performing statistics, and outputting all
kinds of business plans for land transportation, warehousing, and shipment. Batch processing
must be completed within 3 hours. Even though plan delivery and data backup take some time,
operations at night have no impact on service next day. Currently, batch processing takes 155
minutes. Services can run properly.
In early 2012, this company planned to expand its services to North America. The number of
daily orders was expected up to 150,000. Therefore, the MAP system cannot process the data
of 150,000 transactions within 3 hours. If the report output is delayed for one day, the loss for
this company is nearly 100,000 euros.
Solution: HUAWEI Dorado5100 is used for database storage of the MAP system.
Each event of the database is analyzed. The result indicates that when the service pressure
increases, the I/O latency of the original storage system increases sharply. During peak hours,
the IOPS reaches 200,000 and the average I/O latency is up to 8 ms. High I/O latency causes
that 80% of the time is used for I/O waiting during database running. During I/O waiting, the
CPU is idle. As a result, batch processing is greatly prolonged. To improve database
performance, the latency under the pressure of processing 200,000 and more IOPS must be
reduced.
The Dorado5100 is used to replace the original storage system, serving as the online storage
device of the MAP system. The latency is within 1 ms when the Dorado5100 processes
600,000 IOPS. The batch processing time is reduced to 15 minutes for three consecutive days.
Benefits: The batch processing time is reduced to 9.6% of the original one, ensuring
smooth service growth.
The batch processing time of 100,000 transactions is reduced from 155 minutes to 15 minutes,
and that of 150,000 transactions is expected to be less than 30 minutes. The strong data
processing capacity ensures smooth service growth. This company says that the performance
of Huawei Dorado solution is beyond their expectation. The batch processing time keeps short
even if services grow by 2 or 3 times. The efficient batch processing allows this company to
issue service plans in a timely manner to prevent extra OPEX caused by plan delay.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper 5 Conclusion
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
39
5 Conclusion
Huawei is dedicated to providing high-quality storage products and user-friendly services for
customers. Based on this concept, Dorado series all-flash-memory arrays feature low latency,
high IOPS, robust reliability, and enhanced usability to help customers cut the TCO and
maximize service competitiveness.
HUAWEI OceanStor Dorado Series SSD Array
Technical White Paper A Acronyms and Abbreviations
Issue 1.0 (2013-05-24) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
40
A Acronyms and Abbreviations
LUN logical unit number
RAID redundant arrays of independent disks
SCSI Small Computer System Interface
SAS serial attached SCSI
RAM random access memory
PCM phase change memory
SSD solid-state drive
HDD hard disk drive
MTBF mean time between failures
FIT failure in time
AFR annual failure rate
MTTR mean time to repair
OLTP online transaction processing
OLAP online analytical processing
CAPEX capital expenditure
OPEX operating expense
TCO total cost of ownership