Course materials may not be reproduced in whole or in part without the prior written permission of IBM. 5.1 © Copyright IBM Corporation 2009 Unit 3: IBM DS8000 architecture and hardware overview
Course materials may not be reproduced in whole or in part without the prior written permission of IBM. 5.1
© Copyright IBM Corporation 2009
Unit 3: IBM DS8000 architecture and hardware overview
© Copyright IBM Corporation 2009
Unit objectivesAfter completing this unit, you should be able to:
• Discuss the hardware and architecture of the DS8000
• Use virtualization terminology describing configuration of the DS8000 subsystem
• Describe the physical hardware components and resources
• Describe the models and features provided by each model
• Describe the types of disk arrays that can be configured for a DS8000 subsystem
• Explain the cabling between adapters and drive sets
Course materials may not be reproduced in whole or in part without the prior written permission of IBM. 5.1
© Copyright IBM Corporation 2009
DS8000 highlights
© Copyright IBM Corporation 2009
DS8000 highlights• New processor family - POWER5+ RISC (DS8000 Turbo)
– DS8100 model 931– DS8300 model 932/ 9B2 for LPAR
• Significant extensions to enable scalability – 64K logical volumes (CKD, FB, or mixed)– Expanded volume sizes, dynamic volume add/delete
• I/O adapters – Fibre Channel/FICON host adapter (4 ports, 4 Gb/s)– ESCON host adapter (2 ports, 18 MB/s)– FC-AL device adapter (4 ports, 2 Gb/s)
• FC-AL disks – 73 GB, 146 GB, 300 GB or 450 GB at 15K rpm – FATA disk drives of 500 GB or 1 TB / 7200 rpm– 73 GB or 146 GB solid state disks
© Copyright IBM Corporation 2009
DS8000 series models (2107)• DS8000 models feature:
– High performance– High-capacity series of disk storage– Design supporting continuous operations
• Redundancy• Hot replacement/updates
– IBM POWER5 server technology • Integrated with the IBM Virtualization Engine technology
• DS8000 models consist of:– Storage unit– One or two (recommended) Management Consoles (MC)
• Graphical user interface (GUI) or command line interface (CLI) allows:– Performing logical configurations and Copy Services management
functions• For high availability, hardware components are redundant
© Copyright IBM Corporation 2009
DS8000 R2 highlightsR2: Announcing new features for ALL models:
• IBM POWER5+ processor: New DS8000 Turbo (93x/9Bx)• Processor memory for POWER 5+ processor• 4 Gb FCP/FICON adapter (available on all models 92x/9Ax and
93x/9Bx)• 500 GB 7200 rpm FATA drives (available on all models 92x/9Ax
and 93x/9Bx)• Three-site Metro/Global Mirror• Earthquake resistance kit• Ethernet adapter pair (for TPC RM support)• Performance Accelerator (models 932 and 92E only)
• 300 GB 15,000 rpm fibre channel drives• HyperPAV (System z)
© Copyright IBM Corporation 2009
DS8000 R3 highlightsR3: Announcing new features for ALL models:
• Dynamic increase in size of existing volumes– No need to move data or delete/redefine– May be used to prevent out of space conditions or to migrate from
one volume size to another• Storage pool striping• Adaptive multi-stream pre-fetching• FlashCopy space efficient (SE) • New Secure Socket Layer (SSL) option for call home• New console, the System Storage Productivity Center (SSPC)
to manage the full data center from a single point.
© Copyright IBM Corporation 2009
DS8000 R4 highlightsR4: Announcing new features for ALL models
• 450 GB/15,000 RPM DDMs• 1 TB / 7.2 RPM FATA DDMs• 73 GB and 146 GB SSDs• RAID 6 • DSCLI and Storage Manager changes• GUI response time improvement on panel loads• Disk encryption• Variable LPAR• Extended address volumes• IPv6• Secure data overwrite service offering• Laptop HMC
© Copyright IBM Corporation 2009
DS8000 R5 highlightsR5: Announcing a complete new model!
• new IO towers• new DA cards• new CECs• new CPUs (Power6)• increased performance• increased stability• easier upgrade path• …
more in a separate presentation!
© Copyright IBM Corporation 2009
Web resources: DS8000 microcode information• Storage support search
– http://www-912.ibm.com/TotalStorageSearch/index.jsp• Search = “DS8000 Microcode Release Notes” (Look under technical documents)• Search = “DS6000 Microcode Release Notes” (Must initiate download process to
view)
• DS8000 – General support
• http://www.ibm.com/servers/storage/support/disk/ds8300/
– DS8000 code bundle cross-reference (contents and supported code levels)• http://www.ibm.com/support/docview.wss?uid=ssg1S1002949
– Microcode bundle release note information • http://www.ibm.com/support/docview.wss?uid=ssg1S1002835
– DS8000 CLI and Storage Manager client software levels and download• http://www.ibm.com/support/docview.wss?uid=ssg1S4000420
© Copyright IBM Corporation 2009
DS8000 supported operating systems for servers• IBM:
– System i: OS/400, i5/OS, Linux, and AIX– System p: AIX and Linux– System z: z/OS, z/VM, and Linux
• Intel servers:– Windows, Linux, VMware, and NetWare
• Hewlett-Packard:– HP-UX– AlphaServer: Tru64 UNIX
• Sun:– Solaris
• Apple Macintosh– OSX
• SGI Origin servers:– IRIX
• Fujitsu Primepower
Check the System Storage Interoperability Center (SSIC) for complete and updated Information:
http://www-03.ibm.com/systems/support/storage/config/ssic/displayesssearchwithoutjs.wss?start_over=yes
© Copyright IBM Corporation 2009
Host connectivity: IBM SDD and MPIO• SDD provides the following functions:
– Enhanced data availability– Automatic path failover – Dynamic I/O load balancing across multiple paths – Path selection policies for the host system – Concurrent download of licensed machine code
• With DS6000 and DS8000, SDD is supported on the following operating systems:– Windows – NetWare– AIX – HP-UX – Sun Solaris– Linux
• SDD can coexist with RDAC (DS4000 multipath driver) on most operating systems as long they manage separate HBAs.
• SDD cannot be used with most other multipath drivers (in other words, Veritas, PV-Links, Powerpath)
© Copyright IBM Corporation 2009
Interfaces to manage DS8000 (1 of 2)• IBM System Storage DS Storage Manager GUI
(Web-based GUI)– Program interface used to perform logical configurations and Copy Services
management functions– Installed via GUI (graphical mode) or unattended (silent mode) – Accessed through Web browser– Offers:
• Simulated configuration (offline)– Create, modify, save logical configuration when disconnected– Apply to a network-attached storage unit
• Real-time configuration (online)– Logical configuration and Copy Services for a network-attached storage unit
• Both• DS command line interface (DSCLI: script-based)
– Open hosts invoke and manage FlashCopy, Metro and Global Mirror functions• Handle batch processes and scripts• Check storage unit configuration and perform specific application functions • For example:
– Check and verify storage unit configuration– Check current Copy Services configuration used by storage unit– Create new logical storage and Copy Services configuration settings– Modify or delete logical storage and Copy Services configuration settings
• Available for several operating systems
© Copyright IBM Corporation 2009
Interfaces to DS8000 (2 of 2)• DS open application programming interface (API)
– Non-proprietary storage management client application supporting:• Routine LUN management activities (creation, mapping, masking)• Creation or deletion of RAID volume spaces• Copy Services functions: FlashCopy, PPRC
– Helps to integrate configuration management support into existing storage resource management (SRM) applications
– Enables automation of configuration management through customer-written applications
– Complements the use of Web-based DS-SM and script-based DSCLI
– Implemented through IBM System Storage Common Information Model (CIM) agent• Middleware application providing CIM-compliant interface
– Uses CIM technology to manage proprietary devices as open system devices through storage management applications
– Allows these applications to communicate with a storage unit– Used by TPC for disk
Course materials may not be reproduced in whole or in part without the prior written permission of IBM. 5.1
© Copyright IBM Corporation 2009
DS8000 HMC for management and remote support
© Copyright IBM Corporation 2009
DS8000 Storage-Hardware Management Console• Focal point for:
– Configuration, Copy Services, maintenance• Dedicated workstation installed inside DS8000
– Similar to eServer POWER5 HMC– Also known as Storage System Management Console (MC)– Dedicated workstation physically installed inside your system– Automatically monitors the state of system– Notifies user and IBM when service required (call home)– Connected to customer network
• Enables centralized management through GUI, CLI, or open API • External management console (optional, feature code 1100)
– For redundancy with high availability • Internal management console (feat code 1110)
© Copyright IBM Corporation 2009
DS8000 Storage-Hardware Management Console• Provides the following:
– Local service• Interface for local service personnel
– Remote service• Call home and call back
– Storage facility configuration • LPAR management (HMC)• Supports logical storage configuration via preinstalled System Storage
DS Storage Manager in online mode only– Network Interface Server for logical configuration and invocation of
advanced Copy Services functions– Connection to storage facility (DS8000) through redundant private
Ethernet networks only• Service appliance (closed system)
© Copyright IBM Corporation 2009
DS8000 HMC and POWER5 HMCNo
n-Vo
latil
e RA
M
POWER5 Hypervisor
Partition 1
UnassignedResources
LPARAllocation Tables
AIX AIX
Processors
Mem Regions
I/O Slots
Partition 2StatusCommand/ResponseVirtual Consoles
Ethernet
ServiceProcessorPerm Temp
HMCHMC
P5 HMC features:• Logical partition configuration• Dynamic logical partitioning• Capacity and resource management
• System status• HMC management• Service functions (microcode update, …)
• Remote HMC interface
© Copyright IBM Corporation 2009
DS8000 HMC and a pair of Ethernet switches• Every DS8000 base frame comes with a pair of Ethernet switches
installed and cabled to the processor complex.
• The HMC has: • 3 Ethernet ports
– 2 connect to the private Ethernet switches– 1 connects to the customer network
• One PCI modem for asynchronous Call home support.
• Corresponding private Ethernet ports of the external HMC (FC1110) would be plugged into port 2 of the switches as shown in next foil.
• To interconnect two DS8000 base frames, FC1190 would provide a pair of 31m Ethernet cables to connect from port 16 of each switch in the second base frame into port 15 of the first frame.
• If the second HMC is installed in the second DS8000, it would remain plugged into port 1 of its Ethernet” switches.
© Copyright IBM Corporation 2009
DS8000 HMC and Ethernet switch plugging
PCIModem
© Copyright IBM Corporation 2009
DS8000 HMC: Network configuration• HMC network consists of:
– Redundant private Ethernet networks for connection to the Storage Facility(ies)– Customer network configured to allow access from the HMC to IBM through a
secure Virtual Private Network (VPN)
• Call home to IBM Services is possible through dial-up (PCI Modem in the HMC) or Internet connection VPNs
• Dial-up or Internet connection VPNs are also available for IBM service to provide remote service and support
• Recommended configuration is to connect HMC to customer’s public network for support– Support will use WebSM GUI for all service actions– Downloading of problem determination data favors the use of a high-speed network
• Network connectivity and remote support is managed by the HMC
© Copyright IBM Corporation 2009
New Laptop version of the HMC with R4.2
• Laptop HMC replace the xSeries HMC– ThinkPad W500 Model 4061-AP1.– Will ship on new boxes, since 2009– No field replacement for previous xSeries HMC. – The laptop HMC is available for internal and external (rack mounted) HMC.
© Copyright IBM Corporation 2009
Laptop HMC Mounting Tray
This view shows the laptop HMC display opened and ready to use.
Note that the USB cables plugged into the side of the laptop HMC and the power cord at the rear. You can also see the power cord transformer strapped to the rear of the tray with Velcro. To open the DVD tray, the laptop HMC can rotated in clockwise direction.
© Copyright IBM Corporation 2009
Laptop HMC DVD Access
Here the turntable and laptop HMC are fully rotated 90 degrees and the DVD drive’s DVD tray is extended.
© Copyright IBM Corporation 2009
Laptop HMC Network and Modem Connections
• eth 2 is an RJ-45 (Ethernet) connector and attaches to the customers network using a standard Ethernet cable
• eth 0 and 3 are USB (Universal Serial Bus) connectors that attach to the private black and grey network using USB cables with a USB to Ethernet cable adaptor. These two USB to Ethernet cable adaptors are held secure in a bracket at the rear of the laptop HMC’s mounting tray.
• The USB Modem connection is also a USB (Universal Serial Bus) connector
© Copyright IBM Corporation 2009
DS8000 remote access features• Call home (outbound connectivity)
– Automatic problem reporting– DS8000 is designed with a “call home” function
• In the event of a failure, the call home function generates a trouble ticket with the IBM support organization
• IBM support determines the failing component and dispatches a customer engineer with the replacement part
• Remote service and support (inbound connectivity)– With remote support enabled, IBM technical support can log into the
HMC to troubleshoot a problem and view logs, dumps, and traces interactively
– This can reduce lag time to send such information to IBM and can shorten problem determination time
– In the case of complex problems, IBM technical support teams can engage a specialist quickly to resolve the problems as quickly as possible
© Copyright IBM Corporation 2009
DS8000 HMC network topology
CustomerNetwork
CustomerNetwork
InternetInternet
IBM Network
IBM Network
DS8000 Subsystems
HMC
integratedFirewallProxy
Opt. Firewallprovided bycustomer
RedundantEthernetFabric eth
etheth
modem
DMZ
DMZHMC: Hardware Management ConsoleDMZ: Demilitarized ZoneVPN: Virtual Private Networketh: Ethernet Port
VPN
VPN
IBM Remote Supportinfrastructure
© Copyright IBM Corporation 2009
How virtual private network (VPN) operates• The VPN server is located behind IBM firewall, which is designed to be secure• The VPN client is located behind the customer firewall
– The customer has control over opening a connection to access the client – Neither IBM technical support nor non-authorized personnel can access the client without
customer’s permission • The VPN Server security complies with IBM corporate security standards
ITCS104– This is an IBM internal security measure for all IBM secure data.
VPNGateway RS3
IB M S u p p o rt
VPN Tunnel
FirewallDS-6000SMC
Firewall
Customer Site IBM Site
Call home(Outbound)
Remote support(Inbound)
© Copyright IBM Corporation 2009
DS8000 HMC: Remote service security• Server authentication via private/public key:
– Each HMC generates a certificate based on the private key that the HMC will use for Secure Sockets Layer (SSL) based encryption and decryption.
– The IBM SSR transmits the certificate for the installed HMC to a database maintained within the IBM secure network.
– IBM personnel then will retrieve the HMC specific certificate from the database and use this key (public key) to establish the communication session with the HMC needed service.
• SSH over VPN for command line access:– Secure Shell (SSH) is used for command line access from a remote
IBM location (for example: putty ssh session with public key).– The SSH daemon on the HMC accepts client connections only if an
IBM VPN is up and a Product Engineer is currently logged on to the HMC.
– SSH client authentication is done through private/public key algorithm.
Course materials may not be reproduced in whole or in part without the prior written permission of IBM. 5.1
© Copyright IBM Corporation 2009
Hardware components
© Copyright IBM Corporation 2009
DS8000: Primary frame topology
Front Rear
Dense HDD Packaging16 drives per pack
Dual FC-AL Loop SwitchesPoint to Point IsolationTw o Simultaneous Operations per loop
Storage-Hardware Maintenance ConsoleProcessor ComplexIBM eServer p5 570Dual 2-w ay or Dual 4-w ay
4 I/O Enclosure BaysEach bay supports 4 Host Adapters and 2 Device Adapters
Standard 19in rack mounting space
RedundantPower
BBU:BatteryBackupUnits
Host Adapter4 FCP/FICON Ports or 2 ESCON Ports
Device Adapter4 FC-AL Ports
© Copyright IBM Corporation 2009
DS8000 terminology• Storage complex
• A group of DS8000s managed by a single Storage-Hardware Management Console
• Storage unit• A single DS8000 including expansion frames.
• Processor complex• One P5-570 p Series server
– Two processor complexes form a redundant pair
• Divided into one LPAR (models 931 or 932) or two LPARs (model 9B2)
• Storage server• The software that uses an LPAR:
– Has access to a percentage of resources available on the processor complex for the LPAR
– At GA, this percentage is 50% (model 9B2) or 100 % (models 931 or 932)
• Storage facility image (SFI)• Union of 2 LPARs, one from each processor
complex– Each LPAR hosts one storage server.
© Copyright IBM Corporation 2009
DS8000 data flow• The normal flow of data for a
write is the following:1. Data is written to cache memory in
the owning server.2. Data is written to NVS memory of
the alternate server.3. The write is reported to the
attached host as having been completed.
4. The write is destaged from the cache memory to disk.
5. The write is then discarded from the NVS memory of the alternate server.
Under normal operation, both DS8000 servers are actively processing I/O requests.
© Copyright IBM Corporation 2009
DS8000 hardware components detail
Processor complex
Processor complex
© Copyright IBM Corporation 2009
DS8000 processor complex (1 of 3)
Processor complex
Processor complex
© Copyright IBM Corporation 2009
DS8000 processor complex: POWER5 server
CEC enclosures in the Model 921/931 each have one processor card (2-way) CEC enclosures in the Model 922/932 and 9A2/9B2 each have two processor cards (4-way)
CEC: Computer Electronic ComplexCEC enclosures contain components such as the processor cards, cache memory, andCEC hard drives.
© Copyright IBM Corporation 2009
IBM eServer p570• Scales from a 1-way to a 16-way SMP using 4U building blocks• Dynamic LPAR and micro-partitioning• Simultaneous Multi-threading (SMT)• Self-healing features
– Bit-steering (bit sparing)– Chipkill ECC (8-bit packet correct)– ECC on processor cache memories– L3 cache line deletes– Memory scrubbing– Dynamic processor deallocation
• Other RAS attributes– N+1 power and cooling– Hot-plug PCI– In-place service– First fault data capture
• Optimized for storage– High I/O bandwidth RIO-2– Large robust memories– 4K memory allocation
Near Linear Scaling
FX0FX1LS0LS1FP0FP1BRXCRL
Thread1 active
Thread0 activeNo Thread active
Execution units utilization
2-way4-way
8-way
12-way
16-way
© Copyright IBM Corporation 2009
DS8000 processor complex (2 of 3)• Complex is comprised of IBM eServer System p POWER5 servers
(921, 922, and 9A2)– 2-way 1.5 GHz (3X on ESS 800) – 4-way 1.9GHz (6X on ESS 800)
• New DS8000 Turbo (931, 932, and 9B2) are using POWER5+ processors – 15 % performance improvement.– 2.2 GHz for POWER5+ 2- and 4-way
• The POWER5 processor supports logical partitioning– The p5 hardware and Hypervisor manage the real to virtual memory mapping to
provide robust isolation between LPARs.– IBM has been doing LPARs for 20 years in mainframes and 8 years in System p.
• At GA, LPARs are split 50-50, so:– A 4-way has two processors to one LPAR and two processors to the other LPAR.
• Post GA, 25-75 possible. – LPARs only possible in the 4-way P5s (RIO-G cannot be shared in 2-way).
• Cache memory ranges from 16 GB to 256 GB
• Persistent memory ranges from 1 GB to 8 GB: dependent on cache size– Battery backed for backup to internal disk (4 GB per server)
© Copyright IBM Corporation 2009
DS8300 model 8A2/9B2 4-way with LPARs
© Copyright IBM Corporation 2009
Server LPAR concept overview• An LPAR:
– Uses hardware and firmware to logically partition resources– Is a subset of logical resources that are capable of supporting an operating system – Consists of CPUs, memory, and I/O slots that are a subset of the pool of available
resources within a system• Very flexible granularity according to AIX level (5.2, 5.3, and so on)• No need to conform to physical boundaries of building blocks
• In an LPAR:– An operating system instance runs with dedicated (AIX 5.2) or shared (AIX 5.3)
resources: processors, memory, and I/O slots– These resources are assigned to the logical partition – The total amount of assignable resources is limited by the physically installed
resources in the system
• LPARs provide:– Isolation between LPARs to prevent unauthorized access between partition
boundaries– Fault isolation such that one LPARs operation does not interfere with the operation
of other LPARs– Support for multiple independent workloads, different operating systems, operating
system levels, applications, and so on
© Copyright IBM Corporation 2009
LPAR applied to storage facility images (SFI)
• DS8300– Comprised of two eServers P5
570(Processor complex)
– Each processor complex supports one or more LPARs
– Currently, each processor complex divided into two LPARs
• An LPAR in a processor complex
– Set of resources to support exec of an operating system
LPAR02
Processorcomplex 0
Processorcomplex 1
StorageFaci l i ty Image 1
StorageFaci l i ty Image 2
LPAR01 LPAR11
LPAR12
LPARxyx=Processor complex numbery=Storage facility number
Delivered AS IS, no need using the HMC to configure
© Copyright IBM Corporation 2009
DS8000 processor complex (3 of 3)
(Persistent memory)
© Copyright IBM Corporation 2009
DS8000 persistent memory• The 2107 does not use NVS cards, NVS batteries, or NVS battery
chargers • Data that would have been stored in the 2105 NVS cards resides in the
2107 CEC cache memory – A part of the system cache is configured to function as NVS storage
• In case of power failure, if the 2107 has pinned data in cache, it is written to an extra set of two disk drives located in each of the CEC enclosures
• Two disk drives total in each CEC:For LIC (LVM Mirrored AIX 5.3 + DS8000 code) For pinned data and other CEC functions
• During the recovery process, the pinned data can be restored from the extra set of CEC disk drives just as it would have been from the NVS cards on the ESS 800
© Copyright IBM Corporation 2009
DS8000 I/O enclosure
Processor complex
Processor complex
© Copyright IBM Corporation 2009
RIO-G and I/O enclosures• Also called I/O drawers
• Contain six PCI-X slots: 3.3V, 133 MHz blind swap hot-plug:– Four port host adapter cards with four ports each:
• FCP or FICON adapter ports– Two device adapter cards with four ports each:
• Four FC-AL ports per card• Two FC-AL loops per card
• Accesses cache via RIO-G internal bus
• Each adapter has its own PowerPC processor
• Owned by processors in LPAR
• Uses system power control network (SPCN)– Controls and monitors the status of the power and
cooling within the I/O enclosure– Cabled as a loop between the different I/O
enclosures
© Copyright IBM Corporation 2009
DS8000 I/O enclosures (aka I/O drawers)
SPCN : System Pow er Control Netw ork
© Copyright IBM Corporation 2009
DS8000 RIO-G port: Layout example
Each RIO-G port can operate at 1 GHz in bidirectional mode and is capable of passing data in each direction on each cycle of the port. Maximum data rate per I/O Enclosure: 4 GB/s.It is designed as a high performance self-healing interconnect. The p5-570 provides two external RIO-G ports, and an adapter card adds two more. Two ports on each processor complex form a loop.
Figure shows an illustration of how the RIO-G cabling is laid out in a DS8000 that has eight I/O drawers.This would only occur if an expansion frame were installed.The DS8000 RIO-G cabling will vary based on the model.
Up to four I/O enclosures in same RIO-G loop
Up to 20 I/O enclosures to P5-570 system
– Max effective bandwidth: – 2000 MB/SEC per RIO-G
loop,
© Copyright IBM Corporation 2009
DS8000 host adapters
Processor complex
Processor complex
© Copyright IBM Corporation 2009
Host adapter with four fibre channel ports• Configured each as FCP or FICON
– More FICON logical paths:• ESS (1024) versus DS8000 (2048)
– One FICON channel addresses 16,384 devices
– One HA card covers all the 65,280 devices that a DS8000 supports • (64k -256)
– Up to 16 HA into a DS8100 or 32 HA into a DS8300• 16 FICON channel ports to each
single device• Current System z channel
subsystems limited to eight channel paths per device
– Front end of:• 128 ports for DS8300 (8 times ESS)• 64 ports for DS8100 (4 times ESS)
© Copyright IBM Corporation 2009
DS8000 FCP/FICON host adapters• DS8000 has four LC 2 Gb or 4 Gb FC ports (two host adapter models)
• Ports auto-negotiates to 1 Gbps, 2 Gbps, or 4 Gbps– Each port independently auto-negotiates to either 1/2 Gbps link speed on 2 Gb host
adapter models or 2/4 Gbps link speed on 4 Gb host adapter models.
• Ports can be independently configured to FCP or FICON protocols– The personality of the port is changeable via the DS storage management tools (GUI
or CLI).
• Ports cannot operate as FCP and FICON simultaneously
• FCP port can be longwave or shortwave– Shortwave ports support a distance of 300m (non-repeated)– Longwave ports support a distance of 10Km (non-repeated)
• Note: For FCP, configure the ports as follows:– Switched point-to-point for fabric topology– FC-AL for point-to-point topology
© Copyright IBM Corporation 2009
DS8000 FICON/FCP host adapters
QDR
QDR PPC750GX
Flash
Buf f erData ProtectionData MoverASIC
ProtocolChipset
Processor
Data Mover
1 GHz
PCI-X 64 Bit 133 MHz
Fibre Channel Protocol Engine
Fibre Channel Protocol Engine
• Four 2 or 4 Gbps Fibre Channel ports• New high function/high performance ASIC• Metadata creation/checking• Configured at port level Fibre or FICON
•SW or LW
© Copyright IBM Corporation 2009
DS8000 4 Gb host adapter performancesNew 4 Gb host adapters are designed to improve by 50% single port throughput performance.
4 Gb / 2 GB HA performance comparison
© Copyright IBM Corporation 2009
DS8000 device adapters
Processor complex
Processor complex
© Copyright IBM Corporation 2009
Fibre channel device adapters with 2 Gbps ports• DA performs RAID logic
– Offloads servers of that workload
– Each port has up to five times the throughput of previous SSA-based DA ports
– DS8000 configured for array across loops (AAL)
– Eight RAID 5, 6 or RAID 10 DDMs spread over two loops
© Copyright IBM Corporation 2009
DS8000 device adapters• Device adapters support RAID 5, RAID 6 or RAID 10
• FC-AL switched fabric topology
• FC-AL dual ported drives are connected to FC switch in the disk enclosure backplane
• Two FC-AL loops connect disk enclosures to device adapters
• Array across loops is standard configuration option in DS8000– Two simultaneous I/O ops per FC-AL connection possible– Switched FC-AL or switched bunch of disks (SBOF) used for back-end access
• Device adapters are attached to a FC switch with the enclosure
• Four paths to each drive: 2 FC-AL loops X dual port access– (Detailed later with storage enclosures cabling)
© Copyright IBM Corporation 2009
Buf f erRAID:Data Protection -Data Mov erASIC
ProtocolChipset
Processor
Data Mover
500 MHz
PCI-X 64 Bit 133 MHz
NVRAM SDRAM
PPC750FX
Bridge
Fibre Channel Protocol Engine
Fibre Channel Protocol Engine
• Four 2 Gbps fibre channel ports• New high function/high performance
ASIC• Metadata checking
DS8000 RAID device adapter
© Copyright IBM Corporation 2009
Device Adapter
Performance evolution: From the model 800 to the DS8000
Device adapter
58
I/O enclosure – slot numbering – rear view
Host A
dapter
Host A
dapter
Device A
dapter
RIO
Adapter
Host A
dapter
Host A
dapter
Device A
dapter
0 1 2 6 3 4 5
Ô cards counted left to right, starting with 0
Ô ports counted top to bottom, starting with 0
Ô naming scheme in DSCLI (lsioport):
Ô I0XYZ
Ô X=enclosure number (0-7)
Ô Y=card number (0-5)
Ô Z=port number (0-3)
Ô e.g. I0312:
Ô IO enclosure 3 (1st rack, bottom, right)
Ô card 1 (2nd slot)
Ô port 2 (3rd from top)
0
1
2
3
© Copyright IBM Corporation 2009
I0100
Slot 1 Slot 2 Slot 4 Slot 5
I0101
I0102
I0103
I0110 I0130 I0140
I0111 I0131 I0141
I0112 I0132 I0142
I0113 I0133 I0143
Enclosure 2
I0000
Slot 1 Slot 2 Slot 4 Slot 5
I0001
I0002
I0003
I0010 I0030 I0040
I0011 I0031 I0041
I0012 I0032 I0042
I0013 I0033 I0043
Enclosure 1
I0300
Slot 1 Slot 2 Slot 4 Slot 5
I0301
I0302
I0303
I0310 I0330 I0340
I0311 I0331 I0341
I0312 I0332 I0342
I0313 I0333 I0343
Enclosure 4
I0200
Slot 1 Slot 2 Slot 4 Slot 5
I0201
I0202
I0203
I0210 I0230 I0240
I0211 I0231 I0241
I0212 I0232 I0242
I0213 I0233 I0243
Enclosure 3
I/O ports numbering – DS CLI lsioport display – base frame
© Copyright IBM Corporation 2009
I0500
Slot 1 Slot 2 Slot 4 Slot 5
I0501
I0502
I0503
I0510 I0530 I0540
I0511 I0531 I0541
I0512 I0532 I0542
I0513 I0533 I0543
Enclosure 6
I0400
Slot 1 Slot 2 Slot 4 Slot 5
I0401
I0402
I0403
I0410 I0430 I0440
I0411 I0431 I0441
I0412 I0432 I0442
I0413 I0433 I0443
Enclosure 5
I0700
Slot 1 Slot 2 Slot 4 Slot 5
I0701
I0702
I0703
I0710 I0730 I0740
I0711 I0731 I0741
I0712 I0732 I0742
I0713 I0733 I0743
Enclosure 8
I0600
Slot 1 Slot 2 Slot 4 Slot 5
I0601
I0602
I0603
I0610 I0630 I0640
I0611 I0631 I0641
I0612 I0632 I0642
I0613 I0633 I0643
Enclosure 7
I/O ports numbering – DS CLI lsioport display – expansion frame
© Copyright IBM Corporation 2009
16 drive disk enclosure
DS8000: Disk enclosures installed in pairs: one in front and one in back
Interface card (FCIC)
Disk Drive Module (DDM)
Backplane
Top Half (8 DDMs)
Bottom Half (8 DDMs)
62
ISS loop – architectural overview
ÔSwitched connections to each DDM
ÔEach DDM connected to 2 switches
ÔEach switch connected to 2 device adapters
ÔEach device adapter belongs to a different CEC
Ô4 connections to each DDM
high redundancy
© Copyright IBM Corporation 2009
DS8000 FC-AL/ switched FC-AL• FC-AL
– Loop supports only one operation at a time• Arbitration of competition
– Intermittent failure issues– Increasing time as number of devices
grows
• Switched FC-AL– Drives attached in point-to-point
connection• Faster arbitration message
processing• 200 MB/sec external transfer rate
– Improved RAS• Switch detects individual failures
– Intermittent/permanent
© Copyright IBM Corporation 2009
Switched FC-AL advantages• DS8000 uses switched FC-AL technology to link the device adapter (DA) pairs and
the DDMs. • Switched FC-AL uses the standard FC-AL protocol, but the physical implementation is
different.• The key features of switched FC-AL technology are:
– Standard FC-AL communication protocol from DA to DDMs– Direct point-to-point links are established between DA and DDM
• No arbitration and no performance degradation– Isolation capabilities in case of DDM failures provide easy problem determination– Predictive failure statistics– Simplified expansion: No cable rerouting required when adding another disk enclosure
• The DS8000 architecture employs dual redundant switched FC-AL access to each of the disk enclosures.
• The key benefits of doing this are:– Two independent switched networks to access the disk enclosures– Four access paths to each DDM in DS8000 architecture (dual switches)– Each device adapter port operates independently– Double the bandwidth over traditional FC-AL loop implementations
• Each DDM is attached to two separate fibre channel switches. – This means that with two device adapters, we have four 2Gb/sec effective data paths to each disk
• When a connection is made between the device adapter and a disk, the connection is a switched connection that uses arbitrated loop protocol. – This means that a mini-loop is created between the device adapter and the disk– Results in four simultaneous and independent connections, one from each device adapter port
© Copyright IBM Corporation 2009
DS8000: Storage enclosure and DA cabling
Course materials may not be reproduced in whole or in part without the prior written permission of IBM. 5.1
© Copyright IBM Corporation 2009
Architecture:Major points for performance
© Copyright IBM Corporation 2009
DS8000 frames• Base frame:
– The base frame contains two processor complexes: eServer p5 570 servers • Each of them contains the processor and memory that drive all functions within the
DS8000.– The base frame can contain up to eight disk enclosures; each can contain up to 16
disk drives. • In a maximum configuration, the base frame can hold 128 disk drives.
– The base frame contains four I/O enclosures. • I/O enclosures provide connectivity between the adapters and the processors.• The adapters contained in the I/O enclosures can be either device or host adapters
(DAs or HAs).– The communication path used for adapter to processor complex communication is
the RIO-G loop.• Expansion frames:
– Each expansion frame can hold up to 16 disk enclosures which contain the disk drives. • In a maximum configuration, an expansion frame can hold 256 disk drives.
– Expansion frames can contain four I/O enclosures and adapters if they are the first expansion frame that is attached to either a model 932 or a model 9B2.
© Copyright IBM Corporation 2009
IBM System Storage DS8100 (2-way)
Power supplies
Batteries I/O drawers
IBM eServer System p POWER5 servers
Up to 128 disks
HMC
© Copyright IBM Corporation 2009
DS8300 (4-way with two expansion frames)
Up to 640 Disks
I/O drawersBatteries
Power supplies
p5 (POWER5) servers
HMC
© Copyright IBM Corporation 2009
DS8300 (4-way with 4 expansion frames)
(maximum configuration)
© Copyright IBM Corporation 2009
DS8100 (model 921/931) 2-way• Up to 16 host adapters (HA)
– FCP/FICON HA: Four independent ports– ESCON HA: Two ports
• Up to 4 device adapter (DA) pairs– DA pairs 0 / 1 / 2 / 3– Automatically configured from DDMs
• Maximum configuration (384 DDMs)– DA pair 0 = 128 DDMs– DA pair 1 = 64 DDMs– DA pair 2 = 128 DDMs– DA pair 3 = 64 DDMs
– Balanced configuration at 256 DDMs:In other words, 64 DDMs per DA pair
– DA (card) plugging order: 2 / 0 / 3 / 1
b
0/1 1/0
3/22/3
2200
33112200
C0C1
© Copyright IBM Corporation 2009
DS8300 (models 922/932 and 9A2/9B2) 4-way• Up to 32 host adapters
– FCP/FICON HA: Four independent ports
– ESCON HA: Two ports
• Up to eight DA pairs– DA pairs 0 to 7– Automatically configured from
DDMs
• Maximum configuration (640 DDMs)– DA pairs 1, 3-7 = 64 DDMs– DA pairs 2, 0 = 128 DDMs
– Balanced configuration at 512 DDMs: In other words, 64 DDMs per DA pair
– DA (card) pair plugging order: – 2 / 0 / 6 / 4 / 7 / 5 / 3 / 1
b b
0/1 1/0
3/22/3
2200
6644
7755
33112200
4/5 5/4
7/66/7
C0C1
Good idea for a pool: 2 adapters
Bad decision same adapter
Bad decision same adapter
© Copyright IBM Corporation 2009
Cabling diagram for third and fourth expansion frames
b b
0/1 1/0
3/22/3
2200
6644
7755
33112200
4/5 5/4
7/66/7
C0C1
b
33110011b
6644
7755
6/44/6
DS8300 with five frames
Course materials may not be reproduced in whole or in part without the prior written permission of IBM. 5.1
© Copyright IBM Corporation 2009
DS8000 cache management
SARC & AMP
© Copyright IBM Corporation 2009
Sequential prefetching in adaptive replacement cache (SARC)• SARC basically attempts to determine four things:
– When data is copied into the cache– Which data is copied into the cache– Which data is evicted when the cache becomes full– How the algorithm dynamically adapts to different workloads
• SARC uses:– Demand paging for all standard disk I/O– Sequential pre-fetch for sequential I/O patterns
© Copyright IBM Corporation 2009
DS8000 caching using SARC• Best caching algorithms in industry• Over 20 years experience• Features
– Self-learning algorithms• Adaptively and dynamically learn what data
should be stored in cache based upon the recent access and frequency needs of the hosts
– Adaptive replacement cache • Most advanced and sophisticated algorithm to
determine what data in cache is removed to accommodate newer data
– Prefetching• Predictive algorithm to anticipate data prior to a
host request and loads it into cache• Benefits
– Leading performance• Been proven to improve cache hit by up to 100%
over previous IBM caching algorithms and improve I/O response time by 25%
– More efficient use of cache• Intelligent caching algorithm profiles host access
patterns to determine what data is stored• Need less cache than competitors
0
0.2
0.4
0.6
0.8
1
0 64 128 192 256
Cache Size (GB)
Cac
he H
it R
atio
z/OSOpen
Nimrod Megiddo and Dharmendra S. Modha, "Outperf orming LRU with an Adaptiv e Replacement Cache Algorithm," IEEE Computer, pp. 4-11, April 2004.
Benefits of adaptive replacement caching
© Copyright IBM Corporation 2009
What is AMP?• A breakthrough caching technology from IBM Research called Adaptive Multi-
stream Prefetching (AMP)– Can dramatically improve performance for common sequential and batch processing
workloads • AMP optimizes cache efficiency by incorporating an autonomic, workload-
responsive, self-optimizing prefetching technology . – The algorithm dynamically decides what to prefetch and when to prefetch– Delivers up to a two-fold increase in the sequential read capacity of RAID 5 arrays – The bandwidth for a fully configured DS8000 remains unchanged– May improve sequential read performance for smaller configurations and single arrays– Reduces the potential for array hot spots due to extreme sequential workload demands– May significantly reduce elapsed time for sequential read applications constrained by array
bandwidth such as BI and critical batch processing workloads
DS8000 caching using AMP
© Copyright IBM Corporation 2009
AMP doubles sequential read bandwidth for a single RAID 5 array
Course materials may not be reproduced in whole or in part without the prior written permission of IBM. 5.1
© Copyright IBM Corporation 2009
DS8000 RAS features (Reliability, availability, and
serviceability)
© Copyright IBM Corporation 2009
Processor complex RAS• Processor complex has the same RAS features as the P5-570,
which is an integral part of the DS8000 architecture.
• IBM Server p5 system main RAS features:– First Fault Data Capture – Boot process and operating system monitoring– Environmental monitoring– Self-healing– Memory reliability, fault tolerance, and integrity
• Error Checking Correction (ECC)• Memory scrubbing and thresholding
– N+1 redundancy– Resource deallocation– Concurrent maintenance
© Copyright IBM Corporation 2009
Server RAS (1 of 2)• The DS8000 provides data integrity when performing write
operations and server failover.– Metadata check: The metadata is checked by various internal
components to validate the integrity of the data as it moves throughout the disk system or sent back to the host.
– Server failover and failback: • LSS and server affinity:
– LSS with even number have an affinity with server 0– LSS with odd number have an affinity with server 1
• When a host operating system issues a write to a logical volume, the DS8000 host adapter directs that write to the server that owns the LSS to which that logical volume is a member.
© Copyright IBM Corporation 2009
Server RAS (2 of 2)• Under normal operation, both DS8000 servers are actively processing
I/O requests– Each write is placed into the cache memory of the server owning the volume and
also into the NVS memory of the alternate server.
• Failover: In case of one server failure, the remaining server is able to take over all of its functions– RAID arrays which are connected to both servers can be accessed from the device
adapters of the remaining server.– Since the DS8000 has only one copy of data in cache of remaining server, it will now
take the following steps:• It de-stages the contents of its NVS to the disk subsystem.• The NVS and cache of remaining server are divided in two, half for the odd LSSs and
half for the even LSSs.• Remaining server now begins processing the writes (and reads) for all the LSSs.
• Failback: When the failed server has been repaired, failback process is activated – It starts in less than 8 seconds, will finish in less than 15 minutes, and is invisible to
the attached hosts.
© Copyright IBM Corporation 2009
Hypervisor: Storage image independence
RIO-G
Processor Processor
Memory MemoryI/O I/O
R I O -G
P ro cessor P ro cesso r
M em ory M em oryI / O I / O
R I O -G
P rocessor P rocessor
M em ory M em oryI / O I / O
L I C L I C L I C L I C
LPAR Hypervisor
Storage Facility image 1 Storage Facility image 2
Physicalview:physicalstorage unit
Logicalview:virtualStorageFacilityimages
takes part of takes part of
takes part of
© Copyright IBM Corporation 2009
Server failover• Normal flow of data for a write:
1. Data is written to cache memory in the owning server.
2. Data is written to NVS memory of the alternate server.
3. The write is reported to the attached host as having been completed.
4. The write is destaged from the cache memory to disk.
5. The write is then discarded from the NVS memory of the alternate server.
• After a failover, remaining server processes all I/Os with cache and NVS divided by two, one for odd LSSs and one for even LSSs.
Server 0 Server 1
Cachememoryfor evenLSSs
NVSfor oddLSSs
NVSfor evenLSSs
Cachememoryfor oddLSSs
Server 0 Server 1
Cachememoryfor evenLSSs
NVSfor oddLSSs
Failover
NVSfor evenLSSs
Cachefor oddLSSs
Cachefor even
LSSs
NVSfor oddLSSs
© Copyright IBM Corporation 2009
NVS recovery after complete power loss• DS8000 preserves fast writes• Battery backup units (BBUs) ensures fast writes are not lost• Both power supplies stopped
– Batteries not used to keeping disks spinning– Scenario at power-off
• All HA I/O blocked• Each server copies NVS data to internal disk• Two copies made per server• When copy process complete, each server shuts down AIX• When AIX shutdown complete for both servers (or time out expires), the DS8000 is
powered down– Scenario at power-on
• Processor complexes power-on and perform power-on self-test• Each server boots up• During boot-up, each server detects NVS data on its disks and destages it to FC-AL
disks• When battery units reach a certain level of charge, the servers come online
• NVS contents preserved indefinitely• Note: The servers will not come online until the batteries are fully
charged.
© Copyright IBM Corporation 2009
Host connection availability • On DS8000 host, adapters are shared between the servers.
• It is preferable for hosts to have at least two connections to separate host adapters in separate I/O enclosures.– This configuration allows the host to survive a hardware failure on any
component on either path.– This is also important because during a microcode update, an I/O
enclosure may need to be taken offline.
• Multi-pathing software help ensure availability.– Subsystem Device Driver (SDD) is able to manage both path failover
and preferred path determination.• SDD is usable with ESS800, DS6000, DS8000, or SVC.
© Copyright IBM Corporation 2009
Disk subsystem (1 of 2)• RAID 5, RAID 6, and RAID 10
– RAID 5 (7+P or 6+P+S) or RAID 10 (2x4 or 2x3 + 2S) – RAID 6 adds extra parity drive, the “Q” drive, now 6+P+Q versus 7+P or 5+P+Q+S
versus 6+P+S– DS8000 does not support non-RAID configurations (JBODs).
• Spare disk creation– A minimum of one spare is created for each array site defined until the following
conditions are met:• A minimum of four spares per DA pair• A minimum of four spares of the largest capacity array site on the DA pair• A minimum of two spares of capacity and RPM greater than or equal to the fastest
array site of any given capacity on the DA pair
• Floating spare– The DS8000 microcode may choose to migrate new spare disks to a more optimal
position to better balance the spares across the DA pairs, the loops, and the enclosures.• Useful after a drive replacement that became a spare drive
© Copyright IBM Corporation 2009
Disk subsystem (2 of 2)• Each DDM attached to two FC switches
– Each disk has two separate connections on the backplane.
• Each DA connected to the two switches
• DDMs hot-pluggable
• Incorporates predictive Failure Analysis (PFA)– Anticipates failures
• Performs disk scrubbing– All disk sectors periodically read and bad bits corrected (incl. spares)
© Copyright IBM Corporation 2009
Power and cooling• Completely redundant power and cooling in N+1 mode
• Battery backup units (BBU)– Used for NVS (part of the server’s memory)– Can be replaced concurrently
• Rack power control cards (RPC)– Two RPC cards for redundancy– Each card can control power of an entire DS8000
• Power fluctuation protections– DS8000s tolerate a momentary power interruption for approximately
30 ms.– After that time, servers start copying content of NVS to internal SCSI
disks.
© Copyright IBM Corporation 2009
Microcode update• Concurrent code update
– HMC can hold six different versions of code– Each server can hold three different versions of code
• Installation process:– Internal HMC code update– New DS8000 LIC downloaded on the internal HMC– LIC uploaded from HMC to each DS8000 server internal storage – New firmware can be loaded from HMC directly into each device
• May require server reboot with failover of its logical subsystems to the other server
– Update of servers operating system and LIC• Each server updated one at a time with failover of its logical subsystems to
the other server– Host adapters firmware update
• Each adapter impacted for less than 2.5 s, which should not affect connectivity (“Fast Load”)
• Longer interruption managed by host’s multipathing software
© Copyright IBM Corporation 2009
Storage-Hardware Management Console • Redundant Ethernet switches
– Each switch used in a separate Ethernet network with non-routable private IP addresses assigned in networks
– 172.16/16 and 172.17/16– 192.168.16.x and 192.168.17.x– 10.0.16.x and 10.0.17.x
• Redundant HMCs– Each DS8000 can be connected via the redundant Ethernet switches
to both HMCs.
© Copyright IBM Corporation 2009
Unit summaryHaving completed this unit, you should be able to:
• Discuss the hardware and architecture of the DS8000
• Use virtualization terminology describing configuration of the DS8000 subsystem
• Describe the physical hardware components and resources
• Describe the models and features provided by each model
• Describe the types of disk arrays that can be configured for a DS8000 subsystem
93
ESCC – Enterprise Storage Competence Center
DS8000 | Concepts and Architecture | K. Jehnen © 2008 IBM Corporation
DS8000 Models – Overview R2
2107-931–Base rack, 2 way
2107-932–Base rack, 4 way
2107-9B2–Base rack, 4 way, LPAR model
242x-9yy–New models which include the warranty in the model number–x: 1, 2, 3, 4 (years of warranty)–For all rack models
242x-92E–Expansion rack to a 931 or 932
242x-9BE–Expansion rack to a 9B2