Top Banner
© 2009 IBM Corporation IBM TotalStorage DS4000 series skill transfer
50

DS4000 Maintenance Skill

Nov 07, 2014

Download

Documents

hhhlh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: DS4000 Maintenance Skill

© 2009 IBM Corporation

IBM TotalStorage™

DS4000 series skill transfer

Page 2: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Agenda

DS4000 series hardware introduction DS4000 troubleshooting DS4000 hardware maintenance DS4000 firmware package DS4000 data collection DS4000 material

Page 3: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4000 hardware introduction

Product list DS4300 (FAStT600)

DS4500/DS4400 (FAStT900/FAStT700)

DS4700

DS4800

Page 4: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4300 (FAStT600)

Base mode Up to two 2 Gbps hot swappable RAID controllers with 512 MB of battery backed cache (256

MB per controller). Support for up to three IBM TotalStorage DS4000 EXP700/EXP710 Expansion Units. Support for one storage partition in standard configuration. There is an option to expandup to

4, 8, or 16 storage partitions.

Turbo mode Increased cache from 256 MB per controller on base DS4300 to 1 GB per controller on

Turbo. Support for up to seven IBM TotalStorage EXP710 Expansion Units. EXP810 Enclosures can

also be used behind the DS4300. Host interface on base DS4300 is 2 Gbps. Turbo auto senses to connect to 1 Gbps or 2

Gbps. Eight storage partitions standard, with upgrade to 16 or 64.

Page 5: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4300 (FAStT600)

Green Power LED: This LED indicates that the DC power status is OK.

Amber General-System-Fault LED: When a storage server component fails (such as a disk drive,

fan, or power supply), this LED will be on.

Page 6: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4300 (FAStT600)

Host loop LED (green): This LED should be on, which means that the host connection loop is good. If it is off, the following problems

might have occurred:– The host loop is down, not turned on, or not connected.– A SFP has failed, or the host port is not occupied.– The RAID controller circuitry has failed, or the RAID controller has no power.

Cache activity LED (green): This LED is on when the data is in cache. If it is off, one of the following situations has occurred:

– There is no data in cache.– The cache option is not selected for the array.– The cache memory has failed, or the battery has failed.

Battery charged LED (green): Normally, this LED should be on. If it is off, it indicates a battery fault. The LED blinks while the battery is

charging or performing a self-test. Expansion port bypass LED (amber):

The LED will be on if nothing is plugged into the expansion port, or the expansion is powering off. Expansion loop link LED (green):

Normally on when the drive-side Fibre Channel loop is operating normally.

Page 7: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Host side connection

Page 8: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Drive side connection

Page 9: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4500 (FAStT900)

Dual, redundant 2 Gbps RAID controllers with 2 GB of Rambus cache memory (1 GB per RAID controller). The data in the cache is protected by battery backup for at least seven days.

Supports connecting up to sixteen EXP100 or EXP710 or up to fourteen EXP810 enclosures

Has 16 storage partitions standard, with upgrade option to 64.

Page 10: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4500 (FAStT900)

Page 11: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4500 (FAStT900)

Page 12: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4500 (FAStT900) Speed LED (green):

This LED is on when the selected link speed is 2 Gbps and a link is up. This LED is off when the DS4500 RAID Controller works on 1 Gbps.

Fault LED (amber): This LED should normally be off. If on it indicates a fault of the mini-hub or one of

the SFP modules. Two Bypass LEDs (amber):

There is one bypass LED for each SFP module. This LED should normally be off if no SFP module is installed. But if a SFP module is present, and a link error is detected (for example, no cable or faulty cable, or host not powered on) it will go on.

Loop good LED (green): This LED should be normally on. It might be off if there are link errors.

Page 13: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Host side connection

Page 14: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Drive side connection

Only one port on each mini-hub of the DS4500 on the drive side is ever used. We recommend removing all the SFP modules on the mini-hub ports that are not connected to any device

Page 15: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4700

The IBM System Storage DS4700 Express storage server uses 4 Gbps technology

Model 70 contains 2 GB of cache memory (1 GB per controller), four 4 Gbps FC host ports (two ports per controller), and four shortwave small form-factor pluggable (SFP)

Model 72 contains 4 GB of cache memory (2 GB per controller), eight 4 Gbps FC host ports (four ports per controller), and six shortwave SFP

Supports up to six EXP810 Both models 70 and 72 have selectable storage partitions up to 128

Page 16: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4700

Locate LED (white or blue) On: Indicates storage subsystem locate. Off: This is the normal status.

Service action allowed LED (blue) On: The service action can be performed on the

component with no adverse consequences.

Off: This is the normal status. Service action required LED (amber)

On: There is a corresponding needs attention condition flagged by the controller

firmware. Some of these conditions might not be hardware related.

Off: This is the normal status. Power LED (green)

– On: The subsystem is powered on. – Off: The subsystem is powered off.

Page 17: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4700

Page 18: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4700

LED #1-2 (green): Host channel speed LED #3 (blue): Serviced action allowed LED #4 (amber): Need attention LED #5 (green): Caching active LED #8-11 (amber): Drive channel bypass LED #9-10 (green): Drive channel speed LED #12 (green/yellow): Numeric display (enclosure

ID/diagnostic display)

Page 19: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4700

Service action allowed (blue) Off: Normal status. On: Safe to remove.

Battery charging (green) On: Battery charged and ready. Blinking: Battery is charging. Off: Battery is faulted, discharged, or missing.

Needs attention or service action required (amber) Off: Normal status. On: Controller firmware or hardware requires attention.

Page 20: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4700

Power supply fan LED (AC power) (green) Off: Power supply fan is not providing AC power. On: Power supply fan is providing AC power.

Serviced action allowed (blue) On: Safe to remove. Off: Normal status.

Needs attention (amber) Off: Normal status. On: Power supply fan requires attention.

Power supply fan Direct Current Enabled (DC power) (green) Off: Power supply fan is not providing DC power. On: Power supply fan is providing DC power.

Page 21: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Host side connection

Page 22: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Drive side connection

Page 23: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4800

The 1825-80A and 1815-82A come with 4 GB of total cache The 1815-84A has 8 GB of total cache 1815-88A has 16 GB of total cache. Supports up to 16 EXP710 FC-only enclosures for a total of 224

disks. Supports up to 14 EXP810 enclosures for a total of 224 disks. Supports up to 512 host storage partitions.

Page 24: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4800

Page 25: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4800

Page 26: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Host side connection

Page 27: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Drive side connection

The DS4800 supports four redundant drive channel pairs on which to place expansion enclosures

Ports 4 and 3 on controller A are channel group 1. Ports 2 and 1 on controller A are channel group 2. Ports 1 and 2 on controller B are channel group 3. Ports 3 and 4 on controller B are channel group 4.

The two ports on each drive channel group must run at the same speed.

.

Page 28: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Drive side connection

The best sequence (Figure 3-52) to populate drive channel pairs is: 1. Controller A, port 4/controller B, port 1 (drive channel pair 1)

2. Controller A, port 2/controller B, port 3 (drive channel pair 3)

3. Controller A, port 3/controller B, port 2 (drive channel pair 2)

4. Controller A, port 1/controller B, port 4 (drive channel pair 4)

Page 29: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4000 troubleshooting

Basic tools Recovery Guru

Major Event Log (MEL) Other tools

RLS

etc…..

Page 30: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Recovery Guru

If there is an error condition on your DS4000, the Recovery Guru will explain the cause of the problem and will provide necessary actions to recover. It will guide you to perform specific actions, depending on the event

Page 31: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Major Event Log

The Major Event Log (MEL) is the primary source for troubleshooting a DS4000 storage server. To access the MEL select Advanced → Troubleshooting → View Event Log.

By default only the last 100 critical events are shown, but you can choose how many events you want to have listed. The maximum number you can set is 8191.

If you want to troubleshoot your system, use the full event log, as it includes information about actions that took place before the actual critical event happened, thus giving you the complete history of the problem.

Page 32: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4000 hardware maintenance

Disk replacment Battery replacement ……

Remember DO backup(ASD,profile) before normal maintenance Remember DO data backup before maintenace that maybe

harmful

Page 33: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4000 disk replacement Failed drive/Bypassed drive

Check Recovery Guru, verify the problem and recovery method Replace the drive according to the service guide

– Plug out the failed drive (Usually amber LED will be on)– Wait for about 30 sec– Plug in the new drive– Waiting for the reconstruction complete

Impending failure drive Check Recovery Guru, verify the problem and recovery method Option 1: waiting for the drive failed Option 2: Directly replace the drive

– Un-assign the hot spare– Manually failed the drive– Plug out the failed drive (Usually amber LED will be on)– Wait for about 30 sec– Plug in the new drive– Waiting for the reconstruction complete– Re-assign the hot spare

Page 34: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4000 disk replacement

If Multiple drive failed at almost the same timestamp Collect data and waiting for L2’s action plan

If reconstruction failed Recommend to order another one, if failed again, maybe some

logical error occurred. Collect data for L2 review

Page 35: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Cache battery replacement Check the cache setting in Storage Manager In order to ensure no data in cache, it is

recommended to disable the cache setting According to the service guide to replace the

cache battery DS4300 need to offline the controller. DS4400/DS4500/DS4700 can replace the battery

directly. DS4800 should ensure Ctrl A is optimal

Waiting for the battery self-test/charge complete Reset battery age Reset cache setting

Page 36: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4000 battery policy

Above FW 6.60, the age of DS4000 cache battery has been changed to 10 years.

If only battery status is ‘failed’, the battery should be replaced. If battery status is ‘near expiration’, recommended to update the

FW to above 6.60. After the upgrade, the warning will be cleared

Page 37: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4000 firmware

The firmware pack include the controller firmware, NVSRAM, ESM code, DDM code

The controller firmware and NVSRAM should be matched The ESM code and the controller firmware should be matched Pay attention when the DS4000 is attached with both EXP710

and EXP700/EXP810

Remember always check the firmware package readme file and the code matching before doing update

All hardware error should be solved before update the firmware except the JFQ3/JFQ4 issue

Page 38: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4000 firmware update

The normal process of DS4000 firmware update is: Update ESM code

Update Controller/NVSRAM code

DDM code update need stop IO on hosts

Page 39: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Check the DS4000 firmware

Page 40: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

ESM update

选择相应的 ESM微码 每个 ESM升级时间约 5分钟

Page 41: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Controller firmware update

Controller firmware and NVSRAM can be updated at the same time

The update takes about 15-20 mins

Page 42: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DDM dirve update

When updating, the host should stop IO

Page 43: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Case scenarios

DS4000 has JFQ3/JFQ4 disks and need to update the firmware

DS4000 has EXP710 attached and need to update the firmware from 06.12.16.00 to 06.60.08.00

When updating the controller firmware, the SM lost connection with DS4000

Page 44: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4000 date collection

All support data

Serial port output

Page 45: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

All support data

Do connect to both controller with hub when collecting ASD If hub is not available, collect two ASD from ctrl A and ctrl B

respectively In order to make the drive link statistic more accurate,

recommend to do … 15-30 before collecting ASD clear allDriveChannels stats;

reset storagesubsystem RLSBaseline;

reset storagesubsystem SOCBaseline;

Page 46: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

All support data

Page 47: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Serial port output

Using Putty or serial cable to connect the DS4000 Connection parameter

Page 48: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Serial port output

Command that should be collect FW6:

– loadDebug – moduleList 1 – arrayPrintSummary – netCfgShow – inetstatShow – moduleShow – cfgUnitList – vdAll vdShow – ghsList – printBatteryAge – cfgPhyList – hwLogShow – excLogShow – spmShowMaps – spmShow – getObjectGraph_MT 1 – getObjectGraph_MT 4 – getObjectGraph_MT 8 – ccmStateAnalyze 8 – fcDevs 1 – i – fc 111 – ionShow 99 – hdd 5 – fcAll – socShow – showEnclosures – showEnclosuresPage81 – unld “ffs:Debug”

7– loadDebug– moduleList 1– evfShowOwnership– cmgrShow– vdmShowDriveList– vdmShowRAIDVolList– vdmDrmShowMgr– vdmShowVGInfo– evfShowAllVols– bmgrShow 15– bidShow 255– tditnall– iditnall– fcnShow– chall– luall– ionShow 12– fcAll 10– showSdStatus– ionShow 99– discreteLineTableShow– ssmShowTree 2– socShow– showEnclosuresPage81– excLogShow– hwLogShow– spmShowMaps– spmShow– fcHosts 3– getObjectGraph_MT 1– getObjectGraph_MT 4– getObjectGraph_MT 8– ccmShowState– netCfgShow– inetstatShow– dqlist– taskInfoAll 3– tpgmShowSummary– unld “ffs:Debug”

Page 49: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

DS4000 material

Redbook Firmware package

Page 50: DS4000 Maintenance Skill

IBM TotalStorage™

© 2009 IBM CorporationJuly 1, 2007

Q & A