Top Banner
_________________________________________________________________________________________ BladeCenter ® Education Servicing the IBM BladeCenter HS22 Type 7870 Study guide XW5175 Release 1.00 March 2009 This course is owned and published by Charles Perkins and the IBM System x/BladeCenter WW Service & Support Education Team.
119
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

_________________________________________________________________________________________

BladeCenter ® Education

Servicing the IBM BladeCenter HS22 Type 7870

Study guide

XW5175

Release 1.00

March 2009

This course is owned and published by Charles Perkins and the IBM System x/BladeCenter WW

Service & Support Education Team.

Page 2: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Preface

March 2009 2 Xw5175r100.pdf

© International Business Machines Corporation, 2009 All rights reserved.

IBM System x Service and Support Education IBM Systems, Department EYGA. Building 203, Post Office Box 12195, Research Triangle Park, North Carolina 27709-2195

IBM reserves the right to change specifications or other product information without notice. This publication could include technical inaccuracies or typographical errors. References herein to IBM products and services do not imply that IBM intends to make them available in other countries. IBM provides this publication as is, without warranty of any kind —either expressed or implied—including the implied warranties of merchantability or fitness for a particular purpose. Some jurisdictions do not allow disclaimer of expressed or implied warranties. Therefore, this disclaimer may not apply to you.

Data on competitive products is obtained from publicly obtained information and is subject to change without notice. Please contact the manufacturer for the most recent information.

The following terms are trademarks or registered trademarks of IBM Corporation in the United States, other countries or both: Active Memory, Active PCI, AT, BladeCenter, the e-business logo, EasyServ, Enterprise X-Architecture, EtherJet, HelpCenter, HelpWare, IBM RXE-100 Remote Expansion Enclosure, IBM XA-32, IBM XA-64, IntelliStation, LANClient Control Manager, Memory ProteXion, NetBAY3, Netfinity, Netfinity Manager, Predictive Failure Analysis, RXE Expansion Port, SecureWay, ServeRAID, ServerProven, ServicePac, SMART Reaction, SMP Expansion Module, SMP Expansion Port, UM Services, Universal Manageability, Update Connector, Wake on LAN, XceL4 Server Accelerator Cache, XpandOnDemand scalability.

IBM Corporation Subsidiaries: Lotus, Lotus Notes, Domino, and SmartSuite are trademarks of Lotus Development Corporation. Tivoli and Planet Tivoli are trademarks of Tivoli Systems, Inc.

LLC, Adobe, and PostScript are trademarks of Adobe Systems, Inc. Intel Celeron, LANDesk®, MMX, Pentium II, Pentium III, Pentium 4, SpeedStep, and Xeon are trademarks or registered trademarks of Intel Corporation. Linux is a trademark of Linus Torvalds. Microsoft Windows® and Windows NT® are trademarks or registered trademarks of Microsoft Corporation. Other company, product, and service names may be trademarks or service marks of others.

For more information, visit:www.ibm.com/legal/copytrade/phtml

Page 3: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Preface

March 2009 3 Xw5175r100.pdf

Preface Servicing the IBM BladeCenter HS22 Type 7870, XW5175. This document may not be copied or sold, either in part or in whole, to non-IBM personnel. Please write your name and address below to personalize your copy.

Issued to Address

Current release date: March 2009 Current release level: 1.00

The information contained within this publication is current as of the date of the latest revision and is subject to change at any time without notice. Please forward all comments and suggestions regarding the course material format and content to your local IBM System X service and support education country coordinator or contact.

Page 4: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Preface

March 2009 4 Xw5175r100.pdf

Table of Contents Preface...................................................................................................................................... 3

Table of Contents .................................................................................................................... 4 Prerequisites........................................................................................................................ 4 Objectives ........................................................................................................................... 5

Servicing the IBM BladeCenter HS22 Type 7870................................................................ 6 Overview............................................................................................................................. 6 IBM BladeCenter HS22 Type 7870 Front View .............................................................. 10 IBM BladeCenter HS22 Type 7870 Rear View ............................................................... 16 IBM BladeCenter HS22 Type 7870 Inside View ............................................................. 17 Functional Comparison..................................................................................................... 28 Architecture....................................................................................................................... 30 Memory Subsystem Overview.......................................................................................... 34 Microprocessor Subsystem Overview .............................................................................. 42 Problem Determination - LightPath.................................................................................. 52 Problem Determination - Unified Extensible Firmware Interface (UEFI) ....................... 57 Problem Determination - Logs.......................................................................................... 66 Problem Determination - Collecting Logs........................................................................ 86 Problem Determination - Firmware update ...................................................................... 95 Problem Determination – Diagnostics ............................................................................ 100 Problem Determination - Examples................................................................................ 104 Tools available for problem determination..................................................................... 110 Summary ......................................................................................................................... 118

Prerequisites You should be Server + certified or equivalent before taking this course.

You should also have completed the latest release of the self-paced learning courses:

− XW5033 IBM Blade Center Fundamentals

− XW5173 Unified Extensible Firmware Interface (UEFI) Technology Brief

− XW5180 Integrated Management Module (IMM) Technology Brief

Page 5: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Preface

March 2009 5 Xw5175r100.pdf

Objectives

Upon completion of this course, you will be able to:

1. Provide a high level overview of the HS22 Type 7870 blade server 2. Provide a high level overview of the HS22 memory subsystem 3. Provide a high level overview of the HS22 microprocessor subsystem 4. Describe how to use LightPath for problem determination 5. Describe how to use logs for problem determination 6. Describe how to collect logs for problem determination 7. Describe the firmware update process 8. Describe the diagnostics available 9. Provide problem determination examples 10. List tools available for problem determination

Page 6: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 6 Xw5175r100.pdf

Servicing the IBM BladeCenter HS22 Type 7870

Overview

IBM BladeCenter HS22 Type 7870

The IBM BladeCenter HS22 Type 7870 is shown in Figure 1.

Figure 1

Page 7: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 7 Xw5175r100.pdf

Features and Specifications - 7870 - BladeCenter HS22

Microprocessor:

Supports up to two Dual Core (DC) or Quad Core (QC) microprocessors

− Standard models ship with one quad core processor installed Intel chipset

− Intel next generation 5500 series Nehalem EP DC / QC microprocessor

Note: Use the Configuration/Setup Utility program to determine the type and speed of the microprocessors in your blade server. Preboot System Analysis (pDSA) and Online Dynamic System Analysis (DSA) can also be used to determine processor type and speed.

The Intel Nehalem microprocessors are divided into multiple categories which are performance, volume, value and low voltage as shown in Figure 2.

Figure 2

Page 8: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 8 Xw5175r100.pdf

Memory:

12x Very Low Profile (VLP) Double Data Rate 3 (DDR3) Error Correcting Code (ECC) Registered Dual In-Line Memory Module (RDIMM)

Up to 8 GB memory RDIMM can be installed. (8 GB RDIMMs announce with the HS22 but are not available until June 2009.)

Lower voltage (1.5v instead of 1.8v for DDR2) Lower power (30%) than DDR2 or Fully Buffered DIMM (FBD) for the same

speed Higher speed than DDR2 (spec is 800, 1066, 1333MHz)

− The more memory slots, the slower the maximum memory speed

− The more chips on the RDIMM, the slower the speed

− 1 RDIMM = 1333MHz, 2 RDIMMs = 1066MHz, 3 RDIMMs = 800MHz

The chart in Figure 3 provides information on DIMM population.

Figure 3

Drives:

Supports up to two hot-swap, Small Form Factor (SFF) Serial Attached SCSI (SAS) hard disk drives.

Integrated Functions:

Expansion card interface Local service processor: Integrated Management Module (IMM) with

Intelligent Platform Management Interface (IPMI) firmware Integrated Matrox G200eV video controller LSI 1064E SAS controller

Page 9: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 9 Xw5175r100.pdf

Broadcom BCM5709S dual-port Gigabit Ethernet controller Concurrent keyboard/video/mouse (cKVM) is supported Light path diagnostics RS-485 interface for communication with the management module Automatic Server Restart (ASR) Universal Serial Bus (USB) 2.0 for communication with keyboard, mouse and

removable media drives (an external USB port is not supported) Serial over LAN (SOL) Redundant buses for communication with keyboard, mouse, and removable

media drives

Predictive Failure Analysis:

Microprocessor Memory Hard disk drives

Electrical input:

12 V DC

Size:

Height: 24.5 cm (9.7 inches) (6U) Depth: 44.6 cm (17.6 inches) Width: 2.9 cm (1.14 inches) Maximum weight: 4.8 kg (10 lb)

Warranty

Warranty - Worldwide

Warranty Period - 3 years

Warranty Service Type - 5

Warranty Service Level - 1

Page 10: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 10 Xw5175r100.pdf

Note: Warranty Service Type 5 – Customer Replaceable Unit (CRU) and On-site Service - At IBM's discretion you will receive CRU service or IBM or your reseller will repair the failing Machine at your location and verify its operation. You must provide suitable working area to allow disassembly and reassembly of the IBM Machine. The area must be clean, well lit and suitable for the purpose.

Note: Warranty Service Level 1 - Next Business Day (NBD), 9X5

IBM BladeCenter HS22 Type 7870 Front View

There are two release levers and a control panel on the front of the IBM BladeCenter HS22 Type 7870 as shown in Figure 4.

Figure 4

Page 11: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 11 Xw5175r100.pdf

Cam levers

The IBM BladeCenter HS22 Type 7870 is a 30mm, vertically mounted frame with two cam lever handles located in the center of the unit. The HS22 7870 is removed from the BladeCenter chassis by opening the release levers outward from the center as shown in Figure 5.

Figure 5

Page 12: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 12 Xw5175r100.pdf

Control Panel

The control panel of the IBM BladeCenter HS22 Type 7870 is shown in Figure 6.

Figure 6

Page 13: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 13 Xw5175r100.pdf

The position of the control panel on the IBM BladeCenter HS22 is shown in Figure 7.

Figure 7

Power-on LED:

This green LED indicates the power status of the blade server in the following manner:

− Flashing rapidly: The Integrated Management Module (IMM) on the blade is initializing.

− Flashing slowly: The blade server has power but is not turned on.

− Lit continuously: The blade server has power and is turned on.

Note: When the HS22 blade server is powered on from a powered off state it takes 2 to 3 minutes for the system to boot to the operating system. The Power LED flashes rapidly while the Integrated Management Module (IMM) code is initialized.

Activity LED:

When this green LED is lit, it indicates that there is activity on remote drives or Ethernet network (ENET).

Page 14: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 14 Xw5175r100.pdf

Location LED:

The system administrator can remotely turn on this blue LED to aid in visually locating the blade server. When this LED is lit, the location LED on the BladeCenter unit is also lit.

Information LED:

When this amber LED is lit, it indicates that information about a system error in the blade server has been placed in the management-module event log.

Fault LED:

When this amber LED is lit, it indicates that a system error has occurred in the blade server. The blade-error LED turns off only after the error is corrected.

Power-control button:

Press the power-control button to turn the blade server on or off.

Note: The power-control button has effect only if local power control is enabled for the blade server. Local power control is enabled and disabled through the Advanced Management Module (AMM) web interface.

Keyboard/video/mouse (KVM) select button:

Press the Keyboard/video/mouse (KVM) select button to associate the shared BladeCenter unit keyboard port, video port, and mouse port with the blade server. The LED on this button flashes while the request is being processed and then is lit when the ownership of the keyboard, video, and mouse has been transferred to the blade server. It can take approximately 20 seconds to switch the keyboard, video, and mouse control to the blade server.

Using a keyboard that is directly attached to the AMM, you can press keyboard keys in the following sequence to switch KVM control between blade servers instead of using the KVM select button:

− NumLock NumLock blade_server_number Enter − The blade_server_number is the two-digit number of the blade-server bay in

which the blade server is installed. A blade server that occupies more than one blade-server bay is identified by the lowest bay number that it occupies.

− If there is no response when you press the KVM select button, you can use the AMM web interface to determine whether local control has been disabled on the blade server.

Page 15: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 15 Xw5175r100.pdf

Note: The operating system in the blade server must provide Universal Serial Bus (USB) support for the blade server to recognize and use the keyboard and mouse, even if the keyboard and mouse have PS/2-style connectors.

Note: If you install a supported Microsoft Windows operating system on the blade server while it is not the current owner of the keyboard, video, and mouse, a delay of up to one minute occurs the first time that you switch the keyboard, video, and mouse to the blade server. All subsequent switching takes place in the normal KVM switching time frame (up to 20 seconds).

Media-tray select button:

Press the Media-tray select button to associate the shared BladeCenter unit media tray (removable-media drives) with the blade server. The LED on the button flashes while the request is being processed and then is lit when the ownership of the media tray has been transferred to the blade server. It can take approximately 20 seconds for the operating system in the blade server to recognize the media tray.

If there is no response when you press the media-tray select button, you can use the AMM web interface to determine whether local control has been disabled on the blade server.

Note: The operating system in the blade server must provide USB support for the blade server to recognize and use the removable-media drives.

Page 16: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 16 Xw5175r100.pdf

IBM BladeCenter HS22 Type 7870 Rear View

Figure 8 shows the rear of an IBM BladeCenter HS22 Type 7870 system.

Figure 8

Note the following locations:

Power connectors Data connectors

The HS22 connects to the midplane through four connectors – two for power and two for data exchange.

The BladeCenter chassis has built in redundancy. The customer can enable redundant data paths in the chassis. If this has been done and half the midplane failed then the clients and remotely attached devices could communicate through the alternate data connector.

Similarly, if a power supply fails, and the provided power redundancy is operational, the HS22 will continue to operate.

Page 17: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 17 Xw5175r100.pdf

IBM BladeCenter HS22 Type 7870 Inside View

Figure 9 shows the locations of major components of the HS22 processor board.

Figure 9

Powering on the system considerations

When the powering on the HS22 blade server for the first time or after power has been removed it may take two to three minutes for the system to reach the operating system desktop. This is because the HS22 system has to initialize the code for the Integrated Management Module (IMM). The Power LED will flashes rapidly while the IMM code is being loaded.

Page 18: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 18 Xw5175r100.pdf

HS22 Processor Board

The HS22 Processor Board supports up to two Intel Nehalem EP Dual Core (DC) or Quad Core (QC) processors.

The processor board has the following major components:

Two Land Grid Array (LGA) 1366 sockets for up to 2 Nehalem EP processors (a single processor is shipped standard)

Two Enterprise Voltage Regulator-Down (EVRD) regulators. The EVRD supplies both the processor core voltage and L2 cache voltage.

One Intel Tylersburg Input Output Hub (IOH): Host Bridge controller with PCI Express interface.

Twelve Double Data Rate 3 (DDR3) Very Low Profile (VLP) Dual Inline Memory Modules (DIMM) memory sockets

One Intel South Bridge Intel Control HUB 10 (ICH10) One Broadcom BCM5709S Gigabit Ethernet Controller One LSI 1064E Serial Attached SCSI (SAS) Controller LightPath diagnostic panel shown in Figure 10

Figure 10

Two SAS connectors for two 2.5” SAS Hard Drives

Page 19: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 19 Xw5175r100.pdf

One 16 MB Flash Read Only Memory (ROM) One Integrated Management Module (IMM) with Integrated VGA Controller Two High Density Metric (HDM) Midplane connectors One Blade Expansion connector One 1Xe Daughter Card connector One internal Universal Serial Bus (USB) connector for bootable Flashkey

− The only supported key for this connector is the Hypervisor Key

CPU - Nehalem Microprocessor

Integrated three channel Double Data Rate 3 (DDR3) memory controller

− Second CPU required to use all 12 DIMMs Two QuickPath (QPI) interconnect links

− High Speed Serial link between CPUs and chipset Single die Dual Core (QC) or Quad Core (QC) depending on CPU type Shared 4 or 8 MB L3 cache depending on CPU type

− Hyper-threading the four cores allows each core to process up to two threads simultaneously, so that the processor appears to the operating system as eight CPUs

16 total possible in operating system. High core frequencies (up to 2.93GHz for server)

Page 20: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 20 Xw5175r100.pdf

Microprocessor Option Part Numbers

Note: The microprocessor option comes with a heat sink.

2.93GHz 95W 6.40GT 1333MHz CPU 44T1887

2.80GHz 95W 6.40GT 1333MHz CPU 44T1886

2.67GHz 95W 6.40GT 1333MHz CPU 44T1885

2.53GHz 80W 5.86GT 1066MHz CPU 44T1884

2.40GHz 80W 5.86GT 1066MHz CPU 44T1883

2.26GHz 80W 5.86GT 1066MHz CPU 44T1736

2.13GHz 80W 4.80GT 800MHz CPU 43W5987

2.00GHz 80W 4.80GT 800MHz CPU 44T1712

1.86GHz 80W 4.80GT 800MHz CPU 44T1884

Page 21: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 21 Xw5175r100.pdf

Nehalem Microprocessor and Land Grid Array (LGA) 1366 Socket Considerations

The Nehalem processor in the HS22 is seated on the LGA 1366 socket as shown in Figure 11.

Figure 11

LGA 1366 is also known as Socket B.

− The socket measures 60x82mm

− The LGA 1366 socket is not compatible with earlier processors.

− Like in its predecessor, this socket has no holes. Instead, pins on the LGA touch contact points on the underside of the CPU.

− The LGA 1366 socket in the HS22 will have 1366 contacts.

Page 22: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 22 Xw5175r100.pdf

When installing a Nehalem microprocessor the notches on the processor must be aligned with the notches on the LGA 1366 socket shown in Figure 12.

Figure 12

Page 23: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 23 Xw5175r100.pdf

Figure 13 shows the microprocessor dust cover. If no microprocessor is installed in the socket the dust cover should remain in place. If the microprocessor dust cover is removed this could cause the pins on the socket to be contaminated with dust. This could cause connectivity problems with microprocessor contacts connecting with the pins. This problem was more common in the older LGA 775 sockets which did not have pins but had holes instead. The dust would get into the holes causing the pins to not make good contact with the socket. If the customer upgrades to a second microprocessor they should keep the dust cover just in case they need it in the future.

Figure 13

Page 24: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 24 Xw5175r100.pdf

Figure 14 shows the microprocessor filler. If no microprocessor is installed in the microprocessor socket the microprocessor filler should be installed. This filler serves as an air baffle and if removed would change the air flow in the HS22 blade server. This could cause a potential overheating issue with the HS22. If the customer upgrades and adds in a second microprocessor they should make sure they keep the microprocessor filler just in case they need it in the future.

Figure 14

Page 25: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 25 Xw5175r100.pdf

Processor Heat Sink Considerations

The HS22 Processor Option includes the Intel Nehalem processor and the IBM heat sink.

If two processors are installed, they must have the same features and functions but stepping levels can be different.

A HS22 Blade using a single CPU must have it installed in the CPU 1 socket. The system will not boot with an empty CPU 1 socket. If the CPU 2 socket is empty, it must have a CPU heat sink filler installed to

maintain proper airflow throughout the blade.

− If the heat sink filler is removed this will change the airflow within the blade and may cause components to overheat

− Under the heat sink filler is a dust cover which prevents dust contamination and should not be removed.

A terminator in CPU 2 socket is not required, but the CPU heat sink filler is. Older heat sinks are not compatible with the Intel Nehalem microprocessor When you reinstall the heat sink make sure you alternate tightening the screws.

Memory Population Requirements

Dual-channel, Dual Inline Memory Modules (DIMMs): − 12 Double Data Rate 3 (DDR3) DIMM connectors

Memory option is sold in singles Each CPU has its own memory DIMM bank DIMMs do not have to be added in pairs One processor supports up to 6 DIMM slots The IBM BladeCenter HS22 Type 7870 Installation and User’s Guide and the

IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide have detailed information on the various memory modes available and recommended DIMM population guidelines for those modes.

Page 26: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 26 Xw5175r100.pdf

Memory Option Part Numbers

1GB 1R PC3-10600R CL9, VLP Registered Dual Inline Memory Module (RDIMM) 1333MHz 44T1485

2GB 1R PC3-10600R CL9, VLP RDIMM 1333MHz 44T1487

2GB 2R PC3-10600R CL9, VLP RDIMM 1333MHz 44T1486

4GB 2R PC3-10600R CL9, VLP RDIMM 1333MHz 44T1488

8GB 2R PC3-10600R CL9, VLP RDIMM 1066MHz 44T1579

Page 27: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 27 Xw5175r100.pdf

USB Key for Hypervisor

Option Part Number - 41Y8268

Hard Disk Drives Option Part Numbers

IBM 73GB 10K SAS SFF HS 43W7535

IBM 146GB 10K SAS SFF HS 43W7536

IBM 73GB 15K SAS SFF HS 43W7545

IBM 146GB 15K SAS SFF HS 42D0677

IBM 300GB SATA SFF HS 43W7670

IBM 300GB 10K SAS SFF HS 42D0637

IBM 31.4GB SSD HS 43W7648

IBM 146GB 10K 6Gbps SAS 2.5" SFF Slim-HS HDD 42D0632

IBM 73GB 15K 6Gbps SAS 2.5" SFF Slim-HS HDD 42D0637

Note: Part Numbers are subject to change. Make sure you check the IBM support site for the part number.

Page 28: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 28 Xw5175r100.pdf

Functional Comparison Function HS21 HS21 XM HS22

Processor Woodcrest Dual Core (DC)/Clovertown Quad

Core (QC)

Wolfdale DC /Harpertown QC

Woodcrest DC/Clovertown QC

Wolfdale DC /Harpertown QC

Nehalem DC and QC

Form Factor

All single wide models All single wide model

No 120W CPU support in

BladeCenter Enterprise (BCE)

chassis

All single wide models

Systems Manage-ment

Supported by Management Module (MM) and Advanced

Management Module (AMM)

Supported by MM and AMM

AMM only

Memory Four Fully Buffered (FB) Dual Inline Memory Module

(DIMMs) up to 4GB

8 FB DIMMs up to 4GB

12 Very Low Profile (VLP) Double Data Rate 3 (DDR3) dual rank DIMMs up to

4GB DIMMs. I/O Support Peripheral

Component Interconnect

eXtended (PCI-X) daughter card for

legacy

Common Form Factor horizontal (CFFh) for High

Speed (HS) Switches

PCI-X daughter card for legacy switches CFFh daughter card

for HS switches

Combination Input Output vertical

(CIOv) daughter card for legacy switches CFFh daughter card

for High Speed switch

Page 29: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 29 Xw5175r100.pdf

Internal Storage

Non hot swap hard drives

Two 2.5” Serial Attached SCSI (SAS)

drives

One 2.5” SAS or two 2.5” solid state SATA

Two hot swap hard drives

Two 2.5” SAS or two 2.5”solid state

RAID LSI 1064 Redundant Array of Independent

Disks (RAID) controller

LSI 1064 RAID controller

LSI 1064 RAID controller – Down

LSI 1068 Option Card w/ Battery backed Cache

Expansion Blade Storage Expansion-3 (BSE-3)

Storage Expansion

Three hot swap SAS HDDs

Two PCI-X daughter cards for legacy

switches

BSE-3 Storage Expansion

Three hot swap SAS HDDs

Two PCI-X daughter cards for legacy

switches

To Be Determined (TBD)

Ethernet 2 x Broadcom 5708S (two 1Gb Ethernet

ports)

2 x 1 Broadcom 5708S (two 1Gb Ethernet ports)

One Broadcom 5709 (two 1Gb Ethernet

ports) Security No TPM No TPM TPM 1.2 USB Midplane only Midplane plus On

board Universal Serial Bus (USB) modular flash drive

Midplane plus Internal USB port

and Internal Ethernet over USB

PCI-Express

CFFh and legacy High Speed Daughter

Cards (HSDCs) / 8x PCI-Express (PCI-E)

CFFh and legacy HSDCs / 8x PCI-E

CFFh only HSDC / 16x PCI-E

Function HS21 HS21 XM HS22

Page 30: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 30 Xw5175r100.pdf

Architecture

The IBM HS22 processor board provides the major functions of the IBM HS22 blade. A basic block diagram of the HS22 processor board is shown in Figure 15.

Figure 15

Microprocessor – Nehalem EP

The HS22 Processor Board design supports up to two Intel Nehalem-EP Dual Core (DC) or Quad Core (QC) processors. The Nehalem processor supports high frequencies up 2.93GHz.

Tylersburg Input Output Hub (IOH)

The Tylersburg IOH provides the interface between the processors, and PCI Express buses that interface to the Intel Controller Hub 10 (ICH10), the high speed daughter card connector, and the Blade Storage Expansion (BSE) connector.

Page 31: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 31 Xw5175r100.pdf

PCI-E

PCI Express (PCI-E), is a high performance, general purpose I/O interconnect used for a variety of computing and communication platforms. It maintains key PCI features, but is a fully serial interface rather than the parallel bus architecture found in conventional PCI. PCI Express can be used for universal connectivity as a chip-to-chip interconnect, I/O interconnect for adapter cards, or an I/O attach point to other interconnects.

Memory Subsystem

The HS22 blade server memory is contiguous and is shared by both processors. It is Error Correction Code (ECC) protected and supports 1 GB to 96 GB using 1 GB, 2 GB, 4 GB or 8 GB Very Low Profile (VLP) Double Data Rate 3 (DDR3) Registered Dual Inline Memory Modules (RDIMMs) on twelve DIMM connectors. The processors have integrated DDR3 memory controllers and interface directly to the DDR3 RDIMMs. Because of thermal limitations, quad rank RDIMMs are not supported, only single and dual rank RDIMMs.

SAS Subsystem

The Serial Attached SCSI (SAS) subsystem in the HS22 blade server consists of the LSI 1064E SAS controller and two onboard hotswap Small Form Factor (SFF) SAS hard drives. The LSI 1064E SAS controller contains four SAS ports operating at a burst rate of 1.5 or 3.0 Gb/s. Two of the SAS ports connect to the hard drives through 29-pin SFF SAS connectors.

Ethernet Subsystem

The HS22 blade server CPU board’s Ethernet subsystem consists of a Broadcom BCM5709S Ethernet Controller. The BCM5709S controller provides Full Duplex (FD) 1000 Mbps Ethernet network connection.

Page 32: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 32 Xw5175r100.pdf

Integrated Management Module (IMM)

The IMM combines multiple functions on a single chip. It combines the functionality of the Baseboard Management Controller (BMC) and Remote Supervisor Adapter II (RSA-II). The video controller, Remote Presence/cKVM functionality and some of the functionality formerly provided by the Super I/O are part of the IMM. The IMM has a MIPS® 4KEc® 300MHz 32-bit Processor. There are numerous General Purpose Input Outputs (GPIOs), Analog to Digital (A/D) converters, Pulse Width Modulation (PWM) outputs and I2C busses for use in monitoring the server environmental thresholds such as the temperatures, voltages, and so forth. The server’s video function is provided by a Matrox G200eV Video Core. The memory controller is used to control the 128 MB Double Data Rate (DDR) Random Access Memory (RAM) that is shared by the video controller and by the MIPS CPU for running its firmware. The digital video compression hardware assists in reducing the amount of video data that needs to be transmitted when running the cKVM feature. The hardware based 128-bit Advanced Encryption Standard (AES) Hardware Encryption Engine assists with the encryption and decryption of secure data. The Universal Serial Bus (USB) 1.1 and 2.0 peripherals are used to implement the remote keyboard, remote mouse and remote media functions of the cKVM, and also for inband access to the IMM. Some features such as an Advanced Configuration and Power Interface (ACPI) controller and 16550 compatible Universal asynchronous receiver/transmitters (UARTs), which are normally part of the SuperIO, are available in this chip as well.

Figure 16 shows the IMM architecture.

Figure 16

Page 33: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 33 Xw5175r100.pdf

System Firmware

The Integrated Management Module (IMM) firmware, Unified Extensible Firmware Interface (UEFI) firmware and Preboot Dynamic System Analysis (pDSA) firmware are all kept on the IMM. They each have to be flashed separately. The firmware for Preboot DSA and UEFI are specific to the system they are designed for. The IMM firmware is universal and can be installed on any IBM server that has an IMM implemented on the system board.

The IMM, pDSA, and UEFI firmware can be updated locally without the need to reboot the server to initiate the update. The UEFI has 16MB of firmware space. It has 8 MB for primary and 8 MB for backup firmware.

The flash update packages for Windows and Linux provide for the IMM, pDSA, and UEFI firmware. There are no DOS or ZIP packages. The Windows and Linux flash utilities (iflash) can be run locally on the blade. The IMM and UEFI firmware can also be updated remotely through the AMM. You can point the AMM to the update package without the need to extract the firmware update file from the Windows/Linux flash package for flashing. The IMM locates the firmware payload inside the update package.

Note: Due to size limitations, pDSA cannot be updated remotely through the AMM. The inband flash utility must be run to update the pDSA firmware.

Page 34: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 34 Xw5175r100.pdf

Memory Subsystem Overview

The IBM HS22 is using Very Low Profile (VLP) Double Data Rate 3 (DDR3) Registered Dual Inline Memory Modules (RDIMMs). The RDIMMs are directly attached to the Nehalem microprocessor. Each Nehalem microprocessor has 3 independent memory channels. The HS22 memory subsystem supports several memory modes which are shown below:

Independent Channel Mode – Best Performance

− Independent channel mode gives a maximum of 48 GB of usable memory with one CPU installed, and 96 GB of usable memory with 2 CPUs installed (using 8 GB RDIMMs). The RDIMMs can be installed without matching sizes.

Memory Mirroring Mode

− The maximum memory available is 32 GB in a single CPU system and 64 GB in a dual CPU system. All three channels must have identical population with regards to size and organization. RDIMMs within a channel do not have to be identical.

The RDIMMs supported in the HS22 are sold in singles. The DDR3 RDIMMs operate at a lower voltage of 1.5V versus DDR2 DIMMs which operate at 1.8V. The HS22 uses registered DIMMs only. Unbuffered DIMMs cannot be used and are not supported. Only single and dual ranked DIMMs are supported by the HS22. The memory RDIMMs used in the HS22 have been tested and qualified for 1333MHz. The actual memory speed in the system is dependent on the CPU and number of RDIMMs installed. The CPU's are rated as the maximum memory speed as shown by the following:

1333MHz CPU w/ 1 RDIMM per channel runs at 1333MHz 1333MHz CPU w/ 2 RDIMM per channel runs at 1066MHz 1066MHz CPU w/ 1 RDIMM per channel runs at 1066MHz 1066MHz CPU w/ 2 RDIMM per channel runs at 1066MHz

HS22 Memory Channels Table

Memory Channel DIMM Connector Channel 0 DIMM Connector 1 and 2 Channel 1 DIMM Connector 5 and 6 Channel 2 DIMM Connector 3 and 4 Channel 3 DIMM Connector 7 and 8 Channel 4 DIMM Connector 11 and 12 Channel 5 DIMM Connector 9 and 10

Page 35: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 35 Xw5175r100.pdf

Figure 17 shows the IBM HS22 memory logical connections.

Figure 17

Page 36: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 36 Xw5175r100.pdf

Memory Population Considerations

A CPU must be populated for access to its RDIMMs. To achieve the maximum of 12 RDIMMs both CPU's must be populated. You always populate the end of the memory channel first.

− The end of the memory channel is always an even numbered memory socket Memory performance is better if all channels are occupied and populated with

four ranks per channel.

− The best case scenario would be all channels matching. Microprocessor 1 is installed

− A memory RDIMM module must be installed in memory connectors 2 and 6 Both microprocessor 1 and 2 are installed

− A memory RDIMM module must be installed in memory connectors 2, 4, 8 and 10.

When installing a microprocessor a memory DIMM or filler must be installed in the RDIMM connector closest to the microprocessor.

− Install a memory RDIMM or filler in the memory connector closest to microprocessor 1 after installing the processor.

− Install a memory RDIMM or filler in the memory connector closest to microprocessor 2 after installing the processor.

Memory Removal Considerations

When you remove a microprocessor you must remove the memory RDIMM or memory filler closest to it first.

− Remove the memory RDIMM or filler from the memory socket closest to microprocessor 1 before removing the processor.

− Remove the memory RDIMM or filler from the memory socket closest to microprocessor 2 before removing the processor.

Page 37: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 37 Xw5175r100.pdf

Memory Mirroring Mode Considerations

If you are installing the ServeRAID-MR10ie Combination Input Output vertical (CIOv) storage interface card the battery for the card can be installed in DIMM socket 7. DIMM socket 8 is still available since it is at the end of the memory channel which must be populated first as shown in Figure 18. This does affect Memory Mirroring Mode on that CPU since that particular memory channel has only 1 DIMM socket available. For memory mirroring to work, the other mirrored memory channel can only have 1 DIMM installed.

Figure 18

Removing a Memory Module

To remove a DIMM, complete the following steps.

____1. Before you begin, read the Safety statements and Installation guidelines in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide.

Page 38: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 38 Xw5175r100.pdf

____2. If the blade server is installed in a BladeCenter unit, remove it (see Removing the blade server from the BladeCenter unit in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide).

____3. Remove the blade server cover (see Removing the blade server cover in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide).

____4. If an optional expansion unit is installed, remove the expansion unit (see Removing an optional expansion unit in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide).

____5. Locate the DIMM connectors (see Blade server connectors in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide). Determine which RDIMM you want to remove from the blade server.

Attention: To avoid breaking the retaining clips or damaging the DIMM connectors, handle the clips gently.

____6. Move the retaining clips on the ends of the DIMM connector to the open position by pressing the retaining clips away from the center of the DIMM.

____7. Using your fingers, pull the RDIMM out of the connector.

Page 39: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 39 Xw5175r100.pdf

Note: To access DIMM connector seven through twelve, use your fingers to lift the DIMM access door shown in Figure 19.

Figure 19

____8. Install a RDIMM or DIMM filler in each empty DIMM connector (see Installing a memory module in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide).

Note: An RDIMM or DIMM filler must occupy each DIMM socket before the blade server is turned on.

____9. If you are instructed to return the RDIMM, follow all packaging instructions, and use any packaging materials for shipping that are supplied to you.

Page 40: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 40 Xw5175r100.pdf

Installing a DIMM Module

To install a DIMM, complete the following steps:

____1. Locate the DIMM connectors (see Blade server connectors in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide). Determine which DIMM connector you will be installing memory into.

____2. If a DIMM filler or another memory module is already installed in the DIMM connector, remove it (see Removing a memory module in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide).

Note: A RDIMM or DIMM filler must occupy each DIMM socket before the blade server is turned on.

____3. If you are installing a RDIMM in DIMM connector seven through twelve, use your fingers to lift the DIMM access door shown in Figure 21.

____4. Touch the static-protective package that contains the RDIMM to any unpainted metal surface on the BladeCenter unit or any unpainted metal surface on any other grounded rack component in the rack in which you are installing the RDIMM for at least two seconds; then, remove the RDIMM from its package.

____5. To install the RDIMMs, repeat the following steps for each RDIMM that you install:

Attention: To avoid breaking the retaining clips or damaging the DIMM connectors, handle the clips gently.

a. Make sure that the retaining clips are in the open position, away from the center of the DIMM connector.

b. Turn the RDIMM so that the RDIMM keys align correctly with the DIMM connector on the system board.

Page 41: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 41 Xw5175r100.pdf

Note: To access DIMM connector seven through twelve, use your fingers to lift the DIMM access door shown in Figure 20.

Figure 20

c. Press the RDIMM into the DIMM connector. The retaining clips lock the RDIMM into the connector.

d. Make sure that the small tabs on the retaining clips are in the notches on the RDIMM. If there is a gap between the RDIMM and the retaining clips, the RDIMM has not been correctly installed. Press the RDIMM firmly into the connector, and then press the retaining clips toward the RDIMM until the tabs are fully seated. When the RDIMM is correctly installed, the retaining clips are parallel to the sides of the RDIMM.

____6. If the DIMM access door is open, use your fingers to close it.

____7. Install the cover onto the blade server (see Closing the blade server cover in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide).

____8. Install the blade server into the BladeCenter unit (see Installing the blade server in a BladeCenter unit in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide).

Page 42: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 42 Xw5175r100.pdf

Microprocessor Subsystem Overview

The HS22 can support up to two Dual Core (DC) or Quad Core (QC) Intel® Xeon® Processor 5500 Series microprocessors. Nehalem EP is the Intel code name for these microprocessors. The Nehalem EP microprocessor supports Hyper-threading. There are two software threads per core. This makes the microprocessor good for highly threaded applications such as databases, multi-media, and search engines. The microprocessor cache is shared Inclusive Level 3 cache. The cache size is either 4 MB or 8 MB depending on the microprocessor. The Nehalem microprocessors have integrated Double Data Rate 3 (DDR3) memory controllers and interface directly to the DDR3 Registered Dual Inline Memory Modules (RDIMMs). The HS22 supports CPU throttling. When placed into turbo mode it increases performance by increasing processor frequency and enabling faster speeds when conditions allow. Figure 21 shows the Nehalem microprocessor.

Figure 21

Page 43: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 43 Xw5175r100.pdf

The HS22 can be upgraded by the customer with a second Nehalem microprocessor of the same speed and power. The HS22 ships standard with one of these Nehalem microprocessors:

1.86 GHz clock/4.8Gbps QPI/800 MHz DDR3/2 core/4MB cache/80W 2.00 GHz/4.8Gbps QPI/800MHz DDR3/4 core/4MB cache/ 80W 2.13 GHz/ 4.8 Gbps QPI/800 MHz DDR3/4 core/8MB cache/80W 2.26 GHz/5.86 Gbps QPI/1066MHz DDR3/4 core/8MB cache/80W 2.40 GHz/5.86 Gbps QPI/1066MHz DDR3/4 core/8MB cache/80W 2.53 GHz/5.86Gbps QPI/ 1066MHz DDR3/4 core/8MB cache/80W 2.67 GHz/6.4Gbps QPI/1333MHz DDR3/4 core/8MB cache/95W 2.80 GHz/6.4Gbps QPI/1333MHz DDR3/4 core/8MB cache/95W 2.93 GHz/6.4Gbps QPI/1333MHz/4 core/8MB cache/ 95W 2.26 GHz/5.86Gbps QPI/1066MHz/4 core/8MB cache/60W

The socket used by the Nehalem microprocessor is a Land Grid Array (LGA) 1366 Socket B. It supersedes the LGA 775 Socket T. There are no holes on these sockets. Instead there are pins which touch contacts on the underside of the microprocessor. Figure 22 shows the LGA 1366 Socket B in a HS22 blade server.

Figure 22

Page 44: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 44 Xw5175r100.pdf

Removing a Microprocessor and Heat Sink

____1. Before you begin, read the Safety statements and Installation guidelines in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide.

____2. If the blade server is installed in a BladeCenter unit, remove it (see Removing the blade server from the BladeCenter unit in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide).

____3. Remove the blade server cover (see Removing the blade server cover in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide).

____4. If an optional expansion unit is installed, remove the expansion unit (see Removing an optional expansion unit in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide).

____5. Locate the microprocessor that will be removed (see Blade server connectors in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide).

____6. Before removing the microprocessor, you must remove the memory module closest to the microprocessor.

a. If you are removing microprocessor 1, remove the memory module or DIMM filler from DIMM connector 1 (see Removing a memory module in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide).

b. If you are removing microprocessor 2, remove the memory module or DIMM filler from DIMM connector 6 (see Removing a memory module In the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide).

____7. Remove the heat sink.

Attention: Do not touch the thermal material on the bottom of the heat sink. Touching the thermal material will contaminate it. If the thermal material on the microprocessor or heat sink becomes contaminated, contact your service technician.

a. Loosen the screw on one side of the heat sink to break the seal with the microprocessor.

b. Use a screwdriver to loosen the screws on the heat sink, rotating each screw two full turns until each screw is loose.

Page 45: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 45 Xw5175r100.pdf

c. Use your fingers to gently pull the heat sink from the microprocessor.

Attention: Do not use any tools or sharp objects to lift the release lever on the microprocessor socket. Doing so might result in permanent damage to the system board. ____8. Rotate the locking lever on the microprocessor socket from its closed

and locked position until it stops in the fully open position (approximately a 135° angle). Lift the microprocessor retainer cover upward shown in Figure 23.

Figure 23

Attention: Do not touch the connectors on the microprocessor or the microprocessor socket.

Page 46: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 46 Xw5175r100.pdf

____9. Use your fingers to pull the microprocessor out of the socket shown in Figure 24.

Figure 24

____10. If you are instructed to return the microprocessor and heat sink, follow all packaging instructions and use any packaging materials for shipping that are supplied to you.

Page 47: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 47 Xw5175r100.pdf

Installing a Microprocessor and Heat Sink

To install a microprocessor and heat sink, complete the following steps.

Attention: Do not use any tools or sharp objects to lift the locking lever on the microprocessor socket. Doing so might result in permanent damage to the system board.

Attention: Do not touch the contacts in the microprocessor socket. Touching these contacts might result in permanent damage to the system board.

____1. Install the microprocessor and heat sink.

a. Rotate the locking lever on the microprocessor socket from its closed and locked position until it stops in the fully open position (approximately a 135° angle).

b. Rotate the microprocessor retainer on the microprocessor socket from its closed position until it stops in the fully open position (approximately a 135° angle).

Page 48: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 48 Xw5175r100.pdf

c. If a dust cover is installed over the microprocessor socket, lift the dust cover from the socket shown in Figure 25.

Figure 25

d. Touch the static-protective package that contains the microprocessor to any unpainted metal surface on the BladeCenter unit or any unpainted metal surface on any other grounded rack component; then, remove the microprocessor from the package.

e. Remove the dust cover from the bottom of the microprocessor. f. Orient the triangle painted on the corner of the microprocessor with the

triangle on the microprocessor socket. g. Carefully place the microprocessor into the microprocessor socket, using the

alignment notches on the microprocessor with the alignment tabs in the microprocessor socket as a guide.

Page 49: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 49 Xw5175r100.pdf

Figure 26 shows the microprocessor being installed in the microprocessor socket.

Figure 26

Attention: Do not press the microprocessor into the socket. Make sure that the microprocessor is oriented and aligned correctly in the socket before you try to close the microprocessor retainer.

h. Carefully close the microprocessor retainer. i. Rotate the locking lever on the microprocessor socket to the closed and

locked position. Make sure that the lever is secured in the locked position by pressing the tab on the microprocessor socket.

____2. Install a heat sink on the microprocessor.

Page 50: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 50 Xw5175r100.pdf

The heat sink being installed on the microprocessor is shown in Figure 27.

Figure 27

Note: The heat sink is not included in the CPU Field Replaceable Unit (FRU). The heat sink must be ordered separately.

Attention: Do not set the heat sink down after you remove the plastic cover. Do not touch the thermal material on the bottom of the heat sink. Touching the thermal material will contaminate it. If the thermal material on the microprocessor or heat sink becomes contaminated, contact your service technician.

a. Remove the plastic protective cover from the bottom of the heat sink. b. Make sure that the thermal material is still on the bottom of the heat sink, then

align and place the heat sink on top of the microprocessor in the retention bracket, thermal material side down. Press firmly on the heat sink.

Page 51: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 51 Xw5175r100.pdf

c. Align the three screws on the heat sink with the holes on the heat sink retention module.

d. Press firmly on the captive screws and tighten them with a screwdriver, alternating among the screws until they are tight. If possible, each screw should be rotated two full rotations at a time. Repeat until the screws are tight. Do not over tighten the screws by using excessive force. If you are using a torque wrench, tighten the screws to 8.5 Newton-meters (Nm) to 13 Nm (6.3 foot-pounds to 9.6 foot-pounds).

____3. Reinstall the memory module or DIMM filler closest to the microprocessor you installed.

a. If you installed microprocessor 1, install the memory module into DIMM connector 1 (Installing a memory module in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide).

b. If you installed microprocessor 2, install the memory module into DIMM connector 6 (Installing a memory module in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide).

____4. If you are using a single microprocessor, make sure that memory modules are installed in DIMM socket 2 and DIMM socket 6. If two microprocessors are installed in the blade server, make sure that memory modules are installed in DIMM socket 2, DIMM socket 6, DIMM socket 8, and DIMM socket 12. (See Installing a memory module in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide for more information on installing a memory module).

____5. Install the optional expansion unit, if you removed one from the blade server to replace the battery (see Installing an optional expansion unit in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide for instructions).

____6. Install the cover onto the blade server (see Closing the blade server cover in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide).

____7. Install the blade server into the BladeCenter unit (see Installing the blade server in a BladeCenter unit in the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide).

Page 52: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 52 Xw5175r100.pdf

Problem Determination - LightPath

LightPath

When the customer has a failed component in a blade server one of the first indications they receive is a lit fault LED on the front of the BladeCenter chassis. The customer then finds the HS22 blade server with the fault LED lit. Inside the HS22 blade server an error LED will be lit next to the failing component. The components in the HS22 that have error LEDs are the memory Registered Dual Inline Memory Modules (DIMMs), microprocessors, Hard Disk Drives (HDDs) and the battery. Figure 28 shows the error LED locations on the HS22 blade server.

Figure 28

Page 53: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 53 Xw5175r100.pdf

The lit error LEDs indicate which component or components have failed. There are additional LightPath errors LEDs, shown in Figure 29, on the LightPath diagnostic panel.

Figure 29

Page 54: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 54 Xw5175r100.pdf

These additional LightPath error LEDs are the System board error LED, Non-Masked Interrupt (NMI) error LED, Over Temperature LED, Microprocessor mismatch error LED, and LightPath diagnostic error LED. The LightPath diagnostic switch is the blue button used to see what error LEDs are lit when the HS22 blade server is not in the BladeCenter chassis and does not have power. Simply press the blue switch and error LEDs beside the failing component or components are lit. The LightPath diagnostic panel is located on the system board of the HS22 near microprocessor 1 shown in Figure 30.

Figure 30

Once the customer has found the fault LED lit on the HS22 blade server there are two situations that must be considered. They either can power off the server or they can't power it off. If they can power off the server all they have to do is remove it from the BladeCenter chassis. Once the HS22 is outside the chassis they remove the cover and press the LightPath diagnostic switch. When this is done any failing component error LEDs are lit.

If the customer reboots the server they can view the status of the LightPath LEDs by using Preboot Dynamic System Analysis (pDSA). To use pDSA perform the following steps:

Page 55: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 55 Xw5175r100.pdf

____1. During the blade server Power-On Self Test (POST) press <F2> Diagnostics when prompted shown in Figure 31.

Figure 31

Page 56: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 56 Xw5175r100.pdf

Once in <F2> Diagnostics select the LightPath button shown in Figure 32 to see what error LEDs are lit.

Figure 32

If the customer cannot power off or reboot the HS22 blade server then they can view the component error LEDS by using Online DSA. Viewing the error LED status in Online DSA looks the same as pDSA. The customer can download Online DSA for the HS22 blade server from the following website when it is available.

https://www.ibm.com/systems/support/supportsite.wss/docdisplay?brandind=5000008&lndocid=SERV-DSA

Page 57: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 57 Xw5175r100.pdf

Problem Determination - Unified Extensible Firmware Interface (UEFI)

Pure UEFI does not support legacy operating systems. Legacy compatibility is provided by the Compatibility Support Module (CSM). The CSM provides all the required Resources to boot a Legacy OS such as legacy BIOS Calls & Interrupts, memory maps & tables and legacy boot mechanisms. To ensure the highest level of compatibility, IBM’s CSM is based off of the IBM BIOS we have been shipping for years. The CSM is integrated into the UEFI code and does not require any interaction by the user to invoke.

Note: Currently Microsoft Windows 2008 Server is the only UEFI aware operating system.

The UEFI <F1> Setup utility is very similar in look and feel to the BIOS setup utility that IBM has been shipping for years. The screens are larger and UEFI has more configuration options available to it than legacy BIOS. To access the <F1> Setup utility, perform the following steps:

____1. During the system Power-On Self Test (POST) press <F1> Setup when prompted shown in Figure 33.

Figure 33

Page 58: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 58 Xw5175r100.pdf

____2. The System Configuration and Boot Management screen appears shown Figure 34. From this screen you can access system settings and configure them, set the server date and time, select or add start options, change the boot order, access the System Event Log (SEL), configure user security, restore settings, load default settings and save the settings. System Information is selected by default shown in Figure 34.

Figure 34

Page 59: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 59 Xw5175r100.pdf

____3. Select System Information on the System Configuration and Boot Management screen and the System Information screen appears shown in Figure 35.

Figure 35

Page 60: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 60 Xw5175r100.pdf

____4. On the System Information screen select System Summary from the menu. The System Summary screen appears shown in Figure 36. This screen summarizes the basic details of the HS22 blade server. Here you can see the speed of the microprocessor installed and the amount of memory installed as well as other information that can be used for problem determination purposes.

Figure 36

Page 61: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 61 Xw5175r100.pdf

____5. Press Esc until you are back at the System Configuration and Boot Management screen. Select System Settings from the menu shown in Figure 37.

Figure 37

Page 62: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 62 Xw5175r100.pdf

____6. The Systems Settings menu appears as shown in Figure 38. Processors is the default selection on the menu. From this screen you can configure the major components of the HS22 blade server. These components are processors, memory, devices and I/O ports, power, legacy support, Integrated Management Module, system security, adapters and Unified Extensible Firmware Interface (UEFI).

Figure 38

____7. To set the HS22 blade server to legacy boot press Esc until you are back at the System Configuration and Boot Management screen.

____8. From the System Configuration and Boot Management screen select Boot Manager.

Page 63: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 63 Xw5175r100.pdf

____9. On the Boot Manager screen select Add Boot Option as shown in Figure 39.

Figure 39

Page 64: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 64 Xw5175r100.pdf

____10. The File Explorer screen appears as shown in Figure 40. From the File Explorer screen choose Legacy Only. If Legacy Only is not in the menu then it has already been added to the Start Options menu.

Figure 40

Page 65: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 65 Xw5175r100.pdf

____11. Press Esc until you get to the System Configuration and Boot Management screen. From this screen choose Start Options. The Start Options screen appears shown in Figure 41. Legacy Only now shows at the bottom of the start options.

Figure 41

____12. You can change the boot order from the Boot Manager menu.

Page 66: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 66 Xw5175r100.pdf

Problem Determination - Logs

When the customer has a problem with a blade server in his BladeCenter chassis the first indication they receive is a fault LED light on the chassis. The customer then finds the blade server with the fault LED lit. Inside the blade server an error LED is lit next to the failing component. The error LED is not the only indication of a problem. An error message is also generated during the Power-On Self Test (POST) of the system. It is unlikely the customer sees this error message because servers normally don't have displays attached. At this point we need to find the error message or messages generated by the failing component in the logs that are available for problem determination.

There are several ways of gathering more information for problem determination purposes. The first and foremost is the Advanced Management Module's (AMM) Event log. This is the supported way to get information for problem determination for blade servers in the BladeCenter Chassis. The AMM Event log contains system status messages for all the blade servers installed in the BladeCenter chassis. The AMM is not the only way to view system status messages. Both Online Dynamic System Analysis (DSA) and pDSA pull the system status messages from the Integrated Management Module (IMM).

If the customer is having a problem with their server you usually run into two different situations. They can either reboot the server or they can't reboot the server. In either case you can still access the system status messages generated by the errors that occur.

Page 67: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 67 Xw5175r100.pdf

You can view several logs in pDSA by performing the following steps:

____1. If the customer can reboot the server you can access pDSA by pressing <F2> Diagnostics when prompted during the POST of the blade server shown in Figure 42.

Figure 42

Page 68: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 68 Xw5175r100.pdf

____2. Once you are in <F2> Diagnostics you can view the IMM Log shown in Figure 43. The IMM log consists of the Intelligent Platform Management Interface (IPMI) Event log, Chassis Event Log (CEL) and the Merged Logs. Here you can view the system status messages that occurred during POST. In this log you can see the error messages generated by a component such as a microprocessor that failed. If one of the cores in the processor failed several errors would be generated in the IMM Log.

Figure 43

You can view the System Event Log (SEL) in the <F1> Setup utility to see the error messages generated during POST. Perform the following steps to get to the SEL.

Page 69: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 69 Xw5175r100.pdf

____1. On a reboot of the blade server choose <F1> Setup. You are taken to the System Configuration and Boot Management screen shown in Figure 44.

Figure 44

Page 70: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 70 Xw5175r100.pdf

____2. On the System Configuration and Boot Management screen select System Event Logs from the menu. The System Event Logs screen appears as shown in Figure 45.

Figure 45

Page 71: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 71 Xw5175r100.pdf

____3. Select the POST Event Viewer on the System Event Logs screen to view the POST Event Viewer shown in Figure 46.

Figure 46

Page 72: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 72 Xw5175r100.pdf

____4. Select System Event Log on the System Event Logs screen to view the SEL shown in Figure 47.

Figure 47

Note: The SEL holds limited amounts of information and is cleared from the System Event Logs screen of the <F1> Setup utility.

Page 73: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 73 Xw5175r100.pdf

Figure 48 is an example of the contents in the SEL.

Figure 48

Page 74: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 74 Xw5175r100.pdf

If the customer cannot power off the system they can still look at the logs. They can use Online Dynamic System Analysis (DSA) if they have it installed. The logs are displayed by Online DSA in the same manner as they are in pDSA. The logs pulled by Online DSA will be specific to the blade it is installed on. If Online DSA is installed on the HS22 blade server then the logs displayed will be specific to that server. Figure 49 shows the Online DSA Integrated Management Module (IMM) log and the logs that comprise it.

Figure 49

Page 75: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 75 Xw5175r100.pdf

The customer can access the Advanced Management Module (AMM) Event log. The AMM Event log contains system status messages including errors gathered from all the blade servers in the chassis. These messages are labeled according to what bay the particular blade server is installed in. If the customer has the HS22 blade server in bay 14 then the system status messages for the HS22 show bay 14. If the customer moved the HS22 to bay 10 then the system status message for that blade server in the AMM would show bay 10. The previous system status messages would still show bay 14 because they were generated when the HS22 was in that bay. Figure 50 shows the AMM Event log.

Figure 50

Page 76: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 76 Xw5175r100.pdf

The AMM Event log can be saved as a text file. The text file name is asm.elg. An example of the contents of this text file is shown in Figure 51.

Figure 51

The Events in the AMM Event Log have three levels of severity which are E = Error, W = Warning and I = Informational. Each event that occurs also has an Event ID, Date stamp, Time stamp, Source and Text string associated with it. Figure 52 shows a breakdown of the information provided in an event from the AMM Event log. The event in Figure 52 was taken from the text file created when the AMM Event log was saved to a file. The source of this event Blade_14 is a HS22 blade server.

Figure 52

Note: The formatting of an event in the text file version of the AMM Event Log is slightly different from the same event when viewed using the web browser interface of the AMM. The content of the event will be the same.

All blades P6 (JS22) and later are supported by the same standard IPMI events that have the same 8-byte Event IDs. There is an AMM Message Guide available on the WWW at: http://publib.boulder.ibm.com/infocenter/systems/scope/bladecenter/index.jsp?topic=/com.ibm.bladecenter.advmgtmod.doc/adv_mgt_mod_product_page.html

Page 77: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 77 Xw5175r100.pdf

HS22 blade server Event ID's

There are some important changes to Event ID's with the introduction of the HS22 blade server that should be noted.

For CPUs, DIMMs, and PCI Errors the HS22 software may issue the following Group Events

There are situations where error may apply to all of the CPUs, DIMMs or PCI buses. If so, an event will not be issued for each individual component. Instead a group sensor raises a group event for All of the component.

There are also situations where the error can not be pinned down to a specific component or a group of components. Therefore, instead of calling out a wrong component, a group sensor issues a group event to let the customer know that something is wrong with one of a group of components.

Note: In a situation where the error cannot be pinned down to a specific component or group of components an error LED will not be lit next to the components. Instead the fault LED on the HS22 blade server is lit.

Examples of Group Events

0x806F010C E (HS22 Blade) Group group 1 (One of the DIMMs) uncorrectable ECC memory error

0x806F040C I (HS22 Blade) Group group 1 (One of the DIMMs) memory disabled

0x806F050C E (HS22 Blade) Group group 1 (One of the DIMMs) correctable ECC memory error logging limit reached

0x806F070C E (HS22 Blade) Group group 1 (One of the DIMMs) memory configuration error

0x806F040C I (HS22 Blade) Group group 1 (One of the DIMMs) memory enabled

0x806F010C E (HS22 Blade) Group group 1 (All DIMMs) uncorrectable ECC memory error

0x806F040C I (HS22 Blade) Group group 1 (All DIMMs) memory disabled

0x806F050C E (HS22 Blade) Group group 1 (All DIMMs) correctable ECC memory error logging limit reached

0x806F070C E (HS22 Blade) Group group 1 (All DIMMs) memory configuration error

Page 78: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 78 Xw5175r100.pdf

0x806F040C I (HS22 Blade) Group group 1 (All DIMMs) memory enabled

0x806F0413 E (HS22 Blade) Group group 2 (One of PCI Err) PCI parity error

0x806F0513 E (HS22 Blade) Group group 2 (One of PCI Err) PCI system error

0x806F0413 E (HS22 Blade) Group group 2 (All PCI Err) PCI parity error

0x806F0513 E (HS22 Blade) Group group 2 (All PCI Err) PCI system error

0x806F0007 E (HS22 Blade) Group group 4, processor (One of CPUs) internal error

0x806F0107 E (HS22 Blade) Group group 4, processor (One of CPUs) thermal trip

0x806F0207 E (HS22 Blade) Group group 4, processor (One of CPUs) BIST failure

0x806F0507 E (HS22 Blade) Group group 4, processor (One of CPUs) configuration error

0x806F0607 E (HS22 Blade) Group group 4, processor (One of CPUs) SM BIOS uncorrectable error

0x806F0807 I (HS22 Blade) Group group 4, processor (One of CPUs) disabled

0x806F0A07 I (HS22 Blade) Group group 4, processor (One of CPUs) Event not logged or e-mailed/trapped

0x806F0807 I (HS22 Blade) Group group 4, processor (One of CPUs) enabled

0x806F0007 E (HS22 Blade) Group group 4, processor (all CPUs) internal error

0x806F0107 E (HS22 Blade) Group group 4, processor (all CPUs) thermal trip

0x806F0207 E (HS22 Blade) Group group 4, processor (all CPUs) BIST failure

0x806F0507 E (HS22 Blade) Group group 4, processor (all CPUs) configuration error

0x806F0607 E (HS22 Blade) Group group 4, processor (all CPUs) SM BIOS uncorrectable error

0x806F0807 I (HS22 Blade) Group group 4, processor (all CPUs) disabled

Page 79: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 79 Xw5175r100.pdf

0x806F0A07 I (HS22 Blade) Group group 4, processor (all CPUs) Event not logged or e-mailed/trapped

0x806F0807 I (HS22 Blade) Group group 4, processor (all CPUs) enabled

For some PCI Error events, slot numbers will be given

0x806F0413 E (HS22 Blade) Expansion Card 2 (PCI Slot 1) PCI parity error

0x806F0513 E (HS22 Blade) Expansion Card 2 (PCI Slot 1) PCI system error

0x806F0413 I (HS22 Blade) Recovery Expansion Card 2 (PCI Slot 1) PCI parity error

0x806F0513 I (HS22 Blade) Recovery Expansion Card 2 (PCI Slot 1) PCI system error

0x806F0413 E (HS22 Blade) Expansion Card 3 (PCI Slot 2) PCI parity error

0x806F0513 E (HS22 Blade) Expansion Card 3 (PCI Slot 2) PCI system error

0x806F0413 I (HS22 Blade) Recovery Expansion Card 3 (PCI Slot 2) PCI parity error

0x806F0513 I (HS22 Blade) Recovery Expansion Card 3 (PCI Slot 2) PCI system error

Pass Through Messages from BIOS

With the HS22 blade, The Unified Extensible Firmware Interface will no longer be able to pass through messages to the AMM Event Log. For example, in the past, you may have seen the following messages. These were generated by BIOS and BIOS just used the BMC to pass these messages through to the AMM.

109 ERR BLADE_10 02/15/07 09:57:51 (MTGSMOSL148) POSTBIOS: 184 The power-on password has become invalid.

110 ERR BLADE_10 02/15/07 09:57:14 (MTGSMOSL148) POSTBIOS: 162 Configuration Error

bc4: BLADE_06 E 11/07/06 14:13:16 POSTBIOS: 289 Board 1 DIMM Pair 2 Double Bit Error.

Page 80: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 80 Xw5175r100.pdf

With the HS22 blade, the IMM is informed of all errors detected by UEFI and they are handled just like errors detected by the IMM. The standard IPMI events result in a standard blade event log entry in the AMM Event Log.

New Hard Drive Event ID's

Event On - The event has occurred

0x806F000D I (HS22 Blade) Hard drive 1 (Drive 0 Status) installed

0x806F010D E (HS22 Blade) Hard drive 1 (Drive 0 Status) fault

0x806F020D W (HS22 Blade) Hard drive 1 (Drive 0 Status) predictive failure

0x806F050D E (HS22 Blade) Hard drive 1 (Drive 0 Status) in critical array

0x806F060D E (HS22 Blade) Hard drive 1 (Drive 0 Status) in failed array

0x806F070D I (HS22 Blade) Hard drive 1 (Drive 0 Status) rebuild in progress

0x806F000D I (HS22 Blade) Hard drive 2 (Drive 1 Status) installed

0x806F010D E (HS22 Blade) Hard drive 2 (Drive 1 Status) fault

0x806F020D W (HS22 Blade) Hard drive 2 (Drive 1 Status) predictive failure

0x806F050D E (HS22 Blade) Hard drive 2 (Drive 1 Status) in critical array

0x806F060D E (HS22 Blade) Hard drive 2 (Drive 1 Status) in failed array

0x806F070D I (HS22 Blade) Hard drive 2 (Drive 1 Status) rebuild in progress

Event Off - The event has been recovered from

0x806F000D I (HS22 Blade) Hard drive 1 (Drive 0 Status) removed

0x806F010D I (HS22 Blade) Recovery Hard drive 1 (Drive 0 Status) fault

0x806F020D I (HS22 Blade) Recovery Hard drive 1 (Drive 0 Status) predictive failure

0x806F050D I (HS22 Blade) Recovery Hard drive 1 (Drive 0 Status) in critical array

0x806F060D I (HS22 Blade) Recovery Hard drive 1 (Drive 0 Status) in failed array

Page 81: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 81 Xw5175r100.pdf

0x806F070D I (HS22 Blade) Hard drive 1 (Drive 0 Status) rebuild complete

0x806F000D I (HS22 Blade) Hard drive 2 (Drive 1 Status) removed

0x806F010D I (HS22 Blade) Recovery Hard drive 2 (Drive 1 Status) fault

0x806F020D I (HS22 Blade) Recovery Hard drive 2 (Drive 1 Status) predictive failure

0x806F050D I (HS22 Blade) Recovery Hard drive 2 (Drive 1 Status) in critical array

0x806F060D I (HS22 Blade) Recovery Hard drive 2 (Drive 1 Status) in failed array

0x806F070D I (HS22 Blade) Hard drive 2 (Drive 1 Status) rebuild complete

New Memory DIMM Events

Event On - The event has occurred

0x80070100 W (HS22 Blade) Memory device 1, temperature (DIMM 1 Temp) warning

0x80070200 E (HS22 Blade) Memory device 1, temperature (DIMM 1 Temp) critical

0x80070100 W (HS22 Blade) Memory device 2, temperature (DIMM 2 Temp) warning

0x80070200 E (HS22 Blade) Memory device 2, temperature (DIMM 2 Temp) critical

0x80070100 W (HS22 Blade) Memory device 3, temperature (DIMM 3 Temp) warning

0x80070200 E (HS22 Blade) Memory device 3, temperature (DIMM 3 Temp) critical

0x80070100 W (HS22 Blade) Memory device 4, temperature (DIMM 4 Temp) warning

0x80070200 E (HS22 Blade) Memory device 4, temperature (DIMM 4 Temp) critical

0x80070100 W (HS22 Blade) Memory device 5, temperature (DIMM 5 Temp) warning

0x80070200 E (HS22 Blade) Memory device 5, temperature (DIMM 5 Temp) critical

Page 82: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 82 Xw5175r100.pdf

0x80070100 W (HS22 Blade) Memory device 6, temperature (DIMM 6 Temp) warning

0x80070200 E (HS22 Blade) Memory device 6, temperature (DIMM 6 Temp) critical

0x80070100 W (HS22 Blade) Memory device 7, temperature (DIMM 7 Temp) warning

0x80070200 E (HS22 Blade) Memory device 7, temperature (DIMM 7 Temp) critical

0x80070100 W (HS22 Blade) Memory device 8, temperature (DIMM 8 Temp) warning

0x80070200 E (HS22 Blade) Memory device 8, temperature (DIMM 8 Temp) critical

0x80070100 W (HS22 Blade) Memory device 9, temperature (DIMM 9 Temp) warning

0x80070200 E (HS22 Blade) Memory device 9, temperature (DIMM 9 Temp) critical

0x80070100 W (HS22 Blade) Memory device 10, temperature (DIMM 10 Temp) warning

0x80070200 E (HS22 Blade) Memory device 10, temperature (DIMM 10 Temp) critical

0x80070100 W (HS22 Blade) Memory device 11, temperature (DIMM 11 Temp) warning

0x80070200 E (HS22 Blade) Memory device 11, temperature (DIMM 11 Temp) critical

0x80070100 W (HS22 Blade) Memory device 12, temperature (DIMM 12 Temp) warning

0x80070200 E (HS22 Blade) Memory device 12, temperature (DIMM 12 Temp) critical

Event Off - The event has been recovered from

0x80070100 I (HS22 Blade) Recovery Memory device 1, temperature (DIMM 1 Temp) warning

0x80070200 I (HS22 Blade) Recovery Memory device 1, temperature (DIMM 1 Temp) critical

Page 83: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 83 Xw5175r100.pdf

0x80070100 I (HS22 Blade) Recovery Memory device 2, temperature (DIMM 2 Temp) warning

0x80070200 I (HS22 Blade) Recovery Memory device 2, temperature (DIMM 2 Temp) critical

0x80070100 I (HS22 Blade) Recovery Memory device 3, temperature (DIMM 3 Temp) warning

0x80070200 I (HS22 Blade) Recovery Memory device 3, temperature (DIMM 3 Temp) critical

0x80070100 I (HS22 Blade) Recovery Memory device 4, temperature (DIMM 4 Temp) warning

0x80070200 I (HS22 Blade) Recovery Memory device 4, temperature (DIMM 4 Temp) critical

0x80070100 I (HS22 Blade) Recovery Memory device 5, temperature (DIMM 5 Temp) warning

0x80070200 I (HS22 Blade) Recovery Memory device 5, temperature (DIMM 5 Temp) critical

0x80070100 I (HS22 Blade) Recovery Memory device 6, temperature (DIMM 6 Temp) warning

0x80070200 I (HS22 Blade) Recovery Memory device 6, temperature (DIMM 6 Temp) critical

0x80070100 I (HS22 Blade) Recovery Memory device 7, temperature (DIMM 7 Temp) warning

0x80070200 I (HS22 Blade) Recovery Memory device 7, temperature (DIMM 7 Temp) critical

0x80070100 I (HS22 Blade) Recovery Memory device 8, temperature (DIMM 8 Temp) warning

0x80070200 I (HS22 Blade) Recovery Memory device 8, temperature (DIMM 8 Temp) critical

0x80070100 I (HS22 Blade) Recovery Memory device 9, temperature (DIMM 9 Temp) warning

Page 84: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 84 Xw5175r100.pdf

0x80070200 I (HS22 Blade) Recovery Memory device 9, temperature (DIMM 9 Temp) critical

0x80070100 I (HS22 Blade) Recovery Memory device 10, temperature (DIMM 10 Temp) warning

0x80070200 I (HS22 Blade) Recovery Memory device 10, temperature (DIMM 10 Temp) critical

0x80070100 I (HS22 Blade) Recovery Memory device 11, temperature (DIMM 11 Temp) warning

0x80070200 I (HS22 Blade) Recovery Memory device 11, temperature (DIMM 11 Temp) critical

0x80070100 I (HS22 Blade) Recovery Memory device 12, temperature (DIMM 12 Temp) warning

0x80070200 I (HS22 Blade) Recovery Memory device 12, temperature (DIMM 12 Temp) critical

HS22 blade server Event ID list text file

These Event ID's are possible events that the HS22 blade server can send. Each of these events contains the 8-byte Event ID that identifies the event and is sent in Simple Network Management Protocol (SNMP) alerts and sent to IBM Director. This Event ID list includes the Event Ons and Event Offs (if applicable) for every event. For the threshold events, the reading and threshold values are just sample values since these are not "live" events. They are just there to give an example of what the event will look like. Event Off events are always informational and always have the same event ID as the Event On events.

For Error and Warning level events, Event Off events have the same description string as the Event On events, but the strings are preceded with the word Recovery. For Informational events, the Event Off has a custom string that is the opposite or negative of the Event On string. The first name in parenthesis: (HS22 Blade), in this example, is the blade name that may be assigned by the customer via the AMM. The second name is parenthesis, is the 16 byte Sensor Data Records (SDR) string. This string is part of the Sensor Data Records owned by the IMM. The list includes all the duplicates for the multiple instances of the hardware (2 CPUs, 12 RDIMMs, 2 hard drives). The sensor numbers listed are not significant. They are just unique values that are arbitrarily assigned to each SDR by the IMM for identification purposes. They are only present in this debug command and are not presented to the customer. They are used in this debug command as a good way to organize the events from the IMM.

Page 85: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 85 Xw5175r100.pdf

Click on the file below to access the AMMmessages.txt file.

AMMmessages.txt

Page 86: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 86 Xw5175r100.pdf

Problem Determination - Collecting Logs

The logs provided by Preboot Dynamic System Analysis (pDSA) and Online Dynamic System Analysis (DSA) and the Advanced Management Module (AMM) Event Log provide valuable problem determination information that can help solve the customer's problem. These logs should be gathered. pDSA and Online DSA provide you with the flexibility to get the logs whether system can be rebooted or not.

The pDSA and Online DSA logs can tell you if the customer is downlevel on firmware and whether device drivers are at the current level. No component should be replaced if it is downlevel on firmware and not using the latest level of device drivers. When a component is replaced it is sent back to IBM so the cause of its failure can be determined. If the component is found to be downlevel on firmware and drivers it is updated to the latest levels. If this fixes the problem with the component then it is labeled No Defect Found (NDF). The customer's problem was fixed but it was fixed the wrong way and the problem can occur again. The correct and best fix to the problem is to update the component to the latest firmware and device drivers. Ask the customers if they are at the latest levels but the logs tell you the exact levels the customers are at on most components in the system.

The Integrated Management Module (IMM) logs provided by Online and pDSA provide the system status messages that have occurred during the Power-On Self Test (POST) of the HS22 blade server. These system status messages include any errors that have occurred.

DSA

DSA uses the following system data collection providers:

Network Settings

− Hostname, physical network port info, global settings Hardware Inventory

− Processor, memory, disk info, monitor info, system card info, devices – SCSI, Universal Serial Bus (USB), optical, other

PCI Information

− Devices, bridges, slots Firmware/Vital Product Data (VPD)

− Network, Service Processor (SP), Basic Input Output System (BIOS), other VPD

Page 87: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 87 Xw5175r100.pdf

SP Configurations

− Settings – general, TCP/IP, Simple Network Management Protocol (SNMP), dial-out, dial-in

LSI Controller

− Controller info, physical & logical drive info Lightpath – LED settings Built In Self Test (BIST) results – Intelligent Platform Management Interface

(IPMI) Event logs – IMM Chassis Event Log (CEL) and IPMI Merged devices Memory diagnostics log DSA Error log

To collect the Online DSA and pDSA logs on the system perform the following steps:

____1. To collect the pDSA logs press <F2> Diagnostics when prompted during the POST of the HS22 blade server.

____2. A quick test of the memory is performed and once completed pDSA will load.

Page 88: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 88 Xw5175r100.pdf

____3. The Command Line Interface (CLI) of pDSA appears on the screen shown in Figure 53. From here enter the command gui to access the Graphical User Interface (GUI) interface of pDSA.

Figure 53

____4. The pDSA GUI appears shown in Figure 54. Select Send System Information to IBM from the Hardware Diagnostic Utility screen.

Figure 54

Page 89: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 89 Xw5175r100.pdf

____5. At the Collect and Send System Information screen shown in Figure 55 follow the four steps provided on the screen.

Figure 55

Page 90: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 90 Xw5175r100.pdf

____6. Figure 56 displays the logs being collected and the output was specified to be HTML.

Figure 56

Page 91: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 91 Xw5175r100.pdf

____7. Click the View Result button as shown in Figure 57 to see the results before sending to IBM.

Figure 57

Page 92: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 92 Xw5175r100.pdf

____8. Figure 58 shows the results from the log collection on a HS22 blade server.

Figure 58

____9. Click the OK button on the System Overview screen to return to the Send and Collect System Information screen.

____10. Once the log has been collected and retrieved, click the Home button to return to the pDSA Hardware Diagnostic Utility screen.

____11. Click Exit on the Hardware Diagnostic Utility screen to boot the HS22 into the installed operating system.

Page 93: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 93 Xw5175r100.pdf

AMM

The AMM is accessed remotely using its Internet Protocol (IP) address and a web browser. Once the AMM is accessed you can view the AMM Event Log. The AMM Event Log contains the system status messages for all the blade servers in the BladeCenter chassis. This includes any errors that have occurred.

To retrieve the AMM Event Log to a text file perform the following steps:

____1. Use the IP address of the AMM with a web browser to access its web interface and choose Event Log from the System Status menu shown in Figure 59. Then click the Save log to text file button.

Figure 59

____2. Have the customer send the AMM Event Log text file to IBM Support.

Page 94: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 94 Xw5175r100.pdf

There are other tools available for the customer to get service data. These tools do not directly help the IBM service provider but do help indirectly. They are accessed from the AMM Service Tools menu shown in Figure 60.

Figure 60

One of those tools is Service Advisor. This tool resides on the AMM and monitors the BladeCenter chassis for hardware events. If a hardware event is detected it captures the event, error logs and service data and can automatically send it to either IBM or an approved service provider depending on the customer's service agreement. A service ticket is opened and a follow up call is issued for each hardware event that is detected. For both IBM Support and an approved service provider a FTP site must be specified to send the file to. In order to use Service Advisor the customer must enable and configure it.

Page 95: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 95 Xw5175r100.pdf

Problem Determination - Firmware update

Once you get the logs you can see the component firmware levels. Updating firmware is very important. A component should not be replaced until the firmware for that component has been updated to the latest levels to see if this fixes the problem. If you are unsure how to update the firmware of a component check the Installation and Users guide for that component and also check with the next level of support.

This part of the course covers updating the firmware specifically for the Unified Extensible Firmware Interface (UEFI), Integrated Management Module (IMM), and Preboot Dynamic System Analysis (pDSA). It is very important for the HS22 to be at current levels of firmware because firmware fixes problems and adds functionality to the system. An HS22 system board should never be replaced until the firmware has been updated on the system to see if that fixes the problem.

ISO/IMG versions of individual firmware updates will no longer be posted to the web for systems based on Intel’s Nehalem-EP platform. The change is necessary because IMG/ISO files are bootable DOS images. Increased system complexity has necessitated the replacement of both Basic Input Output System (BIOS) and DOS support starting with systems based on Intel’s Nehalem-EP platform. DOS is unable to access modern hardware interfaces such as Universal Serial Bus (USB). The Nehalem-EP platform systems such as the HS22 blade server will ship without floppy disk or CD/DVD ROM drives. Keep in mind that the firmware for the IMM, UEFI and pDSA must be updated separately. There is no firmware package that will update them all at once.

Page 96: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 96 Xw5175r100.pdf

Update strategy for Nehalem-EP platform systems

You can do Bare Metal updates using the IMM and the Advanced Management Module (AMM). The term Bare Metal update means updating the system's firmware with no operating system installed. Individual firmware updates for UEFI and the IMM can be applied using the AMM web interface. To do this you browse to the *.exe or *.bin files using the AMM web interface firmware update page. The same packages can also be executed online if an operating system has been installed. pDSA can only be updated by executing the update package file in the operating system. This is due to the size of pDSA. Figure 61 shows Bare Metal updates using the IMM.

Figure 61

The following readme's for UEFI, IMM and pDSA firmware updates are current at the time of the writing of this course but should only be used as an example. These files could change before the product is shipped so please use the files after the product is shipped when working with a customer.

Linux

ibm_fw_uefi_p9e121u_linux_32-64.txt

Page 97: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 97 Xw5175r100.pdf

Note: The UEFI firmware update cannot be performed with DC power off, or while POST is running.

ibm_fw_imm_yuoo18d_linux_32-64.txt

ibm_fw_dsa_p9yt36a_linux_i386.txt

Note: The pDSA firmware cannot be updated while pDSA is running (F2 press during boot-up)

Windows

ibm_fw_uefi_p9e121t_windows_32-64.txt

Note: The UEFI firmware update cannot be performed with DC power off, or while POST is running.

ibm_fw_imm_yuoo18d_windows_32-64.txt

oem_fw_dsa_p9yt36a_windows_i386.txt

Page 98: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 98 Xw5175r100.pdf

Note: The pDSA firmware cannot be updated while pDSA is running (F2 press during boot-up)

Bootable Media Creator (BoMC)

Bare Metal updates can be done using bootable media. The new Bootable Media Creator tool allows the customer to create a bootable media. You can use the tool to burn CD/DVD, create a bootable ISO image, and create a bootable USB key image. Figure 62 shows the BoMC.

Figure 62

Online updates and UpdateXpress Systems Packs (UXSP)

UpdateXpress will continue to support remote, scriptable updates under Windows, Linux, and VMware ESX Service Console. The BoMC can create a bootable UXSP for firmware only.

Page 99: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 99 Xw5175r100.pdf

Note: The Bare Metal firmware update strategy builds on UXSP Installer 2.0 Bootable Media Support, providing capability to build bootable media dynamically based on online Linux updates. The UpdateXpress Service Packs deliver the tested baseline of firmware and device drivers. Individual updates are hot fixes to UXSP.

Note: DOS ISO/IMGs will continue to be posted for legacy systems.

Page 100: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 100 Xw5175r100.pdf

Problem Determination – Diagnostics

When the customer is having problems they may have to run diagnostics.

Preboot Dynamic System Analysis (pDSA) has the following diagnostic tests available for the HS22:

Memory Test

− Runs in standalone mode Baseboard Management Controller (BMC) I2C Test Ethernet Test

− Control Registers

− EEPROM

− Internal Memory

− Interrupt

− LEDs

− MAC Loopback

− PHY Loopback

− MII Registers Stress Tests

− CPU Stress Test

− Memory Stress Test Hard Disk Drive (HDD) Test

Page 101: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 101 Xw5175r100.pdf

To access the pDSA diagnostics perform the following steps:

____1. Reboot the HS22 blade server and press <F2> Diagnostics when prompted during the Power-On Self Test (POST) of the system.

____2. A quick test of memory occurs once that is complete exit out to start pDSA automatically. It takes a minute or so for pDSA to start so be patient.

____3. At the Hardware Diagnostic Utility screen shown in Figure 63 select Diagnostics.

Figure 63

Page 102: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 102 Xw5175r100.pdf

____4. On the Add Tests screen select one of the add test options offered as shown in Figure 64 and click Next.

Figure 64

Add the tests you want to run using the Diagnostic Test Selection Wizard shown in Figure 65. Figure 65 shows that the memory has been chosen for a stress test. You could also test the microprocessors, HDD, eth 1, eth0 and optical devices.

Figure 65

Page 103: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 103 Xw5175r100.pdf

Once the diagnostics have been run, click Finish.

You can view the results in the Diagnostic Log shown in Figure 66 by clicking the Diagnostic Log button.

Figure 66

Note: Online DSA can only test Optical devices and Tape devices.

Page 104: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 104 Xw5175r100.pdf

Problem Determination - Examples

Whenever you encounter a memory problem in the new Intel processors that have their memory controllers embedded in CPUs, you are required to follow the specific problem determination path listed below. Perform these steps in the following order until the issue is resolved:

1. Assure that there is not a known bios or code issue a. Check for Retain Tips b. Check with the next level of support. You do not work alone. There are

people in place to help you solve the problem. 2. Assure that all memory modules are installed in the correct slots 3. Check that all memory is correctly seated 4. Replace memory 5. Replace the CPU 6. Replace system board (if you reach the conclusion that the memory socket

may be damaged)

The following four problems are presented as examples of problem determination on the HS22 blade server. These are four possible problems but they are not the only problems that the customer may encounter. All the problems that could occur with the HS22 blade server could not be listed here nor could this course expect to anticipate all the problems that could possibly occur.

Page 105: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 105 Xw5175r100.pdf

Problem #1: The customer calls in and states that the Windows operating system is not seeing all the microprocessors in their system.

Action 1: If the customer can reboot the system have then go into <F2> Diagnostics. If the customer cannot reboot the system, because it is still in production, proceed to Actions 7 and 8.

Action 2: Have the customer check the IMM log in Preboot Dynamic System Analysis (pDSA). The customer sees a system status message Microprocessor halted in the IMM log.

Note: The Microprocessor halted message is not a catastrophic event since the microprocessors in the HS22 blade server have more that 1 processor core. In an older system this would be catastrophic since there would have been only 1 processor core. The HS22 can have up to two Dual Core (DC) or Quad Core (QC) processors. Thus the HS22 continues to operate in a degraded state. This can be verified in Microsoft Windows by looking at how many microprocessors the operating system sees.

Action 3: Remove the HS22 blade server from the BladeCenter chassis and then replace it back into the chassis.

Action 4: If Action 3 does not work then the microprocessor should be reseated by a trained technician.

Action 5: If Action 4 does not work then replace the microprocessor first.

Action 6: If Action 5 does not resolve the problem then replace the system board assembly.

Note: Before replacing the system board in an HS22 you must make sure you check for RETAIN tips and check with the next level of support for workarounds and unresolved issues.

Action 7: If the customer cannot reboot their system they can view the logs using Online DSA or the Advanced Management Module (AMM). If neither of these is a possibility then a maintenance window needs to be scheduled with the customer to reboot the system in to pDSA. The customer sees the system status message Microprocessor halted in the IMM log.

Note: Many customers do not support connecting to the AMM through its web interface. In this case the customer should be encouraged to download Online DSA.

Action 8: Repeat Actions 2 through 6 until problem is resolved.

Page 106: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 106 Xw5175r100.pdf

Problem #2: The customer calls in with an Over Temperature condition on their HS22 blade server.

Action 1: If the customer can reboot the system have them go into <F2> Diagnostics. If the customer cannot reboot the system, because it is still in production, proceed to Actions 12 and 13.

Action 2: Have the customer check the IMM log in <F2> Diagnostics. The customer sees the following system status messages in the IMM log:

0x80070100 W (HS22 Blade) Memory device 1, temperature (DIMM 1 Temp) warning

0x80070200 E (HS22 Blade) Memory device 1, temperature (DIMM 1 Temp) critical

Action 3: Have the customer check to see if any fillers or air baffles have been removed from the system.

Note: The air flow of the HS22 is designed to cool the system with DIMM fillers, microprocessor fillers and other air baffles in place. If any of these air baffles are removed the air flow in the server is changed. Air will always seek the path of least resistance and that may not be the best path to cool the server. If the customer has removed any air baffles then a maintenance window has to be scheduled in order for the missing air baffles to be replaced. If the air baffles have been thrown away they will have to be replaced.

Action 4: Replace the missing fillers or air baffles. If this fixes the problem the following system status messages are seen in the IMM Event Log.

0x80070100 I Recovery Memory device 1, temperature (DIMM 1 Temp) warning

0x80070200 I Recovery Memory device 1, temperature (DIMM 1 Temp) critical

Action 5: If no air baffles are missing then have the customer check the front bezel on the HS22 blade server to make sure it is not blocked.

Action 6: If the front bezel of the HS22 blade server is not blocked then check with the next level of support for any known bios or microcode issues and check for any relevant Retain Tips.

Action 7: Check to make sure the memory is installed in the correct population order.

Action 8: Check to make sure that the memory in DIMM slot 1 is correctly seated.

Page 107: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 107 Xw5175r100.pdf

Action 9: If Action 8 does not solve the problem replace the memory DIMM or DIMMs in question.

Action 10: If Action 9 does not work then check to make sure the DIMM socket is not damaged.

Action 11: If the memory socket is damaged replace the system board assembly of the HS22 blade server.

Action 12: If the customer cannot reboot the HS22 server they can review the logs using Online DSA or AMM. If neither of these is a possibility then a maintenance window needs to be scheduled with the customer to reboot the system in to pDSA.

Note: Many customers do not support connecting to the AMM through its web interface. In this case the customer should be encouraged to download Online DSA.

Action 13: Repeat Actions 2 through 11.

Page 108: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 108 Xw5175r100.pdf

Problem #3: The customer calls in stating that they configured for memory mirroring mode but their configuration is showing Independent mode.

Action 1: If the customer can reboot the system have them reboot in to <F2> Diagnostics and check the IMM log. The customer sees the system status message Invalid DIMM population for memory mode. If the customer cannot reboot the system, because it is still in production, proceed to Actions 6 and 7.

Action 2: Check to see if the customer installed the ServeRAID-MR10ie Combination Input Output vertical (CIOv) storage interface card and the battery card.

Action 3: Check to see if the battery card for the MR10ie CIOv card is installed in DIMM socket 7.

Action 4: Check to see which Dual Inline Memory Module (DIMM) sockets are populated with DIMMs. The customer states that a 4 GB DIMM is installed in DIMM sockets 8 and 11. DIMM socket 12 is empty.

Action 5: Have the customer remove the 4GB DIMM in DIMM socket 11.

Note: The reason you remove the DIMM from DIMM socket 11 is because DIMM socket 7 is mirrored with DIMM socket 11. Since the battery card in DIMM socket 7 is not seen as a DIMM there cannot be a DIMM installed in DIMM socket 11. This causes the customer's DIMM mirroring configuration to be invalid. This would not cause the customers system to not boot because when the HS22 blade server discovers the memory mirroring configuration is invalid it automatically changes to independent memory mode.

Note: The DIMM should be removed following the Removing a memory module section of the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide.

Note: The memory DIMMs being mirrored must be the same type and speed as each other in the same bank.

Action 6: If the customer cannot reboot the HS22 server they can review the logs using Online DSA or AMM. If neither of these is a possibility then a maintenance window needs to be scheduled with the customer to reboot the system in to pDSA.

Note: Many customers do not support connecting to the AMM through its web interface. In this case the customer should be encouraged to download Online DSA.

Action 7: Repeat Actions 2 through 5.

Page 109: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 109 Xw5175r100.pdf

Problem #4: The customer calls in stating that they configured for Memory Mirroring mode but their configuration is showing Independent mode.

Action 1: If the customer can reboot the system have them reboot in to <F2> Diagnostics and check the IMM log. The customer sees the system status message Invalid DIMM population for memory mode. If the customer cannot reboot the system, because it is still in production, proceed to Actions 7 and 8.

Action 2: Check to see if the customer installed the ServeRAID-MR10ie Combination Input Output vertical (CIOv) storage interface card and the battery card.

Action 3: Check to see if the battery card for the MR10ie CIOv card is installed in DIMM socket 7.

Action 4: Check to see what Dual Inline Memory Module (DIMM) sockets are populated with DIMMs. A 4GB DIMM is installed in DIMM sockets 8 and 11. DIMM socket 12 is empty.

Action 5: Remove the DIMM installed in DIMM socket 11.

Note: Remove the DIMM following the Removing a memory module section of the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide.

Action 6: Reinstall the DIMM into DIMM socket 12.

Note: Install the DIMM following the instructions in the Installing a memory module section of the IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide.

Note: The memory DIMMs being mirrored must be the same type and speed as each other in the same bank.

Action 7: If the customer cannot reboot the HS22 server they can review the logs using Online DSA or AMM. If neither of these is a possibility then a maintenance window needs to be scheduled with the customer to reboot the system in to pDSA. The customer sees the system status message Invalid DIMM population for memory mode.

Note: Many customers do not support connecting to the AMM through its web interface. In this case the customer should be encouraged to download Online DSA.

Action 8: Repeat Actions 2 through 6.

Page 110: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 110 Xw5175r100.pdf

Tools available for problem determination

The following tools are available for problem determination with the HS22 blade server.

Advanced Management Module (AMM)

The AMM provides valuable problem determination information. The web interface of the AMM can be accessed by entering the Internet Protocol (IP) address into a web browser. Figure 67 shows the AMM web interface and the HS22 blade server in the System Status under Monitors section.

Figure 67

Page 111: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 111 Xw5175r100.pdf

You can use the AMM web interface to view the AMM Event Log discussed earlier in this course. You can view the LEDs on the HS22 blade server by selecting LEDs under the Monitors section shown in Figure 68.

Figure 68

Page 112: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 112 Xw5175r100.pdf

In the Hardware Product Data (VPD) of the Monitors section you can see the hardware VPD of the HS22 blade server shown in Figure 69.

Figure 69

Page 113: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 113 Xw5175r100.pdf

Selecting Firmware VPD in the Monitors section of the AMM web interface allows the firmware VPD of the HS22 blade server to be seen shown in Figure 70.

Figure 70

The web interface can also be used to power off or restart the blade server. It can be used to remote control the HS22, update the blade server's firmware, configure the HS22, configure Serial Over LAN (SOL) and configure Open Fabric Manager.

Page 114: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 114 Xw5175r100.pdf

Preboot Dynamic System Analysis (pDSA) and Online DSA

DSA collects and transmits system information to the IBM service and support organization. This information includes Configuration, status, logs, and diagnostics. DSA allows the customer to view the collected data before sending it to IBM. It produces a CIM-XML file to send to IBM. Customers can also use the collected data as input to their internal tools. Typical usage of DSA is shown in Figure 71.

Figure 71

Note: Web links are subject to change. If a web link in this course is not working please contact the owner of the course and report the broken link.

DSA can be downloaded from the Software download matrix - IBM Dynamic System Analysis (DSA) webpage:

https://www.ibm.com/systems/support/supportsite.wss/docdisplay?brandind=5000008&lndocid=SERV-DSA

Page 115: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 115 Xw5175r100.pdf

Windows Event Viewer

The Windows Event Viewer can be checked for errors with the system. The Event Viewer is divided into the Application, Security and System logs. These logs contain information that can help solve the customer's problem. They contain error messages issued by the operating system.

Linux Messages File

The Linux Messages File is for Red Hat/SuSE Linux systems. The Linux messages file is the main source of information for device driver problems and events. The path for the Linux messages file is:

/var/log/messages

HS22 Publications

The following guides can be used to do problem determination on the HS22 blade server:

IBM BladeCenter HS22 Type 7870 Problem Determination and Service Guide also known as the HS22 PDSG. The HS22 PDSG can be used to do problem determination on the blade server. It contains information on to remove and replace the main components of the system. The PDSG also contains information on error messages and what action to take when they are received.

IBM BladeCenter HS22 Type 7870 Installation and User’s Guide contains information on how to remove and replace the main components of the HS22 blade server.

FRU and CRU Part Number listing

Cover (all models) Tier 1 Customer Replaceable Unit (CRU) 46C7201

Heat sink, microprocessor (all models) - Field Replaceable Unit (FRU) 46C3545

Microprocessor 1.86 GHz/800MHz-4MB 80W (dual core) FRU 46D1272

Microprocessor 2.0 GHz/800MHz-4MB 80W (quad core) FRU 46D1271

Microprocessor 2.13 GHz/800MHz-4MB 80W (quad core) FRU 46D1270

Microprocessor 2.26 GHz/1066MHz-8MB 60W (quad core) FRU 46D1269

Microprocessor 2.26 GHz/1066MHz-8MB 80W (quad core) FRU 46D1267

Page 116: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 116 Xw5175r100.pdf

Microprocessor 2.40 GHz/1066MHz-8MB 80W (quad core) FRU 46D1266

Microprocessor 2.53 GHz/1066MHz-8MB 80W (quad core) FRU 46D1265

Microprocessor 2.67 GHz/1333MHz-8MB 95W (quad core) FRU 46D1264

Microprocessor 2.80 GHz/1333MHz-8MB 95W (quad core) FRU 46D1263

Microprocessor 2.93 GHz/1333MHz-8MB 95W (quad core) FRU 46D1262

Memory, DIMM filler Tier CRU 60H2962

Memory, 1 GB VLP DDR3 Tier 1CRU 44T1495

Memory, 2 GB VLP DDR3 Tier 1 CRU 44T1497

Memory, 4 GB VLP DDR3 Tier 1 CRU 44T1498

Memory, 8 GB VLP DDR3 Tier 1 Customer Replaceable Unit (CRU) 44T1580

Front bezel Field Replaceable Unit (FRU) 46D1141

Control panel (front bezel) Field Replaceable Unit (FRU) 46D1120

Hard disk drive, 2.5 inch hot-swap SAS 73 GB, 10 K, (option) Tier 1 Customer Replaceable Unit (CRU) 43W7537

Hard disk drive, 2.5 inch hot-swap SAS 73 GB, 15 K, (option) Tier 1 Customer Replaceable Unit (CRU) 43W7573

Hard disk drive, 2.5 inch hot-swap SAS 36 GB, 15 K (option) Tier 1 Customer Replaceable Unit (CRU) 43W7546

Hard disk drive, 2.5 inch hot-swap SAS 146 GB, 10 K (option) Tier 1 Customer Replaceable Unit (CRU) 43W7538

Hard disk drive, 2.5 inch hot-swap SAS 300 GB, 10 K (option) Tier 1 Customer Replaceable Unit (CRU) 43W7673

Solid state drive, 2.5 inch hot-swap 31.4 GB (option) Tier 1 Customer Replaceable Unit (CRU) 43W7651

Heat sink filler Field Replaceable Unit (FRU) 46C3548

Page 117: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 117 Xw5175r100.pdf

Hot-swap storage-bay filler Tier 1 Customer Replaceable Unit (CRU) 44T2248

Blade server base assembly Tier 1 Customer Replaceable Unit (CRU) 44T1805

Label, system service Tier 1 Customer Replaceable Unit (CRU) 46C3473

Page 118: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 118 Xw5175r100.pdf

Summary

This course has enabled you to:

1. Provide a high level overview of the HS22 Type 7870 blade server 2. Provide a high level overview of the HS22 memory subsystem 3. Provide a high level overview of the HS22 microprocessor subsystem 4. Describe how to use LightPath for problem determination 5. Describe how to use logs for problem determination 6. Describe how to collect logs for problem determination 7. Describe the firmware update process 8. Describe the diagnostics available 9. Provide problem determination examples 10. List tools available for problem determination

Page 119: Tai Lieu Dao Tao IBM Blade HS22 _ Khac Phuc Su Co

Servicing the IBM BladeCenter HS22 Type 7870 – Content

March 2009 119 Xw5175r100.pdf

Last Page