Top Banner
© IBM Corporation 2008 Andrew Lanczi Certified Consulting I/T Specialist [email protected] POWERHA Implementation Overview POWERHA Implementation Overview
70
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Ha Overview

© IBM Corporation 2008

Andrew LancziCertified Consulting I/T Specialist

[email protected]

POWERHA Implementation OverviewPOWERHA Implementation Overview

Page 2: Ha Overview

© IBM Corporation 2008

Understand the Implementation of HACMP

Planning

How HACMP works

Configuration options

HACMP/XD

Objectives

Page 3: Ha Overview

© IBM Corporation 2008

The Americas:HACMP Customers

Air Reservations Telco BillingsManufacturing Plant FloorTrading Systems Telco DirectoryCredit Verification EntertainmentFile Servers NIC ServersClaims Processing FinancialInventory Management Retail ISP's

Europe Africa Asia/Pacific:HACMP Customers

Police & Fire Fleet ServicesPublishing Security ServicesBanking ATM's PCB Process Mfg.Trading Systems Fleet ManagementProcess Control Cellular Phone SrvcAir Traffic Credit ProcessingBond Trading Health & Hospital

Over 60,000 Licenses Worldwide

Page 4: Ha Overview

© IBM Corporation 2008

HACMP Version Summary

Version Availability End of Support

V4.5 June, 2002 Sept, 2005

V5.1 July, 2003 Sept, 2006

V5.2 July, 2004 Sept, 2007

V5.3 July, 2005 Sept, 2008

V5.4 July, 2006 Sept, 2009 (est)

Green – supported; Red – no longer supported

Page 5: Ha Overview

© IBM Corporation 2008

HACMP V5.4 – Key Features

Faster Failure Detection using First Failure Data Capture (FFDC)

HACMP on Linux (for Power)

Performance improvements to HACMP/XD GLVM Can have up to 4 data mirroring networksSupport for Enhanced Concurrent Volume Groups

GPFS 2.3 Support

IPAT support on Geographic Networks (for HACMP/XD)

Page 6: Ha Overview

© IBM Corporation 2008

HACMP Cluster

A typical cluster consists of nodes, networks, shared storage, and clientsƒ HACMP supports 2 - 32 nodes per cluster

Client

Node A

Shared Disk

Network(s)

Node B

Page 7: Ha Overview

© IBM Corporation 2008

Hardware

pSeries Servers/ POWER No Integrated serial ports for heartbeat on P5 serversAnnouncement letter for limitations - #5765-F62

Network/SAN2-port Async. RS-232 - FC 57232 Gb FC PCI-X - FC 571610 Gb FC 5718 and FC 5719IBM, Cisco, McData, Brocade etc..TotalStorage SAN Volume Controller Software

Page 8: Ha Overview

© IBM Corporation 2008

Additional Hardware

TotalStorage ProductsDS8000 - online firmware update is now supported (code level 6.0.0.324 R10h.9b050406) and higherDS6000 - online firmware update is not supported in HACMP clustersDS4100 and DS4000 EXP1000 Serial ATA Hardware

DS4100 does not support multi-path I/O so no multi path falloverCSPOC cannot be used to add a DS4100 disk to AIXTotalstorage DS4000 EXP710 FC Storage Expansion unit (1740-710)

TotalStorage ESS (2105-F20, 2105-800)OEM per CSA, EMC, HDS...

Page 9: Ha Overview

© IBM Corporation 2008

Stand alone tools for planning a clusterCan be used to configure a cluster with cl_opsconfig utilityDoes not monitor or manage a clusterUseable on AIX or Windows 2000

Installation RequirementsJava Runtime Environment version 1.3.0 or higher

AIX already has this level jrewindows needs it

www.ibm.com/developerworks/webservices/sdk

Online Planning Worksheets (OLPW)

Page 10: Ha Overview

© IBM Corporation 2008

ASCII Based Cluster Configuration

Well defined and documented XML file structure using Document Type Definition (DTD) and XML schema

Configuration file can be passed to XML editorsAllows customers to modify cluster configs that can then be

passed to multiple clusterscl_exportdefinition - worksheet from existing cluster

Extermal DTD/usr/es/sbin/cluster/worksheets/hacmp-v5300.dtd

External XML Schema/usr/es/sbin/cluster/worksheets/hacmp-v5300.xsd

Page 11: Ha Overview

© IBM Corporation 2008

ASCII Based Cluster Configuration

UpdatedXML file

HACMP Cluster

Export Definitionfile for OLPW

XML file

Duplicate a cluster with changes

XML Editor

New HACMP Cluster

cl_opsconfig

Page 12: Ha Overview

© IBM Corporation 2008

Web-based Cluster Management-webSMIT

The Main page after login

Page 13: Ha Overview

© IBM Corporation 2008

Web-based Cluster Management-webSMIT

RequirementsAny "Apache-Compliant" web Server

IBM HTTP ServerApache

/usr/es/sbin/cluster/wsm/README tells how to install Apache from RPMs

Fileset cluster.es.client.wsm

(optionally)Documentation filesets:cluster.doc.en_US.es.pdfcluster.doc.en_US.es.html

Page 14: Ha Overview

© IBM Corporation 2008

HACMP Components

RSCT - (Reliable Scalable Cluster Technology)RMC subsystem

Cluster Manager - (clstrmgr)Recovery driverSNMP Services

clcomd - cluster communications daemon

clinfo - (cluster information services, optional)

Page 15: Ha Overview

© IBM Corporation 2008

How HACMP Works

Heartbeat is used to monitor health

IP Network

RS-232

Disk Disk

Page 16: Ha Overview

© IBM Corporation 2008

Networks for an HACMP cluster

At least two networks are recommended One physical network with multiple logical IP subnetsOne non IP based network

rs-232disk heartbeat

The goal is to avoid a partitioned cluster both nodes always get the latest information

Decide on the mechanism to provide availability of Service addresses

IPAT via Replacement IPAT via Aliasing - The default

Other requirementspersistent IP addresses

Page 17: Ha Overview

© IBM Corporation 2008

Which IPAT?

"IPAT via Aliasing": IP address takeover performed by moving an IP alias address from one interface to another , without changing the base address of the interface

ƒIP aliasing allows multiple resource groups to be configured using the same adapters

ƒIP aliased networks use boot and service labelsBoot label on standby adapters as wellNo Hardware Address Takeover on Service IP

"IPAT via IP Replacement": IP address takeover performed by swapping an interfaces boot-time address with a service IP address

ƒOnly one address can be active on an interface at any time

Page 18: Ha Overview

© IBM Corporation 2008

Network IPAT Connection options

IPAT via Replacementa_boot 192.37.56.10 b_svc 192.37.56.20a_svc 192.37.56.1a_standby 10.10.20.1 b_standby 10.10.20.2

IPAT via Aliasinga_boot1 10.10.20.1 b_boot3 10.10.20.10a_svc 192.37.56.1a_boot2 10.10.30.1 b_boot4 10.10.30.2

sysa sysb

client

ETHERNET1subnet mask 255.255.255.0

Ethernet2

Page 19: Ha Overview

© IBM Corporation 2008

Networking with switches

One Layer 3 VLAN with multiple logical subnets

Do not place intelligent network equipment that does not transparently pass through UDP broadcasts and other packets to all cluster nodes. If such equipment is placed in the paths between cluster nodes and clients, use a $PING_CLIENT_LIST (in clinfo.rc

Page 20: Ha Overview

© IBM Corporation 2008

CISCO Example with HACMP

Assume that the customer is using the standard Cisco Switch product line of 3550, 3750, 49xx, 6500, etc.At the L3 level, one vlan is all that is needed to satisfy the HACMP setup. You just define multipleIP addresses as secondary addresses on this vlan interface for the multiple subnets. For example: thethree subnets are 1.1.1.0/24, 1.1.2.0/24, and 1.1.3.0/24 as primary, standby, alias HA networksrespectively all using vlan 50. On a Cisco L3 switch/router using IOS, you would code the following:

Switch# config tSwitch(config)# vlan 50Switch(config-vlan)# name HACMP_SetupSwitch(config-vlan)# exitSwitch(config)# int vlan 50Switch(config-if)# ip address 1.1.1.254 255.255.255.0Switch(config-if)# ip address 1.1.2.254 255.255.255.0 secondary ( Alias IP address net)Switch(config-if)# ip address 1.1.3.254 255.255.255.0 secondary ( Standby IP address net)Switch(config-if)# no shutSwitch(config-if)# exitSwitch(config)# exitSwitch#You now have vlan 50 customized with three different ip address identities (one for each of the subnets), and all of them pingable. Alias and standby/boot are tagged as secondary.

Page 21: Ha Overview

© IBM Corporation 2008

Persistant Labels

An IP alias that is always available if a service or boot interface is active

Intended to provide administrators access to a node

Only one persistent label per node per network is allowed

Once synchronized, they are always availableCan be used for HATivoli oserv process IP

Page 22: Ha Overview

© IBM Corporation 2008

Heartbeat Over RS232

A point to point non IP serial networkusually implemented using Async adapter and a null modem cable connectionOn some pSeries servers that have 3 or 4 built in serial ports, ports 2, 3 or 4 can be used for this connection

Built in serial port 1 is not supported for HACMP In LPAR mode it is better to allocate a PCI slot for the

async adapter on each server is using rs-232 serial networkCheck documentation to make sure the port is

supported, prior to implementation, p5 server = no!

Page 23: Ha Overview

© IBM Corporation 2008

Provides users with:A point to point network type that is easy to configure as a volume groupAdditional protection against cluster partitioning A Serial network that can use any disk type

Doesn't require additional hardware.

For customers that consider rs232, tmssa, or tmscsi too costly or complex to setup

Requires an enhanced concurrent VGConfigured via the "Extended Configuration path"Uses a disk sector formerly reserved for clvmd

May not be a good alternative for a disk with heavy I/O

Heartbeat Over Disk (diskhb)

Page 24: Ha Overview

© IBM Corporation 2008

Any disk in an enhanced concurrent VG can be usedpoint to point networks

Heartbeat Over Disk (diskhb)

enhconcvghdisk1

hdisk2hdisk3

disknet2

disknet2

disknet3

disknet3

disknet1 disknet1

Page 25: Ha Overview

© IBM Corporation 2008

clcomd provides a secure transport layer caches ODM's for performance

/var/hacmp/odmcache - about 1MB per nodemanaged by SRC and started by init, the inittab entry is:

clcomdES:2:once:startsrc -s clcomdES > /dev/console 2>&1

Source addresses are checked against/usr/sbin/cluster/etc/rhostsHACMPadapter ODMHACMPnode ODM (communication paths)

Cluster Communication Daemon

Page 26: Ha Overview

© IBM Corporation 2008

Security StrategiesThe default is autodiscoveryAIX Cluster security - CtSecUse a VPN tunnel

Set up persistent IP labels on the same subnetchgsrc to add the -p to clcomdspecify port 6191 (clcomd entry in /etc/services)use the extended VPN configuration screen to secure traffic for other

cluster servicesIf there is an unresolvable label in /usr/es/sbin/cluster/etc/rhosts, all

connections will be deniedLog Files

– /var/hacmp/clcomd/clcomd.log[.0]- up to 1 MB each– /var/hacmp/clcomd/clcomddiag.log[.0]- up to 9 MB each

Cluster Communication Daemon

Page 27: Ha Overview

© IBM Corporation 2008

LVM and Disks

Use mirrored logical volumesIncluding mirrored jfslogsConsider quorum issues

Use mutiple connections from the servers to the disk subsystem(s)

DBVGjfsloglv

dblv

db2lv

DBVG'jfsloglv'

dblv'

db2lv'

Page 28: Ha Overview

© IBM Corporation 2008

Requires Enhanced Concurrent Volume GroupsProvides a significant performance gain for takeover of volume groups consisting of a large number of disks

requires Enhanced Concurrent Volume Groups in non-concurrent resource groups

Uses RSCT for communicationHACMP coordinates activity between nodes - active vs passive

varyon etc.bos.clvm.enhIf migrating shared VGs must be converted

System Management (CSPOC) - recommendedor chvg -C on ALL cluster nodes

Fast Disk Takeover

Page 29: Ha Overview

© IBM Corporation 2008

Management feature to simplify the configuration of LVM mirroring between two sites.

Provides automatic LVM mirror synchronization after disk failure when a node/disk becomes available in a SAN network.Maintains data integrity by eliminating manual LVM operations.Cluster verification enhancements to ensure the data high availability.Keeping the data in different locations eliminates the possibility of data loss upon disk block failure situation.For high availability each mirror copy should be located on separate physical

disk, in separate disk enclosure, at separate physical locations.LVM mirroring allows up to three data copies.Mirror synchronization is required for stale partitions.

Cross Site LVM mirroring

Page 30: Ha Overview

© IBM Corporation 2008

Cross-Site LVM Mirroring

Two sites connected using a SAN network

Site A Site B

Node A Node C

Node BNode D

FC Switch 1 FC Switch 2

PV1

PV2

PV5

PV3

PV4

PV6

Page 31: Ha Overview

© IBM Corporation 2008

VIO Server

VIOS owns physical disk resources- LVM based storage on VIO Server

LPAR’S sees disks as vSCSI (Virtual SCSI) devices- Virtual SCSI devices added to partition via HMC- LUNs on VIOS accessed as vSCSI disk

HYPERVISOR

VIO Server AIX1 Partition

AIX2Partition

hdisk1

AIX2VG

SAN Storage Subsystem

Ethernet

AIX1VG

hdisk2

Page 32: Ha Overview

© IBM Corporation 2008

VIO Server Implementation

Single VIO Server configuration has exposures

The VIO Server partition is shutdown or fails Network connectivity through the VIO ServerDisk FailureSystem failure

Page 33: Ha Overview

© IBM Corporation 2008

High Availability with Dual VIO Servers

VLAN 1VLAN 2

Virtual Ethernet- Partition to partition communication- Requires AIX 5L V5.3 and POWER5

VLAN – Virtual LAN- Provide ability for adapter to be on multiple subnets- Provide isolation of communication to VLAN members- Allows a single adapter to support multiple subnets

IEEE VLANS - Up to 4096 VLANS- Up to 65533 vENET adapters- 21 VLANS per vENET adapter

Shared Ethernet Adapter- Provides access to outside world- Uses Physical Adapter in the Virtual I/O Server

External Servers

POWER Hypervisor

AIX2AIX1VIOServer

vSCSI

vLAN

SharedEthernetAdapter

Virtual Ethernet Switch SharedEthernetAdapter

VIOServer

vSCSI

vLAN

External Servers

Page 34: Ha Overview

© IBM Corporation 2008

VIO Server with HACMP Cluster

Available via Advance POWER Virtualization

Issues:Network Connectivity?Shared Disk Access?SPOFs

HYPERVISOR

VIO ServerHACMP

AIX1 Partition

HACMPAIX2

Partition

hdisk1

AIX2VG

SAN Storage Subsystem

Ethernet

AIX1VG

hdisk2

Page 35: Ha Overview

© IBM Corporation 2008

A collection of resources is a resource group. Resources can be:ApplicationsVolume Groups, Disks, Filesystems IP Addresses

Custom Resource GroupsUsers explicitly specify the desired startup , fallover, and fallback

behaviorsCan be configured using standard or extended pathSettling and Fallback timers provide further granularity

Dynamic node priority can provide even further granularity in multi-node cluster.

Custom Resource Groups

Page 36: Ha Overview

© IBM Corporation 2008

Startup PreferencesOnline On Home Node Only - (OHN)Online on First Available Node - (OFAN)Online Using Distribution Policy - (OUDP)Online On All Available Nodes (concurrent) - (OAAN)

Fallover PreferencesFallover To Next Priority Node In The List - (FNPN)Fallover Using Dynamic Node Priority - (FUDNP)Bring Offline (On Error Node Only) - (BO)

This is most appropriate for concurrent type RGsFallback Preferences

Fallback To Higher Priority Node - (FHPN)Never Fallback - (NFB)

Custom Resource Groups

Page 37: Ha Overview

© IBM Corporation 2008

Resource Distribution Policies

Control of IP via aliasing service labels Collocation - all resources of this type will be on the same physical

resourceAnti-collocation - all resources of this type are allocated to the first

physical resource that is not already serving a resource - default

Configure Service IP Labels/Address Distribution Preference

Type or select Values in entry fields.Press Enter AFTER making all desired changes

[Entry Fields]

*Network Name net_ether_01 *Distribution Preference Anti-Collocation +

HACMP Extended Resources ConfigurationConfigure Resource Distribution Policies

Page 38: Ha Overview

© IBM Corporation 2008

Resource Distribution Policies

All policies are exercised by cluster event scriptsacquire_service_addracquire_takeover_addrcl_configure_persistent_address

collocation with persistentanti-collocation with persistent

Feature is available in all versions of HACMP V5HA 5.1 requires APAR IY63515HA 5.2 requires APAR IY63516

Page 39: Ha Overview

© IBM Corporation 2008

Used for multi-tiered architectures that require ordered resource group processing

Allows the implementer to specify cluster wide dependencies between resouces groups

parent - child - Dependency typeOption to display dependancies

clRGinfo -a

Dependent Resource Groups

Resource Group B(parent resource group)

Resource Group C(parent resource group)

Resource Group A(child resource group)

Dependency

Page 40: Ha Overview

© IBM Corporation 2008

Custom Resources

Three Node Cluster with one resource group configured for Online on Home Node Only priority at startupSysa is the current owner of the resource group

GROUPAa_svc 1.1.1.1dbvgdbapp

syscsysa

a_svc

dbvg

a_stdby b_svc b_stdby c_svc c_stdby

sysb

Page 41: Ha Overview

© IBM Corporation 2008

Custom Resources after fallover

syscsysb

a_svc

dbvg

b_svc c_svc c_stdby

sysa

a_stdby

Sysa has crashed!Fallover To Next Priority Node In The ListIf sysb was not available, sysc would have acquired GROUPA

GROUPAa_svc 1.1.1.1dbvgdbapp

Presenter
Presentation Notes
In a Cascading resources failover, the service address from sysa ends up on the standby adapter in sysb. The reason is that sysb might have it's own mission critical application (mutual takeover), and we don't want to interfere with that application.
Page 42: Ha Overview

© IBM Corporation 2008

Custom Resources - reintegration

Sysa is repaired and hacmp is restarted on sysaƒ Fallback To Higher Priority Node

GROUPAa_svc 1.1.1.1dbvgdbapp

sysa syscsysb

a_svc

dbvg

a_stdby b_svc b_stdby c_svc c_stdby

Presenter
Presentation Notes
In HACMP 4.4, we can configure cascading with out fallback, if we want to change this behavior.
Page 43: Ha Overview

© IBM Corporation 2008

Custom Resources after fallover

syscsysb

b_boot

dbvg

b_svc c_svc c_boot

sysa

a_stdby

Sysa has crashed!ƒFallover Using Dynamic Node PriorityDestination determined by DNP rules - lowest CPU usage

GROUPAa_svc 1.1.1.1dbvgdbapp

Presenter
Presentation Notes
In a Cascading resources failover, the service address from sysa ends up on the standby adapter in sysb. The reason is that sysb might have it's own mission critical application (mutual takeover), and we don't want to interfere with that application.
Page 44: Ha Overview

© IBM Corporation 2008

Online on All Nodes

Up to 32 nodes access the data simultaneouslyAll systems have the resource groupAll systems read and write to the databaseLCMP

sysa syscsysbdbvg

GROUPAdbvgORAC

Presenter
Presentation Notes
In this configuration, both systems write to the same disks at the same time. This implies that a lock manager is used to control access. This configuration requires the HACMP concurrent resources feature code and an application that has incorporated the HACMP lock manager or a locak manager that will work in this environment. Oracle Parallel Server is currently the only packaged product that supports this environment. � If a system fails there is no disk takeover required for concurrent resources. The surviving system, already has the mission critical database varied on.
Page 45: Ha Overview

© IBM Corporation 2008

Implementers can configure a cluster settling time to minimize RG fallback activity when multiple nodes are started at the same time

controls how long to wait for a higher priority node to join the cluster before bringing a resource group onlineOne time per cluster - preference = Online On First Available Node

Can be found under "Configure Resource Group Run-Time Policies" in SMIT

Configuration Settling Time

Configure Settling Time for Resource Groups

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

* Settling Time (in Seconds ) [0] #

Page 46: Ha Overview

© IBM Corporation 2008

Configuration Fallback TimersFallback Timers allow the implementer to control

when a RG fallback will occur - off peak, weekends, etc..Can only be configured in the extended path

Custom Resource Groups

Configure Specific Date Fallback Timer Policy

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

* Name of the Fallback Policy [ ]

* YEAR

[ ] #

* MONTH (jan -

Dec)

[ ] +

* Day of Month (1 -

31)

[ ] +#

* HOUR (0 -

23)

[ ] +#

* MINUTES (0 -

59)

[ ] +#

Page 47: Ha Overview

© IBM Corporation 2008

An Application Server is a label with an associated start and stop script

Start/Stop = Absolute path to the executable scripts

Application Servers - Standard Path

Add an Application Server

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

* Server Name [ myapp ]

* Start Script

[/usr/local/app/start_app]

* Stop Script

[/usr/local/app/stop_app]

F1=Help F2=Refresh F3=Cancel F4=List

F5=Reset F6=Command F7=Edit F8=Image

F9=Shell F10=Exit Enter=Do

Page 48: Ha Overview

© IBM Corporation 2008

Application Monitoring

HACMP supports multiple monitors per application server ƒConfigured via SMIT - Extended Configuration

Add Application Server[Entry Fields]

* Server Name [ appsrv ]* Start Script [ /app/startserver ]* Stop Script [ /app/stopserver ]Application Monitor Name(s) monitor1 monitor2 +

#smitty hacmpExtended Configuration

Extended Resource ConfigurationHACMP Extended Resources Configuration

Configure HACMP ApplicationsAdd an Application Server

Page 49: Ha Overview

© IBM Corporation 2008

Add a Process Application Monitor

Type or select values in entry fields.Press Enter AFTER making all desired changes.

[Entry Fields]* Monitor Name [ ]* Application Server(s) to Monitor +* Monitor Mode [Long-running monitori> +* Processes to Monitor [ ]* Process Owner [ ]

Instance Count [ ] #* Stabilization Interval [ ] #* Restart Count [ ] #

Restart Interval [ ] #* Action on Application Failure [notify] +

Notify Method [ ]Cleanup Method [ ]Restart Method [ ]

Application Monitoring

Page 50: Ha Overview

© IBM Corporation 2008

SMIT flow in HACMP"Standard" configuration path allows users to easily configure most common options

IPAT via Aliasing NetworksShared service IP labelsVolume Groups and FilesystemsApplication Servers

"Extended" path is used for fine tuning a configuration and configuring less common features

Configure all network typesConfigure all resource typesLess common options

Site supportApplication MonitoringPerformance Tuning Parameters

User Interface

easy as cake

Page 51: Ha Overview

© IBM Corporation 2008

Topology configuration in the "Standard" path is carried out automaticallyConfiguration discovery is automaticNode names are set by discovering the host namesIP network topology is set based on physical connectivity and

netmasks

Why Do I Need the "Extended Path"?Specify sites, global networks, specific network attributesApplication MonitoringTape resourcesCustom disk methods and resource recoveryExtended Event ConfigurationExtended Performance TuningSecurity and Users Snapshot Configuration

User Interface

Page 52: Ha Overview

© IBM Corporation 2008

SMITTY HACMP

User Interface

HACMP for AIX

Move cursor to desired item and press Enter.

Initialization and Standard Configuration

Extended Configuration

System Management (C-SPOC)

Problem Determination Tools

F1=Help F2=Refresh F3=Cancel F8=Image

F9=Shell F10=Exit Enter=Do

Page 53: Ha Overview

© IBM Corporation 2008

Standard Configuration Menu

Initialization and Standard Configuration

Move cursor to desired item and press Enter.

Two-Node Cluster Configuration Assistant Add Nodes to an HACMP Cluster Configure Resources to Make Highly Available Configure HACMP Resource Groups Verify and Synchronize HACMP Configuration HACMP Cluster Test ToolDisplay HACMP Configuration

F1=Help F2=Refresh F3=Cancel F8=Image F9=Shell F10=Exit Enter=Do

Standard Path Installation

Page 54: Ha Overview

© IBM Corporation 2008

Two-Node Configuration Assistant

An advanced automation infrastructure for On-Demand operating environments

intended for pre-existing application environments that wish to add high availability

Automatically configures a simple two node cluster based on the following input:

Communication path to the remote nodeapplication server nameapplication start/stop scriptsservice IP label

Will automatically copy the start/stop scripts to the remote node

Page 55: Ha Overview

© IBM Corporation 2008

Users must configure the topology and resources At the AIX level before using the Coonfiguration Assistant

Before you start, complete the following tasks:Connect and configure all IP network interfaces.

Install and configure the application to be made highly available.Add the application's service IP label to /etc/hosts on all nodes.Configure the volume groups that contain the application's shared

data on disks that are attached to both nodes.An active communication path to the takeover node.A unique name to identify the application to be made highly available.The full path to the application's start and stop scripts.The application's service IP label.

Two Node configuration Assistant

Page 56: Ha Overview

© IBM Corporation 2008

Two-Node Cluster Configuration Assistant

# smitty hacmpInstallation and standard configuration

Two-Node Cluster Configuration Assistant

Two-Node Cluster Configuration Assistant

Type or select values in entry fields. Press Enter AFTER making all desired changes.

[Entry Fields] * Communication Path to Takeover Node [ ] + * Application Server Name [ ] * Application Server Start Script [ ] * Application Server Stop Script [ ] * Service IP Label [ ] +

Page 57: Ha Overview

© IBM Corporation 2008

Two-Node Configuration Assistant

Will create a cluster with the following characteristicsIPAT via IP aliasingLocal node is the highest priorityStartup on highest priority node Fallover to the remote nodeNever FallbackContains one application serverContains one service IP labelContains all shareable Volume Groups

Activity is logged to /var/hacmp/log/clconfigassist

Will synch and verify - clverifycan be set to auto-correct- default is no

Page 58: Ha Overview

© IBM Corporation 2008

Cluster Start

# smitty clstartSystem Management - (C-SPOC)

Manage HACMP ServicesStart Cluster Services

Start Cluster Services

Type or select values in entry fields. Press Enter AFTER making all desired changes.

[Entry Fields] * Start now, on system restart or both [both] Start Cluster Services on these nodes [] BROADCAST message at startup? true + Startup Cluster Information Daemon true + Reacquire after forced down false +Ignore verification errors? falseAutomatically correct errors found during Interactively Cluster start?

Page 59: Ha Overview

© IBM Corporation 2008

clverify logfiles

clverify collects and archives the dataƒ/var/hacmp/clverify/current/ - stores data used during the current

verification attempt. This should not exist unless verification is running or was aborted

ƒ/var/hacmp/clverify/aborted/ - stores data from the most recent aborted verification attempt

ƒ/var/hacmp/clverify/fail/ - stores data from the most recent failed verification attempt

ƒ/var/hacmp/clverify/pass/ - stores data from the most recent passed verification attempt.

ƒ/var/hacmp/clverify/pass.prev/ - stores data from the second most recent passed attempt

Page 60: Ha Overview

© IBM Corporation 2008

Standard Topology ConfigurationUsers must specify communication path, IP address, IP label, or FQDNHACMP will contact the nodes using the specified comm paths and

automatically configure the base IP topology

Standard Path Installation

Configure Nodes to an HACMP Cluster (standard)

Type or select values in entry fields. Press Enter AFTER making all desired changes.

[Entry Fields] * Cluster Name [andrews_cluster]

New Nodes (via selected communication paths) [node1 node2] + Currently Configured Node(s)

Page 61: Ha Overview

© IBM Corporation 2008

Standard Resource ConfigurationUsers may only configure the most common resource typesNOTE: Service IP Labels/Addresses are now configured as resourcesConfiguring a Service label is required for the "Standard Path"

Installation Standard Path

Configure Resources to Make Highly Available

Move cursor to desired item and press Enter.

Configure Service IP Labels/Addresses Configure Application Servers Configure Volume Groups, Logical Volumes, and Filesystems Configure Concurrent Volume Groups and Logical Volumes

Page 62: Ha Overview

© IBM Corporation 2008

Simplifies cluster validationAutomates testing of an HACMP cluster–Tests are carried out in sequence and analyzed by the clusterLog file /var/hacmp/log/cl_testtool.logCustom Test plans can be created

Cluster test tool runs the following tests by defaultNODE_UP -Start one or more nodesNODE_DOWN_FORCED - Stop a node forcedNODE_DOWN_GRACEFUL - Stop one or more nodesNODE_DOWN_TAKEOVER - Stop a node with takeoverCLSTRMGR_KILL - catastrophic failureNETWORK_DOWN_LOCAL - Stop a network on a nodeNETWORK_UP_LOCAL - Restart a network on a nodeSERVER_DOWN - Stop an application server

Cluster Test Tool

Page 63: Ha Overview

© IBM Corporation 2008

SMS Text Messaging - HACMP

Allows alerts of cluster events to be sent to cell phones and pagersEasily customizable using SMITMessages may be sent through an SMS gateway

[email protected][email protected]

GSM Modem (Global System for Mobil Comm)A wireless modem that connects to a cellular networkallowing a computer to connect to the net wirelessly

Page 64: Ha Overview

© IBM Corporation 2008

SMS Text Messaging - HACMP

A ";" in the number will result in an alpha numeric page being sent - 18005552222;437-1881The @ character will send an SMS message using /usr/bin/mail The "#" will cause an SMS message to be sent wirelessly via GSM modem - 437-1881#

Add a Custom Remote Notification Method

Type or select Values in entry fields.Press Enter AFTER making all desired changes

[Entry Fields]

* Method Name [SMS_Notify]Description [ Node Down ]

* Nodename(s) [ NodeA] +* Number to dial or cell phone address [ [email protected]]* Filename [ /usr/es/sbin/cluster/samples/pager/sample.txt ]•Cluster Event (s) node_down +

Page 65: Ha Overview

© IBM Corporation 2008

Page 66: Ha Overview

© IBM Corporation 2008

Smart Assists

WebSphere 6.0 standalone and NDN+1 and hot standby

Oracle "cold failover cluster" (CFC)Oracle app server 10g(9.0.4) (AS 10g)Two node - hot standby

DB2 - UDB Enterprise Server Edition (v8.1 and 8.2)N+1 and hot standbyDB2 software must not be installed on the share storage

Page 67: Ha Overview

© IBM Corporation 2008

Uses Site Support for PPRCProvides support for ESS 2105-F20 and 2105-800

HACMP/XD for eRCMF (Enterprise Remote Copy Management Facility)

Requires ESS eRCMF version 2.0

Uses Site support for GLVM supports cross site data replication with no distance limitation

Synchronous A maximum of 2 sites

Page 68: Ha Overview

© IBM Corporation 2008

HACMP XD:PPRC Support

Peer-to peer Remote CopyHACMP coordinates Sharks to ensure failover of the

environment

Site 1 Site 2

WAN

ESS PPRC

Shark 1Shark 2Hardware Based Data Mirroring

Page 69: Ha Overview

© IBM Corporation 2008

Two Site HACMP Cluster With Geographic LVM

PV1

Node A Node B

Boston

PV2 PV4PV3

PV4= hdisk10PV3 = hdisk9

PV2 = hdisk8PV1 = hdisk7

PV4 = hdisk8PV3 = hdisk7

PV2 = hdisk6PV1 = hdisk5

PV3 PV4 PV1 PV2

Real Copy #1 Real Copy #2Virtual Copy #1Virtual Copy #2

TCP/IP WAN

One volume group actually spans both sites. Each site contains a copy of mission-critical data. Instead of extremely long disk

cables, a TCP/IP network and the RPV device driver are used for remote disk access.

Austin

Page 70: Ha Overview

© IBM Corporation 2008

More Information

HACMP System Administration l: Planning and Implementation HACMP System Administration ll: Administration and Problem

DeterminationHACMP System Administration III: Virtualization and Disaster RecoveryHACMP Problem Determination and RecoveryHACMP Certification Workshop (2 days)AM050 HACMP High Availability Products for pSeries Overview (2 days)

[email protected] - Comments and Questions about HACMPwww-1.ibm.com/servers/eserver/pseries/hahttp://www.ibm.com/servers/aix/libraryhttp://www-1.ibm.com/servers/eserver/pseries/library/hacmp_docs.html