-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 1 of 27 fujitsu.com/eternus
Best practice FUJITSU Storage ETERNUS DX S3 Storage Cluster
Technical Info
This document will give you some technical information about the
new ETERNUS Storage Cluster feature. It will help you to understand
how to configure, manage and use this new function which enables an
ETERNUS DX S3 storage system to get high availability by connecting
two ETERNUS DX S3 storage devices.
Content
Introduction 2 Overview 2 Requirements 3
Software 3 Licenses 3
Storage Cluster setup and configuration 4 Storage Cluster
configuration 5 Storage Cluster allocating Business Volumes 7
Storage Cluster Controller setup 11 Storage Cluster processing 12
Storage Cluster bi-directional information and setup 15
Recovery procedure caused by defect RAID Group 16 Preconditions
16 1. Step 16 2. Step 17 3. Step 17 4. Step 17 5. Step 18 6. Step
19
Appendix 20 Fibre Channel Switch read-only discovery 21 Status
of TFO Group Information 23 Recommendations 25 TFO Checklist 26
Abbreviations 27
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 2 of 27 fujitsu.com/eternus
Introduction The ETERNUS Storage Cluster is a high-availability
feature of the ETERNUS DX S3 family of storage devices. Assigned
volumes named Transparent Failover Volumes (TFOV) used in a Storage
Cluster configuration are mirrored and paired from the Primary
storage system to the Secondary storage system by remote equivalent
copy (array-based replication with REC synchronous mode of Advanced
Copy function). In normal state, the Fibre Channel (FC) ports
configured on the Primary site are linked up and the ones on the
Secondary site are linked down so that business servers issues I/O
to the Primary storage system. The Storage Cluster Controller is
connected by LAN to both ETERNUS DX S3 storage devices for
heartbeat monitoring. The Storage Cluster Controller is responsible
for avoiding any kind of split-brain scenario for a Storage Cluster
configuration setup in automatic failover mode. In case the Primary
storage system crashes the Storage Cluster Controller is required
for the decision to switchover operation to the Secondary storage
system (automatic failover). If the Storage Cluster Controller is
not connected a user needs to operate a manual failover to the
Secondary storage system. When the failover is invoked, the Fibre
Channel (FC) ports configured on the Primary storage system links
down and the ones on the Secondary storage system links up, taking
over the volume information including WWN/WWPN of the Primary site
so that business servers issues I/O to the Secondary storage
system. To achieve this functionality a user needs to configure the
Storage Cluster feature using ETERNUS SF. Overview The ETERNUS
Storage Cluster is a function which enables the storage system to
get high availability by connecting two ETERNUS DX S3 storage
systems. One of them is the Primary storage system and the other is
the Secondary storage system. In case where the Primary (active)
storage system is no longer available due to hardware failure or
unexpected disaster, the I/O path (host connections) of the working
business servers are switched to the mirrored Secondary (standby)
storage system. In Auto Mode configuration this failover is
transparent for both servers and applications and ensures
uninterrupted operations. Additionally a user could initiate a
manual failover from the Primary (active) storage system to the
Secondary (standby) storage system any time. This could take place
in case when a RAID Group hosting the volumes (TFOV) used in the
Storage Cluster configuration is destroyed due to several disk
failures and the ETERNUS DX S3 storage system is still up and
running. Another approach for a manual failover could be storage
system downtime due to hardware maintenance or firmware
upgrades.
The picture above illustrates the functional design of the
Storage Cluster feature in a single-sided Transparent Failover
(TFO) configuration.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 3 of 27 fujitsu.com/eternus
Requirements For the Storage Cluster feature you need to connect
two ETERNUS DX S3 storage systems used as a pair. Each of them
could be an ETERNUS DX100 S3, ETERNUS DX200 S3, ETERNUS DX500 S3 or
ETERNUS DX600 S3. The Storage Cluster feature requires firmware
version V10L20-000 or later. In addition you need to have a server
running the ETERNUS SF V16.1 Manager software. The operating system
used on that server could be either Windows, Linux or Solaris. Read
the ETERNUS SF Express V16 / Storage Cruiser V16 / AdvancedCopy
Manager V16 Installation and Setup Guide for details about the
supported version of each operating system. As a strong
recommendation you should use a dedicated server for running the
Storage Cluster Monitoring (ETERNUS SF V16.1 Storage Cruiser Agent)
software. Note: The operating system of this server must be Windows
based. This might be changed in the future. The zoning at the Fibre
Channel (FC) switches for the business server connections to the
ETERNUS DX S3 storage systems must be a WWPN based Fibre Channel
(FC) zoning only. Note: Fibre Channel (FC) ports used by the
Storage Cluster feature couldnt be members of any Port Group on
each ETERNUS DX S3 storage system and should have exactly the same
settings (speed, topology etc.) at the Primary (active) and the
Secondary (standby) storage system. In addition Host Affinity must
be enabled for these Fibre Channel (FC) ports. This can be checked
within ETERNUS SF V16.1 for each ETERNUS DX S3 storage system under
Connectivity -> FC Port.
The volumes (TFOV) used by the Storage Cluster feature must be
created with identical size and the host LUN numbers used in each
LUN Group must be identical on both ETERNUS DX S3 storage systems.
Make sure that nobody has a lock (is working with the ETERNUS DX S3
HW-GUI) on each of the two ETERNUS DX S3 storage systems involved
by the Storage Cluster feature while configuring the Storage
Cluster functionality. Ports used for the REC Path for the Storage
Cluster feature could be configured RA or CA/RA. The later one is
not recommended because it will have an influence to the
performance of the Storage Cluster feature volumes (TFOV) used by
the business servers. In addition you couldnt attach business
servers using the Storage Cluster feature to CA/RA ports. These
business server connections need to have dedicated CA ports only.
The REC Path must be configured using ETERNUS SF V16.1 or using the
ETERNUS DX S3 HW-GUI otherwise the Storage Cluster setup cant be
configured. Note: Dont remove LUNs from a LUN Group used by a TFO
Group which is in Phase = Maintenance ! Software The Storage
Cluster functionality can be set up, configured, managed and
checked through the Web Console of the ETERNUS SF V16.1 Manager
software. There are two options for initiating a failover from the
Primary (active) storage system to the Secondary (standby) storage
system.
Automatic Failover Manual Failover
The Storage Cluster Monitoring function is provided by the
ETERNUS SF V16.1 Storage Cruiser Agent software. Licenses For each
discovered ETERNUS DX S3 storage system used for the Storage
Cluster feature you need to purchase and register these kinds of
licenses:
ETERNUS SF Storage Cruiser V16 Standard License ETERNUS SF
Storage Cruiser V16 Storage Cluster Option ETERNUS SF AdvancedCopy
Manager V16 Remote Copy License
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 4 of 27 fujitsu.com/eternus
Storage Cluster setup and configuration Install the ETERNUS SF
V16.1 Manager software on one of your servers. Additional
information related to the installation can be found in the ETERNUS
SF Express V16 / Storage Cruiser V16 / AdvancedCopy Manager V16
Installation and Setup Guide. Afterwards you need to discover your
two ETERNUS DX S3 storage systems and register all required
licenses for both ETERNUS DX S3 storage systems needed by the
Storage Cluster feature. As a strong recommendation you should
discover the Fibre Channel (FC) switches, for receiving SNMP traps
in case of problems at the switches, as well. This can be done in a
read-only way so that ETERNUS SF V16.1 isnt able to modify any
switch configuration. See Appendix for details about the read-only
discovery of Fibre Channel (FC) switches. As another recommendation
you should install the ETERNUS SF V16 Storage Cruiser Agent
software at your business servers and discover these servers in
ETERNUS SF V16.1 as well. This will enable the graphical end-to-end
correlation view within ETERNUS SF V16 Manager GUI for these
servers. Set up the WWPN Zoning between the ETERNUS DX S3 storage
systems and your business servers at the Fibre Channel (FC)
switches first. Afterwards start your setup of the Storage Cluster
functionality. The setup and configuration of the Storage Cluster
feature needs to be done at the ETERNUS SF V16.1 Manager GUI. All
related settings needed for the Storage Cluster setup and
configuration can be found under the Connectivity and the Storage
Cluster selection in the category pane of a discovered ETERNUS DX
S3 storage system in the ETERNUS SF V16.1 Manager GUI.
As a rule of thumb you should use self-explanatory names for all
related configuration elements used by the Storage Cluster
functionality such as FC Hosts (e.g. PRI_SRV01_HBA0 and
SEC_SRV01_HBA0), LUN Groups (e.g. PRI_SRV01_LG and SEC_SRV01_LG)
and TFO Groups (e.g. DX600_to_DX500 or DX500#1_DX500#2). Please be
aware that any kind of names you are using for the configuration of
the Storage Cluster feature should not exceed 16 characters. In
addition you should start the setup of the Storage Cluster feature
always at the Primary (active) ETERNUS DX S3 storage system within
the ETERNUS SF V16.1 Manager GUI.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 5 of 27 fujitsu.com/eternus
Storage Cluster configuration Switch to the Primary (active)
ETERNUS DX S3 storage system within the ETERNUS SF V16.1 Manager
GUI and enter the Storage Cluster section. There you will find
everything related to the Storage Cluster TFO Group and an entry
point for creating a REC Path used by the Storage Cluster
feature.
You should start to create a REC Path first. The REC Path
configuration will be done by the well-known REC Path configuration
wizard. Additional information about the creation of an ETERNUS DX
S3 REC Path is available in the ETERNUS SF V16 documentation.
Supported protocols used by the REC Path for the Storage Cluster
feature are FC and iSCSI. It is strongly recommended to use at
least one port of each CM of the two ETERNUS DX S3 storage systems
for the REC Path configuration. Note: Because the REC configuration
runs always in synchronous mode, you should change the Priority
Level at each ETERNUS DX S3 storage system, involved in the Storage
Cluster functionality to the highest number. This setting can be
done using the ETERNUS DX S3 HW-GUI only. Please refer to Advanced
Copy -> Settings -> Copy Path -> Modify REC Multiplicity
to modify the Priority Level.
After the REC Path configuration is done, you should start with
the creation of your Storage Cluster TFO Group. The Set button in
the Action pane could be used to create a new or modify an existing
and selected Storage Cluster TFO Group.
Select the Remote Disk Array from the list of available storage
systems. Because we started the creation of our Storage Cluster TFO
Group at the Primary (active) ETERNUS DX S3 storage system, the
Local option must be selected for the Primary Disk Array. Enter the
name of this TFO Group and choose your Failover Mode.
The Split Mode settings are related to the status of the REC
Path. To achieve application consistency for any case of automatic
failover to the Secondary (standby) ETERNUS DX S3 storage system,
you may select Read as the Split Mode. Note: If you select the Read
option the business servers will get an I/O error for write
requests in case the REC Path is broken.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 6 of 27 fujitsu.com/eternus
Last but not least you need to select the Fibre Channel (FC)
port pairs used by the Storage Cluster feature.
Note: You cant use CA/RA ports for the creation of the Fibre
Channel (FC) port pairs used by the Storage Cluster feature.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 7 of 27 fujitsu.com/eternus
Storage Cluster allocating Business Volumes Next thing to do is
to register all WWN names of your business servers Host Bus
Adapters (HBA). Open the Connectivity category, select Host and
press the Add FC Host button to set up the FC Host.
You should find all WWN numbers of your business servers HBA's
already connected to Fibre Channel (FC) ports (e.g. CM#0 CA#0
Port#3) at the Primary (active) ETERNUS DX S3 storage system.
Therefore identify the Channel Adapter (CA) port on that ETERNUS
DX S3 storage system and register the names of each WWN number. As
mentioned above use dedicated names (e.g. PRI_SRV01_HBA0) to
identify these FC Hosts for future reference. You should note down
the WWN numbers for manual registration of each FC Host at the
Secondary (standby) ETERNUS DX S3 storage system later on.
Enter the name of this FC Host, select the Host Response, press
the Next button and confirm your settings at the next screen.
Repeat these steps for all WWN numbers of your business servers
Host Bus Adapters (HBA) connected to the Primary (active) ETERNUS
DX S3 storage system. After you have completely finished this part,
you need to setup the LUN Group including the volumes for your
business servers. Select Affinity/LUN Group in the Connectivity
category of the Primary (active) ETERNUS DX S3 storage system and
create the LUN Group.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 8 of 27 fujitsu.com/eternus
Again you should choose a self-explanatory name for the LUN
Group used by the Storage Cluster feature. Enter the Host LUN
Number for the selected volumes and add them to the list of
Assigned Volumes. Press the Next button and confirm your settings
at the next screen.
Note: You should write down the LUN No. including the Capacity
of each volume added to the list of Assigned Volumes for the
creation of the corresponding LUN Group at the Secondary (standby)
ETERNUS DX S3 storage system later on. Note: After adding the
volumes to the LUN Group you need to check the reservation status
of each volume using the ETERNUS DX S3 HW-GUI. If there are still
persistent reservations left over you need to remove them from each
volume first. Select the volume and use the Release Reservation
Action button for this purpose. The picture below will show details
about Reservation.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 9 of 27 fujitsu.com/eternus
After finishing this part of the Storage Cluster feature setup,
you need to create the Host Affinity for your business servers.
Switch to the Host Affinity section in the Connectivity category
pane of the Primary (active) ETERNUS DX S3 storage system.
Press the Create button and start configuring the Host Affinity
using the created FC Host and the associated LUN Group attached to
the Fibre Channel (FC) port of the Primary (active) ETERNUS DX S3
storage system. You need to repeat this process for each WWN of
your business servers Host Bus Adapters (HBA).
Note: You couldnt create the Host Affinity using Host Group,
Port Group and LUN Group at the ETENRUS DX S3 HW GUI. This wont
work with the Storage Cluster feature. Important Note: After
removing a TFO volume from the LUN Group at the Secondary (standby)
ETERNUS DX S3 storage system you must
change the unique identifier (UID) of that volume to use it as a
Standard Volume. For this purpose you can use the ETERNUS DX HW CLI
set volume command using the -uid parameter. (See the ETERNUS CLI
User's Guide for details)
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 10 of 27 fujitsu.com/eternus
The Storage Cluster feature configuration is nearly done at the
Primary (active) ETERNUS DX S3 storage system. Now you need to
setup the corresponding settings at the Secondary (standby) ETERNUS
DX S3 storage system. Therefore register the WWN numbers, which you
noted down while creating each Host at the Primary (active) ETERNUS
DX S3 storage system, of each Host Bus Adapters (HBA) belonging to
your business servers manually. Use self-explanatory names (e.g.
SEC_SRV01_HBA0, SEC_SRV01_HBA1) for this process.
Enter all needed information in the input fields of each FC Host
and add it to the list. Press the Next button to confirm your
settings.
Create the corresponding Affinity/LUN Group and all the Host
Affinity of your business servers Host Bus Adapters (HBA) at the
Secondary (standby) ETERNUS DX S3 storage system afterwards. Keep
in mind that the corresponding Affinity/LUN Group (e.g.
SEC_SRV01_LG) must use same number of volumes including the exact
same LUN No. and the exact same Capacity of each volume added to
the list of Assigned Volumes. The procedure for all of these tasks
is the same as you did at the Primary (active) ETERNUS DX S3
storage system.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 11 of 27 fujitsu.com/eternus
Storage Cluster Controller setup As already mentioned above, you
should install the ETERNUS SF V16.1 Storage Cruiser Agent software
used for Storage Cluster Monitoring on a dedicated server.
Information about the installation of that software could be found
in the ETERNUS SF Express V16 / Storage Cruiser V16 / AdvancedCopy
Manager V16 Installation and Setup Guide. After the installation
succeeded you need to modify two configuration files. This will
enable the ETERNUS SF V16.1 Storage Cruiser Agent to be the Storage
Cluster Controller for your environment. The default installation
directory of the ETERNUS SF V16.1 Storage Cruiser Agent software is
C:\ETERNUS_SF . Using the default installation the two files
(Correlation.ini and TFOConfig.ini) are located under the
C:\ETERNUS_SF\ESC\Agent\etc directory. Add the following lines at
the end of the Correlation.ini file:
#---------------- # Storage Cluster Controller Server
configuration #---------------- StorageClusterController=ON
The TFOConfig.ini file is responsible for identifying the two
ETERNUS DX S3 storage systems used for the Storage Cluster
functionality. Therefore you need to add the Master IP address of
each ETERNUS DX S3 storage system into that file. Here comes an
example how the input should look like:
IP=192.168.100.60 IP=192.168.200.50
After the modifications on both files took place, you need to
restart the ETERNUS SF V16.1 Storage Cruiser Agent to reflect the
settings. You will find additional information in the ETERNUS SF
Storage Cruiser V16 Operation Guide for any kind of details. In
addition you should discover the Storage Cluster Controller (using
the Storage Cruiser Agent functionality) within the ETERNUS SF V16
Manager software.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 12 of 27 fujitsu.com/eternus
Storage Cluster processing Management (such as changing TFO
Group Name, Failover Mode or Split Mode) of a TFO Group can be done
either at the Primary (active) or Secondary (standby) ETERNUS DX S3
storage system. Select the TFO Group and press the Set button in
the Action pane for this purpose. Modifying LUN Groups (such as
adding or removing Volumes to/from the LUN Group) used by the
Storage Cluster feature should always started at the Primary
(active) ETERNUS DX S3 storage system first. After this is done you
should modify the corresponding LUN Group at the Secondary
(standby) ETERNUS DX S3 storage system. You need to check the
status of your Storage Cluster configuration at the Storage Cluster
Controller. If you were using the default installation of the
ETERNUS SF V16.1 Storage Cruiser Agent software you will find the
CLI script here: C:\ETERNUS_SF\ESC\Agent\bin. Here comes an example
output of the agtpatrol.bat CLI script:
C:\ETERNUS_SF\ESC\Agent\bin> agtpatrol.bat
--------------------------------------------------------------------------------
INTERVAL=1000 TARGET IP: 192.168.100.60 192.168.200.50
--------------------------------------------------------------------------------
TARGET TFO GROUP: IP ADDRESS=192.168.100.60 GROUP
NAME=DX600_to_DX500 TYPE=Primary PAIR IP ADDRESS=192.168.200.50
PAIR GROUP NAME=DX600_to_DX500 STATUS=Normal INTERVAL=1000 UPDATE
TIME=Mon Jun 02 10:54:26 CEST 2014 IP ADDRESS=192.168.200.50 GROUP
NAME=DX600_to_DX500 TYPE=Secondary PAIR IP ADDRESS=192.168.100.60
PAIR GROUP NAME=DX600_to_DX500 STATUS=Normal INTERVAL=1000 UPDATE
TIME=Mon Jun 02 10:54:26 CEST 2014
Note: INTERVAL is the heartbeat rate in milliseconds configured
on each ETERNUS DX S3 storage system.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 13 of 27 fujitsu.com/eternus
Use the Refresh button in the Action pane to update the TFO
Group Status always to get the actual status of your TFO Groups.
This will create a job running in the background that will update
the TFO Group Status.
Manual Failover can only be triggered using the Storage Cluster
section at the Primary (active) ETERNUS DX S3 storage system. You
wont be able to press the Failover or Force-Failover button at the
Secondary (standby) ETERNUS DX S3 storage system.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 14 of 27 fujitsu.com/eternus
In case a failover took place due to manual or auto mode, you
want to switch back the business servers host connections to the
original ETENRUS DX S3 storage system. Before you are able to start
this operation you need to check some preconditions. Make sure that
the Primary (active) ETERNUS DX S3 storage system is up and running
without any hardware related issues. The REC Path connection
between the two ETERNUS DX S3 storage systems must be available and
the volumes used by the Storage Cluster feature are in sync
(Equivalent). The last one must be checked using the details of the
associated TFO Group. There you have the capability to verify the
status of the REC Copy process for each volume (switch view from
Ports to Volumes) belonging to this TFO Group. If everything is
ready for switching back the business servers host connections to
the Primary (active) ETERNUS DX S3 storage system (Status = Active
and Phase = Equivalent) you are able to start the failback. As you
can see at the picture below, this action isnt available at the
Primary (active) ETERNUS DX S3 storage system.
Therefore you need to switch the ETERNUS SF V16.1 GUI to the
Secondary (standby) ETERNUS DX S3 storage system and start the
failback action using the associated TFO Group from there.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 15 of 27 fujitsu.com/eternus
Storage Cluster bi-directional information and setup The Storage
Cluster feature could be configured bi-directional as well. There
is no need to create a new REC Path configuration for this purpose.
The existing REC Path between the two ETERNUS DX S3 systems can be
shared for that. However you must use dedicated Fibre Channel (FC)
ports for the second TFO Group on each ETERNUS DX S3 system. You
cant share Fibre Channel (FC) ports among TFO Groups. As a rule of
thumb you should use dedicated RAID Groups for active and passive
TFO Volumes on each ETERNUS DX S3 system involved in a
bi-directional Storage Cluster setup. The additional TFO Group
including all required resources (Volumes, LUN-Groups, FC-Hosts and
Host Affinity) needs to be setup analog as described for the
single-sided configuration. The picture below gives you an example
how such a configuration could look like.
Note: The status of the two TFO Groups above differs. For having
always the latest status you need to press the Refresh button of
the TFO Group Status for creating a job to update all your TFO
Groups. This needs to be done from time to time.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 16 of 27 fujitsu.com/eternus
Recovery procedure caused by defect RAID Group In case a
Transparent Failover took place (automatic or manual) due to a
broken RAID Group at the Primary (active) ETERNUS DX S3 storage
system, there is a special treatment needed to recover the Storage
Cluster configuration after the broken RAID Group is repaired. All
related configuration steps need to be done at the Secondary
(passive) ETERNUS DX S3 storage system. First of all you must login
to the CLI using a User Account of the Maintainer Role, because all
these steps must be executed using CLI commands of that ETERNUS DX
S3 storage system. Preconditions If the setup of your TFO Group is
configured as Failover Mode = Manual you need to start the failover
by using the Failover Force in advance. Identify your TFO Group and
the start the manual failover. In any case you must make sure that
all TFO Volumes are hosted by the Secondary (passive) ETERNUS DX S3
storage system. (The Status of the Secondary TFO Group must be
Active) CLI> show tfo-groups TFO Group No. [0] TFO Group Name
[DX600_to_DX500] Type [Secondary] Status [Standby] Phase
[Maintenance] Condition [Normal] Failover Mode [Manual] Split Mode
[Read/Write] Monitor Interval [-] Pair Box ID
[00ETERNUSDXMS3ET603SAU####OF4621352001##] Own Pair Port [CM#0 CA#0
Port#1 CM#0 CA#0 Port#1] [CM#1 CA#0 Port#1 CM#1 CA#0 Port#1]
CLI> forced tfo-group-activate -tfog-number 0 -active-mode
manual-failover CLI> show tfo-groups TFO Group No. [0] TFO Group
Name [DX600_to_DX500] Type [Secondary] Status [Active] Phase
[Maintenance] Condition [Normal] Failover Mode [Manual] Split Mode
[Read/Write] Monitor Interval [-] Pair Box ID
[00ETERNUSDXMS3ET603SAU####OF4621352001##] Own Pair Port [CM#0 CA#0
Port#1 CM#0 CA#0 Port#1] [CM#1 CA#0 Port#1 CM#1 CA#0 Port#1]
Afterwards stop the Transparent Failover Replication of the TFO
Volumes (Status = Error Suspend) located at the broken RAID Group
of the Primary (active) ETERNUS DX S3 storage system. Follow these
steps to fulfill this requirement: 1. Step Identify volumes located
on the broken RAID Group which are in Error Suspend status. CLI>
show tfo-pair -tfog-number 0 TFO Group Name [DX600_to_DX500] Host
No. [8] Host Name [SEC_SRV01_HBA0] Own Volume Pair Volume SID
Status Phase Error No. Name No. Code -----
-------------------------------- ----------- ----- -------------
---------------- ----- 11 RM_TFO_VOL00 9 13 Error Suspend
Equivalent 0x00 16 RM_TFO_VOL05 14 1 Active Equivalent 0x00 Host
No. [10] Host Name [SEC_SRV01_HBA1] Own Volume Pair Volume SID
Status Phase Error No. Name No. Code -----
-------------------------------- ----------- ----- -------------
---------------- ----- 11 RM_TFO_VOL00 9 13 Error Suspend
Equivalent 0x00 16 RM_TFO_VOL05 14 1 Active Equivalent 0x00
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 17 of 27 fujitsu.com/eternus
2. Step Get detail information about the TFO Copy session with
Status = Error Suspend. CLI> show tfo-pair -session-id 13 Own
Volume No. [11] Own Volume Name [RM_TFO_VOL00] Pair Volume No. [9]
Status [Error Suspend] Phase [Equivalent] Error Code [0x26] Source
Block Address [0x0000000000000000LBA] Destination Block Address
[0x0000000000000000LBA] Total Data Size [30720MB] Copied Data Size
[29184MB] Direction [From Local/To Remote] Sync [Sync] Recovery
Mode [Automatic] Split Mode [Automatic] Remote Session-ID [13]
Remote Box-ID [00ETERNUSDXMS3ET603SAU####OF4621352001##] Time Stamp
[2014-08-25 16:45:29] Elapsed Time [31 day 7 hour 36 min 30 sec]
Copy Range [Totally] Secondary Access Permission [Read Only at
Equivalency] Concurrent Suspend Status [Normal] 3. Step Release the
copy sessions of TFO Volumes which have the Status = Error Suspend.
CLI> release tfo-pair -port 001 -host-number 8 -volume-number 11
4. Step Restore the broken RAID Group and the associated volumes
used as the TFO Volumes at the Primary (active) ETERNUS DX S3
storage system. Please see the maintenance manual for RAID Group
recovery. There are 2 possibilities related to the failed RAID
Group:
- RAID Forced Recovery - Recovery by [DISK Hot Maintenance]
You can use the RAID Forced Recovery options if you think the
disks are still OK and the broken RAID Group was forced because of
another event, e.g. DE failure. If you think the disks are really
broken, then choose Recovery by [DISK Hot Maintenance]. The next
screenshots are examples for Recovery by [DISK Hot Maintenance]
from the ETERNUS DX HW-GUI.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 18 of 27 fujitsu.com/eternus
Identify and exchange the broken disks.
After exchange of the broken disks, the status of the RAID Group
is Available, but the volumes are in status Readying. The next
screenshot is an example how this information will be seen in the
ETERNUS DX HW-GUI.
All volumes which are in status Readying must be formatted
first. Note: The format of the volume must be done using ETERNUS
CLI or ETRNUS SF V16.x manager. If you try to perform the format
using the ETERNUS HW-GUI, you will get the following error
message:
5. Step Go back to the CLI of the Secondary (passive) ETERNUS DX
S3 storage system using a User Account of the Maintainer Role and
restart the Transparent Failover Replication of the TFO Volumes.
CLI> recover tfo-pair -port 001 -host-number 8 -volume-number 11
-recovery-target primary This will start a new initial copy of the
TFO Volumes hosted by the former broken RAID Group. If the copy
succeeded you are able to switchback (Failback) the host access to
the Primary (active) ETERNUS DX S3 storage system again.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 19 of 27 fujitsu.com/eternus
6. Step Afterwards you may want to check the status of the TFO
Group and the associated TFO Volumes belonging to that TFO Group.
CLI> show tfo-pair -tfog-number 0 TFO Group Name
[DX600_to_DX500] Host No. [8] Host Name [SEC_SRV01_HBA0] Own Volume
Pair Volume SID Status Phase Error No. Name No. Code -----
-------------------------------- ----------- ----- -------------
---------------- ----- 11 RM_TFO_VOL00 9 2 Copying Equivalent 0x00
16 RM_TFO_VOL05 14 1 Active Equivalent 0x00 Host No. [10] Host Name
[SEC_SRV01_HBA1] Own Volume Pair Volume SID Status Phase Error No.
Name No. Code ----- -------------------------------- -----------
----- ------------- ---------------- ----- 11 RM_TFO_VOL00 9 2
Copying Equivalent 0x00 16 RM_TFO_VOL05 14 1 Active Equivalent 0x00
CLI> show tfo-pair -session-id 2 Own Volume No. [11] Own Volume
Name [RM_TFO_VOL00] Pair Volume No. [9] Status [Active] Phase
[Copying] Error Code [0x00] Source Block Address
[0x0000000000000000LBA] Destination Block Address
[0x0000000000000000LBA] Total Data Size [30720MB] Copied Data Size
[6144MB] Direction [From Local/To Remote] Sync [Sync] Recovery Mode
[Automatic] Split Mode [Automatic] Remote Session-ID [6] Remote
Box-ID [00ETERNUSDXMS3ET603SAU####OF4621352001##] Time Stamp
[0000-00-00 00:00:00] Elapsed Time [0 day 0 hour 1 min 51 sec] Copy
Range [Totally] Secondary Access Permission [Read Only at
Equivalency] Concurrent Suspend Status [Normal]
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 20 of 27 fujitsu.com/eternus
Appendix Here come some helpful hints while dealing with the
Storage Cluster feature of ETERNUS DX S3 storage systems. You
should examine the Fibre Channel (FC) zone configuration of your
Fibre Channel (FC) switches from time to time. Especially check the
ports at the FC-Switches involved by the Storage Cluster
functionality for issues like Duplicate Port WWN detected. See this
example for details: Switch01:admin> switchshow switchName:
Switch01 switchType: 66.1 switchState: Online switchMode: Native
switchRole: Subordinate switchDomain: 169 switchId: fffca9
switchWwn: 10:00:00:05:1e:83:12:aa zoning: ON (My_Fabric2)
switchBeacon: OFF FC Router: OFF FC Router BB Fabric ID: 1 Address
Mode: 0 Fabric Name: My_Fabric1 Index Port Address Media Speed
State Proto ================================================== 0 0
a90000 id 8G Online FC F-Port 10:00:00:90:fa:50:34:52 1 1 a90100 id
8G Online FC F-Port 10:00:00:90:fa:50:3e:60 2 2 a90200 id 8G Online
FC F-Port 21:00:00:24:ff:53:36:6f 3 3 a90300 id 8G No_Sync FC
Disabled (Persistent) 4 4 a90400 id 8G No_Sync FC Disabled
(Persistent) 5 5 a90500 id 8G Online FC F-Port
21:00:00:24:ff:53:36:71 6 6 a90600 id 8G No_Sync FC Disabled
(Persistent) 7 7 a90700 id 8G In_Sync FC Disabled (Persistent) 8 8
a90800 id 8G No_Light FC Disabled (Persistent) 9 9 a90900 id 8G
In_Sync FC Disabled (Persistent) 10 10 a90a00 id 8G Online FC
F-Port 50:00:00:e0:da:80:68:20 11 11 a90b00 id 8G Online FC F-Port
50:00:00:e0:da:80:43:20 12 12 a90c00 id 8G No_Light FC 13 13 a90d00
id 8G Online FC F-Port 10:00:00:90:fa:50:34:1d 14 14 a90e00 id 8G
No_Sync FC Disabled 15 15 a90f00 id 8G Online FC F-Port
50:00:00:e0:da:80:43:23 16 16 a91000 id 8G No_Sync FC Disabled
(Persistent) (Duplicate Port WWN detected) 17 17 a91100 -- 8G
No_Module FC 18 18 a91200 -- 8G No_Module FC 19 19 a91300 -- 8G
No_Module FC 20 20 a91400 -- 8G No_Module FC 21 21 a91500 -- 8G
No_Module FC 22 22 a91600 -- 8G No_Module FC 23 23 a91700 -- 8G
No_Module FC 24 24 a91800 id N8 Online FC F-Port
50:00:00:e0:d4:00:01:91 25 25 a91900 id N8 Online FC F-Port
50:00:00:e0:d4:00:01:92 26 26 a91a00 id 8G No_Light FC 27 27 a91b00
id 8G No_Light FC 28 28 a91c00 -- 8G No_Module FC 29 29 a91d00 --
8G No_Module FC 30 30 a91e00 -- 8G No_Module FC 31 31 a91f00 -- 8G
No_Module FC 32 32 a92000 id 8G No_Light FC 33 33 a92100 id 8G
No_Light FC 34 34 a92200 id N8 Online FC E-Port
10:00:00:27:f8:3d:bb:a7 "Switch99" (upstream)(Trunk master) 35 35
a92300 id N8 Online FC E-Port (Trunk port, master is Port 34 ) 36
36 a92400 id 8G Online FC F-Port 21:00:00:24:ff:53:36:58 37 37
a92500 id 8G Online FC F-Port 21:00:00:24:ff:53:37:2a 38 38 a92600
id N8 Online FC F-Port 50:00:00:e0:d4:00:00:90 39 39 a92700 id 8G
No_Light FC
As already mentioned above you should use the Refresh button
within the Storage Cluster Overview section of the ETERNUS SF V16.1
Manager GUI to update the status of your TFO Groups. The Set action
could be used to create a new TFO Group or modify an existing TFO
Group, which needs to be checked before you press the Set
button.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 21 of 27 fujitsu.com/eternus
Fibre Channel Switch read-only discovery First you need to
configure your Fibre Channel (FC) switch. Therefore you need to
login into the switch using an administrator account and create the
user account used by ETERNUS SF V16 later on. If you are using the
CLI of the switch the command for creating the user would look like
this: MySwitch:admin> userconfig --add ETSFuser -r user [Syntax:
userconfig --add -r user] Afterwards you need to set a password for
this user. The CLI command for this would be: MySwitch:admin>
passwd ETSFuser [Syntax: passwd ] The last configuration step at
the Fibre Channel (FC) switch is to create a read-only SNMP
community. Again you can use the CLI of the switch for creating the
dedicated read-only SNMP community used by ETERNUS SF V16 later on.
ETERNUS SF V16 requires a SNMP community of SNMPv1. You can modify
the well-known read-only community public and change it to e.g.
ETSFsnmp for this purpose. The CLI command for this would be:
MySwitch:admin> snmpconfig --set snmpv1
Community (rw): [Secret C0de] Trap Recipient's IP address :
[0.0.0.0] Community (rw): [OrigEquipMfr] Trap Recipient's IP
address : [0.0.0.0] Community (rw): [private] Trap Recipient's IP
address : [0.0.0.0] Community (ro): [public] ETSFsnmp Trap
Recipient's IP address : [0.0.0.0] Community (ro): [common] Trap
Recipient's IP address : [0.0.0.0] Community (ro): [FibreChannel]
Trap Recipient's IP address : [0.0.0.0]
You might need to call snmpconfig --set accessControl to set or
change access-control-related parameters afterwards.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 22 of 27 fujitsu.com/eternus
Enter the ETERNUS SF V16 Manager GUI and discover the Fibre
Channel (FC) switch using the just created settings on that
switch.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 23 of 27 fujitsu.com/eternus
Status of TFO Group Information Active/Standby
*1: "Unknown" has a meaning common to all the statuses, so is
omitted hereinafter. Phase
Status
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 24 of 27 fujitsu.com/eternus
Halt Factor
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 25 of 27 fujitsu.com/eternus
Recommendations
Notes belonging to multipath settings of your Business Servers:
Linux:
no_path_retry Specify the number of retries until disable
queueing, or fail for immediate failure (no queueing), queue for
never stop queueing. Default is 0.
For the Storage Cluster function with an ETERNUS DX S3 storage
system you need to specify : "no_path_retry 10"
fast_io_fail_tmo The default fast_io_fail_tmo setting for an FC
remote port in seconds. If an rport has vanished from the fabric
all I/O to the devices on that port will be terminated after this
timeout. Should be smaller than dev_loss_tmo setting. Default is
5.
Infos from (Fibre Channel/FCoE/iSCSI/SAS) for Linux
device-mapper multipath document:
"fast_io_fail_tmo 1"
Windows: Windows Server 2012 R2/ Windows Server 2012/ Windows
Server 2008 R2/ Windows Server 2008 Standard Multipath Driver
(msdsm) Notes Various settings, such as the load balance policy and
retry count, can be adjusted by using the standard multipath
drivers (msdsm) for Windows Server 2012 R2, Windows Server 2012,
Windows Server 2008 R2 or Windows Server 2008. However the
following settings should not be changed from their default values.
Screen name Parameters that may not be changed MPIO tab of
Multi-Path Disk Device properties Load balance policy, [Details]
button, [Edit] button Details of DSM Timer counter (path checking
period, enable path checking,
number of retries, retry interval, PDO deletion period) Details
of MPIO paths Path status
Notes for Host Response Settings: Dont use different Host
Response settings Active-Active (A-A) or Active-Active Preferred
(A-A/P) for the Primary (active) and the Secondary (passive)
ETERNUS DX S3 storage system.
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 26 of 27 fujitsu.com/eternus
TFO Checklist
Quick Checklist for TFO Configurations Step Action Important
Note
1 Check the firmware of both ETERNUS DX S3 systems A minimum of
V10L20-0000 is required
2 Check that the latest version of ETERNUS SF Manager including
all latest patches are installed
ETERNUS SF V16.1 or higher
3 Discover both ETERNUS DX S3 systems in ETERNUS SF Manager
4 Discover the FC Switches Read-Only Mode is recommended
5 Check the licenses for both ETERNUS DX S3 systems ETERNUS SF
Storage Cruiser V16 Standard License, ETERNUS SF Storage Cruiser
V16 Storage Cluster Option, ETERNUS SF AdvancedCopy Manager V16
Remote Copy License
6 Configure FC Zoning or Direct Cabling for the REC Path A
minimum of 1 path per CM is recommended
7 Configure the REC Path with ETERNUS SF Manager or the ETERNUS
DX S3 HW-GUI
RA only Ports are recommended
8 Configure a Storage Cluster TFO Group Use Split Mode -->
Read to achieve Application Consistency. Only CA Ports can be
used.
9 Configure FC Zoning from the Business Server(s) to the Primary
ETERNUS DX S3 system
Only WWPN based zoning is supported for TFO
10 Create the Business LUNs on both ETERNUS DX S3 systems Be
sure that the LUNs on both ETERNUS DX S3 systems have the same
size
11 Register the HBAs of the Business Server(s) on the Primary
ETERNUS DX S3
Be sure to use the same Host Response settings on both ETERNUS
DX S3 arrays. Note down the used Host WWPNs for later usage.
12 Create an Affinity/LUN Group on the Primary ETERNUS DX S3 and
add the Business LUNs
Note down the used Host LUN Numbers for later usage
13 Check the reservation status of each volume using the ETERNUS
DX S3 HW-GUI
Remove existing reservations from each volume
14 Create a Host Affinity for the registered HBAs, the ports and
the created LUN Group on the Primary ETERNUS DX S3
It is not possible to use Host Group, Port Groups, LUN Group
mechanism from the HW-GUI in TFO configurations
15 Register the HBAs of the Business Server(s) on the Secondary
ETERNUS DX S3
Be sure to use the same Host Response settings on both ETERNUS
DX S3 arrays. Add the Host WWPNs manually (info from step 11)
16 Create an Affinity/LUN Group on the Secondary ETERNUS DX S3
and add the Business LUNs
Be sure to use the same Host LUN Numbers as configured for the
Primary ETERNUS DX S3
17 Create a Host Affinity for the registered HBAs, the ports and
the created LUN Group on the Secondary ETERNUS DX S3
It is not possible to use Port Groups in TFO configurations. Be
sure to use the Standby Ports for this Host Affinity.
18 Check the TFO Group and TFO Volume Status
19 Install the ETERNUS SF Storage Cruiser Agent as Monitoring
instance (Storage Cluster Controller)
Only Windows OS is supported
20 Modify the ETERNUS SF Storage Cruiser Agent Configuration
files
Correlation.ini & TFOConfig.ini
21 Restart the ETERNUS SF Storage Cruiser Agent Service
22 Discover the Storage Cluster Controller Server in the ETERNUS
SF Manager
-
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU
CONFIDENTIAL
Page 27 of 27 fujitsu.com/eternus
Abbreviations Shortcut Description
CA Abbreviation of ETERNUS DX Channel Adapter
CM Abbreviation of ETERNUS DX Controller Module
LUN Abbreviation of Logical Unit Number
TFO Abbreviation of Transparent Failover. For the Storage
Cluster feature, it means operation of failover transparently for
operation server.
TFOV Abbreviation of TFO Volume. A volume assigned in a Storage
Cluster configuration.
TFO Group
A group managing connection configuration, policies, states and
maintenance for failover. It includes one or more Fibre Channel
(FC) CA ports and volumes allowed to access from these CA ports.
The state of TFO Group is Active (accessible from operation server)
or Standby (not accessible from operation server).
CA Port Pair
The Storage Cluster feature operates failover by sharing common
WWN/WWPN with each Fibre Channel (FC) CA port of two ETERNUS DX S3
storage systems and controlling link state of each Fibre Channel
(FC) CA port. This operation is called CA Port Pairing and a pair
of Fibre Channel (FC) CA ports sharing common WWN/WWPN is called CA
Port Pair.
WWN / WWPN Abbreviation of World Wide Name / World Wide Port
Name
The diagram below illustrates the different components used by a
TFO Group of the Storage Cluster feature, such as:
TFO Group including TFOVs, Affinity Groups and CA Port Pairs
Contact FUJITSU Limited Address:Shiodome City Center, 5-2,
Higashi-shimbashi 1-Chome, Minato-ku, Tokyo 105-7123, Japan
Website: www.fujitsu.com/eternus
2014 Fujitsu, the Fujitsu logo, [other Fujitsu trademarks
/registered trademarks] are trademarks or registered trademarks of
Fujitsu Limited in Japan and other countries. Other company,
product and service names may be trademarks or registered
trademarks of their respective owners. Technical data subject to
modification and delivery subject to availability. Any liability
that the data and illustrations are complete, actual or correct is
excluded. Designations may be trademarks and/or copyrights of the
respective manufacturer, the use of which by third parties for
their own purposes may infringe the rights of such owner.
Primary storage
CA #0 CA #1
TFOV#0
TFOV#1
TFOV#2
Affinity Group #0 Affinity Group #1
Secondary storage
CA #0 CA #1
TFOV#0
TFOV#1
TFOV#2
Affinity Group #0 Affinity Group #1
CA Port Pair
CA Port Pair
TFO Group TFO Group
Corresponding
Standby Active
IntroductionOverviewRequirementsSoftwareLicenses
Storage Cluster setup and configurationStorage Cluster
configurationStorage Cluster allocating Business VolumesStorage
Cluster Controller setupStorage Cluster processingStorage Cluster
bi-directional information and setup
Recovery procedure caused by defect RAID GroupPreconditions1.
Step2. Step3. Step4. Step5. Step6. Step
AppendixFibre Channel Switch read-only discoveryStatus of TFO
Group InformationRecommendationsTFO ChecklistAbbreviations