Top Banner
MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002
24

MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

Dec 28, 2015

Download

Documents

Arthur Clarke
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

MQSeries Auto Channel Recovery

William HaoCommunications Middleware Worldwide Technical OperationsJuly 11, 2002

Page 2: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

2

Contents

• Overview of Worldspan’s Current MQ Connectivity

• Summary of MQ Channel Issues

• Solutions to MQ Channel Issues

• Conclusion/Benefits

Page 3: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

3

Overview of Worldspan’s Current MQ Connectivity

TPF Loosely-Coupled Complex

UNIX MQ Hub

WIN Servers

Remote MQ Connections

OS/390

UNIX

OTHERS

UNISYS

Page 4: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

4

Summary

of

MQ Channel Issues

Page 5: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

5

Summary of MQ Channel Issues

• Automated Channel Retry in TPF (PJ28758) not yet available at the time.

text

TPF

chl restart

Page 6: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

6

Summary of MQ Channel Issues (cont.)

• Message sequence numbering between sender and receiver channel pair gets out of sync.

Sequence numbers are generated at the sending end of the channel and is incremented before being used, which means that the current seq num is that of the last message sent. These are filed for the last message transferred in a batch and are used during channel start-up to ensure that both ends agree on which messages have been transferred succesfully.

text text

Msg seq = 00123

MQ Server

MQ Server

Msg seq = 00113RECEIVER

SENDER

Page 7: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

7

Summary of MQ Channel Issues (cont.)

• Sender channels go into INDOUBT status.

In MQ, messages are always transferred individually; however, these are committed or backed out as a batch. When MQ commits a batch, it syncpoints a logical unit of work (LUW). If this syncpoint procedure is interrupted, an indoubt chl condition may occur.

text text

MQ Server

MQ Server

Page 8: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

8

text text

Summary of MQ Channel Issues (cont.)

• TPF rcvr chl shows READY but partner sdr chl in UNIX cannot establish channel connection.

• UNIX rcvr chl shows RUNNING but partner sdr chl in TPF cannot establish channel connection.

start chl ready /

running

MQ Server

MQ Server

Page 9: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

9

Solutions

To

MQ Channel Issues

Page 10: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

10

Automated Channel Recovery Function in TPF

Cycle to NORM activates a time-initiated auto-chl recovery function which has the following features:

• First time around, START all sdr chls.

• CRETs to itself every minute.

• Check status of all sdr chls and perform necessary action.

• Can be activated or deactivated via functional entry.

Page 11: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

11

Automated Channel Reset for TPF

RESET and START the sender channel

Is sender chlStatus not READY

Nor INDOUBT?

YES

Page 12: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

12

Automated Channel Resolve for TPF

The sdr chl goes into INDOUBT status if it is in doubt with the partner rcvr chl about which msgs have been sent and received. In this situation, the sdr chl has to be told whether to COMMIT or BACKOUT these msgs. Although this condition rarely occurs, it requires manual intervention to resynchronize the channels via functional entry.

Page 13: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

13

Automated Channel Resolve for TPF

RESOLVE, RESET and START the sender channel

Is sender chlStatus INDOUBT?

YES

Page 14: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

14

Automated Channel Retry for UNIX

UNIX v5.2 has a built-in channel retry mechanism and may be used in conjunction with the following channel attributes:

• SHORTRTY – Short retry is the max nbr times sdr chl will try to allocate a session to its partner (set at 60).

• SHORTTMR – Short retry timer is the interval in sec wherein sdr chl will wait before retrying to establish a chl connection during the short retry mode (set at 60 sec).

• LONGRTY – Long retry kicks in after SHORTRTY expires (set at 999999999).

Page 15: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

15

Automated Channel Retry for UNIX (cont.)

• LONGTMR – Long retry timer is set at 1200 sec (20 min).

• HBINT – Heartbeat interval is the interval in sec wherein the sending MCA will send heartbeat flows to unblock the receiving MCA so that it can disconnect the channel.

• DISCINT – Disconnect interval is the time out value in sec for the sdr chl to disconnect when the xmitq becomes empty.

Note: Setting these channel attributes will work only when the Queue Manager of the partner channel can support it.

Page 16: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

16

Automated Channel Recovery Function for UNIX

The CRON table contains a script file which has the following features:

• Activated once every minute.

• Check status of all sdr chls.

Page 17: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

17

Automated Channel Resolve for UNIX

RESOLVE and RESET the sender channel

Is sender chlStatus INDOUBT?

YES

Page 18: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

18

Automated Channel Reset for UNIX

RESET the sender channel

Is sender channel in RETRYING mode?

YES

Page 19: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

19

Automated Channel Reset for UNIX

RESET the sender channel

No chl status for sender channel?

YES

Page 20: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

20

Using TCP KeepAlive

• TCP KeepAlive knows nothing about MQSeries channels. It works on the TCP socket level.

• It sends a KeepAlive msg to the socket partner.

• If it detects that the partner is no longer available, it will disconnect the socket.

Page 21: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

21

text text

Using TCP KeepAlive (cont.)

• Alleviates the problem where the rcvr chl shows READY or RUNNING but the partner sdr chl is retrying to establish a new connection.

MQ Server

MQ Server

start chl

chl started

Page 22: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

22

Using TCP KeepAlive (cont.)

• For TPF native stack, PJ28289 (PUT16 APAR) enables the KeepAlive option of the socket used by MQ rcvr chls.

• For TPF native stack, the socket sweeper checks if a socket has the KeepAlive option and sends a KeepAlive msg. Currently, the socket sweeper activates every 2 minutes.

• For UNIX, the KeepAlive interval is currently set to 1 minute.

Page 23: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

23

Conclusion

Automated channel restart mechanism for TPF

Automated RESET mechanism for TPF

Automated RESOLVE mechanism for TPF

Automated RESET mechanism for UNIX

Automated RESOLVE mechanism for UNIX

Automated channel resolution between TPF and UNIX

Page 24: MQSeries Auto Channel Recovery William Hao Communications Middleware Worldwide Technical Operations July 11, 2002.

24

Benefits

• Eliminates manual intervention from staff

• Faster MQ channel recovery times

• Increased uptime = $$$ more company revenues