Case Study: Overloaded Chpids - IBM · 2019. 1. 11. · IBM Systems & Technology Group © 2007 IBM Corporation Case Study: Overloaded Chpids Revision 2008-07-29 BKW IBM z/VM Performance

IBM Systems & Technology Group

© 2007 IBM Corporation

Case Study: Overloaded Chpids

Revision 2008-07-29 BKW

IBM z/VM Performance EvaluationBrian Wade [email protected]


© 2007 IBM Corporation2

TrademarksTrademarks

The following are trademarks of the International Business Machines Corporation in the United States and/or other countries. For a complete list of IBM Trademarks, see www.ibm.com/legal/copytrade.shtml: AS/400, DBE, e-business logo, ESCO, eServer, FICON, IBM, IBM Logo, iSeries, MVS, OS/390, pSeries, RS/6000, S/390, VM/ESA, VSE/ESA, Websphere, xSeries, z/OS, zSeries, z/VM

The following are trademarks or registered trademarks of other companies

Lotus, Notes, and Domino are trademarks or registered trademarks of Lotus Development CorporationJava and all Java-related trademarks and logos are trademarks of Sun Microsystems, Inc., in the United States and other countriesLINUX is a registered trademark of Linus TorvaldsUNIX is a registered trademark of The Open Group in the United States and other countries.Microsoft, Windows and Windows NT are registered trademarks of Microsoft Corporation.SET and Secure Electronic Transaction are trademarks owned by SET Secure Electronic Transaction LLC.Intel is a registered trademark of Intel Corporation* All other products may be trademarks or registered trademarks of their respective companies.

NOTES:

Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.

IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.

All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.

This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.

All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.

Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.

Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.

References in this document to IBM products or services do not imply that IBM intends to make them available in every country.

Any proposed use of claims in this presentation outside of the United States must be reviewed by local IBM country counsel prior to such use.

The information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.

Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk.



Customer Configuration

3 z/VM partitions on z900Each partition has about 25 3390-3 in a 2105-F20– One 2105-F20 serves all three partitions– Separate LCUs for each partition– Each partition has its own four ESCON chpids to the 2105Customer wants to back up all of these 3390-3 to a 2107 once per week– N guests concurrently, each running one DDROnly FICON is available to the 2107How much FICON capacity is required?



ESCON vs. FICON Express

FasterFastIOP

32 I/Os at a time

(32x)

One I/O at a timeOpen exchanges

1 Gb/sec

(7.5x)

About 135 Mb/secLink speed

FICON ExpressESCON



Support Staff Claim is….

FICON is so much faster than ESCON, and…FICON can do >1 I/O at a time, and…We don’t have so many FICON ports on our switches, so…Let’s give him only one FICON chpid, and…He can come back if he thinks he needs moreWays to check:– FCX161 LCHANNEL– FCX108 DEVICE– FCX232 IOPROCLG– Open exchange analysis



FCX161 LCHANNELFCX161 Run 2008/07/25 17:54:09 LCHANNEL

Channel Load and Channel Busy Distribution

From 2008/07/20 04:10:55

To 2008/07/20 05:18:55

For 4080 Secs 01:08:00 Result of xxxxxxxx Run

_____________________________________________________________________________________________

CHPID Chan-Group

(Hex) Descr Qual Shrd Cur Ave 0-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100

CB ESCON 00 No 33 39 0 0 15 56 22 0 0 1 0 6

BC ESCON 00 No 33 38 0 0 19 54 19 0 0 1 0 6

DA ESCON 00 No 32 38 0 0 22 56 15 0 0 1 0 6

E9 ESCON 00 No 32 38 0 0 22 56 15 0 0 1 0 6

1A FICON 00 Yes 23 24 0 9 84 3 4 0 0 0 0 0

0C OSE 00 Yes 0 0 100 0 0 0 0 0 0 0 0 0

0D OSE 00 Yes 0 0 100 0 0 0 0 0 0 0 0 0

AD ESCON 00 No 0 0 100 0 0 0 0 0 0 0 0 0

These numbers are CPU BUSY ON THE CHANNEL ADAPTER.

- NOT fiber saturation

- NOT I/O concurrency saturation

- So far it doesn’t look too bad.



FCX108 DEVICEFCX108 Run 2008/07/28 12:10:44 DEVICE

General I/O Device Load and Performance

Mdisk Pa- Req.

Addr Type Label/ID Links ths I/O Avoid Pend Disc Conn Serv Resp CUWt Qued Busy READ

>> All DASD DDR00013 0 4 11.9 .0 5.1 .3 4.1 9.5 9.5 .0 .0 11 100

C800 3390 >DDR00001 0 4 11.8 .0 5.2 .1 4.1 9.4 9.4 .0 .0 11 100

C802 3390 >DDR00003 0 4 11.9 .0 5.2 .1 4.1 9.4 9.4 .0 .0 11 100

C804 3390 >DDR00005 0 4 11.9 .0 5.2 .1 4.1 9.4 9.4 .0 .0 11 100

C806 3390 >DDR00007 0 4 11.9 .0 5.2 .1 4.1 9.4 9.4 .0 .0 11 100

... some sample 2107/FICON targets

D01A 3390 >DDR00027 0 1 11.8 .0 24.6 40.9 9.6 75.1 75.1 .0 .0 89 0

D000 3390 >DDR00001 0 1 11.8 .0 24.6 40.9 9.5 75.0 75.0 .0 .0 89 0

D009 3390 >DDR00010 0 1 11.8 .0 24.4 40.9 9.7 75.0 75.0 .0 .0 89 0

D00E 3390 >DDR00015 0 1 11.8 .0 24.8 41.0 9.1 74.9 74.9 .0 .0 89 0

D008 3390 >DDR00009 0 1 11.9 .0 24.7 40.8 9.2 74.7 74.7 .0 .0 89 0

Uh, this doesn’t look so good…



What FCX108 is Telling Us

ESCON sources– Possible ESCON contention – PEND ~ 5 msec– 2105 is doing fine – low DISC– CONN is indicative of data transfer size (1 I/O at a time)FICON targets– FICON looks in trouble – PEND ~ 25 msec– 2107 is hurting – high DISC time– Longer CONN than ESCON? But chpid is faster!Pop quiz– Why is “Req. Qued” = 0?



FCX232 IOPROCLG FCX232 Run 2008/07/28 12:10:44 IOPROCLG

I/O Processor Activity by Time

From 2008/07/20 04:10:55

To 2008/07/20 05:18:55

For 4080 Secs 01:08:00 Result of WEB0720 Run

_______________________________________________________________________________

Interval Proc Proc

End Time Number Beg_SSCH I/O_Int %Busy Channel Switch CU Device

>>Mean>> 0 603.0 604.5 21.4 3037 .0 .0 .0

>>Mean>> 1 1815 1814 72.6 2261 .0 .0 .0

>>Mean>> 2 .1 .1 .0 18.1 .0 .0 .0

04:11:55 0 905.1 898.8 100.0 7516 .0 .0 .0

04:11:55 1 2529 2535 100.0 2285 .0 .0 .0

04:11:55 2 .1 .1 .0 .0 .0 .0 .0

04:12:55 0 1017 1011 100.0 7470 .0 .0 .0

04:12:55 1 2876 2882 100.0 2140 .0 .0 .0

04:12:55 2 .2 .4 .0 .0 .0 .0 .0

04:13:55 0 988.9 983.0 100.0 7675 .0 .0 .0

04:13:55 1 2851 2857 100.0 2159 .0 .0 .0

04:13:55 2 .6 .7 .0 .0 .0 .0 .0

You have to know how to interpret this report.

These are not percentages. PERFKIT is misleading.

PERFKIT is telling us about busy conditions the SAPs (channel subsystem) are encountering, per SSCH they handle.

/100 = busy indications per SSCH.

IOP0 is seeing 30.37 channel busy situations per SSCH.

Yes, we are fixing PERFKIT.



Lifetime of an I/O

t = cwait + irwait + disc + conn– Cwait = time IOP spends waiting for a chpid– Irwait = time spent waiting for initial response from controller, once a chpid is obtained– Disc = time the controller spends apart from the channel subsystem (usually cache miss)– Conn = time the controller spends in session with channel subsystem (usually data transfer)

Our old friend PEND = cwait + irwait– This is a bit unfortunate because it includes two very different kinds of waits– But as we will see in this workload, this isn’t much of a hindrance to the analysis

“Exchange time” EXCH = irwait + disc + conn– Time the operation is open on the chpid– Analyzing this can be very useful– Perfkit doesn’t report it directly– But it’s in the MONWRITE data, for the intrepid or desperate

• Yet another reason why we ask for raw MONWRITE data



ESCON EXCH times typical in this workloadConsolidated device C800 240RES

__________________________WEB0720__________________________

__Time__ ___IOR___ __IRPIO__ __DDPIO__ __CNPIO__ __EXPIO__ __EXPS___

08:12:00 32.033 0.000 0.143 4.124 4.267 136.687

08:13:00 30.850 0.000 0.119 4.136 4.255 131.262

08:14:00 30.817 0.000 0.167 4.130 4.296 132.397

08:15:00 30.667 0.000 0.176 4.120 4.296 131.748

08:16:00 21.900 0.000 0.107 4.114 4.221 92.437

From FCX168 DEVLOG C800:

Interval Mdisk Pa- Req.

End Time Type Label/ID Links ths I/O Avoid Pend Disc Conn Serv Resp CUWt Qued Busy READ

04:11:55 3390 >DDR00001 0 4 32.1 .0 24.1 .1 4.1 28.3 28.9 .0 .0 91 100

04:12:55 3390 >DDR00001 0 4 30.9 .0 24.6 .1 4.1 28.8 28.8 .0 .0 89 100

04:13:55 3390 >DDR00001 0 4 30.8 .0 25.0 .2 4.1 29.3 29.9 .0 .0 90 100

04:14:55 3390 >DDR00001 0 4 31.3 .0 24.0 .2 4.1 28.3 28.3 .0 .0 88 100

04:15:55 3390 >DDR00001 0 4 22.5 .0 18.1 .1 4.1 22.3 22.3 .0 .0 50 100

We see almost no IRPIO but ~ 25 msec of PEND.

Therefore PEND is almost all CWAIT.

These ESCON chpids are too busy.



FICON EXCH times typical in this workloadConsolidated device D01A WBK026

__________________________WEB0720__________________________

__Time__ ___IOR___ __IRPIO__ __DDPIO__ __CNPIO__ __EXPIO__ __EXPS___

08:18:00 13.000 0.857 21.950 12.011 34.818 452.634

08:19:00 12.717 0.956 22.792 11.514 35.262 448.418

08:20:00 13.417 0.993 20.710 11.731 33.434 448.576

08:21:00 12.883 0.900 23.369 11.040 35.309 454.901

From FCX168 DEVLOG for D01A

Interval Mdisk Pa- Req.

End Time Type Label/ID Links ths I/O Avoid Pend Disc Conn Serv Resp CUWt Qued Busy READ

04:18:00 3390 >DDR00027 0 1 13.0 .0 37.1 21.9 12.0 71.0 71.0 .0 .0 92 0

04:19:00 3390 >DDR00027 0 1 12.7 .0 38.9 22.8 11.5 73.2 73.2 .0 .0 93 0

04:20:00 3390 >DDR00027 0 1 13.4 .0 36.5 20.7 11.7 68.9 70.4 .0 .0 92 0

04:20:59 3390 >DDR00027 0 1 12.9 .0 36.7 23.4 11.0 71.1 71.1 .0 .0 92 0

IRtime ~ 1 msec => rest of PEND is ctime => too few FICON chpids.



What We See So Far

Channel subsystem is really queuing on FICON chpid

Channel subsystem is queuing on ESCON chpids

This queuing is elongating I/O response time

Can we see how things are going on the chpidsthemselves?



Open Exchange Analysis

For each device,– EXCHtime/sec = IOs/sec * EXCHtime/io– EXCHtime/io = IRtime/io + DISC/io + CONN/ioSum over all devices reachable over the chpidsDivide by number of chpids to get e = EXCHtime/chpid/sec– On ESCON, 0



ESCON e, typical for this workload

Open Exchange Report, assuming 4 chpids

__Time__ _WEB0720__ __Total___ ___OEX____

08:11:00 186.652 186.652 0.047

08:12:00 4017.602 4017.602 1.004

08:13:00 3944.960 3944.960 0.986

08:14:00 3953.350 3953.350 0.988

08:15:00 3867.998 3867.998 0.967

08:16:00 2753.180 2753.180 0.688

08:17:00 1646.332 1646.332 0.412

08:18:00 1664.512 1664.512 0.416

08:19:00 1625.777 1625.777 0.406

08:20:00 1631.189 1631.189 0.408

08:21:00 1601.634 1601.634 0.400

08:22:00 1630.931 1630.931 0.408

08:23:00 1630.526 1630.526 0.408



FICON e for this workload

Open Exchange Report, assuming 1 chpids

__Time__ _WEB0720__ _GRN0720__ _YEL0720__ __Total___ ___OEX____

08:19:00 13619.693 27151.697 10842.976 51614.366 51.614

08:20:00 13259.164 26382.097 10587.415 50228.676 50.229

08:21:00 13732.412 27513.826 10948.452 52194.690 52.195

08:22:00 13682.709 27134.276 10962.487 51779.473 51.779

08:23:00 13869.775 27277.918 10929.429 52077.122 52.077

08:24:00 13816.064 27432.410 10977.666 52226.140 52.226

08:25:00 13803.868 27340.420 11164.580 52308.868 52.309

08:26:00 14067.490 27749.820 11181.547 52998.857 52.999

08:27:00 14320.785 28666.660 11508.521 54495.966 54.496

08:28:00 14776.981 29448.911 11734.622 55960.514 55.961

08:29:00 14651.085 29432.937 11657.303 55741.325 55.741

08:30:00 13937.165 28424.352 11233.331 53594.848 53.595

08:31:00 14817.231 29654.477 11737.690 56209.397 56.209

08:32:00 16785.233 35103.435 13485.180 65373.847 65.374

08:33:00 21109.491 40675.031 16819.157 78603.680 78.604

Uniformly awful.

Usually don’t want to drive this past about 4 to 6.

PEND is correspondingly awful throughout the run.



ESCON CONN < FICON CONN?

I thought CONN was time spent sending data

Shouldn’t FICON time be less? The fiber’s faster.

ESCON: one I/O, then next I/O, then next I/O…

FICON: I/Os are interleaved, like IP packets

CONN is time from beginning of first frame to end of last frame

Even though fiber is faster, FICON CONN is longer

Too much interleaving

Get more chpids



Very Rough Crack at FICON Interleaving Level

ESCON conn per I/O was 4.1 msec

FICON conn per I/O was 11.5 msec

FICON fiber is about 7.5x as fast as ESCON fiber

(11.5 / (4.1 / 7.5)) = 21 interleaving factor

Either way you look at this, the FICON interleaving factor is just way too high in this workload

Use caution with this technique. Because of the prevalence of FICON, controllers often do not disconnect anymore, except for acache miss.



Remediation Possibilities for This Workload

More FICON chpids to 2107– Data suggests drop from 70 msec/IO to 35 msec/IO if chpid limitations

were remediated– We don’t know whether this would just make the 2107 DISC situation

worse

Spread FICON target volumes over >1 2107– Would certainly help with DISC time

More ESCON chpids to 2105– Probably least urgent, because PEND settled down after a few

minutes

Consider Metro Mirror aka Synchronous PPRC?



How Do You Know When You Have Enough?

No cwait means you have enough chpids– Probably PEND is a good approximator of this

FCX232 IOPROCLG shows no channel busy hits

CONN ratio is in line with fiber speed ratios– Be careful of non-data-transfer-time charged to CONN

DISC time will come down as you add controller cache– Spread data across multiple controllers

Acceptable application response time



Summary

LCHANNEL, aka “channel busy” report, is probably the least useful indicator of “busy-ness”

PEND time is generally indicative of channel subsystem contention, even though it contains both cwait and irwait

FCX232 IOPROCLG does show us busy retries, if we know how to look at it

Open exchange analysis reveals how bad the situation is on the fiber itself

Case Study: Overloaded Chpids - IBM · 2019. 1. 11. · IBM Systems & Technology Group © 2007 IBM Corporation Case Study: Overloaded Chpids Revision 2008-07-29 BKW IBM z/VM Performance

Documents