SHARE August 2011 1 Getting Started in (z/OS) Capacity Planning (Topics in Capacity Planning) Ray Wicks 561-236-5846 [email protected]Bibliography Ray has spent most of his career at IBM in the performance analysis and capacity planning end of the business in Poughkeepsie, London, and now at the Washington Systems Center. He is the major contributor to IBM’s internal PA & CP tool zCP3000. This tool is used extensively by the IBM services and technical support staff world wide to analyze existing zSeries configurations (Processor, storage, and I/O) and make projections for capacity expectations. Ray has given classes and lectures worldwide. He was a visiting scholar at the University of Maryland where he taught part time at the Honors College. He won the prestigious Computer Measurement Group’s A.A. Michelson award in 2000. His recent virtual sessions “Getting Started in Performance Analysis & Capacity Planning” workshop held for attendees in China and India was well accepted.
50
Embed
Getting Started in CP HO - the Conference Exchange · SHARE August 2011 1 Getting Started in (z/OS) Capacity Planning (Topics in Capacity Planning) Ray Wicks 561-236-5846 [email protected]
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
BibliographyRay has spent most of his career at IBM in the performance analysis and capacity planning end of the business in Poughkeepsie, London, and now at the Washington Systems Center. He is the major contributor to IBM’s internal PA & CP tool zCP3000. This tool is used extensively by the IBM services and technical support staff world wide to analyze existing zSeries configurations (Processor, storage, and I/O) and make projections for capacity expectations.
Ray has given classes and lectures worldwide. He was a visiting scholar at the University of Maryland where he taught part time at the Honors College.
He won the prestigious Computer Measurement Group’s A.A. Michelson award in 2000. His recent virtual sessions “Getting Started in Performance Analysis & Capacity Planning” workshop held for attendees in China and India was well accepted.
SHARE August 2011
2
Trade Marks, Copyrights & StuffMany terms are trademarks of different companies and are owned by them.
On foils that appear in this presentation are not in the handout. This is to prevent you from looking ahead and spoiling my jokes and surprises. Also foils added afterI made handouts.
This tutorial is a two part introductory level session designed to introduce the student to the concepts required for Performance Analysis and Capacity Planning.
Emphasis is placed on large processor systems and examples will be largely drawn from z/OS but the concepts apply to all operating systems and hardware. Topics:
Abstract
Conceptual and Perceptual structures for performance analysis and capacity planning,
Using the Forced Flow law in PA & CP
Performance Analysis queries for capacity planning
Processor performance data (ITRRs & MIPS),
Resource Metrics for use in the Balance System model,
Using the utilization growth process in capacity planning,
A simple view of analytic modeling in CP
SHARE August 2011
3
PhilosophyIn this presentation I have stressed the principles involved in performance analysis and capacity planning rather than the specific development of formulae and hard details in the hardware and software architectures and implementations. The formulae can be found in many text books. It is the intuitions behind the formulae which is important. There is a difference between the formula and understanding what the formula is telling you.The architectures and implementations will change over time. It is more important to know how to look and evaluate a system the counts. If you want hardware and software details, you look to the vendors for that.
Capacity Planning
Capacity Planning ensures that Adequate resources are available for the Workload to complete in an Appropriate time.
Performance Analysis is Short Term (3-7 Days) Capacity Planning is Long Term (6-24 Months) What's the Workload using now? What’s Appropriate? Adequate? Service Level Objective or Service Level Agreement Things are ordered by Priority or Importance Discretionary Workloads?
SHARE August 2011
4
CP QuestionsEASY Do I have enough resource (CPU, I/O, Storage,..) to do the job today? If not, who’s suffering? If I get more, who will be helped? How much? If I need more, when will it be? How much more? Can I use specialized Processing Units? What variables should I track? Do I have any latent demand?
Harder Do I want faster or more CPs? How do I establish my growth? How do I size a new application? What tools should I use? Which interval do I model? If I reduce the #CPs & keep the MIPS the same will there be a problem?
Conceptual Framework (Model, Paradigm) What's it supposed to mean? How do things connect?
Perceptual Framework What's it supposed to look like? How do things connect?
SHARE August 2011
7
A Plumbing Problem
P u m p
P u m p
P u m p
ΣF ΣF
Fluid Flow
A Plumbing Problem
Pum p
Pum p
Pum p
ΣF ΣF
Fluid Flow
=Forced Flow Law
SHARE August 2011
8
Conceptual Framework Forced Flow Law Flow Law
Waiting Service
Waiting Service
Waiting Service
Service Center 1
Service Center 3
Service Center 2
Conceptual Framework
W aiting Service
W aiting Service
W aiting Service
Service Center 1
Service Center 3
Service Center 2
5/M inute
15/M inute If the model is divided in two and the number of transactions crossing the boundaries is as indicated, what would the forced flow law say?
SHARE August 2011
9
Conceptual Framework
Waiting Service
Waiting Service
Waiting Service
1 Second
15 Minutes
1 Second
If the total time in the service center is as indicated, where would the users/transactions be?
B.S. Conceptual Framework
Disk
Disk
G KW E
Thinking
CPU
M em ory
SHARE August 2011
10
Conceptual Framework Implications
Users distribute themselves among nodes in proportion to the time spent at each node. The capacity of the System is determined by the slowest node (server). The resource usage (transaction rates) at various nodes are in proportion.
DASD
DASD
GKW E
Thinking
CPU
M em ory
2010 z/OS B.S. Metrics
MIPS usedS = SSCH rateDG = DASD gigabytes. The computation is nominal in that it is 2.83/actPS = Central Storage configured
If the model is valid, the forced flow law would prescribe that the variation of CPU service would be proportional to the I/O service. In this graph that translates into a linear relationship (R2>0.7?).
When It Doesn’t Work
y = 37.324x
R2 = 0.113
0
500
1000
1500
2000
2500
3000
3500
4000
0 20 40 60 80 100
CPU%
DA
SD
I/O
If the model Doesn’t work, you would see a non-linear relationship.
SHARE August 2011
16
RIOC=F(Workload Composition)
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0:00 4:48 9:36 14:24 19:12 0:00
Time
RIO
C
0102030405060708090
100
0:59
2:59
4:59
6:59
8:59
10:59
12:59
14:59
16:59
18:59
20:59
CP
U%
BATDEV2
BATDEV1
BATPROD
STCLO
STCMD STCHI
SYSSTC
SYSTEM
Compare those intervals in the RIOC plot where the RIOC is stable or unstable with the workload mixture for the same intervals in theCPU% plot. The workload mixture often explains the RIOC variation.
But... it may cause you to see Framework (model) interactions that just aren't there!
SHARE August 2011
17
Another ExamplePartition GCP MIPS
0
5000
10000
15000
20000
25000
0:00
3:15
6:30
9:45
13:0
0
16:1
5
19:3
0
22:4
5
2:00
5:15
8:30
11:4
5
15:0
0
18:1
5
21:3
0
MIP
S
CEC7/SYSJCEC7/SYSG
CEC4/SYSK
CEC4/SYSR
CEC4/SYSF
CEC6/SYSE
CEC6/SYSH
CEC6/SYSD
zIIP MIPS
0
200
400
600
800
1000
1200
1400
0:00
4:00
8:00
12:0
0
16:0
0
20:0
0
0:00
4:00
8:00
12:0
0
16:0
0
20:0
0
CEC7/SYSJCEC7/SYSG
CEC4/SYSR
CEC4/SYSF
CEC6/SYSH
CEC6/SYSD
CEC 6 UtilCEC6 GCP Util
0
20
40
60
80
100
120
0:0
0
3:3
0
7:0
0
10:3
0
14:0
0
17:3
0
21:0
0
0:3
0
4:0
0
7:3
0
11:0
0
14:3
0
18:0
0
21:3
0
SYSDSYSH
SYSE
CEC 6 zIIP Util
020406080
100120140160
0:0
0
3:0
0
6:0
0
9:0
0
12:
00
15:
00
18:
00
21:
00
0:0
0
3:0
0
6:0
0
9:0
0
12:
00
15:
00
18:
00
21:
00
Potential
SYSD
SYSH
SHARE August 2011
18
SYSD zIIP = F(GCP)?zI
IPM
IPS
R2 = 0.02
Workload zIIP = F(GCP)?
SHARE August 2011
19
Fuzzy Thinkers Do Not Trust Perfection
y = 1.9259x
R2 = 1
0
2
4
6
8
10
12
14
0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00
GCPs
zIIP
s
SYSD Workload zIIP = F(GCP)?
y = 0.0844x
R2 = 0.9047
y = 0.1037x0.9262
R2 = 0.9805
0
0.1
0.2
0.3
0.4
0.5
0.6
0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00
GCPs
zIIP
s
SHARE August 2011
20
CEC 7 UtilCEC 7 GCP Util
0
20
40
60
80
100
120
0:00
3:15
6:30
9:45
13:0
0
16:1
5
19:3
0
22:4
5
2:00
5:15
8:30
11:4
5
15:0
0
18:1
5
21:3
0
SYSG
SYSJ
CEC7 zIIP Util
050
100150200250300350400450
0:00
3:30
7:00
10:3
0
14:0
0
17:3
0
21:0
0
0:30
4:00
7:30
11:0
0
14:3
0
18:0
0
21:3
0
Potential
SYSGSYSJ
Another Example
y = 2.4545x
R2 = 0.3726
0
2000
4000
6000
8000
10000
12000
14000
16000
0 500 1000 1500 2000 2500 3000 3500 4000 4500
MIPS
I/O
SHARE August 2011
21
Same Data
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
0:00
1:00
2:00
3:00
4:00
5:00
6:00
7:00
8:00
9:00
10:0
0
11:0
0
12:0
0
13:0
0
14:0
0
15:0
0
16:0
0
17:0
0
18:0
0
19:0
0
20:0
0
21:0
0
22:0
0
23:0
0
IO/M
IPS
Conceptual Framework
DASD
DASD
G KW E
Thinking
CPU
M em ory
Maybe we need a big processor? More engines?????
SHARE August 2011
22
Future Conceptual Framework?
DASD
GKW E
Thinking
CPU
Non-Volatale M em ory
Some Capacity Concerns at a Service Center
Type of serverNumber of serversSpeed of servers
Number in queueOrder in queueService demandType of servicePopulationArrival pattern
SHARE August 2011
23
Z900 Structure
Crypto 1 ClockCrypto 0
ETR
Cluster 1
MBA3
STI
L1
PU10
L1
PU0C
L1
PU0D
L1
PU0E
L1
PU0F
L1
PU0B
L1
PU0AMBA2
STI
L1
PU13
L1
PU11
L1
PU12
Cache control Chip and cache data Chips 16 MB L2 Shared Cache
Cluster 0
MBA0
STI
PU01 PU06PU02 PU03 PU04 PU05PU00
STI
PU09PU07 PU08
Cache control Chip and cache data Chips 16 MB L2 Shared Cache
MBA1
ETR
35 logic chips in total on a 20 PU MCM
PassatI/O Cage(Optional)
Parallel 3/4 PortOSA-2 TR
OSA-2 FDDIESCON 4 Port
ESCON 16 PortFICON 2 Port
OSA-E Gb EthernetOSA-E Fast Ethernet
OSA-E ATMISC-3 1-4 Port
PCI-CC 2 engines
CargoI/O Cage
333 MByte STIs 1 GByte STIs
ICB-2 333 MByte
ICB-3 1 GByte
Memorycard
0
Memorycard
2
Memorycard
1
Memorycard
3
L1 L1L1 L1 L1 L1L1 L1L1 L1
Ref. SG24-5975
Connections
Micro Processor(Adaptor)
Buffer
Link
SubChannels
Every line connecting two boxes in a diagram implies micro processors on each end to do the talking? (What happens if they speak different languages?) Data is moved from a buffer to micro processor buffer onto link into m-processor buffer into storage buffer.
SHARE August 2011
24
Z10 Memory – a simple view
BookMemory
L2 Cache
L 1.5
CPU
L 1
L 1.5
L 1
CPU
L 1.5
L 1
CPU
Memory
L2 Cache
L 1.5
CPU
L 1
L 1.5
L 1
CPU
L 1.5
L 1
CPU
PR/SM
The Nest
Mem
ory
Hie
rarc
hy
or
N
est
Inst
ruc
tio
n
Co
mp
lexi
ty-
Mic
rop
roce
sso
r d
esig
n
Reference: John Burg’s presentation at SHARE 3/3/2011
Note that these are initial values and may change.
SHARE August 2011
28
Hiper Dispatch
Problem of cache misses is alleviated by controlling the dispatch of LCPs on specific RCPs.
Keep LCP-RCP on the same book.
Minimize PRSM dispatching.
Keep LCP on same RCP.
Re-dispatch work units on same processor subset.
zPCR
SHARE August 2011
29
zPCR
Establish Power value in MIPS?
SHARE August 2011
30
Processor Power: zPCR
Configuration Input
Manual
RMF Listing
EDF file (CP3KEXTR)
Processor Power: zPCRMaximum: MIPS available if other partitions are idle given logical configuration.
Minimum: MIPS entitled to if other partitions are demanding their fair share (Weight) given the logical configuration.
2097-E56 summary with this logical configuration
SHARE August 2011
31
Capacity Planning
SLO
Time
Resource Usage
t0
CP Actions
Starting from a single point at t0, we project growth until some threshold is reached– a Service level Objective or Agreement (SLO, SLA).
Then we take action.
SLO or SLA for a Workload Group
Interactive
For some number of users
At some threshold transaction rate
At some threshold Response Time
At some amount of power & I/O
Batch
For some number of Jobs
At some rate
At some about of power & I/O
At some turn-around threshold
SHARE August 2011
32
Capacity Planning Actions
Upgrade Hardware Add CPs (PUs) New Model Add Another CPC
Move Workload to another image Split Workload and move a piece Tune it? Continue to Suffer
How Accurate Is It?
Time
Prediction
t0
Starting from an initial point of maybe dubious accuracy, we apply a growth rate (also dubious) and then recommend actions costing lots of money.
SHARE August 2011
33
Accuracy
Timet0Time
Prediction
t0
Accuracy is found in values that are close to the expected curve. This closeness implies an expected bound or variation in reality. So a thicker line makes sense.
Fuzzy Patches
Time
Prediction
t0 t
p
Time
Prediction
t0 t
p
At time t, is the prediction a precise point p or a fuzzy patch?
SHARE August 2011
34
Fuzzy Factors
Basis for prediction is a single sample taken from a set of samples with some distribution.
Growth Factor applied may be just better than fiction.
Prediction compounds the fuzz and is itself fuzzy.
Niels Bohr: “Prediction is very hard to do. Especially about the future.”
Analytic ExampleErlang’s M/M/c
Q
S
c = Number of CPsU = UtilizationT= traffic or c*UC(c,T) is Erlang's C formulaE[s] is expected service time
From any queuing theory book: Arnold or Jain for example.
E[RT] = E[S] + E[Q]
E[RT] = E[S] + C(c,T)E[S]c(1-U)
E[RT] = E[S] 1 + C(c,T) c(1-U)
SHARE August 2011
35
Erlang’s M/M/c
Q
S
c = Number of CPsU = UtilizationT= traffic or c*UC(c,T) is Erlang's C formulaE[s] is expected service time
E[RT] = E[S] + E[Q]
E[RT] = E[S] + C(c,T)E[S]c(1-U)
E[RT] = E[S] 1 + C(c,T) c(1-U)
Read as Contention Factor. When CF=0, E[RT] = E[ST]When CF=1, E[RT] = 2* E[ST]
M/M/1E[RT] = E[S] + E[Q]
E[RT] = E[S] + C(c,T)E[S]c(1-U)
E[RT] = E[S] {1 + C(c,T) }c(1-U)
E[RT] = E[S]1-U
IF E[S] = 30 Ms. And U=80%Then E[RT] = 30/(1-0.8) = 150
C=1
SHARE August 2011
36
M/M/1 Exercise
Assume M/M/1.
(1) What is the expected RT for Medium?
(2) At what effective utilization would the response time for medium exceed 1.2 seconds?
12%
45%
32%
Workload Utilization
89%1.32 MinLow
77%0.25 secMedium
32%0.05 secHi
Perceived Utilization
Service Time
Workload
M/M/1 Exercise
12%
45%
32%
Workload Utilization
89%1.32 MinLow
77%0.25 secMedium
32%0.05 secHi
Perceived Utilization
Service Time
Workload
Assume M/M/1.
(1) What is the expected RT for Medium?
(2) At what effective utilization would the response time for medium exceed 1.2 seconds?
RT = ST / 1-U RT = 0.25 / 1- .77RT= 1.1
RT = ST / 1-U 1.2 = 0.25 / 1-UU= 79%
SHARE August 2011
37
Simple Capacity PlanProblem: For the following workload, find the future workload utilization by month if the per annum growth rate is 30%.
Base# LCPs 4MIPS 1000MIPS/LCP 250
Target# LCPs 6MIPS 1200MIPS/LCP 200
Base → Target More MIPS More Engines Slower Engines
Migration Options Migrate to same MIPS and more CPs. Migrate to same MIPS and fewer CPs. Migrate to more MIPS and fewer CPs. Migrate to more MIPS and more CPs.
For each period, Base and Target have same transaction rate.
How to Compare?
0.010.020.030.040.050.060.070.0
70.0 80.0 90.0 100.0
CPU%
Res
po
nse
Ms
LowBaseLowTarg
Same utilization
Same TransRate
SHARE August 2011
42
Modeling Restrictions At CPU contention points, WLM and IRD in z/OS can restructure the physical & logical configuration. At I/O contention points in z/OS, PAVs can be moved around. Model arrival rates and service distributions can vary significantly. Priority ordering varies under WLM control In Sysplex, transactions could be routed to other partitions. Software serialization points usually not modeled. Model output results strictly mimic the model. If you miss something important, the results may not reflect your environment. Various models differ greatly in detail. Don’t miss an important service center.
CP Alternatives
ROTs
Trending
Analytic Modeling
Simulation
Benchmark
Difficulty (Time) & Cost ($$$)
Accuracy
SHARE August 2011
43
Rules of Thumb
ROTs (thresholds) are often quite adequate for CP
Useful for Health Check
Rules of Thumb
• Honor your Father & Mother
• Do unto others as you would have them do unto you.
• Do unto others before they do unto you.
• Keep your CPU%<90%
• Don’t swim soon after eating.
• It is better to give than receive.
SHARE August 2011
44
Philosophical Remark
We understand a Rule by trying to break it.Or
Learn the rules so you know how to break them correctly.
All swans are white ≡ There does not exist a swan which is not white
Balanced System ROTs
DASD
DASD
GKW E
Thinking
CPU
M em ory
MIPS used, memory used, I/O used should be in some proportion.
SHARE August 2011
45
Expected Bounds
The resource ratio is shown as a bar. If the bar is above the 90%ile line, it means that the value was in the top 10% of the samples reviewed. Similarly, if the bar is below the 10%ile line, the value is in the bottom 10%. Neither is good or bad., it’s an flag to examine the amount of resource available.
Trending
Trending predicts the future if the future looks like the past.
Time Series Trending is complicated.
Trending can answers overall CP questions.
SHARE August 2011
46
Analytic ModelingDASD
DASD
GKWE
Thinking
CPU
Memory
Pre-built packages can be fast to solve and relatively easy to use. Flow is statistically driven and usually predefined. Accuracy? Utilization within 5% Response times within 30%
Data acquisition is key. Calibration can be tough. Custom analytic models are really tough. Requires technical staff. Services are Available.
ROTs & Analytic Modeling
Res
po
nse
Tim
e
System Utilization
Low
Middle
Hi
0.0
20.0
40.0
60.0
80.0
100.0
120.0
Jun-07
Jul-07 Aug-07
Sep-07
Oct-07 Nov-07
Dec-07
Jan-08
Feb-08 Mar-08
Apr-08
CP
U%
Low
Middle
Hi
The relationship between Utilization and Server Response is sensitive to the priority of the workload. Utilization in Response time is “perceived utilization”. Watch out for: Logical vs physical utilization and single task workloads.
SHARE August 2011
47
Simulation Pre-built packages are slower to solve and can be relatively easy to use. Flow is statistically driven and usually predefined but can be customized. (Application modeling.) Accuracy?
Utilization within 5% Response times within 30%
Data acquisition is key. Calibration can be tough. Custom models are build from service center building blocks. Simulation languages do exist. Specialized staff. Services exist.
DASD
DASD
GKWE
Thinking
CPU
Memory
Benchmark
A lot of work in preparation Hardware/SoftwareWorkload Lot's of time.
It does mimic the running environment the best. Software flow & queuing Software usage It's expensive. Variations limited by resources. Given the resources the benchmark can be complicated. Tests the environment - does it work?
SHARE August 2011
48
CP Alternatives
Difficulty
Accuracy
ROTs
Benchmark
AnalyticModel
Simulation
Trending
What questions do you have?What questions must you answer? Cost of the answer? Cost of getting it wrong? Time line?What happens if you get it wrong?
Things to Remember
• Be aware of the exceptions to the rule.• A framework helps but it can make you see things
that just aren't there.• Impeccable mathematics does not replace
knowledge of the facts.• Protect yourself.• Business decisions can override technical issues.• Sometimes being understood is more important than
being very accurate.• Being "very" accurate may be a luxury of the idle.• Other than the technicalities, there may be a hidden
agenda.
SHARE August 2011
49
Bibliography - IThe Art of Computer Systems Performance Analysis, by Raj Jain, Wiley. I like this one. It is thorough and complete. A very good reference.Capacity Planning for Web Performance, by Daniel A. Menasce and Virgilio A.F. Almeida, Prentice Hall. A good book on network structure and terminology and introduction to the topic.Probability, Statistics, and Queuing Theory, by Arnold O. Allen, Academic Press Inc. This is the classic in queuing theory.Performance by Design: computer capacity planning by example. By Daniel A. Menascé, Virgilio A. F. Almeida, and L. W. Dowdy. The web site http://cs.gmu.edu/~menasce/perfbyd/ has a lot of .xls modeling worksheets. MVS I/O Subsystems, by Gilbert E. Houtekamer and H. Pat Artis, Performance Associates. More than you want to know about the I/O subsystem. A definitive source but is a little out of date. Is available from Intellimagic or perfassoc.com. Exploring IBM S/390 Computers, by Jim Hoskins and George Coleman, Maximum Press. A general introduction to S/390 hardware and architecture. (with IBM G326-3006-06)Statistical Concepts and Methods, by Gouri Bhattacharyya and Richard A. Johnson, John Wiley & Sons.The Practical Performance Analyst, by Neil J. Gunther, Authors Choice Press. A very good book.Almost any volume of the Computer Measurement Group (CMG) Proceedings is worth
looking at for performance and capacity planning articles. Web Site: http://www.cmg.org/measureit/
Bibliography - IIGC28-1761 MVS™ Planning: Workload Management. A guide to WLM.SC28-1950 Resource Measurement Facility Report Analysis. A guide to report reading.SC28-1951 Resource Measurement Facility Performance Management Guide. A good tutorial to get started.SG24-5975 IBM zSeries 900 Technical Guide. A good hardware architecture and implementation Red Book.LY28-1042 RMF™ Support for LPAR Management Time. Want to know how LPAR works?SC28-1187 Large Systems Performance Reference by John Fitch. John goes into detail about the LSPR data.SG24-4356 System/390® MVS Parallel Sysplex Performance. A good Red Book on Parallel Sysplex RMF reports and data.SG24-4680 System/390 MVS Parallel Sysplex Capacity Planning . A good Red Book on the function and capacity of Parallel Sysplex.
A great URL for z/Series documents in general: http://www-1.ibm.com/servers/eserver/zseries/zos/bkserv/RMF in particular:http://www-1.ibm.com/servers/eserver/zseries/zos/bkserv/r4pdf/rmf.htmlFor zPCR,search www.ibm.com for “zPCR” & “SoftCap”
SHARE August 2011
50
Bibliography - IIIEXCEL:Applied Statistics For Engineers and Scientists Using Excel and MINITAB, by David Levine, Patricia Ramsey, Robert Smidt, Prentice Hall. This comes with a CD containing handy Excel Add-Ins.Excel Data Analysis by Jinjier Simon, Wiley. Nice basic reference concentrating on data presentation.
Other Good Stuff:The Black Swan: The Impact of the Highly Improbable, by Nassim Nicholas Taleb, Random House. This is an informative and entertaining approach to statistical analysis among other things.
Statistics as Principled Argument, Robert Abelson, Erlbaum Assoc. Publishers, 1995. Go good discussion of the use of statistics without the ugly formulae.
Judgment under uncertainty: Heuristics and biases, Kahneman, Slovic, & Tversky, Cambridge University Press. The first chapter alone is worth reading. It’s a summary of the pitfalls with intuitive thinking.
Ray Wicks’ Monographs
CPS document for this presentation and other interesting monographs can be obtained from your favorite IBMer. Goto:ftp://cpstools.washington.ibm.com/zcp3000/winLook for: Getting Started In CP.These have been published in cmg.org/measureit.