-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 1
1
Six Sigma Measure Phase
Measuring the CurrentState of a Process
2
Case Study Scanner Mfg
Key Output Variables (Ys) Weld Shear Force (from destructive
test)
Specification: Shear Force > 13 lbs Visual Weld Inspection
(binary: pass/fail)
Process Variables (Xs) Material (melt flow index) Surface
condition Press force Clamping force Temperature
1
3
54
2
Problem:Weld Defects betweenMylar Motor and Attachment
Bracket(UltrasonicWeld Operation)
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 2
3
Topics
I. Review Types of Data
II. Review of Exploring Data Patterns and Descriptive
Statistics
III. Six Sigma Metric Calculations* Yield Defects Per Million
(DPM) Defects Per Million Opportunities (DPMO) DPM based on
Variable (Numerical) Data
* Note: Other metrics will be discussed in future lectures
4
I. Types of Data Variables
Selection of analysis method/tool depends on type of data
Discrete/ Continuous Variables (Numerical/Quantitative Data)
Discrete variables - vary by whole units (# of customers)
Continuous variables - vary to any degree, limited only by
precision of measurement system. Time to complete a task
Manufactured hole diameter measurement may be 10 mm, 10.0 mm, 10.01
mm, 10.008 mm
Qualitative (Categorical) Variables (Attribute Data) Binary
(pass/fail; defective/ not defective) Ordinal (ordered
classification system such as survey rating systems) Nominal
(non-ordered groups or classifications)
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 3
5
Qualitative (Categorical) Data
To analyze qualitative data, we typically assign discrete
numerical values and/or use them to stratify or group other
numerical data by categories
Some examples are: Binary Variables assign discrete binary
outcome (0/1)
Examples: On Time Delivery, Service Quality Binary Attribute: On
Time (0) / Late (1); OK (0) / NOK (1)
Ordinal Variables assign discrete ordinal scale to classify
responses Ordinal Attribute natural order is implied between
categories but the magnitude
of difference is unknown Example 1: Variable = Size
Small, Medium, and Large Example 2: Variable = Survey Response
to Question (with ordinal attribute scale)
Strongly Disagree(1), Disagree(2), Neutral(3), .. Strongly Agree
(5)
Nominal (Categorical or Grouping) Variables use to stratify or
group data Variable Example: Distribution Center
Nominal Attributes: Northeast, Southwest, Central Other
Examples: Shift (e.g., Day or Night); Plant; Department; Model
Type
6
II. Review of Exploring Data Patterns and Descriptive
Statistics
To characterize a variable, we typically observe a Sample from a
Population and run statistical analysis (e.g., compute
Statistics).
Some Common Statistical Analysis/Tools to characterize a
variable include:A. Data patterns regardless of time order
Common Tools (Sample size, N > 30): Histogram, Box Plot If
small sample size (e.g., N < 30): use Dot Plot
B. Data patterns in time order (i.e., to evaluate process
stability over time) Run Chart (also known as trend chart or time
series plot) Statistical Process Control (SPC) Chart (refer to SPC
lecture)
C. Descriptive Statistics Summary Table common statistics to
report include: Sample Size, N Location Statistics: Mean and Median
Dispersion (Variation) Statistics: St Dev, Variance, Range (with
Min and Max) Symmetry and Peakedness of Distribution Shape:
Skewness and Kurtosis Additional Statistics: Trimmed Mean,
Quartiles, or Percentiles
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 4
7
A. Histogram Example
Typical Y-Axis: frequency or relative frequency May use relative
frequency (%) if sample size is large May create using Excel or
Minitab
Minitab Commands:>> Graph>> Histogram>> Select
VariableShearForce
ShearForce
Freq
uenc
y
24181260
16
14
12
10
8
6
4
2
0
Histogram of ShearForce
Note: Requirement is Shear Force >= 13 (Lower Specification
Limit (LSL) = 13)
8
Normal Vs. Skewed Data
Does shear force data appear normally distributed or another
(e.g., skewed right, skew left, or bi-modal)? Is this likely a
natural
phenomenon?
Normal Skewed Right Bi-Modal
ShearForce
Freq
uenc
y
24181260
16
14
12
10
8
6
4
2
0
Histogram of ShearForce
Skewed Left
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 5
9
Statistical Test - Normality We may use Minitab to test for
Normality
Null Hypothesis (Ho): Data are Normal; Ha ~ Data are not Normal
Test Conclusion: p-value is ~0.000 (note: if p-value < alpha,
reject Ho)
ShearForce
Perc
ent
403020100
1.0E+02
99
9590
80706050403020
10
5
1
0.1
Mean
> Stat>> Basic Statistics>> Normality TestSelect
Variable
ShearForce
Note: Selected Anderson Darling Test
Default: alpha error = 0.05
10
Box Plot Calculations
**
Mild Outlier(s)
Upper Whisker:Highest value within
upper limit
Median
Third quartile (Q3)
First quartile (Q1)
Q3 75th PercentileMedian - 50th PercentileQ1 25th Percentilefs =
Q3 Q1
Upper Limit:Q3 + 1.5 fsLower Limit:Q1 1.5 fs*Lower Whisker:
Lowest value within lower limit
Extreme Outlier(s)
< extremeoutlier
> extremeoutlier
Q1 - 1.5 fs > Q1 - 3.0 fs> mildoutlier
Q3 + 1.5 fs < Q3 + 3.0 fs< mildoutlier
Excel Command (E.g., Q3)=percentile(data array, 0.75)
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 6
11
Box Plot Shear Force What does this box plot suggest?
Minitab Commands:>> Graph>> Boxplot>> Select
VariableY = ShearForce
Shea
rFor
ce
30
25
20
15
10
5
0
Boxplot of ShearForce
12
Histogram Vs. Box Plot
Box plots provide a similar representation of distribution as
Histogram (for Normal, skewed right, skewed left) Exception: must
show multi-modal with histogram
ShearForce
Freq
uenc
y
24181260
16
14
12
10
8
6
4
2
0
Histogram of ShearForce
Shea
rFor
ce
30
25
20
15
10
5
0
Boxplot of ShearForce
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 7
13
Outlier Analysis (Extreme Values)
Box plots provide an effective tool to identify possible
outliers
Outliers are non-representative values in a data set and
generally result from measurement or data entry error (e.g., record
using wrong units) observation being obtained under a different set
of circumstances
(e.g., special cause) data recorded during peak volume versus
typical conditions
Outliers may significantly affect descriptive statistics such
asmean/standard deviation and other statistics (e.g., correlation
between two variables)
14
Outliers: Good Or Bad?
Data Analysis Trap is to automatically exclude outliers
Outliers may suggest a better set of operating conditions are
available
Unfortunately, deciding whether to include or exclude outliers
is an experience-developed skill Try to understand the source of
outliers before discarding
If decide to remove outlier, some typical strategies are: With a
large sample size, remove the entire observation For smaller
samples (N < 100) where you collect data on
several variables, you may want to keep the sample. Here, we
typically replace the outlier sample value with median value for
that variable. Why Median?
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 8
15
Multiple Box Plots
Minitab Commands:>> Graph>> Boxplot>> Select
Graph VariableY = ShearForceX = Batch
During the analyze phase, we often stratify Box Plot Results for
Y output by grouping variables (e.g., Nominal Variables) Is shear
force consistent across all batches of incoming material?
Production Batch*
Shea
rFor
ce
P3P2P1
30
25
20
15
10
5
0
Boxplot of ShearForce vs Production Batch*
16
B. Run Chart (Time Series Plot)
If time sequence available, we often like to examine data by
time (look for time trends)
Index
Shea
r Fo
rce
(lb)
60544842363024181261
30
25
20
15
10
5
0
Time Series Plot of Shear Force (lb)
Minitab Commands:>> Graph>> Time Series Plot>>
Select Graph VariableY = ShearForce
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 9
17
C. Minitab - Descriptive Statistics
Another common analysis to perform during the measure phase is
to compute descriptive statistics for Y (if Y may be evaluated as
continuous variable)
Descriptive Statistics: ShearForce Minitab Command >> Stat
>> Basic Statistics
Descriptive Statistics: Shear Force (lb)
Variable N N* Mean SE Mean StDev Minimum Q1 MedianShear Force
(lb) 60 0 17.670 0.883 6.841 1.400 11.350 20.200
Variable Q3 Maximum Skewness KurtosisShear Force (lb) 23.275
26.900 -0.75 -0.53
Or, Use Excel to Create Table with: N, Mean, StDev, Min, Max,
Range, Skew
Questions: What does a skewness of -0.75 suggest? Why does the
median differ from the mean for these data?
18
Stratification Analysis of Descriptive Statistics
May wish to stratify an output by an X variable Descriptive
Statistics: ShearForce
Minitab Command >> Stat >> Basic Statistics By
Variable: Batch
What do these data suggest?
Descriptive Statistics: ShearForce
Production
Batch* N Mean TrMean StDev Minimum Median Maximum
P1 20 22.170 22.272 2.859 16.200 22.450 26.300
P2 20 16.30 16.47 7.07 2.60 18.05 26.90
P3 20 14.55 14.71 7.32 1.40 12.30 24.70
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 10
19
III. Six Sigma Metric Calculations
1. Yield (e.g., Simple Quality Yield)
2. Defects Per Million (DPM) (Attribute Data) Note: DPM also
known as PPM for parts per million defective
3. Defects Per Million Opportunities (DPMO)
4. Defects per Million (Observed DPM)
5. Defects per Million (Expected DPM)
Note: Other Six Sigma Metrics covered later in course Process
Capability, Reliability, Rolled Throughput Yield
20
Specifications
To calculate Yield (or % defective, DPM, DPMO) we need standards
or specification limits LSL Lower Specification Limit; USL Upper
Specification Limit
Specification limits identify acceptance levels. Unilateral
Specification Limit Examples
Process time = 13 lbs
Bilateral Specification Limit Examples 30 +/- 5 days
(Nominal=30; LSL=25; USL=35) Width 1000 +/- 0.5 mm
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 11
21
1. Quality Yield (% Acceptable)
Quality Yield = (# Good Units) / (Total # Units) x 100% Unit:
part, service, customer, document, procedure, etc.
Or, Yield = (1 Fraction Defective) x 100% Where Fraction
Defective = # Defective / Total # Units # Defective is a binary
assessment (e.g., 0-not late; 1-late)
typical convention for binary let defect = 1
Example: Suppose 232 of 1034 bills are late (802 are
on-time),
calculate the Quality YieldQuality Yield = 802/1034 = 77.6%
22
DPM and DPMO Methods Depending on type of data, often convert
Yield to defects per million
(DPM) or defects per million opportunity (DPMO ) Method used
varies based on type of data/ assumptions
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 12
23
2. Defective Method for DPM
Suppose you have a process where each unit is classified as
defective or not defective
DPM = Fraction Defective x 1 MillionNote: Yield = 1 fraction
defective
Suppose you fabricate 4000 welds and find that 35 are defective.
What is the DPM?
InspectedUnitsTotalDefectiveTotal
# # DefectiveFraction =
Fraction Defective: 35 / 4000 = 0.008750
DPM =
24
3. # Defects per Unit Opportunity Method (DPMO)
Use if a particular inspection unit or part has 1 or more
defects (multiple opportunities)
Example: Suppose we visually inspect weld manufacturing process
for various conditions A: Excess Part Deflection after welding B:
Poor weld penetration C: Poor weld appearance (e.g., excess
flash)
Note: each weld (unit) could have 0 - 3 defects
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 13
25
Defects per Million Opportunity (DPMO)
Here, we use opportunities to summarize the total number of
possible chances for error (i.e., defects) in system
Where: Total # Defects = Total # defects across all units
Million 1(TOP) iesOpportunit Total
Defects#TotalDPMO x=
categorydefect iesOpportunit # Total
==
iiesOpportunit i
26
DPMO Example
Given the following data set of three features per unit: Suppose
you have 1,000 welds (TOP = 3 x 1,000 = 3000)
Fraction nonconforming = 59/3,000 = 1.967% DPMO = 19,667
Part Feature DefectsA 22B 19C 18
total 59
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 14
27
DPMO Hotel Survey Example Varying Opportunities per Unit
the number of opportunities may vary by unit (customer) In hotel
example below, not all guests may use hotel meal service Here, the
total opportunities is obtained by summing the
opportunities for each category Given the following data set,
what is the DPMO?
Concern GuestsDefects
(Not Satisfied)
Opportunities
Poor Meal Service* 447 111 447Poor House Keeping 1000 82
1000Problems with Reservations 1000 34 1000Long Check In 1000 96
1000Long Check Out 1000 58 1000
Total 381 4447# defects TOP
* Note: not all guests used a hotel meal service
28
Overall DPMO For Multiple Groups (Facilities)
DPMO also may be used to summarize multiple groups (e.g.,
departments, facilities) Note: Opportunity per group also provides
a measure of complexity For example, perhaps one of the hotel does
not offer any meal services
DPMO = 1054/13786 * 1M
Hotel Poor Meal Service*
Poor House
Keeping
Problems with Reservations
Long Check
In
Long Check
Out
Total Defects TOP
A 111 82 34 96 58 381 4447B 120 89 37 102 62 410 5114C n/a 75 28
90 70 263 4225
TOTAL 1054 13786
OVERALL DPMO 76,454
Defects
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 15
29
Feature # Defects # Opportunities DPM0A 3 200,000 15
B 0 200,000C 0 200,000D 0 200,000E 0 200,000
0
} CombinedDPMO= 3(3 / 1M)
DPMO The Denominator Game
Suppose we measure 200,000 units with 1 feature per unit. What
happens to the DPMO as the # of features (concerns) with NO defects
increases? NOTE: Features MUST BE Customer Related and should not
just
be added to improve DPMO
Total Defects: 3 Total Opportunities: 1,000,000
30
Denominator Game Example
Suppose you have a hole specification
Could you have one defect opportunity for oversized and another
for undersized?
What if we added the category missing weld to our example? How
might we include that in determining total opportunities?
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 16
31
4. Variable Data Methodfor Observed DPM
If you collect numerical measurements for a characteristic
(dimension) of each unit, we may convert each observation to a
binary result based on specification limits of the characteristic
and then compute DPM
Either In-Specification or Out-Specification (Defect)
Here, fraction defective = # units observed out-of-specification
/ total # units
DPM = Fraction Defective x 1 Million Also known as parts per
million (PPM) defective
32
DPM Example: Shear Force(based on Observed
Out-Specification)
Specifications: Ok, if shear force
>= 13
To compute DPM, need to convert each observed measurement to a
binary output (0-within specification, 1= outside specification or
a defect)
Note: Observed DPM also may be obtained using Minitab with
Process Capability Summary Analysis Tool
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 17
33
5. Variable Data Methodfor Expected DPM
Used when collecting variable data and data may be reasonably
assumed to follow a known or assumed distribution (e.g.,
normal)
Use software to fit data to statistical distribution (e.g.,
Normal Distribution) and estimate the probability (Pr) of a defect
based on the distribution and its properties
Expected (Predicted) DPM = Pr (Defect) x 1 Million
DEFECT DEFECT
LOWERSPECIFICATION
UpperSPECIFICATION
NormalExample:bilateraltolerance
34
Expected DPM Using Minitab Capability Analysis: Minitab will
compute expected DPM (based on assumed
distribution). Note will examine non-normal distributions in
later module or see appendix)
Note: Menu will vary based on Minitab Version Used
Suppose weassume Normality
Version 14
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 18
35
Minitab Process Capability Analysis (excellent all-in-one
analysis tool**)
Minitab (Version 14) Command:Stat >> Quality Tools
>> Capability Analysis (Normal)Variable: ShearForce Subgroup
Size ~ 1; LSL=13(minitab assumptions: unbiasing constants, average
moving range method with length=2)
Does NormalityAssumption Matterin this example?
3024181260
LSLProcess Data
Sample N 60StDev(Within) 4.56185StDev(Overall) 6.86963
LSL 13Target *USL *Sample Mean 17.67
Potential (Within) Capability
CCpk 0.34
Overall Capability
Pp *PPL 0.23PPU *Ppk
Cp
0.23Cpm *
*CPL 0.34CPU *Cpk 0.34
Observed PerformancePPM < LSL 316666.67PPM > USL *PPM
Total 316666.67
Exp. Within PerformancePPM < LSL 152986.54PPM > USL *PPM
Total 152986.54
Exp. Overall PerformancePPM < LSL 248314.53PPM > USL *PPM
Total 248314.53
WithinOverall
Process Capability of ShearForce
Observed DPM:316,667
Expected (Predicted)DPM: 248,314
36
Observed Vs. Expected DPM
If collect variable data (e.g., continuous) and have
specifications, we may always convert to a binary outcome and
compute Observed DPM
Or, we can predict the DPM (Expected DPM) by fitting sample data
to a distribution and then determining the probability of a defect
x 1M.
Of note: neither is wrong ultimately you want to use the most
representative estimate -- Rule of thumb:
If data reasonably fit a distribution shape (e.g., Normal or
Weibull), report the Expected (Predicted) DPM. Particularly if data
are from a smaller sample size (e.g., 30-100).
If data do not reasonably fit a distribution and large sample
size is available (> 200), use observed DPM.
If not sure, report them both in current state note: data often
are not normal when assessing the current state
during measure phase as some problems create non-normality
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 19
37
Summary In the measure phase, we typically include:
Histogram and/or Box Plot of raw data (if continuous data) May
include Normality Test or Distribution ID Probability Plot Analysis
(see appendix)
Run Chart (or SPC Chart) to show any time series trends Summary
Statistics (if continuous data)
N, mean, median, standard deviation, variance, min, max, range,
skew Estimate of Current State in terms of: Yield, DPM, or DPMO
Calculations vary depending on type of data, best fit
distribution, defect opportunity classification, # opportunities
for defect per unit, etc.
For numerical variables, use Expected DPM for smaller samples
sizes (< 100), particularly if data reasonably fit a known
distribution. For larger sample sizes, may use either observed DPM
and/or Expected DPM (if good distribution fit).
When identifying opportunities for DPMO, they should be
important to the customer and independent of other categories
(avoid denominator game).
38
Appendix: Distribution ID Plot Minitab has a tool to help
determine best distribution fit
STAT >> Reliability/Survival >> Distribution
Analysis Right Censoring >> Distribution ID Plot
Choose distribution with highest correlation coefficient /
lowest AD score
Common DistributionOptions:Weibull (best
result)ExponentialLognormalNormalOthers available
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 20
39
Shear Force Results Best Result look for:
lowest AD score based on max
likelihood estimation
highest correlation coefficient based on Least
Squares Estimation
Here, we do not have a good distribution fit for any of the
options (recall, bi-modal)!
ShearForce
Pe
rce
nt
100101
1.0E+02
90
50
10
ShearForce
Pe
rce
nt
100101
1.0E+0299
90
50
10
10.1
ShearForce
Pe
rce
nt
100.010.01.00.1
1.0E+02
90
50
10
ShearForce
Pe
rce
nt
40200
1.0E+0299
90
50
10
10.1
Correlation CoefficientWeibull0.948
Lognormal0.865
Exponentia*
Normal0.954
Probability Plot for ShearForceLSXY Estimates-Complete Data
Weibull Lognormal
Exponential Normal
40
Use Best Fit Distribution to Estimate DPM
Note: topic covered in process capability analysis module
Select Desired Distribution