Gauging Gage Minitab

8/12/2019 Gauging Gage Minitab

http://slidepdf.com/reader/full/gauging-gage-minitab 1/16

Gauging Gage Part 1: Is 10 PartsEnough?

"You take 10 parts and have 3 operators measure each 2 times."

This standard approach to a Gage R&R experiment is so common, so accepted, so ubiquitous

that few people ever question whether it is effective. Obviously one could look at whether 3 is

an adequate number of operators or 2 an adequate number of replicates, but in this first of a

series of posts about "Gauging Gage," I want to look at 10. Just 10 parts. How accurately can

you asses your measurement system with 10 parts?

Assessing a Measurement System with 10 Parts

I'm going to use a simple scenario as an example. I'm going to simulate the results of 1,000Gage R&R Studies with the following underlying characteristics:

1. There are no operator-to-operator differences, and no operator*part interaction.

2. The measurement system variance and part-to-part variance used would result in a

%Contribution of 5.88%, between the popular guidelines of <1% is excellent and

>9% is poor.

So —no looking ahead here —based on my 1,000 simulated Gage studies, what do you think the

distribution of %Contribution looks like across all studies? Specifically, do you think it is

centered near the true value (5.88%), or do you think the distribution is skewed, and if so, how

much do you think the estimates vary?

Go ahead and think about it...I'll just wait here for a minute.

Okay, ready?

Here is the distribution, with the guidelines and true value indicated:

http://blog.minitab.com/blog/fun-with-statistics/gauging-gage-part-1-is-10-parts-enough







The good news is that it is roughly averaging around the true value.

However, the distribution is highly skewed —a decent number of observations estimated

%Contribution to be at least double the true value with one estimating it at about SIX time the

true value! And the variation is huge. In fact, about 1 in 4 gage studies would have resulted in

failing this gage.

Now a standard gage study is no small undertaking —a total of 60 data points must be collected,

and once randomization and "masking" of the parts is done it can be quite tedious (and possibly

annoying to the operators). So just how many parts would be needed for a more accurate

assessment of %Contribution?

Assessing a Measurement System with 30 Parts

I repeated 1,000 simulations, this time using 30 parts (if you're keeping score, that's 180 data

points). And then for kicks, I went ahead and did 100 parts (that's 600 data points). So now

consider the same questions from before for these counts —mean, skewness, and variation.

Mean is probably easy: if it was centered before, it's probably centered still.

So let's really look at skewness and how much we were able to reduce variation:



Skewness and variation have clearly decreased, but I suspect you thought variation would have

decreased more than it did. Keep in mind that &Contribution is affected by your estimates of

repeatability and reproducibility as well, so you can only tighten this distribution so much by

increasing number of parts. But still, even using 30 parts —an enormous experiment to

undertake —still results in this gage failing 7% of the time!

So what is a quality practitioner to do?

I have two recommendations for you. First, let's talk about %Process. Often times the

measurement system we are evaluating has been in place for some time and we are simply

verifying its effectiveness. In this case, rather than relying on your small sampling of parts to

estimate the overall variation, you can use the historical standard deviation as your estimate

and eliminate much of the variation caused by the same sample size of parts. Just enter your

historical standard deviation in the Options subdialog in Minitab:



Then your output will include an additional column of information called %Process. This column

is the equivalent of the %StudyVar column, but using the historical standard deviation (which

comes from a much larger sample) instead of the overall standard deviation estimated from the

data collected in your experiment:

My second recommendation is to include confidence intervals in your output. This can be done

in the Conf Int subdialog:



Including confidence intervals in your output doesn't inherently improve the wide variation of

estimates the standard gage study provides, but it does force you to recognize just how much

uncertainty there is in your estimate. For example, consider this output from the gageaiag.mtw

sample dataset in Minitab with confidence intervals turned on:

For some processes you might accept this gage based on the %Contribution being less than

9%. But for most processes you really need to trust your data, and the 95% CI of (2.14, 66.18)

is a red flag that you really shouldn't be very confident that you have an acceptable

measurement system.

So the next time you run a Gage R&R Study, put some thought into how many parts you use

and how confident you are in your results!



Gauging Gage Part 2: Are 3 Operatorsor 2 Replicates Enough?

In Part 1 of Gauging Gage, I looked at how adequate a sampling of 10 parts is for a GageR&R Study and providing some advice based on the results.

Now I want to turn my attention to the other two factors in the standard Gage experiment: 3

operators and 2 replicates. Specifically, what if instead of increasing the number of parts in the

experiment (my previous post demonstrated you would need an unfeasible increase in parts),

you increased the number of operators or number of replicates?

In this study, we are only interested in the effect on our estimate of overall Gage variation.

Obviously, increasing operators would give you a better estimate of of the operator term and

reproducibility, and increasing replicates would get you a better estimate of repeatability. But Iwant to look at the overall impact on your assessment of the measurement system.

Operators

First we will look at operators. Using the same simulation engine I described in Part 1, this time

I did two different simulations. In one, I increased the number of operators to 4 and continued

using 10 parts and 2 replicates (for a total of 80 runs); in the other, I increased to 4 operators

and still used 2 replicates, but decreased the number of parts to 8 to get back close to the

original experiment size (64 runs compared to the original 60).

Here is a comparison of the standard experiment and each scenario laid out here:

http://blog.minitab.com/blog/fun-with-statistics/gauging-gage-part-2-are-3-operators-or-2-replicates-enough













It may not be obvious in the graph, but increasing to 4 operators while decreasing to 8 parts

actually increased the variation in %Contribution seen...so despite requiring 4 more runs this is

the poorer choice. And the experiment that involved 4 operators but maintained 10 parts (a

total of 80 runs) showed no significant improvement over the standard study.

Replicates

Now let's look at replicates in the same manner we looked at parts. In one run of simulations

we will increase replicates to 3 while continuing to use 10 parts and 3 operators (90 runs), and

in another we will increase replicates to 3 and operators to 3, but reduce parts to 7 to

compensate (63 runs).

Again we compare the standard experiment to each of these scenarios:



Here we see the same pattern as with operators. Increasing to 3 replicates while compensating

by reducing to 7 parts (for a total of 63 runs) significantly increases the variation in

%Contribution seen. And increasing to 3 replicates while maintaining 10 parts shows no

improvement.

Conclusions about Operators and Replicates in Gage Studies

As stated above, we're only looking at the effect of these changes to the overall estimate of

measurement system error. So while increasing to 4 operators or 3 replicates either showed no

improvement in our ability to estimate %Contribution or actually made it worse, you may have a

situation where you are willing to sacrifice that in order to get more accurate estimates of the

individual components of measurement error. In that case, one of these designs might actuallybe a better choice.

For most situations, however, if you're able to collect more data, then increasing the number of

parts used remains your best choice.



Gauging Gage Part 3: How to SampleParts

In Parts 1 and 2 of Gauging Gage we looked at the numbers of parts, operators, andreplicates used in a Gage R&R Study and how accurately we could estimate %Contribution based

on the choice for each. In doing so, I hoped to provide you with valuable and interesting

information, but mostly I hoped to make you like me. I mean like me so much that if I told you

that you were doing something flat-out wrong and had been for years and probably screwed

somethings up, you would hear me out and hopefully just revert back to being indifferent

towards me.

For the third (and maybe final) installment, I want to talk about something that drives me

crazy. It really gets under my skin. I see it all of the time, maybe more often than not. You

might even do it. If you do, I'm going to try to convince you that you are very, very wrong. If

you're an instructor, you may even have to contact past students with groveling apologies and

admit you steered them wrong. And that's the best-case scenario. Maybe instead of admitting

error, you will post scathing comments on this post insisting I am wrong and maybe even

insulting me despite the evidence I provide here that I am, in fact, right.

Let me ask you a question:

When you choose parts to use in a Gage R&R Study, how do youchoose them?

If your answer to that question required anymore than a few words - and it can be done in oneword —then I'm afraid you may have been making a very popular but very bad decision. If

you're in that group, I bet you're already reciting your rebuttal in your head now, without even

hearing what I have to say. You've had this argument before, haven't you? Consider whether

your response was some variation on the following popular schemes:

1. Sample parts at regular intervals across the range of measurements typically seen

2. Sample parts at regular intervals across the process tolerance (lower spec to upper

spec)

3. Sample randomly but pull a part from outside of either spec

#1 is wrong. #2 is wrong. #3 is wrong.

You see, the statistics you use to qualify your measurement system are all reported relative to

the part-to-part variation and all of the schemes I just listed do not accurately estimate your

true part-to-part variation. The answer to the question that would have provided the most

reasonable estimate?

http://blog.minitab.com/blog/fun-with-statistics/gauging-gage-part-3-how-to-sample-parts













"Randomly."

But enough with the small talk —this is a statistics blog, so let's see what the statistics say.

In Part 1 I described a simulated Gage R&R experiment, which I will repeat here using the

standard design of 10 parts, 3 operators, and 2 replicates. The difference is that in only one setof 1,000 simulations will I randomly pull parts, and we'll consider that our baseline. The other

schemes I will simulate are as follows:

1. An "exact" sampling - while not practical in real life, this pulls parts corresponding

to the 5th, 15th, 25, ..., and 95th percentiles of the underlying normal distribution

and forms a (nearly) "exact" normal distribution as a means of seeing how much

the randomness of sampling affects our estimates.

2. Parts are selected uniformly (at equal intervals) across a typical range of parts seen

in production (from the 5th to the 95th percentile).

3. Parts are selected uniformly (at equal intervals) across the range of the specs, inthis case assuming the process is centered with a Ppk of 1.

4. 8 of the 10 parts are selected randomly, and then one part each is used that lies

one-half of a standard deviation outside of the specs.

Keep in mind that we know with absolute certainty that the underlying %Contribution is

5.88325%.

Random Sampling for Gage

Let's use "random" as the default to compare to, which, as you recall from Parts 1 and 2,

already does not provide a particularly accurate estimate:



On several occasions I've had people tell me that you can't just sample randomly because you

might get parts that don't really match the underlying distribution.

Sample 10 Parts that Match the Distribution

So let's compare the results of random sampling from above with our results if we couldmagically pull 10 parts that follow the underlying part distribution almost perfectly, thereby

eliminating the effect of randomness:



There's obviously something to the idea that the randomness that comes from random sampling

has a big impact on our estimate of %Contribution...the "exact" distribution of parts shows much

less skewness and variation and is considerably less likely to incorrectly reject the measurement

system. To be sure, implementing an "exact" sample scheme is impossible in most cases...since

you don't yet know how much measurement error you have, there's no way to know that you're

pulling an exact distribution. What we have here is a statistical version of chicken-and-the-egg!

Sampling Uniformly across a Typical Range of Values

Let's move on...next up, we will compare the random scheme to scheme #2, sampling

uniformly across a typical range of values:



So here we have a different situation: there is a very clear reduction in variation, but also a very

clear bias. So while pulling parts uniformly across the typical part range gives much more

consistent estimates, those estimates are likely telling you that the measurement system is

much better than it really is.

Sampling Uniformly across the Spec Range

How about collecting uniformly across the range of the specs?



This scheme results in an even more extreme bias, with qualifying this measurement system a

certainty and in some cases even classifying it as excellent. Needless to say it does not result in

an accurate assessment.

Selectively Sampling Outside the Spec Limits

Finally, how about that scheme where most of the points are taken randomly but just one part is

pulled from just outside of each spec limit? Surely just taking 2 of the 10 points from outside of

the spec limits wouldn't make a substantial difference, right?



Actually those two points make a huge difference and render the study's results

meaningless! This process had a Ppk of 1 - a higher-quality process would make this result even

more extreme. Clearly this is not a reasonable sampling scheme.

Why These Sampling Schemes?

If you were taught to sample randomly, you might be wondering why so many people would use

one of these other schemes (or similar ones). They actually all have something in common that

explains their use: all of them allow a practitioner to assess the measurement system across a

range of possible values. After all, if you almost always produce values between 8.2 and 8.3

and the process goes out of control, how do you know that you can adequately measure a part

at 8.4 if you never evaluated the measurement system at that point?

Those that choose these schemes for that reason are smart to think about that issue, but just

aren't using the right tool for it. Gage R&R evaluates your measurement system's ability to

measure relative to the current process. To assess your measurement system across a range of

potential values, the correct tool to use is a "Bias and Linearity Study" which is found in the

Gage Study menu in Minitab. This tool establishes for you whether you have bias across the

entire range (consistently measuring high or low) or bias that depends on the value measured

(for example, measuring smaller parts larger than they are and larger parts smaller than they

are).



To really assess a measurement system, I advise performing both a Bias and Linearity Study as

well as a Gage R&R.

Which Sampling Scheme to Use?

In the beginning I suggested that a random scheme be used but then clearly illustrated that the"exact" method provides even better results. Using an exact method requires you to know the

underlying distribution from having enough previous data (somewhat reasonable although

existing data include measurement error) as well as to be able to measure those parts

accurately enough to ensure you're pulling the right parts (not too feasible...if you know you can

measure accurately, why are you doing a Gage R&R?). In other words, it isn't very realistic.

So for the majority of cases, the best we can do is to sample randomly. But we can do a reality

check after the fact by looking at the average measurement for each of the parts chosen and

verifying that the distribution seems reasonable. If you have a process that typically shows

normality and your sample shows unusually high skewness, there's a chance you pulled anunusual sample and may want to pull some additional parts and supplement the original

experiment.

Gauging Gage Minitab

Documents