Pocket Guide Statistical Analysis Techniques_tcm10-1249346

Pocket guideto statistical analysis techniques

Pocket guide to statistical analysis techniques for usewith tightening tools

Chapter ..............................................................................Page

1. Introduction.......................................................................4

2. Basic statistics ...................................................................52.1 Variation........................................................................52.2 Distribution...................................................................62.3 Histogram .....................................................................62.4 Mean value ...................................................................62.5 Standard deviation ........................................................72.6 Estimation of a normal distribution .............................9Sample mean and standard deviation ...............................10

3. Accuracy requirements ..................................................113.1 Meanshift and combined scatter.................................11Example ............................................................................12

4. Understanding processes................................................13

5. Capability ........................................................................145.1 Cp................................................................................145.2 Cpk..............................................................................155.3 When is a process capable? ........................................165.4 Machine capability indices.........................................185.5 What else is there to think about? ..............................18

6. Control charts .................................................................196.1 X-bar charts ................................................................206.2 The subgroup ..............................................................216.3 Alarms.........................................................................226.4 Range charts ...............................................................226.5 Control charts conclusion...........................................23

Summary .........................................................................23

Appendix..........................................................................24A1. Example of statistics calculation ...............................24A2. Example of capability calculation .............................28A3. Example of control chart calculation ........................29A4. Analysis of assembly tool performance –

ISO 5393 Calculation ................................................32

P O C K E T G U I D E T O S T A T I S T I C S 3

4 P O C K E T G U I D E T O S T A T I S T I C S

1. Introduction

The purpose of this guide is to explain the basics of statisticsand how statistics can be used in production. You will learnthat with the aid of statistics we can compare tools with eachother, we can tell whether a tool is good enough for a speci-fic application, and by using Statistical Process Control(SPC) we can see how a production process develops overtime. Our hope is that you, after reading this guide, will havea general knowledge and understanding of the potential ofusing statistics as a tool in production.


2. Basic statistics

2.1 VariationUnderstanding statistics is much about understanding varia-tion. Variation is present everywhere, in nature as well as inindustrial processes. In industrial processes, even a slightdeviation from the target value, a dimension for instance,may have strong influence on the functionality of the finis-hed product. This means it is important to understand, and in some cases control, variation.

There are two different kinds of variation. Random varia-tions are predictable, always present and with many contribu-ting causes. Examples of random variations are small varia-tions in hole diameter, inconsistent friction, operator influen-ce and variations in air pressure. It is hard to isolate one ofthese causes. The variations are tackled by improvement ofthe process. Random variations are natural and depend on theprocess and its environment. They are also called commoncauses.Systematic variations are sporadic and isolated. They arenot predictable but it is often easy to pinpoint the cause. Theyare tackled by controlling the process. Systematic variationhas a determined cause and can often be identified and elimi-nated. Examples are machine adjustments, wear of tools andhuman error. They are also called special causes.

A great deal of importance has been placed on the use of statistical analysis techniques to control the quality of theassembly process. The traditional method of using these tech-niques is to analyze what has already occurred and when aproblem is identified to adjust the process accordingly. It isnow becoming increasingly common to use statistical techni-ques to predict how the process will perform in the futureand to identify systematic variations and adjust the processbefore we end up with faulty products.

Figure 1. Variations in airpressure and operatorinfluence are examples ofrandom variations.

Figure 2. Human errors likemissing washers and usingwrong screws are exam-ples of systematic varia-tions.


2.2 DistributionConsider a tightening process where we measure the torqueapplied to a bolt. As you know, we would not achieve thesame readings for all tightenings. Suppose we collect enoughreadings to create a plot of the frequency (the number oftimes a particular reading occurred) against the actual torquereadings. The result would be a plot similar to the one infigure 3 below. In statistical analysis this curve is known as a“distribution”. There are many different types of distribution,but the one that best describes this example (and others likeit) is called the Normal or Gaussian distribution.

A normal distribution is always symmetrical and determinedby the mean and the standard deviation. A normal distribu-tion only occurs when random variations affect the result.

2.3 Histogram A histogram is when you divide the results into categories(for example all results between 20 – 21 Nm). Then it is pos-sible to create a diagram by counting the number of results inevery category and putting them into a diagram. By doingthis it is possible to visualize the distribution with a fairlylimited number of results.

2.4 Mean valueA normal distribution can be found everywhere, both in natu-re and in industrial processes. If we have a big sample ofmeasures, i.e. we have made 1000 tightenings with one tool,we can make a histogram. The more tightenings we have, thebetter curve we get. If we were to measure the height of allSwedish men, we would achieve an average (mean value) of1.80. The mean value is the most common value in a normaldistribution. There are not that many men that are really tallor really short. Another example could be when you cut off astick. The target value is 20.00 cm and this would probablyalso be the mean value. However, some parts become only19.90 and others 20.10, which is due to the natural variationof the process and is normal.

Figure 3. Histogram.

Figure 4. Normaldistributions canbe found every-where. The heightof people is oneexample. Anotherexample can be ifyou try to cut offsticks to the samelength.


2.5 Standard deviationIf a tool is used for a very large number of tightenings at aset torque of e.g. 30 Nm, it is unlikely that every single tigh-tening will reach this torque value exactly. This will be thecase even if the tool is run on the same screw joint, a testfixture. Random factors, such as material wear and differenthandling of the tool may cause the applied torque to exceedor fall below the intended torque. The readings are said todeviate from the mean and we measure this with what isknown as standard deviation. It is not essential to fully understand the formula, which ispresented later. But it is helpful if you know how to calculateit, and it is crucial that you understand what it is! The stan-dard deviation is the amount by which each reading is mostlikely to deviate from the average.What is the practical use of standard deviation? We have alre-ady said that the mean tells us the average value of the distri-bution (all different tightenings) and standard deviation indi-cates the scatter. We can use it to estimate how many of ourvalues will come within a certain range. The standard devia-tion may be more accurately described as a calculation ofhow far a known percentage of the distribution lies from themean. σ is a letter in the Greek alphabet and it is used to symbolizethe deviation from the mean (average) of any distribution.For a business or manufacturing process, the σ value indica-tes how well that process is performing. A low σ value indi-cates that most of the values are close to the target. A high σvalue indicates that the spread is big and that the values devi-ate more from the target value.

If you have 20 values of a population, you are able to groupthem as shown in the figure. We make the assumption thatthey belong to a normal distribution. This is in fact the “area”within which you will get the next tightening. There is a100% probability of getting inside the entire range. It is mat-hematically proven that there is a• 68% certainty that all data lies between +/– σ• 95% certainty that all data lies between +/– 2 σ, and• 99.7% certainty that all data lies between +/– 3 σ.

It is an important characteristic of the normal distributionthat the standard deviation is symmetrical around the mean,and always covers the same percentage of the distribution.This is a mathematical law.

Figure 5. We always know howmany percent of our values wewill have within a certain range.


This now brings us to something very useful. Now that weknow the percentage of the values that will end up within acertain σ boundary, we can predict how the process willbehave in the future. Do you remember the discussion aboutrandom and systematic variation? We said that for a normaldistribution all systematic variations are eliminated and onlyrandom variation is present. We now also know that 99.7% ofall values are within 6σ, (or +/– 3σ). This enables us to makean important assumption: even though 0.3% of all tighteningswill fall outside the 6σ limits for a normal distribution, weassume that all tightenings outside these limits happen becau-se of systematic variations in the process. This means thatsomething new has entered the process – it is not under con-trol any more.

To make things clearer, we assume that as long we have tigh-tenings within the 6σ limits, the process is only affected byrandom variations and is under control. When we have tighte-nings outside the 6σ limits, the process is affected by syste-matic variation and is not under control. When this happens,this means that something new and strange has started toaffect the tightening process and we need to find the reasonfor this and eliminate it. The following graphs show a com-parison of two different normal distributions.

Figure 6. The first pic-ture shows two curveswith the same avera-ge, but different devia-tion. The second pictu-re shows two curveswith same deviationbut different averages.


2.6 Estimation of a normal distributionWhen we talk about measurements or readings on an applica-tion, we can calculate an average and a standard deviation.If we were to measure an infinite number of tightenings, wewould know for sure that we have the true value of the meanand the standard deviation. This is the population mean andthe population standard deviation. But in reality this is notpossible and we have to rely on a limited number of tighte-nings. In statistics we talk about a sample; in the tighteningbusiness we talk about subgroup or a batch. This means thatwe cannot really know for sure that our calculations (meanand standard deviation) are correct, since they are only basedon a limited number of tightenings. In fact, what we have isan estimate of the real values. The more tightenings we haveon which to base our calculations, the more sure we can bethat we are close to the population mean and standard devia-tion.

We say that the average value of the distribution is the popu-lation mean (µ) and the scatter is represented by the popula-tion standard deviation (σ). The population mean (µ) is cal-culated by:

Σx – the sum of all tightenings, divided by the total numberof tightenings (n).The population standard deviation (σ) is calculated by

Where:xi is the value of each individual occurrence, the ith

measurement of variable x.

n is the total number of occurrences in the population

is the value of all occurrences added together(the sum)

is the sum of all values of (xi-µ)2

We take the value of each individual occurrence minus m, themean, and square this new value. Then we add each new

Figure 7. It is impossi-ble to measure theentire population. Wehave to rely on a limi-ted number of values,a sample or a batch.

i

=µΣ xi n

i=1 n

Σ xi n

i=1

i


value together. We now divide this by the number of tighte-nings. Finally, we need to take the root of this total value, aswe have (Nm)2 and need Nm, and we get the population stan-dard deviation. The square and the root only exist because wewant to get rid of the positive and negative deviations fromthe mean.

However, in practice it is very rare that we can measure everyoccurrence of the data. In fact, n would then have to be infi-nite, which of course is impossible. Instead we use a repre-sentative sample to predict the mean and standard deviationof the population.

Sample mean and standard deviationWe calculate sample mean ( ) in the same way as for thepopulation mean (µ):

The calculation for Sample standard deviation (s) differsslightly from the population standard deviation (σ):

Wherexi is the value of each individual occurrence

in the samplen is the total number of occurrences in the

sampleΣ xi is the sum of the values of all occurrences

in the sampleis the sum of all values of (xi - )2

The use of (n – 1) instead of (n) gives more accurate estimateof the population standard deviation, σ, and is very importantwhen small sample sizes are used. So remember that we cannever use the total population in our calculations; that isimpossible. We have to use smaller samples and calculateestimates of the real average and the real standard deviation.

Thus, the sample mean ( ) is an estimation of the populationmean (µ).The sample standard deviation (s) is an estimation of thepopulation standard deviation (σ).

i

=xΣ xi n

i=1 n

i


3. Accuracy requirements

In a tightening application there are often accuracy require-ments of the tools. Accuracy requirements are written as atarget torque +/– a maximal acceptable deviation from thetarget, for example +/– 10%. The accuracy of a tool is oftencalculated as 50% of the natural variation (3σ) divided by thetarget value. This makes it possible to compare different toolsat a certain target value, without relating them to a certainapplication (tolerances). As you will notice in the next chap-ter, the accuracy calculations are similar to some capabilitycalculations (in accuracy calculations we compare the naturalvariation to the mean value, in capability calculations wecompare the natural variation to tolerance demands in theapplication)!

If the accuracy requirements are 40 Nm +/– 10%, we have tocheck that 3s is within 10%, or 100 * 3σ/Ave is less than10%. Assume that we test the tool and achieve a mean valueof 40 Nm, and a standard deviation of 1.2 Nm. Then we cal-culate the accuracy: (3*1.2 / 40) = 9%. We now see that thetool is accurate enough to do the job.

3.1 Mean shift and combined scatterMean shift is what occurs when you run a tool on both hardand soft joints. You will most probably get two differentmean values, a higher value for the hard joint, with two diffe-rent distributions. The difference between these two meanvalues is the mean shift. We want to find the limits (compa-rable to the normal distribution) where the probability of get-ting a torque outside these limits is 99.7% on the hard or softjoint. This is the combined scatter and corresponds to 6σ onthe normal distribution. Once we have the combined scatterwe can relate this to the combined average. This gives ussomething that is often referred to as the “accuracy”.

Written as a formula it will look like this:Accuracy = 100 x 0.5 ((Avehard +3σ hard) – (Avesoft –3σsoft))/AveWhere Ave = (Avesoft+Avehard)/2 (the combined average).

Figure 8. The mean shift is the dif-ference between the mean valuesof the hard and the soft joints.

Figure 9. Combined average andcombined scatter.


This is normally true, but we cannot know for sure that thedistribution will look like this. We can, for example, have anegative mean shift. We need to check which of the limits arethe outermost.

Adjusted, the formula would look like this:

Accuracy = 100 * 0.5 Deviation/AveWhere Deviation = max (Avehard +3σhard, Avesoft +3σsoft)– min(Avesoft –3σsoft, Avehard –3σhard)Ave = (Avesoft + Avehard)/2 (the combined average)

Example:Tests on a hard joint (30 degrees) and a soft joint (800degrees) produced the following data.

Hard joint: Ave = 61 Nm and σ = 1.2 NmSoft joint: Ave = 60.2 Nm and σ = 1.0 Nm

Deviation = Max (61+3*1.2, 60.2+3*1.0) – min (61-3*1.2,60.2-3*1.0) = 7.4 NmAve = (61+60.2)/2 = 60.6 NmAccuracy = 100*0.5*7.4/60.6 = 6.1%

It is hard to give an estimate of the accuracy of tools becauseof:• Different accuracy on hard, soft and combined applica-

tions.• Different accuracy if the tool is used high up in the torque

range or in the lower part.


4. Understanding processes

Every organization produces something, whether it be pro-ducts or activities, and this is done in many different ways.But what all organizations have in common is that the waythey work can be described as methods and activities.A process is simply a structured set of activities designed toproduce a specified output for a particular customer or mar-ket. It has a beginning, an end, and clearly identified inputsand outputs. A process is therefore a structure for action, forhow work is done. Within the quality area, the process con-cept is defined as “a set of activities, which are repeated intime, for the purpose of creating value for a customer”. Asyou now understand, the process approach implies adoptingthe customer point of view. Processes also have performancedimensions, such as cost, time, output quality and customersatisfaction. Bear in mind that all of these dimensions can bemeasured and improved.

In a modern car plant, the production line is a typical “opera-tive process” i.e. it creates value for the person buying thecar. Along the line, the cars are assembled with differentkinds of nutrunners, all with different functionality, perfor-mance and reliability. In the assembly process there are a lotof things that affect the outcome of the tightening. The opera-tors, the screws, the holes and many other things affect thetightenings. All this contributes to the total process variationfor each application. Remember the discussion aboutvariation in chapter 1.The dimensions with which we measure the perfor-mance of the nutrunners are torque and sometimesangle. By using statistics, we can analyze the perfor-mance of the process (tightenings) and we can monitor, con-trol and improve the assembly process. This means, in thelong run, more accurate tightenings, better and safer cars andbetter value for the customers.

Figure 10. A process is a set ofactivities designed to produce anoutput for a customer or a market.

Figure 11. Industrial production isan operative process. A lot ofthings contribute to the processvariation.


5. Capability

Earlier in this pocket guide we talked about statistics andaccuracy. The accuracy of a tool tells us something about theperformance, but this is not enough. The important aspect forour customers is how the tool performs in an application, onthe production line. So, somehow we have to relate the accu-racy of the tool to the application. Every joint has a targetvalue, but also some tolerance that is acceptable for the cus-tomer. By relating the mean and the standard deviation to thetarget value and the tolerance limits of an application we cantell how a tool is performing where it really matters, in itsapplication. This is possible thanks to different capabilityindices.

There are many different capability indices, some of themquite simple and some of them more intricate. This pocketguide deals with the most commonly used ones, the ones ourcustomers use.

We know from before that a normal distribution is defined byits mean and its standard deviation. We also remember ourassumption that all values, when the process is under control,are within the 6σ limits, although only 99.7% really are. Thisis called the process natural variation.

5.1 Cp The first, and most commonly used capability index, is calledCp. The formula for the Cp is:

Cp = Tolerance interval = HI – LO6σ 6σ

If you look at the formula, you can see that it simply relatesthe tolerance interval (HI-LO), to the process natural varia-tion! If we have a tool with a big spread, and an applicationwith very high demands (narrow tolerance limits), we get alow Cp value. Conversely, if we have a tool with very smallspread (small σ), but very wide tolerance limits, we get ahigh Cp. Of course this is what we want, because the smallerthe variation in relation to the tolerance limits, the lower therisk of tightenings outside the tolerances. The Cp require-ments vary. The most common is that Cp has to be greaterthan 1.33. This indicates that 6 times the standard deviationcovers no more than 75% of the tolerance interval.

Figure 12. When calculating Cp,the tolerance interval is related tothe 6σ.


But is this enough for us to tell if the tool is good or bad fora specific application? Do we need something more? Yes.The Cp does not consider whether the mean of the distribu-tion is close to the target value or not. This index does notguarantee that the distribution lies in the middle of the tole-rance interval. In the picture below you can see the same toolon the same application, but before and after torque adjust-ment. In both cases we would have the same Cp. If we are offtarget, it is possible that the tightenings are outside one of thetolerance limits, even if the scatter is small in relation to thetolerance interval (high Cp). So we need something more thatalso relates the distribution to the target value.

5.2 CpkThe Cpk also relates the mean of the distribution to the targetvalue of the application. The way to do this is to divide thedistribution and the application into two different parts andmake one calculation for each side. The formula looks likethis:

Cpk = min [(HI – AVE) / 3σ , (AVE – LO) / 3σ]

First we relate the difference between the upper tolerancelimit and the average to half the natural variation (3σ). Thenwe make another calculation, relating the difference betweenthe average and the lower tolerance limit to 3σ. We now havetwo potentially different values, and the LOWER of the twois the Cpk. If you think this is difficult, just take a few minu-tes to think about this. If the average is higher than the targetvalue, then the difference between the upper tolerance limitand the average is smaller than the difference between theaverage and the lower tolerance limit. If this is the case, the

Figure 13. High Cp does not guarantee that we are close to the targetvalue.

Figure 14. When calculatingCpk also the target value isconsidered.


“upper calculation” will give us the Cpk, because we are clo-ser to the upper tolerance limit.

What happens to the Cpk if we are right on target? Well, inthis case we are as close to the upper tolerance limit as to thelower, and both calculations will give us the same result.

In this case, we can also see that the Cpk has the same valueas the Cp.

Now we have introduced the Cp and the Cpk. By studyingthe formulas it is easy to see that Cp only relates the toleran-ce interval to the process 6σ. Cpk also considers the targetvalue. We want both Cp and Cpk to be higher than 1.33. Ifour average is right on target, the Cp and Cpk are the same.The more off target we are, the bigger the difference betweenCp and Cpk. Obviously Cpk can never be higher that Cp.

5.3 When is a process capable?The question of “how good is capable?” has still not beendefinitively answered. Since Cp was first used, a Cp value of1.33 has become the most commonly acceptable criterion asa lower boundary. The Cpk requirements vary. The mostcommon is that Cpk has to be greater than 1.33. A processthat has a Cpk lower than 1.00 is never capable.

It is very important that you understand why we use both theCp and the Cpk. If we only use the Cp, we do not knowwhether we are on target or not. If we only use the Cpk, wecannot know whether a good or bad Cpk value is because ofthe centering of the process or because of the spread. So wehave to use both. Together they can give us a very good indi-cation of how well a specific tool is performing in a specificapplication. They are also the perfect way to compare diffe-rent tools.

Process not capableChange tool oradjust for goodaccuracy.

Process capablebut average needsto be adjusted.

Not possible. Process capableand well adjusted.

Bad Cp Good

Bad

Cpk

Good

Figure 15. The relation bet-ween Cp and Cpk.


Look at the following dartboards:

The first dartboard shows a poorly centered process, but witha low spread (high accuracy). In this case the Cp is high andthe Cpk low. On the second dartboard, the darts are spreadrandomly around the bull’s eye, but the spread is quite largerelated to the tolerances. Cp is probably not so good, but ifthe “mean value” is on target, the Cpk has the same value asthe Cp. The third dartboard shows a well centered process,with high accuracy. This means that both the Cp and Cpk arehigh; the process is capable.

An example: A joint should be tightened at 70 Nm ± 10 %. A tool is testedand we get an average of 71 Nm and a σ of 1.2 Nm.

Cp = (77-63) / 6*1.2 = 1.95Cpk = min [ (77-71) / (3*1.2) , (71-63) / (3*1.2) ] =

min [ 1.67, 2.22 ] = 1.67

Both the Cp and Cpk values are greater than 1.33 and theprocess is capable and does not need to be adjusted.

Figure 16.Dartboard 1:High Cp and low Cpk.Dartboard 2:Low Cp and low Cpk.Dartboard 3:High Cp and high Cpk


5.4 Machine capability indicesAs you now know, Cp and Cpk are process capability indices.Everything that affects the process affects these indices. Butif we take away all variation affecting the assembly process,except the variation in the tool itself, we get what are calledMachine Capability indices. This must be done under verycontrolled circumstances, preferably in a tool crib. The testsshould be carried out on the same joint and by the same ope-rator (or even better, place the tool in a fixture in order to getrid of all the operator influence). The calculations are thesame for Cm as for Cp, and the same for Cmk as for Cpk.

So remember, Cp and Cpk determine whether the process iscapable. The Cm and Cmk determine whether the machine(tool) is capable.

5.5 What else is there to think about?When you analyze the capability of a tool, the sample size isof great importance in order to obtain reliable mean and stan-dard deviation calculations. A sample size of at least 25 isstrongly recommended.

And remember that if a someone says something like “I havea tool that always can live up to a Cpk demand of 2.0”, thereare two alternatives:

1. He does not know what he is talking about, because it ismeaningless to talk about capability indices without rela-ting the tool performance to an application with customerdemands (tolerance limits)!

2. He knows what he is talking about and is trying to makethe tool look better than it really is.


6. Control charts

We have talked about statistics and accuracy, about processesand capability. Now we are going to learn about controlcharts. Statistics, tool performance and a production environ-ment (process variation) are important elements in understan-ding this.

The control chart is an important tool within StatisticalProcess Control. The idea is to repeatedly collect a number ofobservations (samples) with a certain interval from the pro-cess. With help from these observations (measurements) wewant to calculate some kind of quality indicator and plot it ina diagram. The indicator normally used in the tighteningindustry is subgroup mean and/or subgroup range.

Do you remember the difference between special and randomvariation? If not, do go back and read the section again,because this is very important. If the plotted quality indicatoris within the 6σ limits, we say that the process is under statis-tical control, only random variation affects the tightenings.When we use these limits in control charts, they are calledcontrol limits. We also have an “ideal level”, a target valuemarked between the control limits, and of course it should bethe same as our target value in the assembly process. If somespecial variation enters the process, it can affect the tighte-nings in two different ways; it can affect the average of thetightenings, the spread or both.

We have the following requirements on a control chart:• It should be possible to quickly detect systematic changes

in the process, enabling us to find sources of variation.• It should be easy to use.• The chance of getting a “false alarm” should be very

small (if we use the 6s limits as control limits, the chanceis 0.3 %).

• It should be possible to know when the change started toaffect the process.

• It should prove that the process has been under control.• It should be motivating and constantly bring attention to

variations in the process and to quality related issues.


6.1 X-bar chartsFirst we introduce a control chart for controlling the averagelevel of a certain unit. It can be the diameter of a bolt, or thetorque applied to a joint. It is called -chart, and when usingit we plot the average of the observations (measurements)into the diagram. At pre-defined intervals we collect a num-ber of measurements, a subgroup, from the process. We thencalculate the mean for each subgroup and use this value asour quality indicator.

We know that the tightening applications can be described asa normal distribution. We know that the mean and the stan-dard deviation help us to do that. We also know that all pro-cesses vary over time, due to different kinds of variation, i.e.material differences, operator influence etc. The 6σ limitmakes it possible to tell whether the process variation is dueto random or special causes, so the control limits are normal-ly based on the 6σ limits, the natural variation of the process.The procedure for plotting these charts is straightforward, therelevant variable (in our case torque or angle) is measured atregular intervals (maybe once every hour or once a day), andtypically a group of 5 consecutive readings are taken eachtime.

When the control limits are set, the -values from eachgroup of readings can be plotted on the charts. When theassembly process is under control (only random variationaffects the tightenings), the subgroup averages will spreadrandomly around the overall mean ( ).

LO

Figure 17. We collect a num-ber of measurements, asubgroup from the process,and plot the averages intothe diagram.


6.2 The subgroup Assume that the quality variable (in our case the tightenings)we want to control has the average µ and standard deviationσ when the process is under control. Remember that our qua-lity indicator is the subgroup mean, . Ideally the individualmeasurements and the subgroup averages have the samemean value (see picture). But we can also see that the spreadbetween the individual measurements (σ) is bigger than bet-ween the subgroup averages, which in fact is σ/√n, where nis the number of measurements in each subgroup. So thechance of detecting a deviation from µ is greater when westudy subgroups instead of individual measurements. So, infact, the control limits are normally set to (the subgroup 6σ-limits):

UCL = µ + 3σ/√nLCL = µ – 3σ/√n

But how big does the subgroup need to be? If you look at thepicture below you see that as we increase the size of the sub-groups (n), the standard deviation does not decrease so muchwhen we go over 4 or 5. This explains why 4, 5 or 6 are verycommon choices of subgroup sizes. Historically, a subgroupof 5 is a very common choice.

Figure 18. The spread betweenindividual measurements isbigger than between subgroupaverages.

Figure 19. Using a subgroup size of 5 is very common in the industry.

Estimated by:


6.3 AlarmsNow to the good stuff; what happens if something non-ran-dom starts affecting the tightenings? What if the quality ofthe screws suddenly deteriorates? Well, maybe it will affectthe mean of the subgroups. Maybe it will affect the spreadwithin the subgroups. Maybe the torque applied to the jointswill slowly decrease. All this can now be detected. The beau-ty of control charts is that the quality engineer, or quite oftenthe operator, can pick up potential problems at an early stagebefore we get tightenings outside the tolerances, before faul-ty assemblies are made.

The easiest way to detect that something non-random hasstarted to affect the process is when we get values outside thecontrol limits. This is an ALARM and we have to find outwhat has happened immediately, before we get tighteningsoutside the tolerance limits! In the figure to the left you can see what a control chartCAN look like when special variation starts affecting theassembly process. The first two cases show “trend alarms”.Production can continue during investigation. The fourth caseis when the overall mean ( ) starts to deviate from the targetvalue. We have to find out why this has happened, but maybean adjustment of the tool is enough; this depends on the rea-son for the change.

6.4 Range chartsTo control the spread in the process we can use either thestandard deviation or the range within the subgroups. Therange (R) is the difference between the biggest and the smal-lest value of each subgroup. The standard deviation is ofcourse based on all values within the subgroup, whereas therange is only based on two. This means that the s-chart ismore reliable and gives us more information about the spre-ad. However, the range is easier to calculate and even thoughwe now have very good tools, which calculate everything forus, the R-chart is still the most popular chart to use.

Figure 20. Examples ofwhat control charts can looklike when systematic varia-tion has entered the pro-cess.


The Range R helps us to estimate the spread of the subgroup.This can be done with the aid of different devisors, whichcan be found in manuals for statistical process control. If youwant the centerline to be , the control limits for the controlchart will be:

UCL = D4*LCL = D3*

The R-chart indicates how the spread within the subgroupsdevelops. It makes it possible to detect when a systematicchange in the process affects the subgroup spread.

6.5 Control chart conclusionThe control limits should be based on a large and reliablenumber of tightenings and they should be re-calculated, usingthe actual production results, at regular intervals in order toobtain reliable charts. This chapter is only intended as an introduction to ProcessControl charts and does not cover all aspects of these charts.

SummaryThis guide explains the basics of statistics such as the distri-bution, mean value and standard deviation. It also describeshow this can be related to an application by capability calcu-lations. The process can be monitored and controlled byusing SPC, and this is also described and explained.

This pocket guide does not explain all aspects of and thepotential of statistics. This is an introduction to the subject,and if there is a need for further studies we recommend yourefer to specialist literature.

The different product offerings that Atlas Copco can supplyto help customers utilise the potential of statistics in produc-tion are not explained in this guide. If you need to discussAtlas Copco’s product range, please contact your local AtlasCopco sales representative.


Appendix

A1. Example of basic statistics calculationThe following example will help you to understand the basicsof statistics. In this example we compare the torque levels oftwo different tools. You then might obtain the torque valuesshown below. Target torque is 10.

Atlas Copco Tool Other tool

10 10

10.1 11

10.2 9

9.7 8

10.0 12

10.2 10

10.1 9

9.7 12

9.8 8

10.2 11

Which one of these tools is the most accurate? To answerthis, we first calculate the mean value of the two series. Themean value gives us an average of all values received fromthe different tightenings and we use the symbol . The meanvalue is calculated by adding all tightening data, x, and divi-ding by the number of tightenings, n.

Mean value,


10 10

=xΣ xi n

i=1 n


Both tools have a mean value of 10. If one tool were to havea mean value of 15, we would know that that tool is not asgood as the one hitting the target torque. Do both tools havethe same accuracy? Accuracy tells us how accurate a tool is,i.e. how well it hits the target. It is the degree to which anindicated value matches the actual value of a measured varia-ble.How do we now see the difference? Let us look at the rangeof the values of the two tools. The range, R, tells us betweenwhich values we have received our tightenings, and is calcu-lated as the difference between the highest and the lowestvalue in the range.R = xmax – xmin.

Range, R


0.5 4

With the Atlas Copco tool, our tightening values differ by 0.5Nm between highest and lowest value; while the other toolhas a deviation of 4 Nm. But if you perform 1000 tighteningswith the Atlas Copco tool and get one value totally out of therange, e.g. 5, you get a range for the Atlas Copco tool of 5.5.Then the Atlas Copco tool becomes the bad one. We have tofind a function to remove the influence of that one tighte-ning.


The standard deviation is a statistical index of variability,which describes the deviation and tells us the average diffe-rence between the value of a specific variable and some desi-red value, usually a process set point. Let us calculate thedeviation for each value received and sum them up.


Torque xi - Torque xi -

10 0 10 0

10.1 0.1 11 1

10.2 0.2 9 -1

9.7 -0.3 8 -2

10.0 0 12 2

10.2 0.2 10 0

10.1 0.1 9 -1

9.7 -0.3 12 2

9.8 -0.2 8 -2

10.2 0.2 11 1

=10 =10

The result is 0 for both tools. What is it that causes a problemin this case? We have both positive and negative values. Weneed to take away the minus, to get the absolute values ofeach deviation. To mathematically take away the minus, wecan square each value.

i

=0 =0



σ xi - (xi - )2 σ xi - (xi - )2

10 0 0 10 0 0

10.1 0.1 0.01 11 1 1

10.2 0.2 0.04 9 -1 1

9.7 -0.3 0.09 8 -2 4

10.0 0 0 12 2 4

10.2 0.2 0.04 10 0 0

10.1 0.1 0.01 9 -1 1

9.7 -0.3 0.09 12 2 4

9.8 -0.2 0.04 8 -2 4

10.2 0.2 0.04 11 1 1

=10 (xi – )=0 (xi– )2= 0.036 =10 (xi – )=0 (xi– )2=2

Now we have a value that is Nm2 to compare with. But whatdoes this value tell us? It tells us something about deviation.This value depends on the number of tightenings. What wedo is to divide this value by the number of tightenings –1 toget an average. We have to take the square root of this sum toget the value back to Nm.


0.2 1.4

What we now have done is to calculate the sample standarddeviation. Standard deviation is a way of measuring how wellthe tool performs, how close we are to the expected value.Now we can see a clear difference. The Atlas Copco tool hasa standard deviation of 0.2 Nm from the target; while theother tool has a standard deviation of 1.4 Nm.So what this example tells us is that although both tools havethe same mean value, the first tool is more accurate. Thedifferent tightenings are closer to the target value and thestandard deviation is a way for us to prove this.

i


A2. Example of capability calculationWe know that the capability of a tool is how the tool is per-forming in a specific application. So what we do when calcu-lating capability indices is to relate the tool accuracy (meanvalue and standard deviation) to the demands on the applica-tion (target value and tolerance limits).

Let us assume that we have an application with target value15 Nm, and tolerances +/– 8%. This means that the uppertolerance limit is 16.2 Nm and the lower limit is 13.8 Nm.We have collected 20 tightening results from one tool, on theproduction line:

15.4

15.6

15.4

15.1

15.1

15.5

15.0

15.3

15.2

15.1

15.5

15.3

15.4

15.3

15.3

15.1

15.2

15.4

15.1

15.2


It is now easy to calculate the mean value and standard devi-ation:

It is now easy to calculate Cp and Cpk:

Cp = (HI – LO) / 6σ = (16.2-13.8)/(6*0.165) = 2.42

Cpk = min [(HI - AVE) / 3σ , (AVE - LO) / 3σ] = min [(16.2-15.275)/3*0.165 , (15.275-13.8/3*0.165] = min [1.87 , 2.98] = 1.87

Both the Cp and Cpk values are greater than 1.33 and theprocess is capable and does not necessarily need to be adjus-ted, even though the average is slightly off target.

A3. Example of control chart calculationNow we want to create a control chart from the same tighte-nings, as in the previous example. Let us assume that we arestarting up a production process after it has been stopped forsome time. Then we do not really know the mean value µ andthe standard deviation σ. In order to calculate the controllimits for the control chart, the calculations must be based ona reliable number of tightenings. A good rule of thumb is tocollect at least 20-25 subgroups before calculating the controllimits for a control chart. The reason for this is that at least20 subgroups are needed for us to be able to say whether theprocess is under control or not. However, in this example wehave simplified things and collected only 4 subgroups.

i

=xΣ xi n

i=1 n


Let us assume that we have collected these results on 4 diffe-rent occasions. We have set the subgroup size to 5, so wehave collected 5 results on each occasion:

Day 1 15.4

15.6

15.4

15.1

15.1

Day 2 15.5

15.0

15.3

15.2

15.1

Day 3 15.5

15.3

15.4

15.3

15.3

Day 4 15.1

15.2

15.4

15.1

15.2


The first thing we do is to calculate the average for everysubgroup:

= 15.32= 15.22= 15.36= 15.2

When the production process is under control, the targetvalue is the same as the overall average value. It is easy tocalculate the overall average ( ) = 15.275. We know frombefore that the control limits are based on the natural varia-tion between the average values of the subgroups.

Now we can create our control chart. We use the overallaverage as centerline and also mark the control limits in thechart. Now we can plot the subgroup averages in the chart.As we can see they are all within the control limits and theproduction is under control (even though the individual tigh-tening values are outside the limits. Remember that the limitsare based on the variation between the subgroup averages,not the individual tightenings.

From now on it is easy to plot a new subgroup average intothe chart every day. As long the plotted values are spread ran-domly round the centerline, the process is under control. Figure 21. The process is

under control when thesubgroup averages spreadrandomly around the ove-rall mean.

UCL = + 3s / √n = 15.275 + (3*0.165 / √5) = 15.275 + 0.22 = 15.5LCL = – 3s / √n = 15.275 – (3*0.165 / √5) = 15.275 – 0.22 = 15.05


A4. Analysis of assembly tool performance –ISO 5393 CalculationTo allow us to assess the performance of different tools andto compare one tool with another, there is an internationalstandard (ISO 5393), which sets out a basic test procedureand analysis of results. Based on this, many motor vehiclemanufacturers have developed their own certification stan-dards. As an example we will assume we have tested a tool accor-ding to the procedure stated in ISO 5393.On a hard joint with the tool set at the highest torque settingthe following results were obtained (in Nm).

31.5 33.2 32.6 33.7 31.4 32.5 33.1 31.2 33.5 32.6 33.1

31.0 32.3 33.2 32.4 31.5 33.5 33.3 31.5 32.6 31.3 33.7

33.0 31.8 33.0

We now calculate the values required to analyze the tool’stightening accuracy as described in ISO 5393.For the data on the hard joint at the highest torque setting.

Mean torque ( )= (31.5 +33.2 + 32.6 + 33.7 + ....+ 33.0) / 25

= 32.5 Nm

Range= 33.7 - 31.0 = 2.7 Nm

Standard deviation (s)

= 0.863 Nm

Sigma (6s) torque scatter6 x 0.863 = 5.18 Nm

6s scatter as a percentage of mean torque= (5.18 / 32.5) x 100= 15.93 %


Now let us assume that for the same tool we calculated thefollowing values for the data collected at the other torquesettings and joint conditions described in ISO 5393.For the higher torque setting on the soft jointA mean of 31.95 and a standard deviation of 0.795.For the lower torque setting on the hard jointA mean of 23.72 and a standard deviation of 0.892.For the lower torque setting on the soft jointA mean of 22.87 and a standard deviation of 0.801.

We can now make the following calculations forthe higher torque settinga = Mean hard joint +3S hard jointb = Mean soft joint +3S soft jointc = Mean hard joint – 3S hard jointd = Mean soft joint – 3S soft joint

a = 32.50 + (3 x 0.863) = 35.09b = 31.95 + (3 x 0.795) = 34.34c = 32.50 – (3 x 0.863) = 29.91d = 31.95 – (3 x 0.795) = 29.56

Combined mean torque(35.09 + 29.56) / 2 = 32.33 Nm

Mean shift32.5 – 31.95 = 0.55 Nm

Combined torque scatter35.09 – 29.56 = 5.53 Nm

Combined torque scatter as a % of combined mean(5.53 / 32.33) x 100 = 17.1 %

Lower torque settinga = Mean hard joint + 3s hard jointb = Mean soft joint + 3s soft jointc = Mean hard joint – 3s hard jointd = Mean soft joint – 3s soft joint

a = 23.72 + (3 x 0.892) = 26.40b = 22.87 + (3 x 0.801) = 25.27c = 23.72 – (3 x 0.892) = 21.04d = 22.87 – (3 x 0.801) = 20.47


Combined mean torque(26.40 + 20.47) / 2 = 23.44 Nm

Mean shift23.72 -22.875 = 0.85 Nm

Combined torque scatter26.40 – 20.47 = 5.93 Nm

Combined torque scatter as a % of combined mean(5.93 / 23.44) x 100 = 25.3 %

Tool Capability is 25.3 %as the greatest torque scatter was at the lower torque setting.

This particular tool will tighten 99.7 % of all practical jointsto within ± 13 % of its pre-set torque value. (i.e. 99.7 % ofresults will fall within ± 3s of the mean).

Title Ordering No.

Air line distribution 9833 1266 01

Air motors 9833 9067 01

Drilling with hand-held machines 9833 8554 01

Grinding 9833 8641 01

Percussive tools 9833 1003 01

Pulse tools 9833 1225 01

Riveting technique 9833 1124 01

Screwdriving 9833 1007 01

Statistical analysis technique 9833 8637 01

The art of ergonomics 9833 8587 01

Tightening technique 9833 8648 01

Vibrations in grinders 9833 9017 01

Atlas Copco Pocket Guides


www.atlascopco.com 9833

863

7 01

Rec

ycla

ble

pap

er. J

etla

g 2

003:

1. P

rin

ted

in S

wed

en

Pocket Guide Statistical Analysis Techniques_tcm10-1249346

Documents