Top Banner

of 104

Curso Bosch Diseño de experimentos

Apr 06, 2018

Download

Documents

EstebanMarcos
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/2/2019 Curso Bosch Diseo de experimentos

    1/104

  • 8/2/2019 Curso Bosch Diseo de experimentos

    2/104

    Edition 08.1993

    1993 Robert Bosch GmbH

  • 8/2/2019 Curso Bosch Diseo de experimentos

    3/104

    - 3 -

    Table of Contents:

    1. System-Analytical Approach.................................................................................. 5

    1.1 One-Factor-at-a-Time Method................................................................................ 5

    1.2 Two-Factor Method................................................................................................ 91.3 General Case (Numerous Influence Factors) ........................................................ 12

    2. Industrial Experimentation Methodology and System Theory .............................. 16

    2.1 Hints on System Analysis..................................................................................... 17

    2.2 Short Description of the System Theoretical Procedure ....................................... 17

    2.2.1 Global System Matrix (i.e. without quoting any levels)........................................ 19

    2.2.2 Local System Consideration ................................................................................ 19

    2.2.3 Local System Matrix ............................................................................................ 20

    2.3 Summary .............................................................................................................. 20

    3. Probability Plot .................................................................................................... 21

    3.1 Probability Plot of Small-Size Samples................................................................ 22

    3.2 Probability Paper ................................................................................................. 23

    4. Comparison of Samples Means ........................................................................... 244.1 t Test.................................................................................................................... 244.2 Minimum Sample Size ......................................................................................... 26

    5. F Test................................................................................................................... 30

    6. Analysis of Variance (ANOVA)........................................................................... 326.1 Deriving the Test Statistic.................................................................................... 34

    6.2 Equality Test of Several Variances (According to Levene) .................................. 36

    7. Design of Experiments with Orthogonal Arraysand Evaluating such Experiments......................................................................... 38

    7.1 Representing the Results of Measurement............................................................ 437.2 Calculating the Effects ......................................................................................... 527.3 Regression Analysis............................................................................................. 557.4 Factorial Designs ................................................................................................. 567.4.1 Design Matrix ...................................................................................................... 567.4.2 Evaluation Matrix ................................................................................................ 587.4.3 Confounding ........................................................................................................ 597.4.4 Fractional Factorial Designs................................................................................. 637.5 Designs for Three-Level Factors .......................................................................... 657.6 Central Composite Designs .................................................................................. 677.7 Screening Designs According to Plackett and Burman ......................................... 69

    8. Statistical Evaluation Procedures for Factorial Designs ....................................... 718.1 One-Way Analysis of Variance ............................................................................ 718.2 Factorial Analysis of Variance ............................................................................. 728.3 Factorial Analysis of Variance with Respect to Variation .................................... 728.4 Computer Support ............................................................................................... 738.4.1 Evaluation of an Experiment using the FKM Program.......................................... 758.4.2 Evaluation with the Help of SAV Program........................................................... 81

  • 8/2/2019 Curso Bosch Diseo de experimentos

    4/104

    - 4 -

    9. Hints on Practical Design of Experiments ............................................................ 86

    9.1. Task and Target Formulation .............................................................................. 86

    9.2. System Analysis .................................................................................................. 86

    9.3. Stipulating an Experimental Strategy ................................................................... 87

    9.4. Executing and Documenting an Experiment ......................................................... 88

    10. Shainin Method .................................................................................................... 89

    11. List of References ............................................................................................... 92

    12. Tables .................................................................................................................. 93

    Index ...................................................................................................................101

    Within the framework of quality assurance and for effective new and further development

    of Bosch products, careful design of experiments is not only indispensable but is also re-quired by our customers.

    In this connection, the commonly used term Statistical Experimental Design is not ex-actly defined and labels such as Design of Experiments (DOE), Industrial Experimen-tation Methodology, Taguchi Method and Shainin Method(s) are often used inter-changeably.

    This pamphlet is based on a seminar manuscript on Industrielle Versuchsmethodik 1(Industrial Experimentation Methodology) and should clarify vital terms and processes ofthe statistical experimental design, to an interested user.

  • 8/2/2019 Curso Bosch Diseo de experimentos

    5/104

    - 5 -

    1. System-Analytical Approach

    Investigation of a system must often begin with the description of a particular systemsstate. A basic requisite that we impose on an experiment is reproducibility, i.e. under defi-

    nite conditions the result of an experiment must always be the same. Since there cant beabsolute equality (one cannot swim upstream twice), the reproducibility of an experi-ment is a relative term. One can use statistical terms to define the term reproducibility.The term self-control can be interpreted as generalization of the term reproducibility.It is also possible to limit oneself to the statement that a (quantitative) result of an experi-ment must always lie within a specific bandwidth.

    Variation in results of repeated experiments (e.g. process variation) can, under certainsituations, be a vital parameter. Standard deviation, under certain circumstances, canserve as a measure of the variation. If one wishes to evaluate the variation quantitatively,one needs a sufficiently large sample size, i.e. sufficiently many repetitions of the ex-periment (see hereto, Chapter 4.2). Similar statements are valid for the mean position.

    1.1 One-Factor-at-a-Time Method

    If one wishes to investigate the influence of a factor within a system, one varies this factorbut leaves other factors in the system unchanged. In general, this ensures that other fac-tors, which are not the subject of the investigation, neither falsify the results nor restrictthe corresponding deduced statements. (That is obviously easier said than done.) This ap-proach is convenient, logical and should be reckoned as a fundamental experimental strat-egy. The only restriction: The nature of influence depends (possibly very strongly) upon

    the position of the other factors.

    Strangely, numerous textbook authors reckon the one-factor-at-a-time method to be ineffi-cient. A practical person can, nonetheless, confidently ignore these objections. It is evidentthat one-factor-at-a-time experiments must be carefully designed, executed and evaluated.

    Systematic Approach

    We differentiate between variable and discrete influence parameters. Before one deter-mines experiments or experimental series, one must think about what type of influence thevariable factor has. When preparing one-factor-at-a-time experiments we will get ac-quainted with terms which are later of prime importance when investigating the general n-dimensional case.

    a) The most simple type of influence is the linear influence.

    Increasing the influence by a fixed amount always brings about the same effect, independ-ent of the chosen levels (see Chapter 7). Many known natural laws of physics or chemistryare linear (examples?).

  • 8/2/2019 Curso Bosch Diseo de experimentos

    6/104

    - 6 -

    Differential calculus linearizes (nearly) arbitrary functions. However, one should not as-

    sume that a fact to be investigated can be linearized just as a matter of simplicity.

    The statement that every problem

    can be linearized when the differ-

    ence between the steps of the influ-

    ence parameters is small enough,

    may be correct, though this is of

    little practical value, since what is

    considered small enough mustthen be clarified.

    For instance, a temperature differ-ence of 1C can be small in manyproblems, but in other problems, thisincrement may be large.

    Extrapolation beyond the investigated region is only permissible if the function is known.The same restriction applies to interpolation.

    Because a system generally exhibits significant background noise, erroneous interpreta-tions of experimental results easily occur even though linearity is ensured.

    b) Further generalization of linear influence is the monotonic influence (synonym: ten-

    dency, directional factor).

    A monotonic influence is apparentwhen the input quantity can influ-ence the output quantity in only onedirection.

    More precise: A monotonic influ-ence is apparent when an increase ofthe input quantity invariably causeseither an increase or a decrease of

    the output quantity.

    Monotonic influence parameter: The choice of thesteps influences the size of the effect.

  • 8/2/2019 Curso Bosch Diseo de experimentos

    7/104

  • 8/2/2019 Curso Bosch Diseo de experimentos

    8/104

    - 8 -

    Task 1

    When doing a one-factor-at-a-time experiment one should make sure that factors notconstituting the object under investigation neither falsify the results nor restrict the con-firmation. One should discuss, using concrete cases, how this basic principle can be real-

    ized (e.g. by randomization).

    Task 2

    A glass of water is put inside a freezer and the time required for the water to freeze is re-corded.

    The initial temperature of the water (between 10 C and 100 C) should be determined sothat the time interval up to freezing is as long as possible (optimization problem).

    How do you investigate the process empirically?

    Task 3

    a) Assuming that a process is definitely linear.How many supporting points does one need to represent the natural law explicitly?How should the system noise be considered?How does one select the supporting points?

    b) How can one invalidate linearity empirically?

    Task 4

    Electron-impact experiment doneby Franck and Hertz:The anode current flowing to theexciting anode as a function of theanode voltage.

    a) How can the process representedin the adjacent figure be investi-gated empirically?

    b) What could a physicist con-clude, who only performed testsat 5V, 10V and 15V?

  • 8/2/2019 Curso Bosch Diseo de experimentos

    9/104

    - 9 -

    Summary:

    From the considerations discussed in this chapter, it is clear that when investigating the

    influence of a single factor, the given situation is very important - how many measure-

    ments must be made, how many repetitions must be undertaken and where they have to be.

    There is therefore no strict recipe for conducting empirical investigations. Thus, it is not

    appropriate to teach recipes.

    Scheme

    Target quantity(ies):

    Influence variable(s):

    Other factors influencing the target quantity are (which are nonetheless not the objective

    of the investigation):

    How are the quantities considered?

    Prior knowledge:

    Number of the steps:

    Reason:

    Number of repetitions:

    Reason:

    Additional points to be considered:

    1.2 Two-Factor Method

    If one wants to investigate the influence of two factors within a system, phenomena have

    to be observed that dont arise during one-factor-at-a-time investigations. Since these phe-nomena are symptomatic for the general n-dimensional case, a thorough investigation is

    beneficial.Before one determines an experimental arrangement (incl. experiments size), what isknown and unknown regarding the two factors (and what is then to be investigated empiri-cally) must be systematically established.At first, a cognitive investigation takes place in principle; i.e. a knowledge-based de-scription of the two-factors system. What is helpful as in the one-factor-at-a-time methodis differentiation between discrete and variable influence parameters.

    There are 3 cases to be differentiated.

  • 8/2/2019 Curso Bosch Diseo de experimentos

    10/104

    - 10 -

    Case 1: Both influence factors are discrete

    Influence factor A with k levels: A A Ak1 2, , . . .,

    B with l levels: B B B l1 2, , . . .,

    There are k l system states.

    Example:

    Target quantity: YieldPlants A A1 2,

    Pesticides B B B1 2 3, ,

    Remark:

    It is clear that in general one cannot derive other system states A Bi j from the knowledgeof an empirical result.

    Case 2:

    One discrete influence factor A A Al: , . . .,1One variable influence factor.

    Example:

    System: Solution

    Target quantity: Solubility

    A: Chemical substanceB : Temperature

    It is possible, in principle to de-scribe the system in k characteristic

    lines (family of characteristics). Ingeneral, one deals with k differentone-factor problems. Is it possibleto make a general deduction fromone characteristic line to the othercharacteristic line?

    When characteristic curves are shifted upwards in parallel (depending on the discrete fac-tor), one speaks of an interaction-free system.

    Solubility of several inorganic substances as a func-tion of temperature

  • 8/2/2019 Curso Bosch Diseo de experimentos

    11/104

    - 11 -

    Case 3: Both influence parameters are variable.

    The information can be represented in 3-dimensions (see Figure).

    Complex Motronic ignition map (ignition angle as a function of load and engine speed)

    Hereby the values of both variable influence parameters (in this example, load and engine

    speed) constitute the coordinates of points in a plane. The function is then represented by a

    mountain above this plane (see Figure 7.1).

    The experimenter must now specify the region in which empirical investigations are to beperformed. It fully depends upon the physical question - how many experimental pointsshould be foreseen.The idea that the experimental scope can be reduced by means of a combinatorial magic issimply erroneous. The scope can only be reduced through precise task formulation and useof the knowledge already verified.

    Task:

    System: CakeTarget quantity: Height of cakeInfluence parameters: Yeast, water

    How can one investigate the system empirically?What does an array of characteristic curves look like in principle?

    Ignition Angle

    LoadRotationalSpeed

  • 8/2/2019 Curso Bosch Diseo de experimentos

    12/104

    - 12 -

    Summary:

    To handle two influence factors just like in the one-dimensional case: the combinatorialarrangement of experimental points, the number of repetitions per point etc. fully dependupon how a question is formulated. Generally binding rules, in an algorithmic sense, can-

    not exist.The case differentiation

    discrete - discretediscrete - variablevariable - variable

    is helpful.

    1.3 General Case (Numerous Influence Factors)

    A complex system, with numerous influence factors, poses a challenge. It is clear that thetime needed for an investigation increases with the number of factors to be considered. Itwould be good if one could reduce the time needed for an experiment through a combina-torial magic.It is unfortunately not so. The only way to reduce experimental expenses is by applyingthe existing knowledge in a systematic way. This systematic approach must help the prac-titioner but not force him to have to employ terminology he does not understand (nor is

    expected to know). The practitioner must be able to present his knowledge or his pre-sumptions in a simple and rational manner.(Furthermore, the assumptions must be system-based and plausible!)

    We differentiate between variable and discrete influence parameters.

    The description of the type of influence of the individual quantities belongs to principle-system description. In view of the fact that we may have to differentiate among numerousinput quantities, a careful description of the influence of individual input-quantities isespecially important.

    Naturally, the influence of an individual quantity depends upon the position of the other

    input-quantities, but because of this, it must be determined whether the physical-chemicalcharacter of the individual quantities permits making principle statements about the typeof influence, independent of the other quantities.

    With the systematic approach, it is preferable to begin with considering the discrete influ-ence parameters. If for instance, A is a discrete influence parameter with the levels A1, A2(e.g. metal type) then the following should be asked:Is one of the two levels, in respect of the target quantity, better, in principle, than the oth-ers or not?If it is not the case, then this means that the answer to the question depends upon the posi-tion of the other factors.

  • 8/2/2019 Curso Bosch Diseo de experimentos

    13/104

    - 13 -

    Example:

    A Definite A Ambiguous

    Remark:It is usually preferred to begin by investigating the discrete influence parameters, basically

    because the different steps often represent the system states to be differentiated. (Dontcompare apples with oranges!).

    Variable InfluenceBefore determining the experiments or experimental series, the overall influence should bedescribed (i.e. without determining the levels). Relevant terminology is known to us (see1.1 and 1.2).

    Black BoxIf, after a careful analysis, all of the influence parameters are ambiguous or if the characterof the influence is unknown, then the matter can not be investigated empirically. If onenevertheless, wants to conduct the experiment, all the strategies then become nearlyequivalent (all cats are grey at night).

    Trial Task:

    System:A board fixed on one side.

    Target quantity:Lowering of the free end.

    Influence quantities:Types of wood H H H 1 2 3, , ,

    length, breadth, height,force F

  • 8/2/2019 Curso Bosch Diseo de experimentos

    14/104

    - 14 -

    I. a) Perform a global system analysis with the help of the system matrix!

    b) If appropriate, draw an array of characteristic lines!

    Length Breadth Height Force

    Linear

    Monotonic

    Non-monotonic

    Unknown

    II. Given are

    Length: 1.5 m

    Breadth: 20 cm

    Height: 4 cm

    Force: 20 N

    All 4 quantities can be reduced by up to 10%. Target is a board which is lowered as little

    as possible. Perform a local analysis!

    Which experiments or experimental series would you perform?

    Factors Length Breadth Height Force

    Definite

    Ambiguous

    Unknown

    Trial Task:

    System: Green house

    Target

    quantity: Yield of useful plants

    Influence-

    quantities: Types of plants P P P1 2 3, ,

    Types of soil B B1 2,

    Chemicals C C C1 2 3, ,

    Water quantity (irrigation)

    Light

    Temperature

  • 8/2/2019 Curso Bosch Diseo de experimentos

    15/104

    - 15 -

    1. Perform a detailed system analysis (with system matrix)!

    2. Draw arrays of characteristic lines!

    What can be said about interactions?

    3. Which experimental strategy is recommendable?

    Trial Task:

    a) What does the optimization strategy of a monotonic system look like?

    Global System Matrix:

    Factor A B C D

    Monotonic

    Non-monotonic

    Unknown

    b) What does the optimization strategy of the following system look like?

    Factors A B C D E F

    Levels A1 A2 B1 B2 C1 C2 D1D2 E1 E2 F1 F2

    Definite X X

    Ambiguous X

    Unknown X X X

  • 8/2/2019 Curso Bosch Diseo de experimentos

    16/104

    - 16 -

    2. Industrial Experimentation Methodology and System Theory

    The terminology or key words summarized under D.O.E., Statistical Experimental Design,

    Taguchi, and Shainin methods, as mentioned earlier, are either required or initiated by

    customers and also used in specialized literature.

    With respect to the practical relevance of the methods mentioned above, reference is made

    to the following:

    Taguchi Method

    The Taguchi method is characterized by, among other things, the usage of the so-called

    orthogonal arrays to reduce the required extent of the experiment. The use of the method is

    dependent upon negligibility of interactions or - in exceptional cases - the predictability of

    interactions. These assumptions are controversial; nevertheless, successful examples are

    often quoted in the literature. These successes are not verifiable and usually not rationally

    comprehensible. What is confirmed is that substantial misstatements can be proved withthe orthogonal arrays.

    D.O.E. (= Design of Experiments)

    Anybody who has ever thought about performing an experiment, has practiced experi-mental design. Thus, one can never ask the question whether one is for or against experi-mental design. With regard to the contents of textbooks about the D.O.E.-subject, how-ever, there are some reservations, for instance:

    All algorithmic approaches are based on models, i.e. a mathematically quantitativemodel is suggested to represent the reality to be investigated. All subsequent proce-dures (experimental designs, evaluations etc.) are only reasonable if the model ade-quately describes the reality.

    The difficulty of selecting the right model is fundamentally natural.

    From the results structure, it is not possible to recognize whether the model is ade-quate (i.e. verification is neither a prior nor a posterior possibility).

    A way out of this difficulty is only possible via a system-theoretical approach.

    Shainin Method

    For Shainin method see Chapter 10 and [11].

  • 8/2/2019 Curso Bosch Diseo de experimentos

    17/104

    - 17 -

    2.1 Hints on System Analysis

    The prerequisite for a reasonable experimental design is a system analysis. The purpose of

    a system analysis is, among other things, to present existing knowledge or lack of know-

    ledge for the system to be investigated with the help of elementary terms. Theoretical DOE

    terms are to be avoided at this stage for various reasons. After executing the system analy-sis, a decision can be made, to some extent deduced, about the experimental strategy that

    is appropriate. Automation in the sense of a strict recipe is not appropriate and therefore

    not to be pursued. Formulation of General Systems Theory terminology is used.

    Generally it may be assumed that the system to be investigated does not represent ablack-box. (It is self-evident that a real black-box cannot be investigated with formalprocedures). Hence the specialist will be able to make principle statements about the in-put-output-situation of the system. An explanation in principle, i.e. qualitatively correctexplanations, are preferred to precise quantitative statements that are for various reasonsoften false (better be approximately right than exactly wrong).

    2.2 Short Description of the System-Theoretical Procedure

    System analysis begins with system definition. This includes listing all relevant targetquantities (output) as well as relevant influence parameters (input).

    Here for instance, flow charts and cause-and-effect diagrams can be helpful. When dealingwith input-quantities, e.g., care should be taken about independence, susceptibility andpossibility of definite establishment.

    Subsequent to completion of the system definition, the system characteristics are to bedescribed. System analysis is a recursive process. In the ideal case, all relevant systemcharacteristics are known and investigating the system via experiments becomes unneces-sary.

    A statement about system noise belongs to system characteristics description, i.e. thedescription of output-quantities behaviour when given input-quantities are kept constant.

    The knowledge of system noise has vital consequences to the type and scope of impendinginvestigations. Describing the functional input-output situation is important within thescope of information about system characteristics. In view of the fact that normally severalinput-quantities exist, describing the influence of the individual input-quantity is espe-cially important. Naturally, the influence of an individual quantity depends upon the posi-tion of the other input-quantities, and for this reason it is especially important that thephysical-chemical character of the individual quantity permits making principle statementsabout its type of influence, independent of the other quantities.

    Here the following formulation of terms can help further:

    Global description

  • 8/2/2019 Curso Bosch Diseo de experimentos

    18/104

    - 18 -

    Linear influence (as a special case

    of the monotonic influence):

    A linear influence exists, if the

    functionsf A( , . . . )

    are always lin-

    ear (linear influence factors are

    certainly exceptional cases).

    Monotonic influence:

    A monotonic influence exists if the

    input-quantity can only influence

    the output-quantity in one direction.

    Non-monotonic influence factor:

    A non-monotonic (dichotomous)

    influence exists if the input-quantityis influenced in both directions (i.e.

    both upwards and downwards). Here

    also the characteristic of the influ-

    ence factor depends upon the posi-

    tion of the other influence factors. It

    is generally assumed, however, that

    the type of the dichotomy is an in-

    variant of the influence factor, i.e.

    the dichotomy is independent of the

    position of the other factors.

    Characteristics of a linear influence factor

    Characteristics of a monotonic influence factor

    Characteristics of a dichotomous influence factor

  • 8/2/2019 Curso Bosch Diseo de experimentos

    19/104

    - 19 -

    2.2.1 Global System Matrix (i.e. without quoting any levels)

    Considering the special role of discrete input-quantities, every single quantity is then

    specified according to how someone, conversant with the system, determines the influence

    character (without quantification). Hereby reference is made to the above type classifica-

    tion.

    The results are summarized in the global system matrix:

    Factors A B C Z

    Linear

    Monotonic

    Dichotomous

    Unknown

    A completed global system matrix can alreadydepict a sensible experimental strategy.

    Example:

    If all influence factors are monotonic, then it is simple to optimize the system and the only

    question needed to be asked is what influence factors are decisive for the optimum. Here

    reference can be made to the Shainin method.

    2.2.2 Local System Consideration

    Often, an experimental strategy directly follows from the global system consideration.

    Because the global characteristics array, especially that of the dichotomous influence fac-

    tors, is often very complex, system consideration must be localized; i.e., the levels of theinfluence factors must be prescribed and the properties of the system relative to the pre-scribed levels considered.

    For the special case between the two steps, the following case-differentiation is to bemade:

    1. Univalent Influence Factor (univalent = definite)If the target quantity is only moved in one direction with a change from A1 to A2 ,

    i.e. f A f A( ) ( )1 2 0 >

    or always f A f A( ) ( )1 2 0 < ,

    then a univalent factor exists.

    Hint:Because of localization, a dichotomous factor can be univalent. To some extent, how-

    ever, there exists some correspondence between univalent and monotonic factors.

  • 8/2/2019 Curso Bosch Diseo de experimentos

    20/104

    - 20 -

    2. Bivalent Influence Factors (bivalent = ambiguous)

    Bivalent factors, according to definition, are factors which are not univalent. That

    means that the factor, depending on the position of the other factors, influences the tar-

    get quantity both upwards and downwards when the level of the influence factor is

    changed as prescribed. The behaviour of a bivalent factor is, as such, synergetic or an-

    tagonistic. It is of special importance to find out which ones of the other factors cause

    the changes.

    2.2.3 Local System Matrix

    (depending upon the selected levels, i.e. there exists not only one local system matrix).

    The results of the local system consideration are summarized in the local system matrix.

    Example:

    Factors A B C Z

    Levels A1 A2 B1 B2 C1 C2 Z1 Z2

    Univalent

    Bivalent

    Unknown

    A completed local system matrix gives an indication of the complexity of localized prob-

    lems. The simplest case exists when all influence factors are univalent. Then the experi-

    mental strategy is obvious. The most difficult case exists when all influence factors are

    bivalent or when the character of the influence is unknown.

    In this case, a simple experimental strategy is (without further information) impossible.

    Especially, reasonable optimization with a small experimental series is not attainable.

    2.3 Summary

    The statement made in the QS-Info 1/90 there is no alternative to statistical design of

    experiments is only correct if, under statistical design of experiments, one understandsthe systematic, i.e., the system-theoretical design of experiments by considering the statis-tical points of view.If under statistical design of experiments, however, one understands the contents of thetextbooks about statistical design of experiments (from Fisher via Box to Taguchi), then itis assumed that these contents are not or are only seldom transferable to real-life. Similarreservations are made with respect to commercial software-packages. Especially, everypolemic against the so-called conventional methods is uncalled-for. A consequent appli-cation of the system-theoretical attitude will often lead to the need to account for conven-tional investigation types in other cases, however, this can lead to the formal approachesbeing seen as promising. Holding to stubborn schools of thought is certainly detrimental atlong-term.

  • 8/2/2019 Curso Bosch Diseo de experimentos

    21/104

    - 21 -

    3. Probability Plot

    When one speaks about a normal distribution, one mostly associates this concept with a

    Gaussian bell-shaped curve. The Gaussian bell-shaped curve is a representation of the

    probability density function x( ) of the normal distribution:

    f x e

    x

    ( ) =

    1

    2

    1

    2

    2

    .

    This function and its graphic representation are printed on the 10 DM bank note, be-sides the portrait, in honour of the mathematician called Carl Friedrich Gau.

    The normal distribution assigns to every value x the probability that a random variable Xtakes a value between and x . One acquires the distribution function F x( ) of the

    normal distribution, in that he integrates over the above given density function.

    f x e dv

    vx

    ( ) =

    1

    2

    1

    2

    2

    F x( ) corresponds to the area up to the value x , under the Gaussian bell-shaped curve.

    The graphical representation of this function has an s-shaped form. Thus, strictly speaking,one must always think about this curve whenever a normal distribution is concerned.

    If the y-axis, in this representation, is now distorted such that a straight line evolves out ofthe s-shaped curve, a new coordinate system - the probability paper - emerges. The x-axisremains unchanged.Because of this association, a normal distribution in this new coordinate system is alwaysportrayed as a straight line on the probability paper.

    One uses this fact in order to check graphically for the normal distribution of a given dataset. As long as the number of measured values given is large enough, one creates a histo-gram of these values, thus determining the relative frequencies of values within the classesof a grouping. If the cumulative relative frequencies found are now plotted over the rightclass limits on the probability paper and a series of points approximately lying on astraight line is obtained, then it can be inferred that the values of the data set are approxi-

    mately normally distributed.

    Remark:The recording of measurement values or groups of measurement values ordered accordingto the factor levels on probability paper is a component of the SAV-program (see Chapter8.4 Computer aid and [9]).

    Hint: In German, two different denotions are used in this context. Wahrscheinlichkeits-netz stands for the coordinate system in which the data are plotted and Wahrscheinlich-keitspapier denotes the form (sheet) with the pre-printed coordinate system (see chapter

    3.2), whereas in English textbooks the denotion probability paper is used for both.

  • 8/2/2019 Curso Bosch Diseo de experimentos

    22/104

    - 22 -

    3.1 Probability Plot of Small-Size Samples

    The size of a sample for creating a histogram or calculating relative frequencies is often

    not sufficient, so that representation on the probability paper according to the above-

    described method is not possible. There is a way out of this dilemma, which is explained

    below.

    The processes can be understood easily by means of computer simulation.

    One takes a sample of size n : x x xn1 2, , . . . , from a standard normally distributed popula-

    tion ( = 0 , = 1) and arranges the values in order of magnitude:

    ( ) ( ) ( ) x x xn1 2 . . . .

    The number assigned to each of the sample values in this increasing sequence is called

    rank. The smallest value ( )x 1 has therefore the rank 1, the greatest value ( )x n the rank n .

    Then one determines the value F F xi i= ( )( ) from the table of standard normal distributionfor every ( )x i ( , , . . ., )i n= 1 2

    If this process is frequently repeated, then the cumulative frequencies H ni( ) ensue for

    every rank i as a mean value of Fi (in actual sense, the median will be considered).

    To every sample size 6 50 n these cumulative frequencies H ni ( ) are given for eachrank i in Table 1 (Section 12).

    We now consider a sample of size 10 for example, which should be tested for normal dis-

    tribution:

    2.1 2.9 2.4 2.5 2.5 2.8 1.9 2.7 2.7 2.3.

    The values are sorted according to magnitude:

    1.9 2.1 2.3 2.4 2.5 2.5 2.7 2.7 2.8 2.9.

    The value 1.9 has rank 1, the value 2.9 rank 10. In the table in the appendix (sample size

    n = 10) one finds the cumulative frequencies (in percentage) for every rank i :

    6.2 15.9 25.5 35.2 45.2 54.8 64.8 74.5 84.1 93.8.

    Finally, one chooses a suitable division (scaling) for the x-axis of the probability paper

    corresponding to the values 1.9 up to 2.9 and enters the cumulative frequencies versus the

    well-sorted accompanying sample values on the probability paper. One therefore marks the

    following points in the example considered above:

    (1.9; 6.2), (2.1; 15.9), (2.3; 25.5), ...

    ..., (2.7; 74.5), (2.8; 84.1), (2.9; 93.8).

    Because these points are well approximated by an eye-fitted straight line, it can be as-

    sumed that the sample values are approximately normally distributed.

  • 8/2/2019 Curso Bosch Diseo de experimentos

    23/104

    - 23 -

    3.2 Probability Paper

    The plot of the above described points will be simplified if the so-called probability paper

    is used. This is a special form where horizontal lines are drawn at the positions of the cu-

    mulative relative frequencies which correspond to ranks i .

    The probability paper for the sample size n = 10 therefore exhibits horizontal lines for thevalues:

    6.2% 15.9% 25.5% ... 74.5% 84.1% 93.8%.

    Hint:

    The cumulative frequency H ni ( ) to the rank i can also be calculated with the following

    approximation formulas

    H ni

    ni ( )

    .=

    05and H n

    i

    ni ( )

    .

    .=

    +

    0 3

    0 4.

    The deviation from the exact value in the table is thereby insignificant.

    Approximating values for n = 10:

    5% 15% 25% 35% 45% 55% 65% 75% 85% 95%

  • 8/2/2019 Curso Bosch Diseo de experimentos

    24/104

    - 24 -

    4. Comparison of Samples Means

    4.1 t Test

    The t test is a statistical method with which a decision can be made to determine whetherthe mean values of two samples are significantly different. In order to clarify the func-

    tional nature of t tests, we will perform the following mental experiment:

    We derive from a normally distributed population N(, ) two samples each of size n,

    calculate the mean values y1 and y 2 as well as the standard deviations s1 and s2 (or the

    variances s12

    and s22) and finally deduce the value

    t ny y

    s s=

    +

    1 2

    1

    2

    2

    2.

    t can take values between 0 and + . If we repeat this process very often, we expect thatmainly values near zero occur and very large values are rarely found.

    This mental experiment was performed by computer simulation. For n = 10 and 3,000sample pairs ( t -values), the result was the histogram represented in Fig. 4.1.

    Fig. 4.1

  • 8/2/2019 Curso Bosch Diseo de experimentos

    25/104

    - 25 -

    If one simultaneously lets the number of samples approach infinity and the class width

    approach zero, the histogram will more and more approach the straight line that represents

    the density function of the t distribution.

    The upper limit of the 99% random variation range (percentage point) is, in this example,

    t18 0 99 2 88; . .= , i.e. only in 1% of all cases can values greater than 2.88 randomly occur.

    Percentage points of the t distribution are tabled for different error probabilities depending

    upon the number of degrees of freedom f n= 2 1( ) (Table 2). The t test approach isbased on the relationship represented above.

    A decision shall be made whether the arithmetic mean values of two existing series of

    measurements (each of size n ) can belong to one and the same population or not. As theso-called null hypothesis, it is therefore assumed that the mean values of the respectively

    affiliated population are equal.

    Hence, the test statistic becomes calculated from both the mean values y1 and y 2 as well

    as the variances s12

    and s22:

    t ny y

    s s=

    +

    1 2

    1

    2

    2

    2for n n n1 2= = .

    If t t n> 2 1 0 99( ); . is the result, i.e. t lies outside the 99% random variation range, the nullhypothesis will be rejected.

    Hint: The expression for the test statistic t is then, in the simplest form only applicablewhen both the variances of the populations as well as the sample sizes are assumed to be

    equal respectively ( 12

    2

    2= and n n n1 2= = ). The prerequisite for equal variances can betested with the help of an F test (see 5).

    The t test, in the form represented here, tests the null hypothesis 1 2= against the al-ternative 1 2 . As such, a two-sided question exists. For this reason, the absolutevalue of the difference of the means is contained in the expression for t .

    t can hence only assume values 0 , so that the distribution depicted in Figure 4.1 re-sults.

    Table 2 in Section 12 gives the 95%, 99%, and 99.9% percentage points of the t distribu-

    tion in correspondence with the two-sided question. They correspond to the one-sided per-

    centage points: 97.5%, 99.5% and 99.95%.

  • 8/2/2019 Curso Bosch Diseo de experimentos

    26/104

    - 26 -

    4.2 Minimum Sample Size

    In the preceding Section 4.1 it was explained how one can decide, by means of a t test,

    whether or not the mean values of two samples are significantly different.

    This decision is frequently the goal of experiments, by which the change of a target char-acteristic in dependence upon two system states or two settings of an influence factor is to

    be determined. The subsequent intention with respect to pursued system optimization is to

    choose the better one between two selected settings.

    This especially applies to experiments witch use orthogonal arrays, by which several in-fluence factors are concurrently varied on two levels (see Chap. 7).

    The executed factorial analysis of variance (see 8.2) in the scope of the evaluation of suchexperiments is, in principle, nothing other than a comparison of mean values of all ex-periment results attained for two settings (levels) of an influence factor, by consideringexperimental noise.

    In the preparatory phase of such experimental investigations, the experimenter often asksthe question: which minimum mean value difference is of actually interest in view ofhis target (system optimization, production simplification, costs reduction), and whichminimum sample size n must be chosen, so that the minimum mean value distance, if ac-tually existent, is ascertained as a result of the experimental evaluation (significant).

    From the expression for the test statistic t (see Section 4.1)

    t ny y

    s s

    =

    +

    1 2

    1

    2

    2

    2

    it is apparent that for a significant test result, n must be the greater, the smaller the meanvalue difference y y1 2 is and the greater the variances s1

    2 and s22 of both of the series to

    be compared are. Note that the table value tTable is smaller at increasing number of degrees

    of freedom f n= 2 1( ) .

    Visually, a small difference of mean values by a simultaneously greater variance of dis-tributions means that both groups of values are visually either indistinguishable or arehardly distinguishable in a graphical representation of both measurement series.

    Based on the previous discussion, it is possible to estimate the minimum sample size nroughly, by assigning the mean value difference as a multiple of a mean variance

    s s12

    2

    2

    2

    +and for different n the calculated test statistic t is compared with t

    Table(ob-

    serve the degrees of freedom and significance level!).

  • 8/2/2019 Curso Bosch Diseo de experimentos

    27/104

    - 27 -

    Besides this trial method, however, there is an exact deduction method for the minimumsample size from the statistical point of view, which we only sketch roughly at this point(deduction in [1] and [7]).

    By comparing the mean values of two series of measurements and the corresponding test-

    decision, two types of errors are possible.In the first case, both series of measurements originate from the same population, i.e. thereis no significant difference. If one decides here, due to a t test, that a difference of bothmean values exists, then an error of the first kind ( ) is made. It corresponds to the sig-nificance level of the t test (for example = 1% ).

    If, in the second case, a difference of the mean values actually exists, i.e. the measuredseries originates from two different populations, then this will not be indicated with abso-lute certainty by the test. The test result can coincidentally indicate that this differencedoes no exist. One speaks in this case about an error of the second kind ( ).

    For the person performing the experiment, both of these error types are unpleasant, be-cause for example due to the likely significant effect of an influence factor, further expen-sive investigations may be initiated or even changes in the production process (error of thefirst kind; type I error), or because the actually significant effect is not identified, thechance to make possible process improvements (error of the second kind; type II error) ismissed.

    The minimum sample size n , which is required in order to identify a real mean value dif-

    ference depends upon both the distance 2 1

    = =D of the mean values given in

    units of standard deviation in correspondence with the above plausibility considerationand the error probabilities and .

    ( )n

    u u

    D=

    + 2

    2

    In the concrete case of comparing two series of measurements, the mean values 1 and 2 as well as the standard deviation of the population (subsequently also D ) are notknown. They become estimated through the empirical values y1, y 2 and s . For this rea-

    son, when calculating n according to the given formula, the t distribution must be taken as

    a basis.

    Accordingly, u and u are the abscissa values u , by which the t distribution assumes

    the values (two-sided) or (one-sided).

    Smaller error probabilities, i.e. smaller type I ( ) and type II errors ( ) mean that bothdistributions to be compared and thus also the distributions of the mean values may onlymarginally overlap. For this, with a given mean values distance D , the sample size n mustbe chosen adequately large.

  • 8/2/2019 Curso Bosch Diseo de experimentos

    28/104

  • 8/2/2019 Curso Bosch Diseo de experimentos

    29/104

    - 29 -

    Stronger effect

    Medium effect

    Weaker effect

  • 8/2/2019 Curso Bosch Diseo de experimentos

    30/104

    - 30 -

    5. F Test

    The F test is a statistical method, with which it can be decided, whether the variances of

    two samples are significantly different.

    The functionality of the test can be explained, just as in the case of the t test, using the

    result of a computer simulation.

    We take two samples of sizes n1 and n2 from a normally distributed population N( , ) and calculate the sample variances s1

    2and s2

    2, and from this finally calculate the quantity

    Fs

    s= 1

    2

    2

    2.

    F can take values between 0 and + . It is plausible that by frequent repetition of this

    procedure, small values near zero and very large values result very rarely.

    The results of a computer simulation, by which the F-values for N= 3 000, sample pairsare determined with sample sizes n n n1 2 9= = = , are represented as a histogram in thefollowing figure.

    Figure 5.1

  • 8/2/2019 Curso Bosch Diseo de experimentos

    31/104

    - 31 -

    If one lets the number of samples approach infinity and, at the same time, the class width

    approaches zero, the histogram will approximate the line in Fig. 5.1 (density function of

    the F distribution).

    The shape of the histogram depends upon the sample sizes n1 and n2 of the investigated

    sample pairs; the curve shape of the density function of the F distribution correspondinglydepends upon the degrees of freedom f n1 1 1= and f n2 2 1= .

    The upper limit of the 99% random variation range (percentage point) in the calculated

    example is F8 8 0 99 6 03; ; . .= , i.e. only in 1% of all cases (error probability) is random

    s s12

    2

    26 03 . .

    The percentage points of the F distribution are tabled in the appendix for different error

    probabilities dependent upon the degrees of freedom 1 and 2 .

    The relationship represented above makes the approach by F test understandable.

    It should be decided whether or not two series of measurements, with sizes n1 and n2,

    originate from two normally distributed populations with the same variance (the mean

    values do not need to be known).

    As a null hypothesis, it is assumed that the variances of the respective populations are

    equal: 12

    2

    2= .

    Finally, the test statistic Fs

    s= 1

    2

    2

    2will be calculated from the variances s1

    2and s2

    2of both

    measurement series and compared with the percentage point of the F distribution. If the

    result is F Fn n> 1 21 1 0 99; ; . , i.e., F lies outside of the 99% random variation range, thenthe null hypothesis will be rejected.

    Remark:

    The alternative hypothesis is 1

    2

    2

    2> ; a one-sided problem is in question.

    In principle, when one writes the greater one of the two variances s12

    and s22

    above the

    fraction line, then F can only assume values greater than 1; now there is a two-sidedquestion. If an error probability of = 1% is chosen the percentage point must be deter-mined with an accuracy of 99.5%.

  • 8/2/2019 Curso Bosch Diseo de experimentos

    32/104

    - 32 -

    6. Analysis of Variance (ANOVA)

    With the help of the t test (Section 4.1) a determination is made whether the mean values

    of two series of measurements are significantly different. The series of measurements to be

    compared can be considered formally as experimental results for both respective levels1(e.g. material A) and 2 (material B ) of an individual influence factor (material).

    If one expands the one-factor-at-a-time experiment to more than two levels (general: klevels), then it is no longer possible to compare the mean values using the t test. In this

    case, an evaluation can occur by means of the analysis of variance.

    If the factor A has no influence upon the measurement results, then all individual results

    y i j can be seen as originating from the same population. The y i j and thus also the mean

    values y i are then only subjected to random deviations (experiment noise) of the com-

    mon mean value .

    In the other case - the factor A has a significant influence upon the result of measurement -the mean values 1 , . . . , k of the distributions belonging to the levels A Ak1, . . . , of thefactor A will be different.

    In the scope of the analysis of variance, one sets k independent, normally distributedpopulations with the same variance as prerequisite and formulates the null hypothesis:

    All measured values originate from populations with the same mean value 1 2= = = =. . . k (Remark: Since identical variances were a prerequisite, the nullhypothesis means that all measured values originate from one and the same population).Therefore one calculates the mean variance within the experimental rows (levels ofA)

    ( )s sk

    sy yi

    i

    k

    22 2 2

    1

    1= = =

    as well as the variance between the experimental rows (levels ofA) s sy12 2= .

    sy2 Is a measure for the experimental noise. sy

    2 Is the variance of the mean values y i.

    If the null hypothesis is correct, both factors are estimates of the variance of the underly-

    ing population:

    $ 12 2= n sy $ 2

    2 2= sy .

    The factor n is to be considered because of the relationship

    y

    y

    n= .

  • 8/2/2019 Curso Bosch Diseo de experimentos

    33/104

    - 33 -

    Finally, one conducts an F test with the test statistic

    Fn s

    s

    y

    y

    = 2

    2

    (comparison of both estimates), and rejects the above formulated null hypothesis, if

    F Fk n k

    > 1 1 0 99; ( ) ; . . (percentage points for F in the appendix)

    Rejection of the null hypothesis means: a significant difference exists with regard to the

    mean values y i of the results of measurement for the levels of factor A, or: factor A has a

    significant influence upon the result of measurement.

    Figure 6.1

    Figure 6.2

  • 8/2/2019 Curso Bosch Diseo de experimentos

    34/104

    - 34 -

    Figures 6.1 and 6.2 should illustrate the importance of this fact. Along the diagonals, the

    density functions of normal distributions with equal variance are represented respectively.

    In the corners of the figures, the density functions of the mixture of distributions (top left)

    and of the distribution of the mean values (bottom right) are represented.

    The distributions on Figure 6.1 are only subjected to small mean-value fluctuations, the

    mixture of distributions is nearly normally distributed.

    The variance of the distribution of mean values and original distributions are rarely differ-

    ent, so that an F test does not reject the null hypothesis (identical mean values). In com-

    parison with this, the mean values of the seven distributions in Figure 6.2 show greater

    fluctuations, the variance of the mean-value distribution is substantially (significant)

    greater than that of single distributions.

    Accordingly, the null hypothesis, that is the assumption of identical mean values, will in

    this case be rejected within the scope of an analysis of variance.

    6.1 Deriving the Test Statistic

    The term analysis of variance is based on the decomposition of variation of all measured

    values in both parts - random variation (experimental noise) and systematic deviation ofthe mean values associated with the above represented formality.

    This decomposition is described as follows. When k represents the number of rows and nthe number of measured values (experiments) per row, then the overall variance of alln k measured values is given by

    ( )sn k

    y yi jj

    k

    i

    n2 2

    11

    1

    1=

    == .

    The quantity Q n k s= ( )1 2 is called the sum of squares (SS).

    ( )Q y yi jj

    k

    i

    n

    = ==

    2

    11

    ( )Q y y y yi j j jj

    k

    i

    n

    = + ==

    2

    11(expansion with zero)

    ( )Q y y y y y y y yi j j i j jj

    k

    j j

    i

    n

    = + + ==

    2

    1

    2

    1

    ( ) ( ) ( )

    If we first consider the middle term:

    ( ) ( ) y y y yi j jj

    k

    j

    i

    n

    ==

    11

    = = ===

    ( ) ( ) y y y y y yi j jj

    k

    j i j j

    j

    k

    i

    n

    i

    n

    1 111

    .

  • 8/2/2019 Curso Bosch Diseo de experimentos

    35/104

    - 35 -

    =

    +

    == == == y y y y y y y j i j ji

    n

    j

    k

    i j

    j

    k

    i

    n

    j

    j

    k

    i

    n

    ( )11 11 11

    =

    + == y y y n k y n k y j i j j

    i

    n

    j

    k

    ( )11

    2

    ( )= ==

    y n y n y j j jj

    k

    ( ) 01

    Therefore:

    ( ) ( )Q y y y yi j jj

    k

    j

    j

    k

    i

    n

    i

    n

    = +

    = ===

    2

    1

    2

    111

    Q n s k sj yi

    n

    j

    k

    = + == ( ) ( )1 12 2

    11

    ( ) ( ) ( )n k s k n s n k sy y = + 1 1 12 2 2

    Q Q Q= +1 2

    Overall variation = experimental noise + variation of mean values

    Degrees of freedom ofQ1: k n1 1= ( )Degrees of freedom ofQ2: k2 1= Degrees of freedom ofQ: n k= 1

    Equation of the number of degrees of freedom:

    = +2 1

    n k k k n = + 1 1 1( )n k n k = 1 1

    Test statistic: F

    Q

    f

    Q

    f

    n k

    ks

    k n

    k ns

    n s

    s

    y

    y

    y

    y

    = =

    =

    2

    2

    1

    1

    2

    2

    2

    2

    1

    1

    1

    1

    ( )

    ( )

    ( )

  • 8/2/2019 Curso Bosch Diseo de experimentos

    36/104

    - 36 -

    6.2 Equality Test of Several Variances (According to Levene)

    With the one-way analysis of variance, it is investigated whether a factor A has a signifi-cant influence upon the result of measurement. Thus a determination is made whether the

    mean values 1 , . . . , k of the measurement results which belong to the levels

    A Ak1, . . . , are significantly different.

    Frequently the aim of the experiments in this case is to maximise or to minimise a target

    quantity.

    In connection with investigating disturbance-insensitive (robust) designs, it can be of in-

    terest to find out parameter settings, at which the experimental results possibly exhibit

    little variation (variance).

    For this reason, it is sensible to initially check whether the variances of the results in the

    individual experimental rows are significantly different.

    Experiment No. Results Mean Variance

    1 x x x n11 12 1, , . . . , x1 s12

    2 x x x n21 22 2, , . . . , x 2 s22

    k x x xk k k n1 2, , . . . , x k sk2

    Deviating from our notation to date, we designate the determined measured values with x

    and calculate the row mean values x i as well as the variances within the rows s i2.

    To test the equality of these variances s i2, Levene proposes the following method:

    0. Formulate the null hypothesis:

    All results of measurement originate from populations with equal variance:

    12

    2

    2 2= = =. . . k.

    1. Calculate the absolute deviations of the results of measurement x i j from the

    mean values x i . This corresponds to a transformation according to the equation:

    y x xi j i j i= .

    The transformed values y i j are entered in the evaluating scheme.

    Further calculation is done exclusively with the transformed values y i j .

  • 8/2/2019 Curso Bosch Diseo de experimentos

    37/104

    - 37 -

    Experiment No. Results Mean Variance

    1 y y y n11 12 1, , . . . , y1 s12

    2 y y y n21 22 2, , . . . , y 2 s2

    2

    k y y yk k k n1 2, , . . . , y k s k2

    2. Calculate the mean values y i and variances sy2

    3. Calculate the mean value of the variances sy2

    4. Calculate the variance sy2

    of the mean values y i

    5. F test with the test statistic

    Fn s

    s

    y

    y

    = 2

    2Degrees of freedom: k1 1= , n k2 1= ( )

    If F, for example, is greater than the percentage point Fk n k 1 1 0 99; ( ) ; . , then the null

    hypothesis will be rejected with an error probability

  • 8/2/2019 Curso Bosch Diseo de experimentos

    38/104

    - 38 -

    7. Design of Experiments with Orthogonal Arrays

    and Evaluating such Experiments

    In this section, two simple examples will be used to represent how orthogonal arrays are

    applied:

    Example 1: One-factor-at-a-time method

    The change in length of an alloy should be determined through experiment. Two experi-

    ments will be performed.

    1. Experiment: length at T1 25= C2. Experiment: length at T2 100= C

    L C cm1 25 100 04( ) . = L C cm2 100 10016( ) . =

    One starts with the fact that a linear relationship exists between expansion and temperatureand therefore wants to calculate the equation of the straight line in order to determine ar-bitrary intermediate values.

    Equation of the straight line: L A A T = + 0 1

    Through a coordinate transformation, as it is schematically represented in Figure 7.0.1through the second x-axis, the pair of values (T1, T2 ) will be formally transformed in (-1,

    +1).

    Figure 7.0.1

  • 8/2/2019 Curso Bosch Diseo de experimentos

    39/104

    - 39 -

    The transformation equation is x

    TT T

    T T=

    +

    2 1

    2 1

    2

    2

    .

    Remark:

    This equation can be written in the form given in 7.1 through a simple transformation:

    xT T

    T T=

    +2

    12 1

    2( ) .

    Substituting the values T1 25= C and T2 100= C gives: xT

    = 62 537 5

    .

    ..

    For T T= 2 follows: x = +1.For T T= 1 follows: x = 1.

    In the transformed coordinate system, the straight line equation is: L a a x= + 0 1 .

    From there, follows for x = +1: 10016 0 1. = +a a ,for x = 1: 100 04 0 1. = a a .

    At this point the reason for the coordinate transformation is clear; the coefficients a0 and

    a1 are thus easy to calculate by addition or subtraction of both equations:

    a 0100 16 100 04

    21001=

    +=

    . .. a 1

    100 16 100 04

    20 06=

    =

    . .. .

    The coefficient a0 is the mean value of both lengths: aL L

    0

    2 1

    2=

    +.

    The coefficient a1 is the half effect (see Figure 7.0.1): aL L

    1

    2 1

    2=

    .

    Thus, in the transformed system the equation of the straight line is:

    L L L L L

    x=+

    +

    2 1 2 12 2

    L x= + 1001 0 06. . .

    The equation of the straight line in the original system is found by reverse transformation

    LT

    = +

    1001 0 0662 5

    37 5. .

    .

    . L T= + 100 0 0016. .

  • 8/2/2019 Curso Bosch Diseo de experimentos

    40/104

    - 40 -

    Example 2: Two-Factor Design

    This example should clarify the mathematical procedure followed when evaluating ex-

    periments using orthogonal arrays applying a known and analytically exact physical fact -

    Ohms law.

    We put ourselves in the position of an experimenter, who does not know the relationshipbetween voltage, current and resistance and wants to investigate it with the help of a sim-ple experiment.

    We assume he has conducted four individual experiments according to Figure 7.0.2 andignores experimental repetitions and measuring errors.

    R 1 20= R 2 60= I A1 4= I A2 12=

    Searched: U R I= ( , )

    Transformation:

    x

    RR R

    R R

    R1

    2 1

    2 1

    2

    2

    40

    20=

    +

    =

    x

    II I

    I I

    I2

    2 1

    2 1

    2

    2

    8

    4=

    +

    =

    Figure7.0.2

  • 8/2/2019 Curso Bosch Diseo de experimentos

    41/104

    - 41 -

    Multilinear formulation of solution:

    U a a x a x a x x= + + +0 1 1 2 2 12 1 2

    1. x1 1= x 2 1= a a a a0 1 2 12 80 + =

    2. x1 1= + x 2 1= a a a a0 1 2 12 240+ =

    3. x1 1= x 2 1= + a a a a0 1 2 12 240 + =

    4. x1 1= + x 2 1= + a a a a0 1 2 12 720+ + + =

    On the right side there are the voltages U, determined by individual experiment combina-tions.

    a0 80 240 240 7204

    320= + + + =

    a1720 240

    4

    240 80

    4160=

    +

    +=

    a2720 240

    4

    240 80

    4160=

    +

    +=

    a12720 80

    4

    240 240

    480=

    +

    +=

    Substituted in the formulated solution, one gets: U x x x x= + + +320 160 160 801 2 1 2 .

    Reverse transformation:

    UR I R I

    = +

    +

    +

    320 16040

    20160

    8

    480

    40

    20

    8

    4

    U R I=

    Remark:

    In this example, the right solution (Ohms law) is bound to come out because the multi-linear form U a a x a x a x x= + + +0 1 1 2 2 12 1 2 was just the right formulation. A more com-plex functional relationship with quotients or exponentials of the influence factors wouldbe described with this formulation only approximately or otherwise never described at all(see 7.3).

  • 8/2/2019 Curso Bosch Diseo de experimentos

    42/104

    - 42 -

    Generalization:

    For two factors and two levels, the equation of the multilinear form in the transformed

    system is:

    y a a x a x a x x= + + +0 1 1 2 2 12 1 2.

    The coefficients can easily be determined with the following matrix. One designates this

    matrix as an orthogonal arrangement or an orthogonal array (see 7.4). The term or-

    thogonality in this connection, simply said, means that in each column both levels (-) and

    (+) appear equally frequently (see also general formulation scheme in 7.4.1). The or-

    thogonality is explained in [1] through mathematical orthogonality conditions.

    I x1 x 2 x x1 2 y

    + - - + y1

    + + - -y

    2

    + - + - y3

    + + + + y 4

    a y y y y

    0

    1 2 3 4

    4=

    + + +

    a y y y y

    1

    2 4 3 1

    4=

    + +( ) ( )

    a y y y y

    2

    3 4 1 2

    4=

    + +( ) ( )

    a y y y y

    12

    1 4 2 3

    4=

    + +( ) ( )

    The coefficient a0 is the mean value of all measurement results. The coefficient a1 is the

    half mean effect through a change ofx1 from -1 to +1.

    a Effect x Effect x

    1

    2 21

    2

    1

    2= = + =

    ( ) ( )

  • 8/2/2019 Curso Bosch Diseo de experimentos

    43/104

  • 8/2/2019 Curso Bosch Diseo de experimentos

    44/104

    - 44 -

    The representation in Figure 7.1.3 shows the contours of a hill (see Figure 7.1.2), as are

    found on topographic charts. In the example shown, a jump from a line to the neighbouring

    line corresponds to a height difference of 10 m.

    Closely neighbouring contours represent a steep ascent in a direction perpendicular to the

    contours. If one remains on a closed contour, then one moves - pictorially speaking - at a

    constant height around the hill.

    If we refrain from the picture of a hill and consider instead of the height generally a func-

    tion y , which depends upon the parameters A and B : y A B= ( , ) .

    Figure 7.1.2

    Figure 7.1.3

  • 8/2/2019 Curso Bosch Diseo de experimentos

    45/104

    - 45 -

    y is a target characteristic, whose value is determined by the setting of the factors A and B.Each setting (A, B) then corresponds to a point in the A-B-plane and this again to a value y A B= ( , ) .

    One finds for instance the following results:

    A B y

    6 12 43

    12 12 62

    6 20 78

    12 20 113

    The four points (A,B) form a rectangle in Figure 7.1.3.

    They are drawn in Figure 7.1.2 over the A-B-plane with y as the third coordinate, whichcorresponds to the height above this plane.

    From this representation, it is just as apparent as in Figure 7.1.1, that when dealing with

    factorial designs at two levels, a linear model (straight line, plane) is taken as a basis, in

    order to approximate the unknown, in general, curved response surface.

    Figure 7.1.4 shows a further way to represent these results. The target characteristic y isentered as a function ofA withB as fixed parameter.

    In Figure 7.1.3, an attempt is made to illustrate the three-dimensional surface y A B= ( , ) it corresponds to the hill surface two-dimensionally depending upon both factors AandB.

    The dotted curves in Figure 7.1.4, on the contrary, represent the function y respectivelywhenB is fixed : y f A B const = =( , .)

    They are, as such, the intersection lines of a perpendicular cut through the hills surfacewherebyB is constant (see Figure 7.1.2).

    Figure7.1.4

  • 8/2/2019 Curso Bosch Diseo de experimentos

    46/104

    - 46 -

    Analogous to that, the dotted lines in Figure 7.1.5 represent the functiony when A is con-stant.

    These facts are illustrated by the following figures, as further examples.

    Figure 7.1.5

    Figure 7.1.6

  • 8/2/2019 Curso Bosch Diseo de experimentos

    47/104

  • 8/2/2019 Curso Bosch Diseo de experimentos

    48/104

    - 48 -

    In principle, one can also use these methods for representing results of experiments.

    The above scheme can be simplified, in which, one transforms the factorial levels A1 6= ,A2 12= , B1 12= , B2 20= respectively according to the following rule:

    XX X

    X X* ( )=

    +2 12 1

    2 .

    Example: AA A

    A A12 1

    1 2

    21 1* ( )=

    + =

    AA A

    A A22 1

    2 2

    21 1* ( )=

    + = +

    B B B B B1

    2 1

    1 2

    21 1

    *

    ( )= + =

    BB B

    B B22 1

    2 2

    21 1* ( )=

    + = +

    If one considers only the attained signs, then after the coordinate transformation one at-

    tains the following design matrix for the two-factor design with two levels, instead of the

    above scheme.

    No. A B y

    1 - - y1

    2 + - y2

    3 - + y3

    4 + + y4

    The second row corresponds accordingly to an experiment in which the factor A is set on

    the upper level (+), the factor B on the lower level (-). Instead of using the form A1, A2 for

    the settings of factorA one frequently uses A and A+ .

    In the column y the results are y y1 4, . . . , of the four experiment rows. They allow being

    represented in the following form.

  • 8/2/2019 Curso Bosch Diseo de experimentos

    49/104

    - 49 -

    This form of representation is also applicable, when one (or several) of the investigated

    factors is not a quantitative adjustable variable, but instead a qualitative variable withfixed levels (e.g. material 1 - material 2). Naturally, an interpolation of intermediate values

    is not reasonable in this case.

    The results of three influence factors can be graphically represented by expanding Figure

    7.1.10, into the form of a cubical. Each corner point thus corresponds to a combination of

    levels of the factors A, B and C. When dealing with more than three factors, only two or

    three-dimensional projections of an n-dimensional experimental space can be repre-sented.

    Figure7.1.10

    Figure7.1.11

  • 8/2/2019 Curso Bosch Diseo de experimentos

    50/104

  • 8/2/2019 Curso Bosch Diseo de experimentos

    51/104

    - 51 -

    Fig. 7.1.13

    Fig. 7.1.14

    Fig.7.1.15

  • 8/2/2019 Curso Bosch Diseo de experimentos

    52/104

    - 52 -

    These representations show clearly the principle appearance of a surface described by amultilinear form. The linearity with respect to both coordinates is obvious. In addition, itis seen that the minimum or maximum of every considered straight line respectively lieson the boundary of the experimental space.

    7.2 Calculating the Effects

    The effect of a factor gives the change of the target characteristic y, when a change takesplace from - level to + level, as an average over the settings of all the other factors. Natu-rally, the effect depends upon the explicit choice of the levels.

    A graph of the effects, for the example of the two-factor design, is shown in Fig 7.2.1.

    As long as the factors behave in an additive manner, both lines are parallel (see Figure7.1.11). If, on the contrary, the effect of a factor depends upon the setting (level) of an-

    other, then an interaction of these factors exists, since they do not behave in an additivemanner.

    The evaluation matrix of the two-factor design contains a columnAB for the interaction ofthese factors in addition to the columns for the factorsA andB.

    No. A B AB y

    1 - - + y1

    2 + - - y2

    3 - + -y

    3

    4 + + + y4

    Fig. 7.2.1

  • 8/2/2019 Curso Bosch Diseo de experimentos

    53/104

    - 53 -

    The effect of factor X is calculated as a difference from the mean value of all y, resulting

    when Xhas the + level and the mean value of all y, resulting when Xhas the level -. Thiscalculation rule is analogous for interactions and may be used generally for orthogonal

    designs with m factors.

    For this example the following is valid:

    Effect Ay A y A y y y y

    m m( )( ) ( )

    = =+

    ++

    2 2 2 21 1

    2 4 1 3

    Effect By B y B y y y y

    m m( )( ) ( )

    = =+

    ++

    2 2 2 2

    1 1

    3 4 1 2

    Fig. 7.2.2

    Fig. 7.2.3

  • 8/2/2019 Curso Bosch Diseo de experimentos

    54/104

    - 54 -

    Effect AB y AB y AB y y y y

    m m( )( ) ( )

    = =+

    ++

    2 2 2 21 1

    1 4 2 3.

    Here, the designation of the factor levels with + and - as opposed to the notation 1 and 2,

    that is frequently used, proves advantageous, since the signs of y i on the right side of

    these equations can directly be read for A, B and AB from the evaluation matrix. Further-

    more, the column AB of the evaluation matrix can be determined character-wise as the

    product of the columnsA and B (( ) ( ) = +1 1 1).

    When dealing with fractional factorial designs, confounding of factors with interactions

    can occur. The effects of confounded quantities can then no longer be calculated sepa-

    rately.

    Hint:

    Calculation of mean effects is given here only as a matter of completeness. By using the

    Figures 7.1.6 - 7.1.9, one can easily see that if a stronger interaction AB exists, the mean

    effect of both factors A and B can become zero, although each factor exhibits great total

    effects.

  • 8/2/2019 Curso Bosch Diseo de experimentos

    55/104

    - 55 -

    7.3 Regression Analysis

    From the factors effects, the coefficients of the multilinear form (regression polynomial)may be calculated by using the coordinate transformation which transforms the settingvalues of factors into the coded form, + level, - level. The searched coefficients corre-

    spond to half of the effects.

    Consider, as an example, the function: y x x x x= + +3 4 2 51 2 1 2 .

    The four experiments with the settings

    A = 5 A+ = 10B = 6 B+ = 12

    would accordingly deliver the following results if experimental noise remained unconsid-ered:

    y1 3 4 5 2 6 5 5 6 161= + + =

    y2 3 4 10 2 6 5 10 6 331= + + =

    y3 3 4 5 2 12 5 5 12 299= + + =

    y4 3 4 10 2 12 5 10 12 619= + + = .

    We now proceed as though the above initial polynomial was unknown and try to derive thecoefficients from the experimental data (see 7.2).

    Effect A y y y y

    ( ) =+

    +

    =2 4 1 3

    2 2245

    Effect B y y y y

    ( ) =+

    +

    =3 4 1 2

    2 2213

    Effect AB y y y y

    ( ) =+

    +

    =1 4 2 3

    2 275

    Constant term y y y y

    =+ + +

    =1 2 3 4

    43525.

    If one now substitutes half of the effects as coefficients into the polynomial (model)

    y a a x a x a x x= + + +0 1 1 2 2 12 1 2

    and considers the coordinate transformation (see Section 7.1)

    XX X

    X X* ( )

    = +

    21

    2 12

    ,

  • 8/2/2019 Curso Bosch Diseo de experimentos

    56/104

    - 56 -

    then this results

    y A

    B

    A B

    = +

    +

    +

    +

    +

    +

    +

    3525245

    2

    2

    10 510 1

    2132

    212 6

    12 1

    75

    2

    2

    10 510 1

    2

    12 612 1

    . ( )

    ( )

    ( ) ( )

    and after solving this expression:

    y A B AB= + +3 4 2 5 .

    It is therefore possible to calculate the coefficients of the regression polynomial from the

    results of the experiment which was chosen as a formulation model for the experimentaldesign.

    Therefore, it is possible to determine interpolated values within the experimental space. If

    one or several additional experiments are conducted in the center of the experimental

    space (e.g. rectangle in Figure 7.1.3) (design with center point), it is possible to get infor-

    mation about the adequacy of the model used as a basis, by comparing the results for this

    point with the corresponding interpolated values, i.e. about the quality of the fit .

    If greater deviations occur between the results of additional experiments and the values

    interpolated with the help of the regression polynomial, then this shows that the chosen

    model describes the reality insufficiently, if not fully wrong.

    Here the whole crux of DOE with orthogonal arrays shows itself: right results can onlybe attained with the right model.

    7.4 Factorial Designs

    7.4.1 Design Matrix

    In Section 7.1, the creation of a simple scheme for a 2 2 -design is shown by considering acoordinate transformation:

    No. A B

    1 - -

    2 + -

    3 - +

    4 + +

    Strictly speaking, one can interpret the first two rows of the design as a one-factor-at-a-time experiment, where the factor A is set to the lower (-) or upper (+) level, while thefactorB is on the - level.

  • 8/2/2019 Curso Bosch Diseo de experimentos

    57/104

    - 57 -

    In the rows 3 and 4, A is set on the two - and + levels, thoughB is held fixed on + level.

    This scheme is the basis for a general rule of factorial designs that is made clear by means

    of the following representation.

    25-Design

    24-Design

    2 3-Design

    22-Design

    Experiment A B C D E

    1

    23

    4

    -

    +-

    +

    -

    -+

    +

    -

    --

    -

    -

    --

    -

    -

    --

    -

    5

    6

    7

    8

    -

    +

    -

    +

    -

    -

    +

    +

    +

    +

    +

    +

    -

    -

    -

    -

    -

    -

    -

    -

    9

    10

    11

    12

    1314

    15

    16

    -

    +

    -

    +

    -+

    -

    +

    -

    -

    +

    +

    --

    +

    +

    -

    -

    -

    -

    ++

    +

    +

    +

    +

    +

    +

    ++

    +

    +

    -

    -

    -

    -

    --

    -

    -

    17

    18

    19

    20

    21

    22

    23

    2425

    26

    27

    28

    29

    30

    31

    32

    -

    +

    -

    +

    -

    +

    -

    +-

    +

    -

    +

    -

    +

    -

    +

    -

    -

    +

    +

    -

    -

    +

    +-

    -

    +

    +

    -

    -

    +

    +

    -

    -

    -

    -

    +

    +

    +

    +-

    -

    -

    -

    +

    +

    +

    +

    -

    -

    -

    -

    -

    -

    -

    -+

    +

    +

    +

    +

    +

    +

    +

    +

    +

    +

    +

    +

    +

    +

    ++

    +

    +

    +

    +

    +

    +

    +

    Scheme for illustrating the general rule for factorial designs

    (see [1] p. 53).

  • 8/2/2019 Curso Bosch Diseo de experimentos

    58/104

  • 8/2/2019 Curso Bosch Diseo de experimentos

    59/104

    - 59 -

    Remark:

    In this model, the designations x1, x 2 and x 3 will be used instead of the names A, B and C

    for the three factors. Correspondingly, e.g. a12 is the coefficient of the interactionAB.

    The columns of the evaluation matrix assigned to the interactions can be calculated, char-acter-wise as products of the columns of related factors (( ) ( ) = +1 1 1). For example,the column for the interactionACresults when one multiplies the columns of the factors A

    and Cwith each other.

    7.4.3 Confounding

    If all 8 experiments of a 23- design were conducted, the effects and thus the coefficients of

    the model for all factors and interactions can be calculated separately. Mathematically

    considered, the calculation of the coefficients means solving a system of 8 equations with8 unknowns (see model, design and evaluation matrix).

    y a a a a a a a a1 0 1 2 12 3 13 23 123= + + + y a a a a a a a a2 0 1 2 12 3 13 23 123= + + +y a a a a a a a a3 0 1 2 12 3 13 23 123= + + +y a a a a a a a a4 0 1 2 12 3 13 23 123= + + + y a a a a a a a a5 0 1 2 12 3 13 23 123= + + +y a a a a a a a a6 0 1 2 12 3 13 23 123= + + + y a a a a a a a a7 0 1 2 12 3 13 23 123= + + +

    y a a a a a a a a8 0 1 2 12 3 13 23 123= + + + + + + +

    The coefficients of this system of equations are easy to calculate due to the simple struc-ture. For example, the constant a 0 can be determined by adding all rows and dividing the

    sum by 8 (mean of all results yi, see 7.3, Regression Analysis).

    Owing to the balanced nature of the system of equations in front of every coefficient aplus sign appears as frequently as a minus sign by addition, all members on the right-hand side, except for a 0 cancel each other out. In order to calculate a1 the rows 1, 3, 5 and

    7 are multiplied by -1 respectively and then all 8 rows are added together. Again, apartfrom a1 all elements on the right-hand side cancel each other out. The calculation for all

    the remaining coefficients is analogous. If one compares this procedure with the equationsin Section 7.2, it will be evident that the calculation of the coefficients of the system ofequations and the calculation of half effects of the factors are identical processes.

    Because a plus sign appears in front of a 0 in every row of the equation system, the

    evaluation matrix is often given a precedent column with exclusively plus signs, which isdesignated with I (for identity) or 0.

  • 8/2/2019 Curso Bosch Diseo de experimentos

    60/104

    - 60 -

    If less than 8 experiments are conducted, then it is clear that it is no longer possible to

    determine the coefficients separately . The so-called confounding occurs. This is explained

    by means of an example of the 23 1

    fractional factorial design. Where, three factors shall

    be investigated, only 4 experiments are conducted.

    Design matrix of the 23 1

    design (see [9]):

    A B C

    1 - - +

    2 + - -

    3 - + -

    4 + + +

    We now consider how the interaction columns AB, AC and BC of the related evaluation

    matrix look. They can be calculated as a product of the corresponding columns of the de-

    sign matrix.

    If one compares these columns with the columns of the design matrix, then it is evident

    that AB with C, ACwith B and BCwith A are equivalent. Thus, the columns A and BC, B

    and AC, Cand AB in the evaluation matrix are not distinguishable at all. One reckons that

    the factor A with the interactionBC, the factor B with the interaction ACand the factor C

    with the interactionAB are confounded.

    A

    BC

    B

    AC

    C

    AB

    1 - - +

    2 + - -

    3 - + -

    4 + + +

    Evaluation matrix of the 2 3 1 fractional factorial design

    BC

    -

    +

    -

    +

    AB

    +

    -

    -

    +

    AC

    -

    -

    +

    +

  • 8/2/2019 Curso Bosch Diseo de experimentos

    61/104

    - 61 -

    The occurrence of confounded factors will still be somewhat clearer if one directly con-

    siders the incomplete system of equations corresponding to the 23 1

    design:

    y a a a a a a a a1 0 1 2 12 3 13 23 123= + + +y a a a a a a a a2 0 1 2 12 3 13 23 123= + + +

    y a a a a a a a a3 0 1 2 12 3 13 23 123= + + +y a a a a a a a a4 0 1 2 12 3 13 23 123= + + + + + + + .

    If, in this case, the first and third equation are multiplied by -1 and subsequently all four

    equations are added together, then all elements on the right-hand side apart from a1 and

    a 23 will cancel out. They are the coefficients assigned to the factor A or to the interaction

    BC. Therefore A and BC are confounded. The remaining confounded factors are analo-

    gously.

    Remark:

    Strictly considered, one should list an extra column in the evaluation matrix, for entering

    the identity (column for the constant term a 0) and the three-factor interaction ABC. It is

    neglected as a matter of simplicity.

    It is therefore not possible in the preceding example, for instance, to calculate the effect of

    factorA separate from the effect of interaction BC.

    Here, a rather strange logic can be used now, which is found in most of the literature on

    the subject of DOE. The effect of factor A can be determined if one assumes that the inter-

    action BCdoesnt exist. This means that one must be sure that the factors B and Cbehavepurely additive. If this is clear, then it is sufficient to investigate B and C with the one-factor-at-a time experiment.

    In textbooks on DOE, it is often assumed that three-factor and higher interactions are not

    probable and as such this fact becomes exploited in order to formulate fractional factorialdesigns of the type 2 1m .

  • 8/2/2019 Curso Bosch Diseo de experimentos

    62/104

    - 62 -

    We investigate the evaluation matrix of the 24 1

    design as an example.

    A B

    AB

    CD

    C

    AC

    BD

    BC

    AD

    D

    ABC

    1 - - + - + + -

    2 + - - - - + +

    3 - + - - + - +

    4 + + + - - - -

    5 - - + + - - +

    6 + - - + + - -

    7 - + - + - + -

    8 + + + + + + +

    Instead of the 2 164 = experiments which would be necessary for investigating four fac-

    tors on two levels each, in correspondence with the full factorial design, here only 8 ex-

    periments will be conducted. If one determines the column of the interaction ABC, then

    one sees that this corresponds with the column of factor D. Therefore, factor D is con-

    founded with a three-factor interaction. When applying this design it is assumed that thethree-factor interaction ABC does not exist. When this assumption is false then a false

    effect results forD.

    In addition, two-factor interaction effects cannot be calculated separately. If, for instance,

    a higher significance of the third column occurs during the column-wise evaluation (facto-

    rial analysis of variance ), then it is not determinable whether this is due to the interaction

    AB or CD. Otherwise AB and CD can compensate themselves (equivalent, counteracting

    effects). This is not recognisable by the evaluation. The reduction in the extent of experi-

    mentation is therefore a trade-off with the risk of a faulty result as well as loss of informa-

    tion.

    This statement is especially valid for a fractional factorial design with a reduction of the

    experimental extent by more than factor 0.5 (Taguchi method, see [10]).

    The rows 1-8 of the 24 1

    -design correspond to the rows 1, 10, 11, 4, 13, 6, 7, 16 of the

    complete 24-design (see [9], Appendix). An experiment on the basis of the 2

    4 1-design

    still allows being rescued, if necessary, by addition of the missing (complementary)eight rows. The c