Top Banner
Planejamento e Otimização de Experimentos The Analysis of Variance Prof. Dr. Anselmo E de Oliveira anselmo.quimica.ufg.br [email protected]
33

Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

Mar 18, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

Planejamento e Otimização

de Experimentos

The Analysis of Variance

Prof. Dr. Anselmo E de Oliveira

anselmo.quimica.ufg.br

[email protected]

Page 2: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

An Example

• Integrated circuits – Wafers – Plasma etching process is

employed to remove unwanted material

– Energy is supplied by a radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes

– RF power setting etch rate • C2F6 and 0.80 cm (gap) • RF power: 160, 180, 200, 220 W

(four levels) • Five wafers at each RF level

Page 3: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

• Single-factor experiment

– a = 4 levels of the factor

– n = 5 replicates

– 20 runs

• Random order – How to generate the run order?

» Ex: random numbers using a random function in a spreadsheet; sort by that number

– Prevent the effects of unknown nuisance variables

» Nonrandomized order ( 5x160; 5x180 ...)

• Ex: the etching tool exhibits a warm-up effect: the longer it is on, the lower the observed etch rating readings will be

• It will destroy the validity of the experiment

Page 4: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

Power (W)

1 2 3 4 5 Totals Averages

160 575 542 530 539 570 2756 551.2

180 565 593 590 579 610 2937 587.4

200 600 651 610 637 629 3127 625.4

220 725 700 715 685 710 3535 707.0

> load etch.oc

> whos

> etch

> boxplot(etch)

> tics ("x", 1:4, {"160"; "180"; "200";"220"});

> xlabel ("Power (W)"); ylabel("Etch rate (A/min)");

Page 5: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

• Etch rate increases as the power setting increases. • There is no strong evidence to suggest that the variability in each rate

around the average depends on the power setting. • Suppose that we wish to test for differences between the mean etch rates:

– t-test for all six possible pairs of means. • It takes a lot of effort • It inflates the type I error (rejecting the null hypothesis when it is in fact true)

– Analysis of variance

Page 6: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

The Analysis of Variance

• 𝑎 treatments or different levels of a single factor

𝑦.. 𝑦 ..

Treatment (Level)

Observations Totals Averages

1 𝑦11 𝑦12 ... 𝑦1𝑛 𝑦1. 𝑦 1.

2 𝑦21 𝑦22 ... 𝑦2𝑛 𝑦2. 𝑦 2.

⋮ ⋮ ⋮ ... ⋮ ⋮ ⋮

𝑎 𝑦𝑎1 𝑦𝑎2 ... 𝑦𝑎𝑛 𝑦𝑎. 𝑦 𝑎.

Page 7: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

• (Linear Statistical) Models for the Data – Means model

𝑦𝑖𝑗 = 𝜇𝑖 + 𝜖𝑖𝑗 𝑖 = 1,2, … , 𝑎𝑗 = 1,2,… , 𝑛

𝑦𝑖𝑗 is the ijth observation

𝜇𝑖 is the mean of the ith factor level or treatment

𝜖𝑖𝑗 is the random error It is convenient to think of the errors as having mean zero, so that 𝐸 𝑦𝑖𝑗 = 𝜇𝑖

– Effects model 𝜇𝑖 = 𝜇 + 𝜏𝑖, 𝑖 = 1,2, … , 𝑎

𝑦𝑖𝑗 = 𝜇 + 𝜏𝑖 + 𝜖𝑖𝑗 𝑖 = 1,2,… , 𝑎𝑗 = 1,2, … , 𝑛

𝜇 is the overall mean

𝜏𝑖 is the ith treatment effect

one-way or single-factor analysis of variance (ANOVA)

Page 8: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

• The experimental design is a completely randomized design

– The experiments must be performed in random order

– The environment in which the treatments are applied is as uniform as possible (experimental units)

– Hypotesis test

• 𝑦𝑖𝑗~𝑁 𝜇 + 𝜏𝑖 , 𝜎2

• Observations mutually independent

Page 9: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

The Analysis of Fixed Effects Model

– 𝑎 treatments

– Test hypotheses about the treatment means, and the conclusions will apply only to the factor levels considered in the analysis

𝑦𝑖. = 𝑦𝑖𝑗

𝑛

𝑗=1

𝑦 𝑖. =𝑦𝑖.

𝑛 𝑖 = 1,2, … , 𝑎

𝑦.. = 𝑦𝑖𝑗

𝑛

𝑗=1

𝑎

𝑖=1

𝑦 .. =𝑦..𝑁

𝑁 = 𝑎𝑛

the total number of observations

Page 10: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

– Testing the equality of the a treatment means

𝐸 𝑦𝑖𝑗 = 𝜇 + 𝜏𝑖 = 𝜇𝑖, 𝑖 = 1,2, … , 𝑎

𝐻0: 𝜇1 = 𝜇2 = ⋯ = 𝜇𝑎

𝐻1: 𝜇𝑖 ≠ 𝜇𝑗 for at least on pair 𝑖, 𝑗

𝜇 = 𝜇𝑖

𝑎𝑖=1

𝑎⟹ 𝜏𝑖 = 0

𝑎

𝑖=1

𝐻0: 𝜏1 = 𝜏2 = ⋯ = 𝜏𝑎 = 0

𝐻1: 𝜏𝑖 ≠ 0 for at least one 𝑖

– Thus, we speak of testing the equality of treatment means or testing that the treatment effects (the 𝜏𝑖) are zero

the treatment or factor effects can be thought of as deviations from the overall mean

Page 11: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

• Decomposition of the Total Sum of Squares

– The name Analysis of variance is derived from a partitioning of total variability into its component parts

𝑆𝑆𝑇 = 𝑦𝑖𝑗 − 𝑦 ..2

𝑛

𝑗=1

𝑎

𝑖=1

= 𝑛 𝑦 𝑖. − 𝑦 ..2

𝑎

𝑖=1

+ 𝑦𝑖𝑗 − 𝑦 𝑖.2

𝑛

𝑗=1

𝑎

𝑖=1

𝑆𝑆𝑇 = 𝑆𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠 + 𝑆𝑆𝐸

sum of squares of the differences between the treatment averages and the grand mean

sum of squares of the differences of observations within treatments from the treatment average

Page 12: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

– Degrees of freedom

𝑆𝑆𝑇: 𝑁 − 1

𝑆𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠: 𝑎 − 1

𝑆𝑆𝐸: 𝑎 𝑛 − 1 = 𝑎𝑛 − 𝑎 = 𝑁 − 𝑎

– Mean squares

𝑀𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠 =𝑆𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠

𝑎 − 1

𝑀𝑆𝐸 =𝑆𝑆𝐸

𝑁 − 𝑎

– Expected values 𝐸 𝑀𝑆𝐸 = 𝜎2

𝐸 𝑀𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠 = 𝜎2 +𝑛 𝜏𝑖

2𝑎𝑖=1

𝑎 − 1

if there are no differences in treatment means (𝜏𝑖 = 0), 𝑀𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠 also estimates 𝜎2

Page 13: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

• Statistical Analysis

– If the null hypothesis of no difference in treatment means is true, the ratio

𝐹0 =

𝑆𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠𝑎 − 1

𝑆𝑆𝐸𝑁 − 𝑎

=𝑀𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠

𝑀𝑆𝐸

is distributed as F with 𝑎 − 1 and 𝑁 − 𝑎 degrees of freedom.

– Reject 𝐻0 if 𝐹0 > 𝐹𝛼,𝑎−1,𝑁−𝑎

Page 14: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

• ANOVA Table

Source of Variation Sum of Squares Degrees of

Freedom Mean Square

𝑭𝟎

Between treatments 𝑆𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠 =

1

𝑛 𝑦𝑖.

2 −𝑦..

2

𝑁

𝑎

𝑖=1

𝑎 − 1 𝑀𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠 𝐹0

=𝑀𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠

𝑀𝑆𝐸

Error (within treatments) 𝑆𝑆𝐸 = 𝑆𝑆𝑇 − 𝑆𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠 𝑁 − 𝑎 𝑀𝑆𝐸

Total 𝑆𝑆𝑇 = 𝑦𝑖𝑗 − 𝑦 ..

2𝑛

𝑗=1

𝑎

𝑖=1

𝑁 − 1

Page 15: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

Power (W)

1 2 3 4 5 Totals 𝒚𝒊.

Averages 𝒚 𝒊.

160 575 542 530 539 570 2756 551.2

180 565 593 590 579 610 2937 587.4

200 600 651 610 637 629 3127 625.4

220 725 700 715 685 710 3535 707.0

𝒚.. =12,355 𝒚 .. = 617.75

𝑺𝑺𝑻 = 𝑦𝑖𝑗 − 𝑦 ..2

𝑛

𝑗=1

𝑎

𝑖=1

= 𝑦𝑖𝑗2 −

𝑦..2

𝑁

5

𝑗=1

4

𝑖=1

= 5752 + 5422 + ⋯+ 7102 −12,3552

20= 𝟕𝟐, 𝟐𝟎𝟗. 𝟕𝟓

𝑺𝑺𝑻𝒓𝒆𝒂𝒕𝒎𝒆𝒏𝒕𝒔 =1

𝑛 𝑦𝑖.

2 −𝑦..

2

𝑁

4

𝑖=1

=1

527562 + ⋯+ 35352 −

12,3552

20= 𝟔𝟔, 𝟖𝟕𝟎. 𝟓𝟓

𝑺𝑺𝑬 = 𝑆𝑆𝑇 − 𝑆𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠 = 𝟓𝟑𝟑𝟗. 𝟐𝟎

Page 16: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

• ANOVA Table

– RF Power (between-treatment mean square) ≫ Error (within-treatment) • it is unlikely that the treatment means are equal

– 𝐹0.05,3,16 = 3.24 (F table) • Because 𝐹0 = 66.80 > 3.24, we reject 𝐻0 and

conclude that the treatment means differ

Source of Variation Sum of Squares

Degrees of Freedom

Mean Square

𝑭𝟎 p-Value

RF Power 66,870.55 3 22,290.18 66.80 <0.01

Error 5339.20 16 333,70

Total 72,209.75 19

The RF power setting significantly affects the mean etch rate

Page 17: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

> anova (etch)

Page 18: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

• Estimation of the Model Parameters 𝑦𝑖𝑗 = 𝜇 + 𝜏𝑖 + 𝜖𝑖𝑗

𝜇 = 𝑦 .. 𝜏 = 𝑦 𝑖. − 𝑦 .. 𝑦 𝑖𝑗 = 𝑦 𝑖.

A 100(1 − 𝑎) percent confindence interval on the ith treatment mean 𝜇𝑖:

𝑦 𝑖. − 𝑡𝛼 2 ,𝑁−𝑎

𝑀𝑆𝐸

𝑛≤ 𝜇𝑖 ≤ 𝑦 𝑖. + 𝑡𝛼 2 ,𝑁−𝑎

𝑀𝑆𝐸

𝑛

A 100(1 − 𝑎) percent confindence interval on the difference in any two treatment means:

𝑦 𝑖. − 𝑦 𝑗. − 𝑡𝛼 2 ,𝑁−𝑎

2𝑀𝑆𝐸

𝑛≤ 𝜇𝑖 − 𝜇𝑗 ≤ 𝑦 𝑖. − 𝑦 𝑗. + 𝑡𝛼 2 ,𝑁−𝑎

2𝑀𝑆𝐸

𝑛

Page 19: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

Model Adequacy Checking

• The Normality Assumption

– Normal probability plot of the residual

cd octavedocs

load etch.oc

etch

m = mean(etch)

one = ones(5,4)

M = one .* m

e = M – etch

normplot(e(:))

𝑦 𝑖.

𝑦 𝑖𝑗 = 𝑦 𝑖.

𝑒𝑖𝑗 = 𝑦 𝑖. − 𝑦 𝑖𝑗

Page 20: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2
Page 21: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

• Standardized residuals

– Outliers

𝑑𝑖𝑗 =𝑒𝑖𝑗

𝑀𝑆𝐸

If the erros are 𝑁 0, 𝜎2 , the standardized residuals should be approximately normal with mean zero and unit variance

• ~68% of 𝑑𝑖𝑗 should fall within the limits 1

• ~95% ... 2

• ~100% ... 3

Using the largest standardized residual:

𝑑 = 1.404 should cause no concern

sort(e(:))

d = 25.6/sqrt(333.7)

Page 22: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

• Plot of Residuals in Time Sequence

Plotting the residuals in time order of data collection is hepful in detecting strong correlation between the residuals

– A tendency to have runs of positive and negative residuals indicates positive correlation:

The independence assumption of the errors would have been violated

Page 23: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

plot(etch_run(:),e(:),"*")

xlabel ("Run order or time"); ylabel("Residuals");

Page 24: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

• Plot of Residuals Versus Fitted Values

If the model is correct and the assumptions are satisfied, the residuals should be structureless; in particular, they should be unrelated to any other variable including the predicted response

– Plasma etching experiment 𝑦 𝑖𝑗 = 𝑦 𝑖.

𝑒𝑖𝑗 = 𝑦 𝑖. − 𝑦 𝑖𝑗

Page 25: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

plot(M(:),e(:),"*")

xlabel ("Predicted"); ylabel("Residuals");

There is not inequality of variance: the variance of the observations does not increase as the magnitude of the observation increases

Page 26: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

• Statistical Test for Equality of Variance (in each treatment) 𝐻0: 𝜎1

2 = 𝜎22 = ⋯ = 𝜎𝑎

2 = 0

𝐻1: 𝜎𝑖2 ≠ 0 for at least one 𝑖

– Residual plots – Bartlett’s test

• 𝜒2 distribution with 𝑎 − 1 degrees of freedom • 𝑎 random samples from independent normal populations

𝜒02 = 2.3026

𝑞

𝑐

𝑞 = 𝑁 − 𝑎 𝑙𝑜𝑔𝑆𝑝2 − 𝑛𝑖 − 1

𝑎

𝑖=1

𝑙𝑜𝑔𝑆𝑖2

𝑐 = 1 +1

3 𝑎 − 1 𝑛𝑖 − 1 −1 − 𝑁 − 𝑎 −1

𝑎

𝑖=1

𝑆𝑝2 =

𝑛𝑖 − 1 𝑆𝑖2𝑎

𝑖=1

𝑁 − 𝑎

and 𝑆𝑖2 is the sample variance of the ith population.

• The quantity 𝑞 is large when the sample variances 𝑆𝑖2 differ greatly and is equal to zero

when all 𝑆𝑖2 are equal.

• Reject 𝐻0 𝜒0

2 > 𝜒𝛼,𝛼−12

Page 27: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

var_p=sum(4/(20-4)*var(etch))

q = (20-4)*log10(var_p)-4*sum(log10(var(etch)))

c = 1+1/(3*(4-1))*(4*1/4-1/(20-4))

chi_0=2.3026*q/c

1 - chi2cdf(chi_0,3)

ans = 0.93324

• The tests return a p-value that describes the outcome of the test. • Assuming that the test hypothesis is true, the p-value is the

probability of obtaining a worse result than the observed one. • Large p-values corresponds to a successful test. • Usually a test hypothesis is accepted if the p-value exceeds 0.05

there is no evidence to counter the claim that all four variances are the same (plot residuals x fitted values)

we cannot reject the null hypothesis

Page 28: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

Practical Interpretation of Results

After

conducting the experiment,

performing the statistical analysis, and

investigating the underlying assumptions,

the experimenter is ready to draw practical conclusions about the problem he or she is studying

Page 29: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

• A Regression Model

– The factors involved in an experiment can be either quantitative or qualitative

– Empirical model of the process

• An interpolation equation for the response variable in the experiment

• Regression analysis – Method of least squares

Page 30: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

linear model

quadratic model

Page 31: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

• Comparing Pairs of Treatment Means

– Tukey’s Test

• Following an ANOVA in which the null hypothesis of equal treatment mean was reject

• Test all pairwise mean comparisons (all 𝑖 ≠ 𝑗) 𝐻0: 𝜇𝑖 = 𝜇𝑗

𝐻1: 𝜇𝑖 ≠ 𝜇𝑗

• Equal or unequal (Tukey-Kramer procedure) sample sizes

• Distribution of the studentized range statistics, 𝑞 table

𝑇𝛼 = 𝑞𝛼 𝑎, 𝑓𝑀𝑆𝐸

𝑛

Page 32: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

• Plasma etching experiment 𝑎 = 4 𝛼 = 0.05

𝑓 = 16 degrees of freedom 𝑁 − 𝑎 𝑀𝑆𝐸 = 333.7 𝑛 = 5

𝑞0.05 4,16 → 𝑞 table 𝑞0.05 4,16 = 4.046

𝑇0.05 = 4.046333.7

5= 𝟑𝟑. 𝟏

𝑦 1. = 551.2 𝑦 2. = 587.4

𝑦 3. = 625.4 𝑦 4. = 707.0 𝑦 1. − 𝑦 2. = −36.2 𝑦 1. − 𝑦 3. = −74.2 …

the Tukey procedure indicates that all pairs of means differ

any pairs of treatment averages that differ in absolute value by more than 33.1 would imply that the corresponding pair of population means are significantly different

Page 33: Planejamento e Otimização de Experimentos · radiofrequency (RF) generator causing plasma to be generated in the gap between the electrodes –RF power setting etch rate •C 2

Nonparametric Method in the Analysis of Variance

• The Kruskal-Wallis Test

– In situations were the normality assumption is unjustified

– Reject 𝐻0 𝐻 > 𝜒𝛼,𝛼−1

2

[pval, chisq, df] =

kruskal_wallis_test(etch(:,1),etch(:,2),etch(:,3),etch(:,4))

pval = 7.3856e-004

chisq = 16.907

df = 3

we reject the null hypothesis

This is the same conclusion as given by the usual analysis of variance F test