Comparing Mean Vectors for Two Populationsmaitra/stat501/lectures/InferenceForMeans... · Comparing Mean Vectors for Two Populations Independent random samples, one sample from each

Comparing Mean Vectors for Two Populations

• Independent random samples, one sample from each of two

populations

• Randomized experiment: n1 units are randomly allocated to

treatment 1 and n2 units are randomly allocated to treatment

2. Sample sizes need not be equal.

• Measure p outcomes (or variables or traits) on each unit

329

Comparing Mean Vectors for Two Populations• Data vectors from population 1:

x1j =

x1j1x1j2

...x1jn1

j = 1,2, . . . , n1

• Data vectors from population 2:

x2j =

x2j1x2j2

...x2jn2

j = 1,2, . . . , n2

• We use x̄1 and x̄2 to denote the sample mean vectors, andS1 and S2 to denote the estimated covariance matrices.

330

Comparing two mean vectors (cont’d)

• If n1, n2 are large, the following assumptions are all we need

to make inferences about the difference between treatments

µ1 − µ2:

1. X11, X12, ..., X1n1∼ p-variate distribution(µ1,Σ1).

2. X21, X22, ..., X2n2∼ p-variate distribution(µ2,Σ2).

3. X11, X12, ..., X1n1are independent

4. X21, X22, ..., X2n2are independent.

5. X11, X12, ..., X1n1are independent of X21, X22, ..., X2n2

.

331

Comparing two mean vectors (cont’d)

• When samples sizes are small, we also need that both

distributions are multivariate normal.

• Will will first examine the situation where the population

covariance matrices are the same: Σ1 = Σ2. This is a strong

assumption because it says that all p(p+ 1)/2 variances and

covariances are the same in both populations.

332

Pooled estimate of the covariance matrix

• If Σ1 = Σ2 = Σ, then

n1∑j=1

(x1j − x̄1)(x1j − x̄1)′,n2∑j=1

(x2j − x̄2)(x2j − x̄2)′

are estimates of (n1 − 1)Σ and of (n2 − 1)Σ, respectively.

Then, we can pool or average the information from the two

samples to obtain an estimate of the common covariance

matrix:

Spool =

∑n1j=1(x1j − x̄1)(x1j − x̄1)′+

∑n2j=1(x2j − x̄2)(x2j − x̄2)′

n1 + n2 − 2

=(n1 − 1)

(n1 + n2 − 2)S1 +

(n2 − 1)

(n1 + n2 − 2)S2.

333

Testing hypotheses

• Consider testing H0 : µ1 − µ2 = δ0, where δ0 is some fixedp× 1 vector. Often, δ0 = 0.

• An estimate of µ1 − µ2 is X̄1 − X̄2 and

Cov(X̄1 − X̄2) = Cov(X̄1) + Cov(X̄2) =1

n1Σ +

1

n2Σ,

Since units from treatment 1 are independent of units fromtreatment 2, Cov(X̄1, X̄2) = 0.

• An estimate of the covariance matrix of the differencebetween the sample mean vectors is given by

1

n1Spool +

1

n2Spool =

(1

n1+

1

n2

)Spool.

334

Testing hypotheses (cont’d)

• We reject H0 : µ1 − µ2 = δ0 at level α if

T2 = (x̄1 − x̄2 − δ0)′[(

1

n1+

1

n2

)Spool

]−1

(x̄1 − x̄2 − δ0) > c2,

where

c2 =(n1 + n2 − 2)p

(n1 + n2 − p− 1)Fp,n1+n2−p−1(α).

• A 100(1−α)% CR for µ1−µ2 is given by all values of µ1−µ2

that satisfy

(X̄1−X̄2−(µ1−µ2))′[(

1

n1+

1

n2

)Spool

]−1

(X̄1−X̄2−(µ1−µ2)) ≤ c2.

335

Example 6.3: Two Processes for ManufacturingSoap

• Objective was to compare two processes for manufacturing

soap. Outcome measures were X1 = lather and X2 = mild-

ness, and n1 = n2 = 50.

• Sample statistics for sample 1 were

x̄1 =

[8.34.1

], S1 =

[2 11 6

],

and for sample 2:

x̄2 =

[10.23.9

], S2 =

[2 11 4

].

336

Example 6.3: Two methods for manufacturingsoap

• The pooled estimate of the common covariance matrix andthe difference in sample mean vectors are

Spool =49

98S1 +

49

98S2 =

[2 11 5

], x̄1 − x̄2 =

[−1.9

0.2

].

• We reject H0 : µ1 − µ2 = 0 at level α = .05 because

T2 = (x̄1− x̄2−0)′[(

1

n1+

1

n2

)Spool

]−1

(x̄1− x̄2−0) = 15.66

is larger than

(50 + 50− 2)(2)

(50 + 50− 2− 1)F2,97(.05) = 6.26.

337

Example 6.3: Two methods for manufacturingsoap

• Eigenvalues and eigenvectors of the pooled covariance matrix

are given by

λ =

[5.3031.697

], E = [e1, e2] =

[0.290 0.9570.957 −0.290

].

338

Two methods for manufacturing soap (cont’d)

• A 95% confidence ellipse for the difference between the two

means is centered at x̄1 − x̄2.

• Since(1

n1+

1

n2

)c2 =

(1

n1+

1

n2

)(n1 + n2 − 2)p

(n1 + n2 − p− 1)Fp,n1+n2−p−1(α)

=(

1

50+

1

50

)(98)(2)

(97)F2,97(0.05) = 0.25,

we know that the ellipse extends√

5.303√

0.25 = 1.15 and√1.697

√0.25 = 0.65 units in the e1 and e2 directions,

respectively.

339

Two methods for manufacturing soap (cont’d)

• Since µ1 − µ2 = 0 is not inside the ellipse, we conclude that

the two processes produce different results. There appears

to be no big difference in mildness (X2), but soaps made

with the second process produce more lather.

340

Confidence Intervals

• As before, we can obtain simultaneous confidence intervalsfor any linear combination of the components of µ1 − µ2.

• In particular, in the case of p variables, we might be interestedin

a′i(µ1 − µ2) =[

0 0 · · · 1 · · · 0]

µ11 − µ21µ12 − µ22

...µ1p − µ2p

= µ1i − µ2i,

where the vector ai has zeros everywhere except for the onein the ith position.

• Typically, we would be interested in p such comparisons.

341

Confidence Intervals (cont’d)• In general,

a′(X̄1−X̄2)±√

(n1 + n2 − 2)p

(n1 + n2 − p− 1)Fp,n1+n2−p−1(α)

√√√√a′(

1

n1+

1

n2

)Spoola

will simultaneously cover the true values of a′(µ1 − µ2) withprobability of at least (1− α)%.

• One-at-a-time t intervals would be computed as

a′(X̄1 − X̄2)± tn1+n2−2(α/2)

√√√√a′(

1

n1+

1

n2

)Spoola

and have less than (1− α)% simultaneous probability ofcoverage unless only one comparison is made. To apply theBonferroni method, divide α by the number m of comparisonsof interest.

342

Heterogeneous covariance matrices

• Life gets difficult when we cannot assume that Σ1 = Σ2.

• We must modify the standardized distance measure, and themodification will not exacty be a multiple of an F-distibutionwhen the null hypothesis of equal mean vectors is true.

• How to decide whether the assumption of equal covariancematrices is reasonable? Tests such as Bartlett’s test aresensitive to departures from normality.

• A crude rule of thumb is the following: if σ1,ii ≥ 4σ2,ii orσ2,ii ≥ 4σ1,ii, then it is likely that Σ1 6= Σ2.

343

Bartlett’s test for equality of k covariancematrices

• Tests for testing homogeneity of variances are touchy in that

they are sensitive to the assumption of multivariate normality.

They tend to reject homogeneity of covariance matrices too

often when the samples are not selected from multivariate

normal distributions.

• Bartlett’s test is a likelihod ratio test (independent random

samples from multivariate normal distributions) for testing

H0 : Σ1 = Σ2 = ... = Σk = Σ

versus the alternative where at least two Σi are different.

344

Bartlett’s test (cont’d)

• The test-statistic is a generalization of a likelihood ratio

statistic:

M =k∑i=1

(ni − 1) ln |Spool| −k∑i=1

(ni − 1) ln |Si|,

where |Spool| is the determinant of the pooled estimate of

the covariance matrix, |Si| is the determinant of the sample

covariance matrix for the ith treatment group, and k is the

number of treatments (or populations).

• In this part of the course we focus on the case k = 2 but this

test is more general.

345

Bartlett’s test (cont’d)

• When all k samples come from multivariate normal popula-

tions and when ni − p is relatively large for all i = 1, ..., k, it

has been shown that

MC−1 ∼ χ2df , df =

1

2(k − 1)(p+ 1)p,

where the scale factor C−1 is given by

C−1 = 1−2p2 + 3p− 1

6(p+ 1)(k − 1)

∑i

1

(ni − 1)−

1∑i(ni − 1)

.

• The null hypothesis H0 : Σ1 = Σ2 = ... = Σk = Σ is rejected

at level α if MC−1 ≥ χ2df(α), with degrees of freedom as

defined above.

346

Example: soap manufacturing

• We test the hypothesis H0 : Σ1 = Σ2 = Σ. Here, k = 2 andp = 2.

• From earlier results, we have:

|Spool| = 9, |S1| = 11, |S2| = 7.

• Then

M = 98× ln(9)− 49× ln(11)− 49× ln(7) = 2.4794

C−1 = 1−2(22) + 3(2)− 1

6(3)(1)

[1

49+

1

49−

1

98

]= 1− 0.0221 = 0.9779

df =1

2(1)(3)(2) = 3.

347

Example: soap manufacturing (cont’d)

• We reject the null hypothesis if

MC−1 = 2.4794× 0.9779 = 2.4246 ≥ χ23(0.05) = 7.82.

• In this case, we fail to reject the null hypothesis and

conclude that Σ1 is similar to Σ2. Thus, pooling the

samples to obtain a single estimate of the common

population variance is reasonable.

348

What to do when Σ1 6= Σ2

• Suppose that we reject H0, so that Σ1 6= Σ2. For n1− p and

n2 − p large, an approximate 100(1 − α)% CR for µ1 − µ2 is

given by all µ1 − µ2 satisfying

[x̄1−x̄2−(µ1−µ2)]′[

1

n1S1 +

1

n2S2

]−1

[x̄1−x̄2−(µ1−µ2)] ≤ χ2p(α).

• For large samples, an approximate test of H0 : µ1 = µ2 is

obtained by rejecting H0 at level α if

T2 = [x̄1 − x̄2]′[

1

n1S1 +

1

n2S2

]−1

[x̄1 − x̄2] ≥ χ2p(α).

349

Heterogeneous covariance matrices (cont’d)

• Note that if n1 = n2 = n, then

1

n1S1 +

1

n2S2 =

1

n(S1 + S2)

=(n− 1)S1 + (n− 1)S2

n+ n− 2

(1

n+

1

n

)= Spool

(1

n+

1

n

).

• With equal sample sizes, the test statistic is the same as the

statistic for homogeneous covariance matrices. Thus, the

effect of heterogeneous matrices is less pronounced when

sample sizes are equal.

350

Steel Dataset

library(ICSNP)

HotellingsT2(steel[steel$temperature == 1, -1],

steel[steel$temperature == 2, -1])

# Hotelling’s two sample T2-test

#

#data: steel[steel$temperature == 1, -1] and steel[steel$temperature == 2, -1]

#T.2 = 10.7603, df1 = 2, df2 = 9, p-value = 0.004106

#alternative hypothesis: true location difference is not equal to

#c(0,0)

#

# So there is evidence that temperature has an effect on yield or

# strength.Actually, both of them individually have are not

# significantly affected.

351

Comparing Mean Vectors for Two Populationsmaitra/stat501/lectures/InferenceForMeans... · Comparing Mean Vectors for Two Populations Independent random samples, one sample from each

Documents