17 - 1 Module 17: Two-Sample t- tests, with equal variances for the two populations This module describes one of the most utilized statistical tests, the two-sample t-test conducted under the assumption that the two populations from which the two samples were selected have the same variance. Reviewed 11 May 05 /MODULE 17
34
Embed
17 - 1 Module 17: Two-Sample t-tests, with equal variances for the two populations This module describes one of the most utilized statistical tests, the.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
17 - 1
Module 17: Two-Sample t-tests, with equal variances for the two
populations
This module describes one of the most utilized statistical tests, the two-sample t-test conducted under the assumption that the two populations from which the two samples were selected have the same variance.
Reviewed 11 May 05 /MODULE 17
17 - 2
Up to this point, the focus has been on a single population, for which the observations had a normal distribution with a population mean and standard deviation . From this population, a random sample of size n provided the sample statistics and s as estimates of and , respectively.
We created confidence intervals and tested hypotheses concerning the population mean , using the normal distribution when we had available the value of and using the t distribution when we did not and thus used the estimate s from the sample. This circumstance is often described as the one sample situation.
x
The General Situation
17 - 3
Clearly, we are often faced with making judgments for circumstances that involve more than one population and sample. For the moment, we will focus on the so-called two sample situation. That is, we consider two populations.
Question:
Do you believe the two populations have the same mean?
City A City B
Mean µA µB
SD A B
17 -
H0: A = B versus H1: A B
or equivalently
H0: Δ = A - B = 0 versus H1: Δ = A - B 0.
Two Sample Hypotheses
17 - 4
Population 1 Population 2
Parameter Estimate Parameter Estimate
Populations of individual values
1 2
σ12 s1
2 σ22
s2
2
σ1 s1
σ2 s2
Populations of means, samples of size n1 and n2
1 2
σ12/n1 s1
2/n1 σ22/n2
s22/n2
σ1/√ n1 s1/√ n1 σ2/√ n2 s2/√ n2
1x 2x
2x1x
Parameters vs. Estimates
17 - 5
We are interested in
Δ = µ1 - µ2
If the samples are independent, then
When
1 2d x x
1 2 1 2
2 21 2
1 21 2
( ) ( ) ( )
( )
Var x x Var x Var x
Var x xn n
2 2 21 2 1 2
1 2
1 1, ( )Var x x
n n
17 - 6
we have two estimates of σ2 , one from sample 1, namely s1
2 and one from sample 2, namely s22. How
can we best use these two estimates of the same thing. One obvious answer is to use the average of the two; however, it may be desirable to somehow take into account that the two samples may not the same size. If they are not the same size, then we may want the larger one to count more.
2 2 21 2When ,
Estimating σ2
17 - 7
Hence, we use the weighted average of the two sample variances, with the weighting done according to sample size. This weighted average is called the pooled estimate:
11
11
21
222
2112
nn
snsnsp
Pooled Average
17 - 8
To estimate Var( ), we can use
21
2 11
nnsp
1 2x x
1 2x xEstimate of Var( )
17 - 9
Statistic City A City B N 10 10 x (mmHg) 105.8 97.2 s2(mmHg)2 78.62 22.40 s (mmHg) 8.87 4.73
To investigate the question of whether the children of city A and city B have the same systolic blood pressure, a random sample of n = 10 children was selected from each city and their blood pressures measured. These samples provided the following data:
Example 1: Blood Pressures of Children
17 - 10
We are interested in the difference:
Δ = A - B
and we have as an estimate of A and as an
estimate of B; hence it is reasonable to use:
d = - = 105.8 - 97.2 = 8.6 (mm Hg)
as an estimate of Δ = A - B.
AxBx
Ax Bx
17 - 11
We then can ask whether this observed difference of 8.6 mm Hg is sufficiently large for us to question whether the two population means could be the same, that is, A = B.
Clearly, if the two population means are truly equal, that is, if A = B is true, then we would expect the two sample
means also to be equal, that is = , except for the random error that occurs as a consequence of using random samples to represent the entire populations. The question before us is whether this observed difference of 8.6 mm Hg is larger than could be reasonably attributed to this random error and thus reflects true differences between the population means.
Ax Bx
17 - 12
Confidence Interval for A- B, using sp
0.975
( 1) ( 1)
8.6 = 2.1009 18
A B
A B
df n n
x x t df
0.975 0.975
1 1 1 1C ( ) ( ) 0.95A B p A B A B p
A B A B
x x t s x x t sn n n n
2 22 ( 1) ( 1) 9(78.62) 9(22.4)
50.51( 1) ( 1) 18A A B B
pA B
n s n ss
n n
17 - 13
1 1 1 1C 8.6 2.1009(7.11) 8.6 2.1009(7.11) 0.95
10 10 10 10A B
50.51 7.11pS
C 1.92 15.27 0.95A B
17 - 14
Example 2: AJPH, April 1994; 84:p644
31n 223n
17 - 15
1 2 OCCP Prog Non OCCP Prog n 31 223
mean 4.1 3.4
SD 1.2 1.5
S2 1.44 2.25
Example 2 (contd.)
17 - 16
1. The hypothesis: H0 : 1 2 vs. H1: 1 2
2. The assumptions: Independent random samples from normal distributions,
3. The level: = 0.05
4. The test statistic:
5. The critical region: Reject H0 if t is not between
2 2 21 2
1 2
1 2
1 1p
x x
sn n
t
0.975 (252) 1.97t
Example 2 (contd.)
17 - 17
6. Test result:
7. The Conclusion: Reject H0 since t = 2.5 is not between ± 1.97; 0.01 < p < 0.02
4.1 3.4
2.51.47 0.19
t
2 22 1 1 2 2
1 2
( 1) ( 1)
( 1) ( 1)pn s n s
sn n
2 30(1.44) 222(2.25)
30 222ps
2 542.72.154
252ps 2.154 1.47ps
1 2
1 1 1 10.19
31 223n n
17 - 18
Example 3: AJPH July 1994; 89:1068
17 - 19
Example 3 (contd.)
s SE n (0.2) 1383 7.44 (0.7) 357 13.23
2s 55.35 175.03
Mainland Cuban Puerto Ricans Americans n 1,383 357
mean 3.3 2.4
SE 0.2 0.7
Source: AJPH, July 1994; 89:1068
17 - 20
1. The hypothesis: H0 : µ1 = µ2 vs. H1: µ1 ≠
µ2
2. The assumptions: Independent random samples from normal distributions
3. The level: = 0.05
4. The test statistic:
2 2 21 2
1 2
1 2
1 1p
x x
sn n
t
17 - 21
5. The critical region: Reject if t is not between
t0.975(1738) =1.96
6. The Result:
7. Conclusion: Accept H0: 1 = 2, since p >
0.05 ; 0.05 < p < 0.10
79.86 8.94 ps
3.3 2.4 0.901.71
8.94(0.059) (0.527)t
2 1382(55.35) 356(175.03)
1382 356ps
1 10.059
1383 357
17 - 22
Example 4: AJPH July 1994; 89:1068
17 - 23
1. The hypothesis: H0: SSS = NHS vs. H1: SSS ≠ NHS
2. The level: = 0.05
3. The assumptions: Independent Samples, Normal Distribution,
4. The test statistic:
5. The critical region: Reject if t is not between ± 2.1315
2 2SSS NHS
1 1SSS NHS
pSSS NHS
X Xt
Sn n
17 - 24
6. The result :
7. The conclusion: Reject H0: SSS = NHS ; 0.01< p < 0.02