The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

The t-distributionThe t-distribution

William Gosset lived from 1876 to 1937

Gosset invented the t -test to handle small samples for quality control in brewing. He wrote under the name "Student".

t test-origint test-origin

Founder WS GossetFounder WS Gosset

Wrote under the pseudonym “Student”Wrote under the pseudonym “Student”

Mostly worked in tea (t) timeMostly worked in tea (t) time

? Hence known as Student's ? Hence known as Student's t t test. test.

Certainly if n Certainly if n << 30 30

Is there a difference?Is there a difference?

between you…means,

who is meaner?

TypesTypes

One sample One sample

compare with populationcompare with population

UnpairedUnpaired

compare with controlcompare with control

PairedPaired

same subjects: pre-postsame subjects: pre-post

T-testT-test1.1. Test for single meanTest for single mean Whether the sample mean is equal to the predefined

population mean ?

2. Test for difference in means. Test for difference in means Whether the CD4 level of patients taking treatment A is

equal to CD4 level of patients taking treatment B ?

3. Test for paired observationTest for paired observation Whether the treatment conferred any significant benefit ?

Test directionTest direction

One tailed t testOne tailed t test

Two tailed testTwo tailed test

Developing the Pooled-Variance t Test (Part 1)

•Setting Up the Hypothesis:

H0: 1 -2 = 0

H1: 1 - 2

0

H0: 1 = 2

H1: 1 2 OR

Two Tail



H0: 1 2

H1: 1 > 2

H0: 1 -2 = 0

H1: 1 - 2

0

H0: 1 = 2

H1: 1 2

H0: 1 - 2 0

H1: 1 - 2 > 0

OR

OR Right Tail

Two Tail



H0: 1 2

H1: 1 > 2

H0: 1 -2 = 0

H1: 1 - 2

0

H0: 1 = 2

H1: 1 2

H0: 1

2

H0: 1 - 2 0

H1: 1 - 2 > 0

H0: 1 - 2

H1: 1 -

2 < 0

OR

OR

OR Left Tail

Right Tail

Two Tail

H1: 1 < 2

Mean systolic BP in nephritis is significantly Mean systolic BP in nephritis is significantly higher than of normal personhigher than of normal person

100 110 120 130 140

0.050.05

Mean systolic BP in nephritis is significantly Mean systolic BP in nephritis is significantly different from that of normal person different from that of normal person

0.025 0.025

100 110 120 130 140

Statistical AnalysisStatistical Analysis

controlgroupmean

treatmentgroupmean

Is there a difference?

What doesWhat does differencedifference mean?mean?

mediumvariability

highvariability

lowvariability

The mean differenceis the same for all

three cases


mediumvariability

highvariability

lowvariability

Which one showsthe greatestdifference?


a statistical difference is a function of the a statistical difference is a function of the difference between meansdifference between means relative to the relative to the variabilityvariabilitya small difference between means with a small difference between means with large variability could be due to large variability could be due to chancechancelike a like a signal-to-noisesignal-to-noise ratio ratio

lowvariability

Which one showsthe greatestdifference?

So we estimateSo we estimate

lowvariability

signal

noise

difference between group means

variability of groups=

XT - XC

SE(XT - XC)=

= t-value

_ _

_ _

Two sample t-testTwo sample t-test

Difference between means

Sample size

Variability of data

t-test t t ++

Probability - pProbability - pWith t we check the probability With t we check the probability

Reject or do not reject Null hypothesisReject or do not reject Null hypothesis

You reject if p < 0.05 or still lessYou reject if p < 0.05 or still less

Difference between means (groups) is Difference between means (groups) is more & more significant if p is less & lessmore & more significant if p is less & less

-1.9

6-1

.96 00

Area = .025Area = .025

Area =.005Area =.005

ZZ

-2.5

75-2

.575

Area = .025Area = .025

Area = .005Area = .005

1.96

1.96

2.57

52.

575

Determining the p-ValueDetermining the p-Value

.95

t0

f(t)

-1.96 1.96

.025

red area = rejection region for 2-sided test

AssumptionsAssumptions

Normal distributionNormal distribution

Equal varianceEqual variance

Random samplingRandom sampling

t-Statistict-Statistic

ns

xt

/

When the sampled population is normally When the sampled population is normally distributed, the t statistic is Student t distributed, the t statistic is Student t distributed with n-1 degrees of freedom.distributed with n-1 degrees of freedom.

T- test for single T- test for single meanmeanThe following are the weight (mg) of each of 20

rats drawn at random from a large stock. Is it likely that the mean weight of these 20 rats are similar to the mean weight ( 24 mg) of the whole stock ?

9 18 21 2614 18 22 2715 19 22 2915 19 24 3016 20 24 32

Steps for test for single meanSteps for test for single mean

1. Questioned to be answered Is the Mean weight of the sample of 20 rats is 24 mg?

N=20, =21.0 mg, sd=5.91 , =24.0 mg

2. Null Hypothesis The mean weight of rats is 24 mg. That is, The

sample mean is equal to population mean.

3. Test statistics --- t (n-1) df

4. Comparison with theoretical value if tab t (n-1) < cal t (n-1) reject Ho, if tab t (n-1) > cal t (n-1) accept Ho,5. Inference

ns

xt

/

x

t –test for single mean t –test for single mean Test statisticsTest statistics

n=20, =21.0 mg, sd=5.91 , n=20, =21.0 mg, sd=5.91 , =24.0 mg=24.0 mg

tt = t = t .05, 19 .05, 19 = 2.093 = 2.093 Accept H Accept H00 if t < 2.093 if t < 2.093 Reject HReject H00 if t if t

>= 2.093>= 2.093

x

30.22091.5240.21 ll

t

Inference :Inference :

There is no evidence that the sample is taken There is no evidence that the sample is taken from the population with mean weight of 24 gmfrom the population with mean weight of 24 gm

Given below are the 24 hrs total energy Given below are the 24 hrs total energy expenditure (MJ/day) in groups of lean and expenditure (MJ/day) in groups of lean and obese women. Examine whether the obese obese women. Examine whether the obese women’s mean energy expenditure is women’s mean energy expenditure is significantly higher ?.significantly higher ?.

Lean Lean

6.1 7.0 7.56.1 7.0 7.5

7.5 5.5 7.67.5 5.5 7.6

7.9 8.1 8.17.9 8.1 8.1

8.1 8.4 10.28.1 8.4 10.2

10.9 10.9

T-test for difference in means

ObeseObese 8.8 9.2 9.28.8 9.2 9.2 9.7 9.7 10.09.7 9.7 10.0 11.5 11.8 12.811.5 11.8 12.8

Null HypothesisNull Hypothesis

Obese women’s mean energy expenditure Obese women’s mean energy expenditure is equal to the lean women’s energy is equal to the lean women’s energy expenditure.expenditure.

Data SummaryData Summary

lean Obeselean Obese

N 13 9N 13 9

8.10 10.308.10 10.30

S 1.38 1.25S 1.38 1.25

HH00: : 1 1 - - 22 = 0 = 0 ((1 1 = = 22))

HH11: : 1 1 - - 22 0 0 ((1122))

= 0.05= 0.05

df = 13 + 9 - 2 = 20df = 13 + 9 - 2 = 20

Critical Value(s):Critical Value(s):

t0 2.086-2.086

.025

Reject H0 Reject H0

.025

Solution

tX X

Sn S n S

n n

df n n

P

1 2 1 2

2 1 12

2 22

1 2

1 2

1 1

1 1

2

Hypothesized Difference (usually zero when testing for equal means)

•Compute the Test Statistic:

( ))(

( ) ( )( ) ( )

112pS

n1 n2

_ _

Calculating the Test Statistic:

Developing the Pooled-Variance t Test

•Calculate the Pooled Sample Variances as an Estimate of the Common Populations Variance:

)n()n(

S)n(S)n(Sp 11

11

21

222

2112

2pS

21S

22S

1n

2n

= Pooled-Variance

= Variance of Sample 1

= Variance of sample 2

= Size of Sample 1

= Size of Sample 2

Sn S n S

n nP

2 1 12

2 22

1 2

2 2

1 1

1 1

13 1 1 38 9 1 125

13 1 9 11 765

. ..

((((

((

( (

)

))

))

)))

First, estimate the common variance as a weighted average of the two sample variances using the degrees of freedom as weights

tX X

Sn nP

1 2 1 2

2

1 2

10.3 0

176 1 +13

19

3.82 8.1

.

Calculating the Test Statistic:

( (( )) )

11

tab t 9+13-2 =20 dff = t 0.05,20 =2.086

T-test for difference in meansT-test for difference in means

Inference : The cal t (3.82) is higher than tab t at 0.05, 20. ie 2.086 . This implies that there is a evidence that the mean energy expenditure in obese group is significantly (p<0.05) higher than that of lean group

ExampleExampleSuppose we want to test the effectiveness Suppose we want to test the effectiveness

of a program designed to increase of a program designed to increase scores on the quantitative section of the scores on the quantitative section of the Graduate Record Exam (GRE). We test Graduate Record Exam (GRE). We test the program on a group of 8 students. the program on a group of 8 students. Prior to entering the program, each Prior to entering the program, each student takes a practice quantitative student takes a practice quantitative GRE; after completing the program, each GRE; after completing the program, each student takes another practice exam. student takes another practice exam. Based on their performance, was the Based on their performance, was the program effective?program effective?

Each subject contributes 2 scores: repeated Each subject contributes 2 scores: repeated measures designmeasures design

Student Before Program After Program

1 520 555

2 490 510

3 600 585

4 620 645

5 580 630

6 560 550

7 610 645

8 480 520

Can represent each student with a single Can represent each student with a single score: the difference (D) between the scoresscore: the difference (D) between the scores

StudentBefore Program After Program

D

1 520 555 35

2 490 510 20

3 600 585 -15

4 620 645 25

5 580 630 50

6 560 550 -10

7 610 645 35

8 480 520 40

Approach: test the effectiveness of program Approach: test the effectiveness of program by testing significance of Dby testing significance of D

Null hypothesis: There is no difference in the Null hypothesis: There is no difference in the scores of before and after programscores of before and after program

Alternative hypothesis: program is effective Alternative hypothesis: program is effective → scores after program will be higher than → scores after program will be higher than scores before program → average D will be scores before program → average D will be greater than zerogreater than zero

HH00: µ: µDD = 0 = 0

HH11: µ: µDD > 0 > 0

StudentBefore

ProgramAfter

Program D D2

1 520 555 35 1225

2 490 510 20 400

3 600 585 -15 225

4 620 645 25 625

5 580 630 50 2500

6 560 550 -10 100

7 610 645 35 1225

8 480 520 40 1600

∑D = 180 ∑D2 = 7900

So, need to know ∑D and ∑D2:

Recall that for single samples:Recall that for single samples:

error standard

mean - score

X

obt s

Xt

For related samples:For related samples:

D

Dobt s

Dt

where:

N

ss DD

and

1

2

2

N

N

DD

sD

45.23

188

1807900

1

22

2

N

N

DD

sD

5.228

180

N

DD

Standard deviation of D:Standard deviation of D:

Mean of D:Mean of D:

Standard error:Standard error:

2908.88

45.23

N

ss DD

D

Dobt s

Dt

Under H0, µD = 0, so:

714.22908.8

5.22

D

obt s

Dt

From Table B.2: for α = 0.05, one-tailed, with df = 7,

t critical = 1.895

2.714 > 1.895 → reject H0

The program is effective.

t-Valuet-Valuet is a measure of:How difficult is it to believe the null hypothesis?

High t Difficult to believe the null hypothesis -

accept that there is a real difference.

Low t Easy to believe the null hypothesis -

have not proved any difference.

In Conclusion !In Conclusion !

Student ‘s t-test will be used:Student ‘s t-test will be used: --- When Sample size is small--- When Sample size is small and for the following situations:and for the following situations: (1) to compare the single sample mean(1) to compare the single sample mean with the population meanwith the population mean (2) to compare the sample means of (2) to compare the sample means of two indpendent samplestwo indpendent samples (3) to compare the sample means of (3) to compare the sample means of

paired samples paired samples

The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Documents