Top Banner
CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for two sample means • P-values • F-tests Dr D Borman ©Claudio Nunez 2010, sourced from http://commons.wikimedia.org/wiki/File:2010_Chile_earthquake_-_Building_destroyed_in_ Concepci%C3%B3n.jpg?uselang=en-gb Available under creative commons license
60

CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Apr 01, 2015

Download

Documents

Desirae Twyford
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

CIVE2602 - Engineering Mathematics 2.2 (20 credits)

Statistics and Probability

Lecture 9

• Hypothesis testing –Examples• Undertaking experiments • t-test for two sample means • P-values• F-tests

Dr D Borman

©Claudio Nunez 2010, sourced fromhttp://commons.wikimedia.org/wiki/File:2010_Chile_earthquake_-_Building_destroyed_in_Concepci%C3%B3n.jpg?uselang=en-gbAvailable under creative commons license

Page 2: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

1) Testing a single sample mean

• Does one distribution come from a certain population? • How certain are we that it comes from a different population?

population

sample

??

??

2) Two sample tests on the mean

• Are two distributions different?• Do they come from the same population?• How different?• How certain are we that they are different?

Page 3: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

• Is there a difference in the strength of our concrete depending on if we use chemical X or chemical Y? (test sample of each type)

• Does our concrete get stronger if we leave it an additional week to set?

(e.g. sample 1 this week, sample 2 week after)

• Has pollution level in a river changed since a new factory was built?

(before /after sample)

• Has there been a reduction in vibrations to a bridge after modifications

Hypotheses Testing – 2 sample tests ??

We might want to set up an experiment to test for…

©Terraplanner 2007, sourced fromhttp://www.flickr.com/photos/halonfury/1977813099/

Available under creative commons license

Page 4: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Standard Deviation (σ)

Testing X against Y(where Y is population)

Testing X against Y (where Y is another

sample)

σx ~σy

σ is known or

large sample

z - testuse σ (or Sx) = population variance

2 Sample t-test or

2 Sample z-test

σx ≠ σy

or σ is unknown,

small samples

t - testuse σ =Sx

2 Sample t-test

Summary of when to use z-test and t-test

population

sample

??

??

Page 5: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

No

Page 6: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.
Page 7: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

‘Student’ t test

• The t-test was developed by W. S. Gossett at around 1899

• Employed at the Guinness brewery in Dublin

• He was responsible for developing procedures for ensuring the similarity of batches of Guinness

• Had to publish anonymously so did so under the name ‘Student’

©Sami Keinanen 2005, sourced fromhttp://commons.wikimedia.org/wiki/File:Guinness.jpg

Available under creative commons license

Page 8: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Ronald Fisher (famous Statistician),

what he says rings even more true today than when he uttered it, back in 1938:

"To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of."

Designing a statistical experiment

Page 9: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

http://www.youtube.com/watch?v=BX9iMIC6mcg

Hypotheses Testing (2 sample t-test)

Setting up and performing an experiment

Page 10: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

1

)( 22

y

y n

YYS1

)( 22

x

x nXX

S

Formula B: - not equal variances

2

)()( 22222

yx

yx nn

YYXXSSS

Formula A: - assumes equal variances

y

y

x

x

yx

n

s

n

s

YXt

22

)()(

103.1161

133.1161

0)83.2652.301(518.2

33.14

09.36

P-value –value from t-tables(We double it for 2TT so that it can be compared to significance level, that’s because value from tables only shows value for one tail)

Page 11: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

1

)( 22

y

y n

YYS1

)( 22

x

x nXX

S

Formula B: - not equal variances

2

)()( 22222

yx

yx nn

YYXXSSS

Formula A: - assumes equal variances

y

y

x

x

yx

n

s

n

s

YXt

22

)()(

103.1161

133.1161

0)83.2652.301(518.2

33.14

09.36

Rejection region for H0

Acceptance region for H0

Critical Level (from tables)

(tcrit)

P-value –value from t-tables(We double it for 2TT so that it can be compared to significance level, that’s because value from tables only shows value for one tail)

Page 12: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

3 ways to check if H0 should be rejected (all effectively the same thing) Rejection region for H0

Acceptance region for H0

Critical Level (from tables)

tcrit

-If P-value< significance level (<5%)(p-value found from tables, but doubled for 2TT)

-If value from tables < ½ significance level (2TT)

-If t statistic > tcrit

tcrit=t-value for a particular significance level e.g. 5% (depends on if 1TT or 2TT)

http://www.youtube.com/watch?v=ZFXy_UdlQJg&NR=1

Page 13: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

A simple example

• The breaking stress of 66 Oak beams to be used in houses around Yorkshire were measured

• We want to– Describe the data– Test for differences between Red Oak and

Yellow Oak

Page 14: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Good to look at your data before analysis

female male

sex

0.00

50.00

100.00

150.00

200.00

250.00

Box plot

Scatter plot

medianIQR

Red yellow

Oak Type

Breaking strength

Red yellow

Oak Type

Breaking strength

Page 16: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

The frequency distribution of the breaking stress of Oak beams in 66 Yorkshire houses

WORMS

225.0220.0

215.0210.0

205.0200.0

195.0190.0

185.0180.0

175.0170.0

165.0160.0

155.0150.0

145.0140.0

135.0130.0

Histogram

Fre

qu

en

cy10

8

6

4

2

0

Std. Dev = 19.51

Mean = 174.9

N = 66.00

Approximately normal distribution+/- symmetrical (skewness = 0)

Breaking stress

Page 17: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

How do I analyse my data?• Think about analysis before you collect

the data

• What is the form of the data?

• How do I describe the data?

• Hypothesis testingHypothesis testing

• What test do I need?What test do I need?

• What do the results mean?What do the results mean?

Page 18: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Hypothesis testing

• Phrase as a hypothesis: e.g. Strength of beams vary with Oak type

• Rephrase as a Null hypothesis, H0

e.g. “beam strength does not vary with Oak type”

• Test whether data are consistent with H0

• Think of an interesting/appropriate question

Page 19: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Types of null hypothesis

• data follow a given distribution

• a and b are not associated

• means do not differ

• x does not influence y

Page 20: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

What test do we need?

• Breaking strength of Oak Beam -Single continuous dependent variable

• Oak Type (Red/Yellow) -Single categorical predictor variable

=> t-test (or ANOVA if more than two)

Page 21: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Could draw a graph show 95% Confidence Intervals of sample mean

• Graph should reflect analysis

• e.g. if analysing means present means,

• Figure legend should be understandable without reference to text

Type of Oak Red Yellow

95% CI

Breaking

strength

Page 22: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Hypothesis testing• Choose the null hypothesis

– H0: breaking strengths of Red and Yellow are equal

• Calculate a test statistic– Mean for Red (x1)= 181, Yellow (x2)= 170

– Calculate t = (x1 - x2)/SE = 11/4.6 (use formula)

– t =2.37

y

y

x

x

yx

n

s

ns

YXt

22

)()(

-Null hypothesis allows us to predict the distribution of the test statistic

-Test the observed value against the predicted distribution assuming the null hypothesis is true

-Is this is significant i.e. larger or smaller than we would expect by chance?

Page 23: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

0

0.01

0.02

0.03

0.04

0.05

-4 -2 0 2 4

•If H0 is true, predicted distribution of t for 64 df is

tcrit=-2.00 tcrit=2.00

•There is a probability of 2.1% that H0 is true (p-value= 0.021),

•5% of the time we see t>2.00 or t<-2.00(P-value =0.021, found from tables using t-statistic= 2.37)

t=2.37

Remember n=66, df=v=n-2=64

tcrit-value for 5% significance (from tables)

Therefore, reject H0 and take alternative Hypothesis=> There is a difference between red and yellow oak

Page 24: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Data presentation(what do we need to report)

• The frequency distribution of Oak breaking strength did not differ significantly from normal distribution.– Make it clear that your data meets any assumptions of the tests

used

• The mean (±SD) Breaking strength was 181 kN (±20) for Red Oak (n=30), and 170 kN (±18) for Yellow Oak (n=36)– Present appropriate descriptive statistics

• There was a significant difference in Breaking strength between Red Oak and Yellow Oak (t=2.37, df=64, P=0.021)… – Present the test statistic, degrees of freedom and exact P-value

WORMS

225.0220.0

215.0210.0

205.0200.0

195.0190.0

185.0180.0

175.0170.0

165.0160.0

155.0150.0

145.0140.0

135.0130.0

Histogram

Fre

qu

en

cy

10

8

6

4

2

0

Std. Dev = 19.51

Mean = 174.9

N = 66.00

Page 25: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

P-value measures strength of evidence against null hypothesis, not magnitude of effect

SEX

2.22.01.81.61.41.21.0.8

HE

IGH

T

2.4

2.2

2.0

1.8

1.6

1.4

1.2

SEX

2.22.01.81.61.41.21.0.8

HE

IGH

T

2.6

2.4

2.2

2.0

1.8

1.6

1.4

1.2

P<0.0001 P=0.021

Point to note

Page 26: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Interpreting Non-Significant results

• Non-significant result does not mean no effect (could be good to take larger samples!)

• Very hard to prove no effect

• Other techniques can be used

Page 27: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

F-test – test on the variance

Page 28: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Example- Comparing Two means with small sample

Five cubes can be made from one mix of concrete. They are tested after 5 days and give the following compressive strengths: 23 17 22 33 25 (say, X)

while another 5 cubes from the same mix are tested after 6 days and give

26 19 29 37 24 (say, Y)

Is there a significant difference between the means for the two samples?

1) –Test variances (not done this yet) – assume underlying variances are the same

SolutionWe have small samples with an unknown σ (population standard deviation)

Use the t-test

nx =5 ny=5

X =24, Y =27Find sample means:

Find sample standard deviation:

1

)( 22

x

x n

XXS

1

)( 22

y

y n

YYS

Formula B:

25.392 S2

)()( 22222

yx

yx nn

YYXXSSS

Formula A:

Degrees of freedom, v = n – 2 = 5+5 -2 = 8

(assume underlying population is Normally distributed)

Page 29: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

F-test – test on the variance

XSo far we have only considered testing a sample mean against a population mean (μ).

Here we look at tests on the sample variance to see if two samples have significantly different variances (normally done before comparing means).

2

2

y

x

S

S

Distribution of VariancesIf X and Y are normally distributed with identical variances and if samples of size nx and ny respectively are drawn from each, then the ratio

satisfies an F distribution with

V1 = nx-1 and V2 = ny-1 degrees of freedom.

2xS

2yS

22 )()1(

1XX

nS

xx

22 )()1(

1YY

nS

yy

Remember and are the unbiased estimators of the X and Y population variances given by:

Page 30: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

F-test – test on the variance

XSo far we have only considered testing a sample mean against a population mean (μ).

Here we look at tests on the sample variance to see if two samples have significantly different variances.

2

2

y

x

S

S

Distribution of VariancesIf X and Y are normally distributed with identical variances and if samples of size nx and ny respectively are drawn from each, then the ratio

satisfies an F distribution with

V1 = nx-1 and V2 = ny-1 degrees of freedom.

2xS

2yS

22 )()1(

1XX

nS

xx

22 )()1(

1YY

nS

yy

Remember and are the unbiased estimators of the X and Y population variances given by:

F-distribution

Page 31: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

F-test

Example – we have two small samples, X and Y

18xn

16yn2002 xS

502 yS

Calculate

2

2

y

x

S

SF

Since our calculated value of F=4 exceeds the value at the 5% significance, we conclude that the two sample variances are different

Degrees of freedom: V1 = nx-1=17

and V2 = ny-1=15

450

200

2.37315,17F

Find F at 5% significance

Page 32: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

F – Test-summaryThe F-test is used to test for differences among sample variance. Like

the Student's t, one calculates an F and compares this to a table value.

The formula for F is simply

                       The variance are arranged so that F>1. That is; s1

2>s22.

We use the F-test as the Student's t test, only we are testing for significant differences in the variances.

1) Invoke the null hypothesis that states that the two variances we are comparing are from the same population. (i.e., they are not statistically different)

2) Calculate the F value (the ratio of the two variances)

3) Look up the table value of F for the degrees of freedom used to calculate both variances and for a given confidence level.

4) If the calculated F is greater than the table value, then the null hypothesis is not correct. Else, the two could have come from the same population of measurements.

2

2

y

x

S

S

Page 33: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

T-statistic

B

B

A

A

A

nS

nS

XXt B

22

Difference in the means

‘Sum’ of the standard errors T-statistic =

Page 34: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Assumptions about the T-statistic

You have normally distributed data– You might not have. – You can’t use this test if you don’t.

(there are ways to investigate if a population is likely to be normal- but beyond this course)

- Important to state assumptions you are making

Page 35: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Confidence

• Need something to compare it with

• How certain do you need to be that the means are the same?

• 90% - 1 in 10 you’ll be wrong

• 95% - 1 in 20 you’ll be wrong

• 99%– 1 in 100 you’ll be wrong

Page 36: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

What about if more than 2 samples?

- ANOVA The ANalysis Of VAriance (or ANOVA) is a powerful and common statistical procedure

The t-test tells us if the variation between two groups is "significant". Why not just do t-tests for all the pairs of samples.

Multiple t-tests are not the answer because as the number of groups grows, the number of needed pair comparisons grows quickly. For 7 groups there are 21 pairs.

ANOVA puts all the data into one number and gives us one P for the null hypothesis.

Page 37: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

http://www.youtube.com/watch?v=BX9iMIC6mcg

Hypotheses Testing (2 sample t-test)

Setting up and performing an experiment - maybe need explanation of p-value 9perhaps cover here!)

Page 38: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Hypothesis testing considerations

Do two samples come from the same population?

Z-test (for very large samples or when σ known)

t-test (for small samples when σ not known) Population should be Normally distributed

Page 39: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Testing if strength of concrete changes if kept in an humidity chamber for a week

We want to find out if leaving concrete in a humidity chamber for a week changes the breaking strength of the concrete.

We have 5 samples to put in the chamber and 6 to leave out in normal drying conditions. Following the week of drying the 11 samples are strength tested.

23 17 22 33 25 (say, X) 26 19 29 37 24 32 (say, Y)The breaking strengths (in KN) are as follows

Is there a significant difference between the means for the two samples?

(assume underlying population is Normally distributed)

Example

Page 40: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

21 17 21 33 23 (say, X) 27 25 29 37 24 29 (say, Y)

The breaking strengths are as follows

Is there a significant difference between the means for the two samples? (don’t assume same variances)

SolutionWe have small samples with an unknown σ (population standard deviation)

=>Use the t-test

nx =5 ny=6

X =23, Y =271) Find sample means:

1

)( 22

x

x n

XXS

1

)( 22

y

y n

YYS

2) Find sample standard deviation: Use Formula B (i.e. don’t assume same variances)

3) Degrees of freedom, v = n – 2 = 5+6 -2 = 9

xn

XXge ..

15

..)2317()2321( 2

36

16

..)2919()2725( 2

2.33

Which test statistic to use?

Page 41: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Rejection region for H0

Acceptance region for H0

Critical Level (from tables)

Use the t-test

nx =5 ny=6 X =23, Y =27

0.362 xSDegrees of freedom, v = 9

y

y

x

x

yx

n

s

ns

YXt

22

)()(

H0: μx = μy

H1: μx ≠ μy

H0: the population means are the same

H1: population mean for X is not the same as the population mean for Y.

84.056.3

3

62.33

50.36

0.0)2724(

= 0 :H0 Assumes means are equal

2TT we are looking for any difference (at 5% significance 2.5% in each tail)

So P-value =~ 0.2 x2 =40%

As P-value is higher than 5% significance (i.e.40%>5%) do not reject H0 ->conclude that there is no significant difference in strength between the concrete that has been in the chamber (although might want to try with bigger samples!)

Is there a significant difference between the means for the two samples? (don’t assume same variances)

2.332 yS

From t tables (t-stat =0.84):

Could have also compared the probability of 0.2 from tables with 2.5% (i.e 20% > 2.5) and again would have rejected the H0 (this is just the same)

Page 42: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

CIVE2602 - Engineering Mathematics 2.2

Lecture 9- Summary

Hypothesis testing very important technique –used very widely

We have looked at testing to see if there is a difference between 2 means – can test other properties as well

When testing the means- perform an F-test on the variance first

For more than 2 samples we use ANOVA

Page 43: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

If X and Y are normally distributed with means μx, and μy, and population standard

deviations σx and σy, and independent random samples of size nx and ny are

drawn from X and Y, then the difference between the sample means YX

is normally distributed with mean: )( YXE μx - μy

Differences of Sample Means: known population Variances (or estimated using very large samples)

y

y

x

x

nnyxYXVar

22

)var()var()(

and variance

y

y

x

x

nnYXSD

22

)(

or standard deviation

x

x

nXSD

2

)(

Compare with:

Page 44: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Choosing the null hypothesis

• “x does not affect y”• “distribution is not different from normal”• “means of 2 groups do not differ”• could be “means of 2 groups differ by

3.1”• NOT “means of 2 groups are different”

Allows a prediction to be made about expected value of a statistic

Page 45: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Summary of Tables in Notes

• P101 of notes - Normal tables -finds probabilities from z-values

• P102 of notes - Reverse Normal tables -finds z-statistics from probabilities.

• P103 of notes - t-tables (reverse) -finds t-statistic from probabilities (can use to look up probabilities from t-stat as well).

• P104 – F-tables, used for F-test on variances (not covered this yet)

Find probability for z=2.12. prob=

0.0170

Page 46: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

What are the assumptions of an independent samples t test?

• Assumptions. For the equal-variance t test, -the observations should be independent, random

samples from normal distributions with the same population variance. Formula A

For the unequal-variance t test, -the observations should be independent, random

samples from normal distributions. Formula B

Page 47: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Hypothesis test of two sample means is also referred to as the ‘Student’ T test

• The t-test was developed by W. S. Gossett at around 1899

• Employed at the Guinness brewery in Dublin• He was responsible for developing procedures

for ensuring the similarity of batches of Guinness

• Had to publish anonymously so did so under the name ‘Student’

• T-test is sometimes referred to as "Student's” t-test. http://www.youtube.com/watch?v=snrajeiZLK0&feature=related

©Sami Keinanen 2005, sourced fromhttp://commons.wikimedia.org/wiki/File:Guinness.jpg

Available under creative commons license

Page 48: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

We want to find if a new factory being built near the river Aire, has changed the level of pollution in the river.

Before the factory is built we take 30 samples and find that the mean pollution level of these samples to be 21.51 and the standard deviation to be 4.3.

After the factory has been built a further 40 samples are taken from the river and the mean level is found to be 19.51 and sample standard distribution is 3.10.

Q Can we say at 5% significance level that the level of pollution in the river has changed.

i.e. Is there a significant difference between the underlying pollution levels of the river on these two occasions

WORK IN GROUPS OF 2-3

Page 49: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Summary of Procedure

• Compare your t-statistic with the one in the table

• Need to decide on the – degrees of freedom and the – percentile you are trying to check within– Whether it is a 1 tail or a 2 tail comparison– Usually use 2 tails

Page 50: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

T-Statistic Tables

Page 51: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

3 variants of t-test

• Comparing 2 independent datasets– Equal variances (use Formula A)– Unequal variances (use Formula B)

• Matched pair comparison

Page 52: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Comparing 2 independent samples

• You have two different datasets looking at the same thing. Are the means significantly different?

e.g. Is patient recovery rate faster with drug A or drug B

e.g. Is the fuel economy better with Car Type A or Car Type B

e.g. Is Concrete with Add

Page 53: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Matched pair comparison

• You have two datasets looking at the same thing with pairs that are links. Are the means significantly different?

e.g. Do patients temperatures change after you give the drug A

Page 54: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Procedure for independent samples

• Find the mean of each group• Find the number of observations in each group• Find the standard deviation in each group• Put the numbers into the formula• Calculate the t-statistic

• Compare the t-statics with the ones in the table• Is your value higher than the critical one (you can

disprove the null) or lower (you can’t)

B

B

A

A

A

nS

nS

XXt B

22

Page 55: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Analysis of lots of data

• Measure size of leaves on maple trees from 2 different sites

• Are the means the same at all the sites?

• Perform a t-test

Page 56: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Analysis of lots of data

• Measure size of leaves on maple trees from 10 different sites

• Could perform t-tests between all the combinations

• 1-2, 1-3, 1-4, 1-5, 1-6, 1-7, 1-8,

1-9, 1-10

• 2-3, 2-4, 2-5, 2-6,2-7…..

• 3-4, 3-5, 3-6, 3-7,3-8,

Page 57: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Analysis of lots of data

• 45 different combinations

• Testing at 95% confidence, will find significant differences between the samples just because of the number of tests you are doing.

• Not the best way of doing this!!!!

Page 58: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

FisherSir Ronald Aylmer Fisher

Geneticist

‘Fisher was a genius who almost single-handedly created the foundations for modern statistical science’ Hald, Anders (1998). A History of Mathematical Statistics. New York: Wiley.

"I occasionally meet geneticists who ask me whether it is true that the great geneticist R.A. Fisher was also an important statistician" (Annals of Statistics, 1976).

Fisher didn’t believe that smoking caused lung cancer.

Page 59: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

Fisher’s analysis

• If we have several samples taken from identical populations we can estimate the standard deviation in 2 ways

1. The spread of values within the samples

2. The spread of the mean of the samples• Providing the samples are from identical

populations the two ways should give approximately the same result

Page 60: CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 9 Hypothesis testing –Examples Undertaking experiments t-test for.

http://www.youtube.com/watch?v=abjHpJ36pIE

http://www.youtube.com/watch?v=snrajeiZLK0&feature=related

T-test information

http://www.youtube.com/watch?v=ZFXy_UdlQJg&feature=related

http://www.youtube.com/watch?v=N4aHi_g0vgQ&feature=related

http://www.youtube.com/watch?v=BX9iMIC6mcg

http://www.youtube.com/watch?v=ZFXy_UdlQJg&NR=1P-value-use

Hypothesis test-use