Top Banner
Introduction to Introduction to the t Statistic the t Statistic
25

Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

Mar 26, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

Introduction to the t Introduction to the t StatisticStatistic

Page 2: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

What were the formulae used What were the formulae used for the z statistic?for the z statistic?

z = (M – μ) / σz = (M – μ) / σmm

σσmm = σ/√n = σ/√n

σ = √(SS/n)σ = √(SS/n)

Page 3: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

What Do You Notice About What Do You Notice About These Formulae?These Formulae?

They are based on population parametersThey are based on population parameters How often do you think these population How often do you think these population

parameters are known?parameters are known? Well, we have said that the sample mean is usually a Well, we have said that the sample mean is usually a

good estimate of the population meangood estimate of the population mean Therefore finding Therefore finding μ is not generally a problem we μ is not generally a problem we

worry aboutworry about What about σ and σWhat about σ and σmm?? These we cannot estimateThese we cannot estimate

Page 4: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

So, What Do We Do?So, What Do We Do? When we do not know the population When we do not know the population

variation we use t - testsvariation we use t - tests

Page 5: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

The Story…The Story…

The The tt statistic was introduced by William Sealy Gosset for statistic was introduced by William Sealy Gosset for cheaply monitoring the quality of beer brews. "Student" cheaply monitoring the quality of beer brews. "Student" was his pen name. Gosset was a statistician for the was his pen name. Gosset was a statistician for the Guinness brewery in Dublin, Ireland, and was hired due Guinness brewery in Dublin, Ireland, and was hired due to Claude Guinness's innovative policy of recruiting the to Claude Guinness's innovative policy of recruiting the best graduates from Oxford and Cambridge to apply best graduates from Oxford and Cambridge to apply biochemistry and statistics to Guinness' industrial biochemistry and statistics to Guinness' industrial processes. Gosset published the processes. Gosset published the tt test in Biometrika in test in Biometrika in 1908, but was forced to use a pen name by his employer 1908, but was forced to use a pen name by his employer who regarded the fact that they were using statistics as a who regarded the fact that they were using statistics as a trade secret. In fact, Gosset's identity was unknown not trade secret. In fact, Gosset's identity was unknown not only to fellow statisticians but to his employer—the only to fellow statisticians but to his employer—the company insisted on the pseudonym so that it could turn company insisted on the pseudonym so that it could turn a blind eye to the breach of its rules.a blind eye to the breach of its rules.

From Wikipedia

Page 6: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

… … more on Gossett…more on Gossett…

Gossett was a chemist and was Gossett was a chemist and was responsible for developing procedures for responsible for developing procedures for ensuring the similarity of batches of ensuring the similarity of batches of Guiness. The t-test was developed as a Guiness. The t-test was developed as a way of measuring how closely the yeast way of measuring how closely the yeast content of a particular batch of beer content of a particular batch of beer corresponded to the brewery's standard. corresponded to the brewery's standard.

From http://ccnmtl.columbia.edu/projects/qmss/t_about.html

Page 7: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

Why t-test?Why t-test?

Student's distribution arises when (as in nearly all practical Student's distribution arises when (as in nearly all practical statistical work) the population statistical work) the population standard deviation is is unknown and has to be estimated from the data.unknown and has to be estimated from the data.

Textbook problems treating the standard deviation as if it Textbook problems treating the standard deviation as if it were known are of two kinds:were known are of two kinds:

(1) those in which the sample size is so large that one may treat (1) those in which the sample size is so large that one may treat a data-based estimate of the variance as if it were certain, and.a data-based estimate of the variance as if it were certain, and.

(2) those that illustrate mathematical reasoning, in which the (2) those that illustrate mathematical reasoning, in which the problem of estimating the standard deviation is temporarily problem of estimating the standard deviation is temporarily ignored because that is not the point that the author or instructor ignored because that is not the point that the author or instructor is then explaining.is then explaining.

From Wikipedia

Page 8: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

The t StatisticThe t Statistic

The t statistic is used to test hypotheses about The t statistic is used to test hypotheses about an unknown population mean (an unknown population mean (μ) when the value μ) when the value of σ is unknown. The formula for the t statistic of σ is unknown. The formula for the t statistic has the same structure as the z-score formula, has the same structure as the z-score formula, except that the t statistic uses the estimated except that the t statistic uses the estimated standard error in the denominatorstandard error in the denominator

t = (M – μ)/st = (M – μ)/smm

ssmm = s/√n = √(s = s/√n = √(s22/n)/n)

s = √[SS/(n-1)] = √(SS/df)s = √[SS/(n-1)] = √(SS/df)

Page 9: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

What is the Estimated Standard What is the Estimated Standard Error (sError (smm)?)?

The estimated standard error (sThe estimated standard error (smm) is used ) is used

as an estimate of the real standard error as an estimate of the real standard error ((σσmm) when the value of ) when the value of σ σ is unknown. It is is unknown. It is

computed from the sample variance or computed from the sample variance or sample standard deviation and provides sample standard deviation and provides an estimate of the standard distance an estimate of the standard distance between a sample mean M and the between a sample mean M and the population mean population mean μ.μ.

Page 10: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

What Are The Degrees of What Are The Degrees of Freedom (df)?Freedom (df)?

Degrees of freedom describe the number Degrees of freedom describe the number of scores in a sample that are independent of scores in a sample that are independent and free to vary. Because the sample and free to vary. Because the sample mean places a restriction on the value of mean places a restriction on the value of one score in the sample, there are n-1 one score in the sample, there are n-1 degrees of freedom for the sample.degrees of freedom for the sample.

Page 11: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

Describe the Shape of the t - Describe the Shape of the t - DistributionDistribution

The t is leptokurtic but as df gets larger, it more The t is leptokurtic but as df gets larger, it more closely resembles the normal curveclosely resembles the normal curve

This is due to the fact that sThis is due to the fact that smm more closely more closely estimates estimates σσmm when the df gets very large when the df gets very large

Once df is sufficiently large t is distributed as zOnce df is sufficiently large t is distributed as z What is two tailed the critical value (tWhat is two tailed the critical value (tcritcrit) for ) for αα = .05 = .05

and df = 6and df = 6• 2.4472.447

One tailed?One tailed?• 1.9431.943

Page 12: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

How Did Gossett Use His Test?How Did Gossett Use His Test?

He had to find out if the beer that was brewed He had to find out if the beer that was brewed met the brewery standards for the yeast contentmet the brewery standards for the yeast content

First, he would take samples of the beer from First, he would take samples of the beer from each vat and determine the yeast contenteach vat and determine the yeast content

With this data he would know the desired yeast With this data he would know the desired yeast content (content (μ) as set by factory standards, the μ) as set by factory standards, the mean yeast content for the samples (M), and the mean yeast content for the samples (M), and the sample standard deviation (s) for yeast contentsample standard deviation (s) for yeast content

Page 13: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

Lets See What Gossett Might Lets See What Gossett Might Have Seen…Have Seen…

What if there are usually 15 grams of What if there are usually 15 grams of yeast per bottle of Guinness.yeast per bottle of Guinness.

We take nine samples of beer from a vat We take nine samples of beer from a vat and we get readings of {7, 12, 11, 15, 7, 8, and we get readings of {7, 12, 11, 15, 7, 8, 15, 9, 6}15, 9, 6}

Does this vat have a significantly different Does this vat have a significantly different ((αα = .05) level of yeast than what = .05) level of yeast than what Guinness wants?Guinness wants?

Page 14: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

Step 1: State Your HypothesesStep 1: State Your Hypotheses

Null:Null: HH00: : μμ = 15 = 15

AlternativeAlternative HH11: : μ ≠ 15μ ≠ 15

State your alphaState your alpha αα = .05 = .05

Page 15: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

Step 2: Find tStep 2: Find tcritcrit

First find the dfFirst find the df df = n – 1 = 9 – 1 = 8df = n – 1 = 9 – 1 = 8

Find the two tailed critical t value for df = 8 Find the two tailed critical t value for df = 8 and and αα = .05 = .05

Page 16: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

Step 3: Sample Data and Test Step 3: Sample Data and Test StatisticsStatistics

Mean = 10Mean = 10 SS = 94SS = 94 ss22 = 11.75 = 11.75 s = 3.43s = 3.43 ssmm = 1.14 = 1.14

t = -4.39t = -4.39

Page 17: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

Step 4: Make a DecisionStep 4: Make a Decision

Is our observed t (tIs our observed t (tobsobs) greater than, or less ) greater than, or less

than the critical value for t (tthan the critical value for t (tcritcrit))

Therefore we make the decisionTherefore we make the decision t(8) = -4.39, p<.05t(8) = -4.39, p<.05

Page 18: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

Measuring Effect SizeMeasuring Effect Size

How did we measure effect size before?How did we measure effect size before? Mean difference over standard deviationMean difference over standard deviation

Therefore…Therefore… Here, estimated d = mean difference / Here, estimated d = mean difference /

sample standard deviationsample standard deviation

Page 19: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

Measuring Effect Size (Take Measuring Effect Size (Take Two!)Two!)

We can measure effect size by looking at We can measure effect size by looking at the proportion of variance accounted forthe proportion of variance accounted for

This is sometimes called PRE, or This is sometimes called PRE, or Proportional Reduction in ErrorProportional Reduction in Error

Two ways of calculating thisTwo ways of calculating this1.1. Variability accounted for / total variabilityVariability accounted for / total variability

2.2. rr22 = t = t22/(t/(t22 + df) + df)

Page 20: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ
Page 21: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ
Page 22: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ
Page 23: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

Effect SizeEffect Size

Cohen’s d = mean difference/standard Cohen’s d = mean difference/standard deviationdeviation

5/3.43 = 1.465/3.43 = 1.46 rr22 = Variability accounted for / total = Variability accounted for / total

variabilityvariability

rr22 = t = t22/(t/(t22 + df) + df)

Page 24: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

Confidence Intervals

Point Estimate Interval Estimate

Page 25: Introduction to the t Statistic. What were the formulae used for the z statistic? z = (M – μ) / σ m z = (M – μ) / σ m σ m = σ/n σ m = σ/n σ = (SS/n) σ

Directional HypothesesDirectional Hypotheses

When is a directional hypothesis justified?When is a directional hypothesis justified? When there is clear theoretical support for a When there is clear theoretical support for a

one tailed test.one tailed test. This is done through a literature review of This is done through a literature review of

past findings, not simply well thought out logicpast findings, not simply well thought out logic What are examples of directional What are examples of directional

hypotheses?hypotheses? How do we use directional hypotheses?How do we use directional hypotheses?