Top Banner
The T Distribution The T Distribution ©Dr. B. C. Paul 2005 ©Dr. B. C. Paul 2005
16

The T Distribution ©Dr. B. C. Paul 2005. Wasn’t the Herby Assembly Line Problem Fun But there is one little problem But there is one little problem We.

Apr 01, 2015

Download

Documents

Haylie Bruff
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The T Distribution ©Dr. B. C. Paul 2005. Wasn’t the Herby Assembly Line Problem Fun But there is one little problem But there is one little problem We.

The T DistributionThe T Distribution©Dr. B. C. Paul 2005©Dr. B. C. Paul 2005

Page 2: The T Distribution ©Dr. B. C. Paul 2005. Wasn’t the Herby Assembly Line Problem Fun But there is one little problem But there is one little problem We.

Wasn’t the Herby Assembly Line Wasn’t the Herby Assembly Line Problem FunProblem Fun

But there is one little problemBut there is one little problem We knew that our mean value could We knew that our mean value could

have been all over the map relative to have been all over the map relative to the real true meanthe real true mean

We calculated our standard deviation We calculated our standard deviation from the same samplefrom the same sample How come our mean could be anything How come our mean could be anything

and yet our standard deviation is God’s and yet our standard deviation is God’s own value for the standard deviation?own value for the standard deviation?

Page 3: The T Distribution ©Dr. B. C. Paul 2005. Wasn’t the Herby Assembly Line Problem Fun But there is one little problem But there is one little problem We.

It Isn't It Isn't

When our value for the standard deviation When our value for the standard deviation is just an estimate we have another is just an estimate we have another chance for things to be way out in the tailschance for things to be way out in the tails

Sadisticians – woops I mean statisticians Sadisticians – woops I mean statisticians figured out probability distribution for what figured out probability distribution for what would happen thenwould happen then Called it the T distributionCalled it the T distribution

First published in 1908 perfected in 1926First published in 1908 perfected in 1926 We look up values for areas under the We look up values for areas under the

curve of a T distribution just like we did curve of a T distribution just like we did with a normal distribution.with a normal distribution.

Page 4: The T Distribution ©Dr. B. C. Paul 2005. Wasn’t the Herby Assembly Line Problem Fun But there is one little problem But there is one little problem We.

Let’s Redo Herby’s Problem Right Let’s Redo Herby’s Problem Right This TimeThis Time

We will use the T distributionWe will use the T distribution

n

sX

t

S is the estimated standard deviationThe test statistic has a T distribution (assuming the underyling populationReally is normally distributed)The distribution has n-1 degrees of freedom

Page 5: The T Distribution ©Dr. B. C. Paul 2005. Wasn’t the Herby Assembly Line Problem Fun But there is one little problem But there is one little problem We.

Degrees of Freedom! What are you Degrees of Freedom! What are you talking about? – this isn’t an talking about? – this isn’t an Amnesty International ClassAmnesty International Class

Consider # of equations and # of unknownsConsider # of equations and # of unknowns To uniquely solve 3 unknowns you need 3 independent To uniquely solve 3 unknowns you need 3 independent

equationsequations Each sample is like an equationEach sample is like an equation

If I have one sample I first use it as an estimate of the mean.If I have one sample I first use it as an estimate of the mean. I can’t calculate a standard deviation – I don’t have enough I can’t calculate a standard deviation – I don’t have enough

datadata If I have two samplesIf I have two samples

I can estimate std deviation and still have one degree of I can estimate std deviation and still have one degree of freedom to measure something elsefreedom to measure something else

Happens to be the meanHappens to be the mean How much extra data do I have above the bear How much extra data do I have above the bear

minimum?minimum?

Page 6: The T Distribution ©Dr. B. C. Paul 2005. Wasn’t the Herby Assembly Line Problem Fun But there is one little problem But there is one little problem We.

So How Do I Use This?So How Do I Use This?(I have a really bad feeling your going to tell me)(I have a really bad feeling your going to tell me)

Note that this table is set upDifferent from Z values for normalDistribution.

Area under the curve comes fromThe top line.

Degrees of Freedom from theside

Value in the middle is the T value(equivalent to the Z value)

Remember in the normal tableThe Z value was on the edgeAnd the area under the curveIn the middle of the table

Page 7: The T Distribution ©Dr. B. C. Paul 2005. Wasn’t the Herby Assembly Line Problem Fun But there is one little problem But there is one little problem We.

Lets Do the ProblemLets Do the Problem

n

stX *

X = 3.8S= 0.73N= 7

OK – So What Is t?

Page 8: The T Distribution ©Dr. B. C. Paul 2005. Wasn’t the Herby Assembly Line Problem Fun But there is one little problem But there is one little problem We.

Finding tFinding tIf we do this as a two tailed test(ie we would be concerned if ourBalls were to hard or to soft) weCan only have 2.5% in each tail

Pick 97.5

We have 7 samples hence n-1 or6 degrees of freedom

Read into the table

2.45

Page 9: The T Distribution ©Dr. B. C. Paul 2005. Wasn’t the Herby Assembly Line Problem Fun But there is one little problem But there is one little problem We.

Plug and ChugPlug and Chug

7

73.0*45.28.3 UpperLimit

4.48We can still reject the null hypothesis with anAlpha Level of 5% but it is now much closerThan before

Page 10: The T Distribution ©Dr. B. C. Paul 2005. Wasn’t the Herby Assembly Line Problem Fun But there is one little problem But there is one little problem We.

Some Observations About Degrees Some Observations About Degrees of Freedom and the T statisticof Freedom and the T statistic

95% of a normal distribution is within 1.96 95% of a normal distribution is within 1.96 standard deviations of the meanstandard deviations of the mean

95% of a T distribution is within 2.45 estimated 95% of a T distribution is within 2.45 estimated standard deviations of the mean if the standard standard deviations of the mean if the standard deviation estimate came from 7 samplesdeviation estimate came from 7 samples

With 20 samples it is 2.09 estimated standard With 20 samples it is 2.09 estimated standard deviation unitsdeviation units

With 50 samples it is 2.01With 50 samples it is 2.01 With 100 samples it is 1.98With 100 samples it is 1.98 With 500 samples it is 1.96With 500 samples it is 1.96

Note that as the number of samples increases the T Note that as the number of samples increases the T distribution converges to a normal distributiondistribution converges to a normal distribution

Page 11: The T Distribution ©Dr. B. C. Paul 2005. Wasn’t the Herby Assembly Line Problem Fun But there is one little problem But there is one little problem We.

So When Do I Use a T DistributionSo When Do I Use a T Distribution

The underlying population must be realistic to The underlying population must be realistic to model as having a normal distributionmodel as having a normal distribution

The standard deviation of the population must The standard deviation of the population must have been estimated from a standard deviation have been estimated from a standard deviation calculation using a sample of the populationcalculation using a sample of the population You can get out of using the T distribution and pretend You can get out of using the T distribution and pretend

that God gave you the standard deviation if you used that God gave you the standard deviation if you used about 100 or more samples to calculate your estimate of about 100 or more samples to calculate your estimate of the standard deviationthe standard deviation

People with a lot of experience with a distribution People with a lot of experience with a distribution often ignore the T distribution completely because often ignore the T distribution completely because they have seen results from hundreds of samplesthey have seen results from hundreds of samples They are not “doing it wrong” using a simple normal They are not “doing it wrong” using a simple normal

distribution if they have that kind of data supporting their distribution if they have that kind of data supporting their standard deviation valuestandard deviation value

Page 12: The T Distribution ©Dr. B. C. Paul 2005. Wasn’t the Herby Assembly Line Problem Fun But there is one little problem But there is one little problem We.

Why Did You Do a Two Tailed Why Did You Do a Two Tailed Test?Test?

Herby was going Bananas because he thought Herby was going Bananas because he thought the line might be putting out soft ballsthe line might be putting out soft balls That sounds to me like he is only concerned about 1 side That sounds to me like he is only concerned about 1 side

of the distribution.of the distribution. We may be upset about one particular thing but We may be upset about one particular thing but

that doesn’t mean nothing else is importantthat doesn’t mean nothing else is important One problem with things that are too hard is that they One problem with things that are too hard is that they

are often brittleare often brittle Premature ball failure could be due to the balls being too Premature ball failure could be due to the balls being too

soft or breaking up because they are too hardsoft or breaking up because they are too hard We have to ask our own case specific question about We have to ask our own case specific question about

what we are concerned about – You plan a one tailed what we are concerned about – You plan a one tailed test only if you are only concerned about events on just test only if you are only concerned about events on just one tail one tail

Page 13: The T Distribution ©Dr. B. C. Paul 2005. Wasn’t the Herby Assembly Line Problem Fun But there is one little problem But there is one little problem We.

Common Cheating on Random Common Cheating on Random SamplesSamples

Experiments should be planned before we look at the Experiments should be planned before we look at the datadata If we look at the data and then decide what the experiment If we look at the data and then decide what the experiment

should have been we are “political spin doctors” not scientistsshould have been we are “political spin doctors” not scientists A spin doctor looks at a result and then tries to make it say what A spin doctor looks at a result and then tries to make it say what

he wantshe wants A scientist sets up the test and lets the truth be what ever it isA scientist sets up the test and lets the truth be what ever it is

Often we had a theory that made us want to look Often we had a theory that made us want to look deeperdeeper Many theories are based on observationsMany theories are based on observations But the scientific method causes you to then plan an But the scientific method causes you to then plan an

experiment and go out and get the data you need to test the experiment and go out and get the data you need to test the theorytheory

It’s a subtle difference but its often ignoredIt’s a subtle difference but its often ignored The doctrine of “political correctness” is causing us all to The doctrine of “political correctness” is causing us all to

loose our integrityloose our integrity

Page 14: The T Distribution ©Dr. B. C. Paul 2005. Wasn’t the Herby Assembly Line Problem Fun But there is one little problem But there is one little problem We.

Back to Herby and the Two Tailed Back to Herby and the Two Tailed TestTest

If it is true that hard balls make no difference – only If it is true that hard balls make no difference – only soft ones then the test should have been set up as soft ones then the test should have been set up as one tailed onlyone tailed only

If the concern was the line being out of spec and If the concern was the line being out of spec and that causing unhappy customers we could not know that causing unhappy customers we could not know the sample would come out below 4.5 unless we the sample would come out below 4.5 unless we peaked firstpeaked first If at that point we decided we only cared about soft balls If at that point we decided we only cared about soft balls

we distort the reliability of our analysiswe distort the reliability of our analysis The data would have not only determined what the values The data would have not only determined what the values

of the test statistics were – it would have determined the of the test statistics were – it would have determined the testtest

Normal distribution theory only accounts for the data Normal distribution theory only accounts for the data determining the test statisticdetermining the test statistic

We in fact do not have good models for exactly what the We in fact do not have good models for exactly what the consequences are if we let the data set up the test – we can consequences are if we let the data set up the test – we can say we are taking a chance of something bad happeningsay we are taking a chance of something bad happening

Page 15: The T Distribution ©Dr. B. C. Paul 2005. Wasn’t the Herby Assembly Line Problem Fun But there is one little problem But there is one little problem We.

My ChoiceMy Choice

So why did I do this example as a two tailed testSo why did I do this example as a two tailed test 1- because that sample size analysis I did is nastier to 1- because that sample size analysis I did is nastier to

explain if I’m only working on one sideexplain if I’m only working on one side 2- Because it sets up a great discussion on random 2- Because it sets up a great discussion on random

samples and peaking and cherry picking datasamples and peaking and cherry picking data 3- Because it allowed me to discuss when I should run 3- Because it allowed me to discuss when I should run

one and two tailed testsone and two tailed tests The story problem told is inconclusive about The story problem told is inconclusive about

whether Herby was vulnerable to the line being whether Herby was vulnerable to the line being out of spec on one side only or on both sidesout of spec on one side only or on both sides

Page 16: The T Distribution ©Dr. B. C. Paul 2005. Wasn’t the Herby Assembly Line Problem Fun But there is one little problem But there is one little problem We.

Look at the Problems We Have Look at the Problems We Have Run So FarRun So Far

We looked at a storm washing out the drainage system in a We looked at a storm washing out the drainage system in a subdivisionsubdivision Only too much rain would create the disaster – we really only Only too much rain would create the disaster – we really only

were worried about too big rain eventswere worried about too big rain events (And we ran a one tailed test on the upper side)(And we ran a one tailed test on the upper side)

We looked at a Mine and the amount of ore below cut-off We looked at a Mine and the amount of ore below cut-off grade that would go to the dumpgrade that would go to the dump We aren’t going to dump our high grade ore – we really only We aren’t going to dump our high grade ore – we really only

care about how much stuff is on the lower endcare about how much stuff is on the lower end (And we ran a one tailed test on the lower side)(And we ran a one tailed test on the lower side)

We looked at tolerance on a machined partWe looked at tolerance on a machined part The spec said we had to be plus or minus so our customer The spec said we had to be plus or minus so our customer

would be upset if the pegs were too big or too littlewould be upset if the pegs were too big or too little (And we ran a two tailed test)(And we ran a two tailed test)

Determine whether to run a one or two tailed test based on Determine whether to run a one or two tailed test based on the concerns for the process or design you are working on – the concerns for the process or design you are working on – not from peaking at the data.not from peaking at the data.