This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
▶ We explored the fundamental idea of Frequentist statistics,namely that we live in an uncertain world, and eachmeasurement we make of it is drawn randomly from some(unspecified) Probability Distribution Function, or PDF.
▶ If we know the shape of a PDF, we can compute ways ofcharacterising it – for example, by computing its mean andmedian, or standard deviation and interquartile range.
HT 2018 Statistics Lecture 2 — Introduction 3
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
However...
▶ When we do experiments, we make one or moremeasurements of an unknown quantity. We don’t know whatthe PDF of the unknown quantity looks like (otherwise therewould be no point in doing the experiment!)
▶ As we repeat the experiment more and more times, we aredrawing samples at random from the underlying PDF. (This isoften referred to as “simple random sampling”)
▶ We want to infer as much as we can about the properties ofthe underlying distribution as a whole based on this sample.
▶ Things are complicated by the fact that there are, in general,infinitely many distributions that the data could have comefrom!
HT 2018 Statistics Lecture 2 — Introduction 4
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
However...
▶ When we do experiments, we make one or moremeasurements of an unknown quantity. We don’t know whatthe PDF of the unknown quantity looks like (otherwise therewould be no point in doing the experiment!)
▶ As we repeat the experiment more and more times, we aredrawing samples at random from the underlying PDF. (This isoften referred to as “simple random sampling”)
▶ We want to infer as much as we can about the properties ofthe underlying distribution as a whole based on this sample.
▶ Things are complicated by the fact that there are, in general,infinitely many distributions that the data could have comefrom!
HT 2018 Statistics Lecture 2 — Introduction 4
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
However...
▶ When we do experiments, we make one or moremeasurements of an unknown quantity. We don’t know whatthe PDF of the unknown quantity looks like (otherwise therewould be no point in doing the experiment!)
▶ As we repeat the experiment more and more times, we aredrawing samples at random from the underlying PDF. (This isoften referred to as “simple random sampling”)
▶ We want to infer as much as we can about the properties ofthe underlying distribution as a whole based on this sample.
▶ Things are complicated by the fact that there are, in general,infinitely many distributions that the data could have comefrom!
HT 2018 Statistics Lecture 2 — Introduction 4
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
However...
▶ When we do experiments, we make one or moremeasurements of an unknown quantity. We don’t know whatthe PDF of the unknown quantity looks like (otherwise therewould be no point in doing the experiment!)
▶ As we repeat the experiment more and more times, we aredrawing samples at random from the underlying PDF. (This isoften referred to as “simple random sampling”)
▶ We want to infer as much as we can about the properties ofthe underlying distribution as a whole based on this sample.
▶ Things are complicated by the fact that there are, in general,infinitely many distributions that the data could have comefrom!
HT 2018 Statistics Lecture 2 — Introduction 4
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
An example
Let’s consider the height of people in the UK. Population datashows that, ignoring sex, on average our height is normallydistributed (with µ = 1686 mm, σ = 98.89 mm):
Imagine I pick five people at random from this room, measurethem, and obtain their heights as xi = 1589, 1565, 1529,1823, 1694 mm.I’d like to try to estimate the population mean (and ideallystandard deviation) from these five numbers. It turns out that thebest I can do is estimate the population mean and standarddeviation, µ and σ from these five numbers, using the constructs
x̄ =1n
n∑i=1
xi, s =
√√√√ 1n − 1
n∑i=1
(xi − x)2,
where n is the number of samples I have taken (5!)
What happens if I ask more people stand up, and measure them?Or what if I tell those people to sit down, and measure another fiveinstead?My values for x̄ and s will change. Let’s do this a few times andmake up a histogram of values for x̄. This histogram ultimatelybecomes known as the sampling distribution of the mean and thestandard deviation respectively.
If I continue doing this, I get an idea of the distribution of thesample mean when I measure the height of five people: here’s theplot with 200 lots of samples of 5:
Now, clearly I’ve done this a bit strangely: if I measure the heightof 5 × 20 000 people, I’d probably be much better off computing x̄of all of them!What happens if repeat the above, but draw samples containing 30people instead of 5?
William Sealy Gosset (New College, graduated in 1899, read
chemistry and maths) was employed by the GuinnessSon and Co. brewery in Dublin straight outof university, initially doing something thatwe would perhaps regard as industrialbiochemistry – systematically optimising beerquality given variable starting products andconditions.
HT 2018 Statistics Lecture 2 — William Sealy Gosset 16
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
The t-distribution
Guinness had a policy of employing Oxbridgegraduates, as they “found before them analmost unexplored field lying open toinvestigation. A great mass of data wasavailable or could easily be collected whichwould throw light on the relations, hithertoundetermined or only guessed at in anempirical way, between the quality of the rawmaterials of beer, such as barley and hops,the conditions of production and the qualityof the finished article.”
Biometrika, Volume 30, Issue 3-4, 1 January 1939, pp. 210–250
HT 2018 Statistics Lecture 2 — William Sealy Gosset 16
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
The t-distribution
After two years of training to be a Brewer, heset his mind to work trying to improve theproduction process, and was specificallyinterested in the sugar composition of maltedbarley, which affected the alcohol content ofthe final product (and hence the tax bill!).
He, and others in the firm, had difficultycoming to firm conclusions on matters suchas whether or not the nitrogen soil contenton a barley farm mattered due to a lot ofvariation in the measurement, and the factthat the sample sizes were necessarily low(due to the limited availability of comparablebarley farms in Ireland).
HT 2018 Statistics Lecture 2 — William Sealy Gosset 16
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
The t-distribution
Naturally, Gosset tried to work out theshape of the sampling distributions Ishowed you above – and wrote aninternal memo to the other brewers inGuinness, entitled “The application ofthe law of error to the work of theBrewery” (1904) detailing some of hisprogress.
He published the results in 1908, underthe pseudonym “Student”.
Biometrika, Volume 6, Issue 1, 1 March 1908, pp. 1–25.
t-DistributionDRAUGHT
19 08
HT 2018 Statistics Lecture 2 — William Sealy Gosset 17
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
The t-distribution
▶ The trouble with investigating x̄ and s is that they depend onthe problem at hand.
▶ One way to make all problems “look the same” is tostandardise them, through by computing the quantity
x̄ − µ
σ/√
n .
If we know µ and σ, then the expected value for this quantityis a normal distribution of mean 0 and variance 1.
▶ This was known before Student, and most people assumedthat s was a very good approximation for σ. This is true with“enough” samples.
HT 2018 Statistics Lecture 2 — The t-distribution 18
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
The t-distribution
▶ The trouble with investigating x̄ and s is that they depend onthe problem at hand.
▶ One way to make all problems “look the same” is tostandardise them, through by computing the quantity
x̄ − µ
σ/√
n .
If we know µ and σ, then the expected value for this quantityis a normal distribution of mean 0 and variance 1.
▶ This was known before Student, and most people assumedthat s was a very good approximation for σ. This is true with“enough” samples.
HT 2018 Statistics Lecture 2 — The t-distribution 18
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
The t-distribution
▶ The trouble with investigating x̄ and s is that they depend onthe problem at hand.
▶ One way to make all problems “look the same” is tostandardise them, through by computing the quantity
x̄ − µ
σ/√
n .
If we know µ and σ, then the expected value for this quantityis a normal distribution of mean 0 and variance 1.
▶ This was known before Student, and most people assumedthat s was a very good approximation for σ. This is true with“enough” samples.
HT 2018 Statistics Lecture 2 — The t-distribution 18
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
The t-distribution
▶ Student showed that if we don’t know σ, but know s and thedata really are sampled from a normal distribution, then thequantity
Z =x̄ − µ
s/√
nfollows a different distribution, which has since become knownas the t-distribution. NB: some authors call this T or t, and use Z for the case where
n → ∞. To try and be concise, I’m calling it Z regardless of n.
▶ (Quantities like Z have a specific name – formally it is known as a test statistic.)
▶ t depends on a parameter, known as the number of degrees offreedom, ν, which here is n − 1. As ν → ∞, the t-distributionbecomes the normal distribution with mean 0 and variance 1.
HT 2018 Statistics Lecture 2 — The t-distribution 19
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
The t-distribution
▶ Student showed that if we don’t know σ, but know s and thedata really are sampled from a normal distribution, then thequantity
Z =x̄ − µ
s/√
nfollows a different distribution, which has since become knownas the t-distribution. NB: some authors call this T or t, and use Z for the case where
n → ∞. To try and be concise, I’m calling it Z regardless of n.
▶ (Quantities like Z have a specific name – formally it is known as a test statistic.)
▶ t depends on a parameter, known as the number of degrees offreedom, ν, which here is n − 1. As ν → ∞, the t-distributionbecomes the normal distribution with mean 0 and variance 1.
HT 2018 Statistics Lecture 2 — The t-distribution 19
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
The t-distribution
▶ Student showed that if we don’t know σ, but know s and thedata really are sampled from a normal distribution, then thequantity
Z =x̄ − µ
s/√
nfollows a different distribution, which has since become knownas the t-distribution. NB: some authors call this T or t, and use Z for the case where
n → ∞. To try and be concise, I’m calling it Z regardless of n.
▶ (Quantities like Z have a specific name – formally it is known as a test statistic.)
▶ t depends on a parameter, known as the number of degrees offreedom, ν, which here is n − 1. As ν → ∞, the t-distributionbecomes the normal distribution with mean 0 and variance 1.
HT 2018 Statistics Lecture 2 — The t-distribution 19
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
The t-distribution
▶ For a small number of degrees of freedom, t is “broader” thanthe corresponding Gaussian, and has fatter tails.
▶ The full analytic form for t is mildly hairy, but computers arevery good at providing numbers from it should we need them:
HT 2018 Statistics Lecture 2 — The t-distribution 20
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
The t-distribution
▶ For a small number of degrees of freedom, t is “broader” thanthe corresponding Gaussian, and has fatter tails.
▶ The full analytic form for t is mildly hairy, but computers arevery good at providing numbers from it should we need them:
pt(x, ν) =Γ(ν+1
2)
√νπ Γ
(ν2) (1 +
x2
ν
)− ν+12
Where Γ(ν) =
∫ ∞
0xν−1e−x dx
HT 2018 Statistics Lecture 2 — The t-distribution 20
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
The t-distribution
The t-distribution with ν degrees of freedom looks like this:
0.0
0.1
0.2
0.3
0.4
−2 0 2x
Prob
abilit
y
ν
= 1
= 2
= 3
= 40
= inf
t
HT 2018 Statistics Lecture 2 — The t-distribution 21
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
The t-distribution
0.0000.0020.004
1400 1600 1800Mean height (mm)
Prob
abilit
y
0.00000.00250.00500.0075
0 2500.0000.0020.0040.0060.008
0 250Prob
abilit
y
500 750 1000Standard deviation of height (mm)
Prob
abilit
y
0.00.10.20.3
−10 −5 0 5 10Z
Prob
abilit
y
ν = 1
HT 2018 Statistics Lecture 2 — The t-distribution 22
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
The t-distribution
0.005
0.00000.00250.0050
0.000
0.0080.012
0 250 500 750 1000Standard deviation of height (mm)
0.0075
1400 1600 1800Mean height (mm)
Prob
abilit
y
0.0000Pr
obab
ility
0.00.10.20.3
−10 −5 0 5 10Z
Prob
abilit
y
ν = 4
HT 2018 Statistics Lecture 2 — The t-distribution 22
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
The t-distribution
0.000.010.02
1400 1600 1800Mean height (mm)
Prob
abilit
y
0.000.010.020.030.04
0 250 500 750 1000Standard deviation of height (mm)
Prob
abilit
y
0.00.10.20.30.4
−10 −5 0 5 10Z
Prob
abilit
y
ν = 40
HT 2018 Statistics Lecture 2 — The t-distribution 22
Now, the thing about zα/2 is that it’s just a number, chosen todivide the area under the curve as shown so that most of it lieswithin a particular region.
Now, the thing about zα/2 is that it’s just a number, chosen todivide the area under the curve as shown so that most of it lieswithin a particular region. Specifically, for some number α knownas the “significance level” (which is typically chosen to be 5%), wewant ∫ zα/2
Now, the thing about zα/2 is that it’s just a number, chosen todivide the area under the curve as shown so that most of it lieswithin a particular region. Specifically, for some number α knownas the “significance level” (which is typically chosen to be 5%), wewant ∫ zα/2
−zα/2
p(t, ν) dt = 1 − α
We can’t do this integral by hand very easily, but a computer can.In R, zα/2 is given by qt(1-alpha/2,df=n-1) where you fill in nand alpha to taste.
This means that if I repeated everything again and again, 95%of the time the population mean would lie within this interval.In other words, A (1 − α)× 100% confidence interval is an intervalcalculated using a procedure such that it will contain the truevalue (1 − α)× 100% of the times you use it, but the rest of thetime you will be unlucky.
In R, it’s easy to generate a whatever-precision-you-like confidencelimit, e.g. for the upper 95% limit:mean(data) +qt(0.975,df=length(data)-1)*sd(data)/sqrt(length(data))
If we set zα/2 to one we obtain an estimator for the standarderror on the mean or SEM, which is the standard deviation ofthe mean’s sampling distribution.
If we set zα/2 to one we obtain an estimator for the standarderror on the mean or SEM, which is the standard deviation ofthe mean’s sampling distribution.I.e., I mean this:
▶ In the language of statistics sex – here a variable relevant tothe quantity at hand – is called a factor, and its levels are“male” and “female”. Colloquially we may refer to them asgroups.
▶ Note that we can’t “rank” or “order” levels; consequently sexis called a categorical variable.
▶ (Sometimes we do have categorical variables that we might be able to rank, like that classic ‘Strongly
agree, Agree, ..., Strongly Disagree’ scale that you might have seen before. These are known as ordinal
variables, as they can be ordered)
HT 2018 Statistics Lecture 2 — t-tests 37
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Factors
Because we often have many factors that may influence aparticular experiment, it’s much more common to see factorsplotted on an x axis, e.g. in a box plot:
x7580
8590
95
Example BoxplotVa
lue
Largest non-extreme value(typically 1.5 × IQR)
Upper quartileMedian
Lower quartile
Smallest non−extreme value
Extreme value
HT 2018 Statistics Lecture 2 — t-tests 38
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
FactorsBecause we often have many factors that may influence aparticular experiment, it’s much more common to see factorsplotted on an x axis, e.g. in a box plot:
1300
1500
1700
1900
M F
Sex
Hei
ght (
mm
)
HT 2018 Statistics Lecture 2 — t-tests 38
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
FactorsBecause we often have many factors that may influence aparticular experiment, it’s much more common to see factorsplotted on an x axis, e.g. in a box plot:
Med
ian
Thyr
otro
pin
Leve
l(m
U/li
ter)
3.0
2.5
1.5
1.0
2.0
0.5
0.0No Thyroxine
TreatmentThyroxineTreatmentwith Low
ThyrotropinLevel
Treatment withThyroxine
andOmeprazole
Treatment withHigher-Dose
Thyroxineand
Omeprazole
HT 2018 Statistics Lecture 2 — t-tests 38
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Statistical tests
A common question (asked by Student and many other peoplesince!) is as follows:
HT 2018 Statistics Lecture 2 — t-tests 39
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Statistical tests
A common question (asked by Student and many other peoplesince!) is as follows:
H1: Given that I have measured a set of samples in both groups, isthere evidence that there is a difference in the population means ofboth groups?
HT 2018 Statistics Lecture 2 — t-tests 39
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Statistical tests
A common question (asked by Student and many other peoplesince!) is as follows:
H1: Given that I have measured a set of samples in both groups, isthere evidence that there is a difference in the population means ofboth groups?
H0: Or could my samples all come from one similar underlyingdistribution? (It’s always important to consider the case wherenothing happens!)
HT 2018 Statistics Lecture 2 — t-tests 39
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Statistical tests
0.0
0.5
1.0
1.5
2.0
1400 1600 1800
Height (mm)
Peo
ple Sex
M
F
HT 2018 Statistics Lecture 2 — t-tests 40
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
One-sample Student’s t-test
▶ The way we deal with this is by going back to the expressionfor Z we had earlier – we know that x̄−µ
s/√n is t-distributed withn − 1 degrees of freedom.
▶ So, if we want to test the hypothesis that x̄ is equal to somespecified value µ, we just compute
x̄ − µ
s/√
n .
Since we know that this quantity is t-distributed, and we havethe ability to look up values of zα, we can see if this is verylikely – i.e. obtain a value p that represents theprobability of observing a value at least as extreme asthe one observed.
HT 2018 Statistics Lecture 2 — t-tests 41
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
One-sample Student’s t-test
▶ The way we deal with this is by going back to the expressionfor Z we had earlier – we know that x̄−µ
s/√n is t-distributed withn − 1 degrees of freedom.
▶ So, if we want to test the hypothesis that x̄ is equal to somespecified value µ, we just compute
x̄ − µ
s/√
n .
Since we know that this quantity is t-distributed, and we havethe ability to look up values of zα, we can see if this is verylikely – i.e. obtain a value p that represents theprobability of observing a value at least as extreme asthe one observed.
HT 2018 Statistics Lecture 2 — t-tests 41
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
One-sample Student’s t-test
▶ More “extreme” values of observed Z are larger in magnitude,and are less likely to occur.
▶ Here,“more extreme” means having a Z-value at least as greatin magnitude (at least as far from zero) as the observedZ-value. This means this is called a two-tailed test, as I’minterested in both tails of the t-distribution
▶ If, a priori I have a good reason to know that an effect canonly possibly exist in one direction, I can do a one-tailed test.This is discouraged.
HT 2018 Statistics Lecture 2 — t-tests 42
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
One-sample Student’s t-test
▶ More “extreme” values of observed Z are larger in magnitude,and are less likely to occur.
▶ Here,“more extreme” means having a Z-value at least as greatin magnitude (at least as far from zero) as the observedZ-value. This means this is called a two-tailed test, as I’minterested in both tails of the t-distribution
▶ If, a priori I have a good reason to know that an effect canonly possibly exist in one direction, I can do a one-tailed test.This is discouraged.
HT 2018 Statistics Lecture 2 — t-tests 42
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
An example
▶ Suppose I measure the plasma iron concentration in fivepeople with a particular SNP in the gene called HBB, whichcodes for a protein whose absence or reduction is known tocause thalassemia, a form of anaemia that arises becauseblood cells are destroyed.
▶ The “reference range” for a normal healthy adult is 11 –32 µmol l−1, reflecting the fact that plasma iron can changedue to physiological reasons in different people.
▶ I measure their plasma iron concentration as being 42, 34, 48,45, and 55 µmol l−1.
▶ Is this different from the known population maximum value ofµ = 32 µmol l−1?
HT 2018 Statistics Lecture 2 — t-tests 43
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
An example
▶ Suppose I measure the plasma iron concentration in fivepeople with a particular SNP in the gene called HBB, whichcodes for a protein whose absence or reduction is known tocause thalassemia, a form of anaemia that arises becauseblood cells are destroyed.
▶ The “reference range” for a normal healthy adult is 11 –32 µmol l−1, reflecting the fact that plasma iron can changedue to physiological reasons in different people.
▶ I measure their plasma iron concentration as being 42, 34, 48,45, and 55 µmol l−1.
▶ Is this different from the known population maximum value ofµ = 32 µmol l−1?
HT 2018 Statistics Lecture 2 — t-tests 43
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
An example
▶ Suppose I measure the plasma iron concentration in fivepeople with a particular SNP in the gene called HBB, whichcodes for a protein whose absence or reduction is known tocause thalassemia, a form of anaemia that arises becauseblood cells are destroyed.
▶ The “reference range” for a normal healthy adult is 11 –32 µmol l−1, reflecting the fact that plasma iron can changedue to physiological reasons in different people.
▶ I measure their plasma iron concentration as being 42, 34, 48,45, and 55 µmol l−1.
▶ Is this different from the known population maximum value ofµ = 32 µmol l−1?
HT 2018 Statistics Lecture 2 — t-tests 43
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
An example
▶ Suppose I measure the plasma iron concentration in fivepeople with a particular SNP in the gene called HBB, whichcodes for a protein whose absence or reduction is known tocause thalassemia, a form of anaemia that arises becauseblood cells are destroyed.
▶ The “reference range” for a normal healthy adult is 11 –32 µmol l−1, reflecting the fact that plasma iron can changedue to physiological reasons in different people.
▶ I measure their plasma iron concentration as being 42, 34, 48,45, and 55 µmol l−1.
▶ Is this different from the known population maximum value ofµ = 32 µmol l−1?
HT 2018 Statistics Lecture 2 — t-tests 43
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
An example
▶ So, I obtain x̄ = 44.8 and s = 7.73 µmol l−1 as estimators forthe population mean (µ1) and SD (σ1) of the ironconcentration.
▶ I can then formally state the hypothesis that I am testing:
H0 :The sample is drawn from the healthy population: µ1 = µ
H1 :They’re different: µ1 ̸= µ
▶ I then computeZ =
x̄ − 32s/√
n ≈ 3.70
HT 2018 Statistics Lecture 2 — t-tests 44
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
An example
▶ So, I obtain x̄ = 44.8 and s = 7.73 µmol l−1 as estimators forthe population mean (µ1) and SD (σ1) of the ironconcentration.
▶ I can then formally state the hypothesis that I am testing:
H0 :The sample is drawn from the healthy population: µ1 = µ
H1 :They’re different: µ1 ̸= µ
▶ I then computeZ =
x̄ − 32s/√
n ≈ 3.70
HT 2018 Statistics Lecture 2 — t-tests 44
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
An example
▶ So, I obtain x̄ = 44.8 and s = 7.73 µmol l−1 as estimators forthe population mean (µ1) and SD (σ1) of the ironconcentration.
▶ I can then formally state the hypothesis that I am testing:
H0 :The sample is drawn from the healthy population: µ1 = µ
H1 :They’re different: µ1 ̸= µ
▶ I then computeZ =
x̄ − 32s/√
n ≈ 3.70
HT 2018 Statistics Lecture 2 — t-tests 44
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
An example
▶ Looking this up in either a big table of t values (or using acomputer) for the distribution with 4 degrees of freedom (i.e.5-1) I find that this corresponds to a p value of 0.021.
▶ This is less than the commonly-used significance thresholdof 0.05 – in other words the sample mean is likely to bedifferent from the population mean.
▶ I can therefore say that we state that as p < 0.05 we rejectthe null hypothesis at the 5% level and conclude that thosepeople with the SNP in question are likely to have higherplasma iron levels than the reference range.
▶ In R we’d do this more concisely as: t.test(data, mu=32).
HT 2018 Statistics Lecture 2 — t-tests 45
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
An example▶ Looking this up in either a big table of t values (or using a
computer) for the distribution with 4 degrees of freedom (i.e.5-1) I find that this corresponds to a p value of 0.021.
...p is between 0.05 and 0.02
We measured Z≈3.7, which implies that....
▶ This is less than the commonly-used significance thresholdof 0.05 – in other words the sample mean is likely to bedifferent from the population mean.
▶ I can therefore say that we state that as p < 0.05 we rejectthe null hypothesis at the 5% level and conclude that thosepeople with the SNP in question are likely to have higherplasma iron levels than the reference range.
▶ In R we’d do this more concisely as: t.test(data, mu=32).
HT 2018 Statistics Lecture 2 — t-tests 45
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
An example
▶ Looking this up in either a big table of t values (or using acomputer) for the distribution with 4 degrees of freedom (i.e.5-1) I find that this corresponds to a p value of 0.021.
▶ This is less than the commonly-used significance thresholdof 0.05 – in other words the sample mean is likely to bedifferent from the population mean.
▶ I can therefore say that we state that as p < 0.05 we rejectthe null hypothesis at the 5% level and conclude that thosepeople with the SNP in question are likely to have higherplasma iron levels than the reference range.
▶ In R we’d do this more concisely as: t.test(data, mu=32).
HT 2018 Statistics Lecture 2 — t-tests 45
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
An example
▶ Looking this up in either a big table of t values (or using acomputer) for the distribution with 4 degrees of freedom (i.e.5-1) I find that this corresponds to a p value of 0.021.
▶ This is less than the commonly-used significance thresholdof 0.05 – in other words the sample mean is likely to bedifferent from the population mean.
▶ I can therefore say that we state that as p < 0.05 we rejectthe null hypothesis at the 5% level and conclude that thosepeople with the SNP in question are likely to have higherplasma iron levels than the reference range.
▶ In R we’d do this more concisely as: t.test(data, mu=32).
HT 2018 Statistics Lecture 2 — t-tests 45
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
An example
▶ Looking this up in either a big table of t values (or using acomputer) for the distribution with 4 degrees of freedom (i.e.5-1) I find that this corresponds to a p value of 0.021.
▶ This is less than the commonly-used significance thresholdof 0.05 – in other words the sample mean is likely to bedifferent from the population mean.
▶ I can therefore say that we state that as p < 0.05 we rejectthe null hypothesis at the 5% level and conclude that thosepeople with the SNP in question are likely to have higherplasma iron levels than the reference range.
▶ In R we’d do this more concisely as: t.test(data, mu=32).
HT 2018 Statistics Lecture 2 — t-tests 45
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Spoilers ahead
If you have already covered this material before, and are waitingfor me to start talking about all the assumptions and problemsof t-tests, and say words like ‘Type I error’, that’s the subject ofthe next lecture.
For now, let’s extend this machinery to compare two differentgroups of samples, and ask what evidence there is that theirmeans are different.
HT 2018 Statistics Lecture 2 — t-tests 46
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Two-sample Student’s t-tests
0.0
0.5
1.0
1.5
2.0
1400 1600 1800
Height (mm)
Peo
ple Sex
M
F
HT 2018 Statistics Lecture 2 — t-tests 47
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Two sample Student’s t-tests
▶ The general approach is similar, but things are morecomplicated because we don’t know either mean exactly.
▶ Moreover, we also don’t know either standard deviation!
HT 2018 Statistics Lecture 2 — t-tests 48
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Two sample Student’s t-tests
▶ The general approach is similar, but things are morecomplicated because we don’t know either mean exactly.
▶ Moreover, we also don’t know either standard deviation!
Group A Parameter Group Bx̄A Sample mean x̄BsA Sample standard deviation sBnA Number of samples nB
HT 2018 Statistics Lecture 2 — t-tests 48
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Two-sample Student’s t-test
▶ It turns out that there are two fundamentally different waysof interpreting the two different estimates for populationvariance that the two groups give us.
▶ We can either assume that both groups have the samevariance, or, unsurprisingly, different variance.
HT 2018 Statistics Lecture 2 — t-tests 49
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Two-sample Student’s t-test
▶ It turns out that there are two fundamentally different waysof interpreting the two different estimates for populationvariance that the two groups give us.
▶ We can either assume that both groups have the samevariance, or, unsurprisingly, different variance.
HT 2018 Statistics Lecture 2 — t-tests 49
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Equal variance two-sample Student’s t-test
If the two groups have the same mean, then the difference of x̄Aand x̄B should be, on average, zero.
HT 2018 Statistics Lecture 2 — t-tests 50
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Equal variance two-sample Student’s t-test
If the two groups have the same mean, then the difference of x̄Aand x̄B should be, on average, zero.
It turns out that we can construct a pooled estimate of thestandard deviation, if we assume that it’s common to both groups.This estimate is
sp =
√(nA − 1)s2
A + (nB − 1)s2B
nA + nB − 2
HT 2018 Statistics Lecture 2 — t-tests 50
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Equal variance two-sample Student’s t-test
We therefore construct the quantity
x̄A − x̄B
sp√
1nA
+ 1nB
,
which is very much like before – it’s t-distributed, but withnA + nB − 2 degrees of freedom.
HT 2018 Statistics Lecture 2 — t-tests 51
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Equal variance two-sample Student’s t-test
We therefore construct the quantity
x̄A − x̄B
sp√
1nA
+ 1nB
,
which is very much like before – it’s t-distributed, but withnA + nB − 2 degrees of freedom.
To test H1 : the groups A and B have different population means,we plug the numbers in and compare the value we get to thet-distribution.
HT 2018 Statistics Lecture 2 — t-tests 51
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Unequal variance two-sample Welch’s t-test
If we take the different s’s of acting as estimators forfundamentally different population variances, then the picture ismore complex. (It was originally described by B. L. Welch in 1947, in Biometrika, 34, 28 – 35)
HT 2018 Statistics Lecture 2 — t-tests 52
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Unequal variance two-sample Welch’s t-test
If we take the different s’s of acting as estimators forfundamentally different population variances, then the picture ismore complex. (It was originally described by B. L. Welch in 1947, in Biometrika, 34, 28 – 35)
Here, the quantity in question is
Z =x̄A − x̄B√
s2A
nA+
s2B
nB
HT 2018 Statistics Lecture 2 — t-tests 52
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Unequal variance two-sample Welch’s t-test
There’s a catch, however, – this isn’t t-distributed “nicely”.
HT 2018 Statistics Lecture 2 — t-tests 53
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Unequal variance two-sample Welch’s t-test
There’s a catch, however, – this isn’t t-distributed “nicely”.
It’s t-distributed with (s2A
nA+
s2B
nB
)2
(s2AnA
)2
nA−1 +
(s2BnB
)2
nB−1
degrees of freedom (!)
HT 2018 Statistics Lecture 2 — t-tests 53
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Nevertheless...
▶ This is just a quantity that can be calculated, and thetheoretical t-value compared to the observed t-value.
▶ We can therefore perform a hypothesis test as before, andchoose to reject (or not) the null hypothesis that thepopulation means are the same at some chosen significancelevel.
HT 2018 Statistics Lecture 2 — t-tests 54
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Paired t-tests
One other very powerful trick is to perform repeated experimentson the same subject – for example, measuring a quantity with andwithout administration of a drug within a number of patients.
We’re then interested in changes, and if the mean differencebetween the groups is distinct from zero. As we obtain data inpairs, this is known as a paired test – and it can have morestatistical power.
HT 2018 Statistics Lecture 2 — t-tests 55
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
In practice...
0.0
0.5
1.0
1.5
2.0
1400 1600 1800
Height (mm)
Peo
ple Sex
M
F
HT 2018 Statistics Lecture 2 — t-tests 56
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
In practice...
> t.test(x=males, y=females)
Welch Two Sample t-test
data: males and femalest = 3.893, df = 9.9081, p-value = 0.003047alternative hypothesis:true difference in means is not equal to 0
95 percent confidence interval:53.23589 196.14649
sample estimates:mean of x mean of y1751.374 1626.683
HT 2018 Statistics Lecture 2 — t-tests 57
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Quick summary & Spoilers
▶ The sample mean and standard deviation provide estimatesfor the population mean and standard deviation.
▶ The standardised (or “Studentised”) constructs I’ve shownyou all have the same distribution – a t-distribution if thedata are drawn from a normal distribution.
▶ We can use this to infer whether or not two groups ofmeasurements are likely to have been drawn from onepopulation with one mean.
▶ Next time, I’ll talk a lot about the perils of the t-test, andwhat happens if your data are not normally distributed.
HT 2018 Statistics Lecture 2 — t-tests 58
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Quick summary & Spoilers
▶ The sample mean and standard deviation provide estimatesfor the population mean and standard deviation.
▶ The standardised (or “Studentised”) constructs I’ve shownyou all have the same distribution – a t-distribution if thedata are drawn from a normal distribution.
▶ We can use this to infer whether or not two groups ofmeasurements are likely to have been drawn from onepopulation with one mean.
▶ Next time, I’ll talk a lot about the perils of the t-test, andwhat happens if your data are not normally distributed.