Biochemistry Prelims Statistics Lecture II: [2em] Sampling ... · t-Distribution DRAUGHT 19 08 6 4 2 2 4 6 0.1 0.2 0.3 0.4 Biochemistry Prelims Statistics Lecture II: Sampling and

t-DistributionDRAUGHT

19 08

-6 -4 -2 2 4 6

0.1

0.2

0.3

0.4

Biochemistry Prelims StatisticsLecture II:

Sampling and the t-test

Jack J. Miller, [email protected]

Hilary Term 2018

. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Sampling

HT 2018 Statistics Lecture 2 — Introduction 2

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Last time. . .

▶ We explored the fundamental idea of Frequentist statistics,namely that we live in an uncertain world, and eachmeasurement we make of it is drawn randomly from some(unspecified) Probability Distribution Function, or PDF.

▶ If we know the shape of a PDF, we can compute ways ofcharacterising it – for example, by computing its mean andmedian, or standard deviation and interquartile range.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

However...

▶ When we do experiments, we make one or moremeasurements of an unknown quantity. We don’t know whatthe PDF of the unknown quantity looks like (otherwise therewould be no point in doing the experiment!)

▶ As we repeat the experiment more and more times, we aredrawing samples at random from the underlying PDF. (This isoften referred to as “simple random sampling”)

▶ We want to infer as much as we can about the properties ofthe underlying distribution as a whole based on this sample.

▶ Things are complicated by the fact that there are, in general,infinitely many distributions that the data could have comefrom!


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

However...






...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

However...






...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

However...






...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

Let’s consider the height of people in the UK. Population datashows that, ignoring sex, on average our height is normallydistributed (with µ = 1686 mm, σ = 98.89 mm):

0

500

1000

1400 1600 1800

Height (mm)

Peo

ple

HT 2018 Statistics Lecture 2 — Estimating parameters 5

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

Imagine I pick five people at random from this room, measurethem, and obtain their heights as xi = 1589, 1565, 1529,1823, 1694 mm.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

Imagine I pick five people at random from this room, measurethem, and obtain their heights as xi = 1589, 1565, 1529,1823, 1694 mm.I’d like to try to estimate the population mean (and ideallystandard deviation) from these five numbers. It turns out that thebest I can do is estimate the population mean and standarddeviation, µ and σ from these five numbers, using the constructs

x̄ =1n

n∑i=1

xi, s =

√√√√ 1n − 1

n∑i=1

(xi − x)2,

where n is the number of samples I have taken (5!)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

For these five numbers, I can easily compute

x̄ = 1640 mm, s =120 mm.

What happens if I ask more people stand up, and measure them?Or what if I tell those people to sit down, and measure another fiveinstead?


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

For these five numbers, I can easily compute

x̄ = 1640 mm, s =120 mm.

What happens if I ask more people stand up, and measure them?Or what if I tell those people to sit down, and measure another fiveinstead?My values for x̄ and s will change. Let’s do this a few times andmake up a histogram of values for x̄. This histogram ultimatelybecomes known as the sampling distribution of the mean and thestandard deviation respectively.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

To make things straightforward, let’s just consider x̄ for now –here’s a “histogram” of it for the five people I sampled above:

0

500

1000

1400 1600 1800Height (mm)

Peop

le

0.000.250.500.751.00

1400 1600 1800Mean height (mm)

Prob

abilit

y

HT 2018 Statistics Lecture 2 — Sampling distributions 7

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

Let’s tell them to sit down, and pick another five people instead:

0

500

1000

1400 1600 1800Height (mm)

Peop

le

0.000.100.200.300.400.50


Mean of new sample

Prob

abilit

y


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

If I continue doing this, I get an idea of the distribution of thesample mean when I measure the height of five people: here’s theplot with 200 lots of samples of 5:

0

500

1000

1400 1600 1800

Height (mm)

Peo

ple

0.0000

0.0025

0.0050

0.0075

1400 1600 1800

Mean height (mm)

Pro

babi

lity


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

...and with 20 000 lots of samples of five people each:

0

500

1000

1400 1600 1800Height (mm)

Peop

le

0.00000

0.00025

0.00050

0.00075


Prob

abilit

y


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

Now, clearly I’ve done this a bit strangely: if I measure the heightof 5 × 20 000 people, I’d probably be much better off computing x̄of all of them!What happens if repeat the above, but draw samples containing 30people instead of 5?


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

One sample of 30 people:

0

500

1000

1400 1600 1800Height (mm)

Peop

le

0.000.250.500.751.00


Prob

abilit

y


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

Ten samples of 30 people:

0

500

1000

1400 1600 1800

Height (mm)

Peo

ple

0.00

0.02

0.04

0.06

1400 1600 1800

Mean height (mm)

Pro

babi

lity


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

20 000 samples of 30 people:

0

500

1000

1400 1600 1800

Height (mm)

Peo

ple

0.000

0.005

0.010

0.015

0.020

1400 1600 1800

Mean height (mm)

Pro

babi

lity


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

Let’s have a look at the sampling distribution of the mean forthese data, as we vary the number of samples we take, n:


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

Let’s have a look at the sampling distribution of the mean forthese data, as we vary the number of samples we take, n:

0.0000.0020.0040.006

1400 1600 1800Prob

abilit

y

0.00000.00250.00500.0075

1400 1600 1800Prob

abilit

y

0.0000.0050.0100.0150.020


Prob

abilit

y n = 30

n = 5

n = 2


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What about s2 and σ2?


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

0e+00

2e−05

4e−05

6e−05

8e−05

0 5000 10000 15000 20000 25000

Prob

abilit

y

0e+00

1e−04

2e−04

3e−04

4e−04

0 5000 10000 15000 20000 25000

Variance of height (mm2)

Prob

abilit

y

n = 2

n = 5

0.00000

0.00005

0.00010

0.00015

0 5000 10000 15000 20000 25000

Prob

abilit

y

n = 30


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What about s and σ?


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What about s and σ? Playing the same trick, we can show how sapproximates σ as the number of samples increases:

0.0000.0020.0040.0060.008

0 250 500 750 1000Prob

abilit

y

0.0000.0050.010

0 250 500 750 1000Prob

abilit

y

0.00000.00250.00500.00750.01000.0125

0 250 500 750 1000Standard deviation of height (mm)

Prob

abilit

y n = 30

n = 5

n = 2


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

So, to summarise, x̄ and s give us an estimate of µ and σ – butthis estimate is itself uncertain!

It turns out that x̄ and s2 are the Best Unbiased Estimators of µand σ2 that we can construct (in most circumstances).

Only as n → ∞ does x̄ → µ and s2 → σ2.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

BUE?

▶ What do I mean by Best and Unbiased?

▶ Unbiased means that they converge on the “right answer”,i.e. that

limn→∞

x̄ = µ and limn→∞

s2 = σ2

▶ Best means here that the width of their sampling distributionis minimal – i.e. they’re “usually close to the right answer”.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

BUE?

▶ What do I mean by Best and Unbiased?▶ Unbiased means that they converge on the “right answer”,

i.e. that

limn→∞


s2 = σ2



...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

BUE?

▶ What do I mean by Best and Unbiased?▶ Unbiased means that they converge on the “right answer”,

i.e. that

limn→∞


s2 = σ2



...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

A standard deviation related word of caution...

0.00000.00250.00500.00750.01000.0125


Prob

abilit

y n = 30

The di�erence is bias!

True answer

Mean of estimates (provided by s with n=30)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

A standard deviation related word of caution...

It turns out that s is a biased estimator of σ, but it is usually thebest we can do without knowing more about the (unknown)

distribution.

It can be shown that for any finite sample size s is always anunderestimate of σ, and that the bias originates due to the

nonlinear behaviour of the square root. This bias is small – if thedata are normally distributed, it is approximately σ/4n. We shall

henceforth ignore it.

(However, had we used the definition of standard deviation that involves division by n, as opposed to n − 1, we

would find a greater bias here – it would be wrong by a factor of n/(n − 1). More on this much later.)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Clearly, as n → ∞, everything gets easier.

However, biochemistry is filled with small n experiments, usuallyfor understandable reasons (e.g. cost and ethics).

Time for a historical interlude.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The t-distribution

William Sealy Gosset (New College, graduated in 1899, read

chemistry and maths) was employed by the GuinnessSon and Co. brewery in Dublin straight outof university, initially doing something thatwe would perhaps regard as industrialbiochemistry – systematically optimising beerquality given variable starting products andconditions.

HT 2018 Statistics Lecture 2 — William Sealy Gosset 16

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The t-distribution

Guinness had a policy of employing Oxbridgegraduates, as they “found before them analmost unexplored field lying open toinvestigation. A great mass of data wasavailable or could easily be collected whichwould throw light on the relations, hithertoundetermined or only guessed at in anempirical way, between the quality of the rawmaterials of beer, such as barley and hops,the conditions of production and the qualityof the finished article.”

Biometrika, Volume 30, Issue 3-4, 1 January 1939, pp. 210–250


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The t-distribution

After two years of training to be a Brewer, heset his mind to work trying to improve theproduction process, and was specificallyinterested in the sugar composition of maltedbarley, which affected the alcohol content ofthe final product (and hence the tax bill!).

He, and others in the firm, had difficultycoming to firm conclusions on matters suchas whether or not the nitrogen soil contenton a barley farm mattered due to a lot ofvariation in the measurement, and the factthat the sample sizes were necessarily low(due to the limited availability of comparablebarley farms in Ireland).


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The t-distribution

Naturally, Gosset tried to work out theshape of the sampling distributions Ishowed you above – and wrote aninternal memo to the other brewers inGuinness, entitled “The application ofthe law of error to the work of theBrewery” (1904) detailing some of hisprogress.

He published the results in 1908, underthe pseudonym “Student”.

Biometrika, Volume 6, Issue 1, 1 March 1908, pp. 1–25.

t-DistributionDRAUGHT

19 08


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The t-distribution

▶ The trouble with investigating x̄ and s is that they depend onthe problem at hand.

▶ One way to make all problems “look the same” is tostandardise them, through by computing the quantity

x̄ − µ

σ/√

n .

If we know µ and σ, then the expected value for this quantityis a normal distribution of mean 0 and variance 1.

▶ This was known before Student, and most people assumedthat s was a very good approximation for σ. This is true with“enough” samples.

HT 2018 Statistics Lecture 2 — The t-distribution 18

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The t-distribution



x̄ − µ

σ/√

n .




...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The t-distribution



x̄ − µ

σ/√

n .




...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The t-distribution

▶ Student showed that if we don’t know σ, but know s and thedata really are sampled from a normal distribution, then thequantity

Z =x̄ − µ

s/√

nfollows a different distribution, which has since become knownas the t-distribution. NB: some authors call this T or t, and use Z for the case where

n → ∞. To try and be concise, I’m calling it Z regardless of n.

▶ (Quantities like Z have a specific name – formally it is known as a test statistic.)

▶ t depends on a parameter, known as the number of degrees offreedom, ν, which here is n − 1. As ν → ∞, the t-distributionbecomes the normal distribution with mean 0 and variance 1.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The t-distribution


Z =x̄ − µ

s/√






...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The t-distribution


Z =x̄ − µ

s/√






...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The t-distribution

▶ For a small number of degrees of freedom, t is “broader” thanthe corresponding Gaussian, and has fatter tails.

▶ The full analytic form for t is mildly hairy, but computers arevery good at providing numbers from it should we need them:


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The t-distribution

▶ For a small number of degrees of freedom, t is “broader” thanthe corresponding Gaussian, and has fatter tails.

▶ The full analytic form for t is mildly hairy, but computers arevery good at providing numbers from it should we need them:

pt(x, ν) =Γ(ν+1

2)

√νπ Γ

(ν2) (1 +

x2

ν

)− ν+12

Where Γ(ν) =

∫ ∞

0xν−1e−x dx


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The t-distribution

The t-distribution with ν degrees of freedom looks like this:

0.0

0.1

0.2

0.3

0.4

−2 0 2x

Prob

abilit

y

ν

= 1

= 2

= 3

= 40

= inf

t


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The t-distribution

0.0000.0020.004


Prob

abilit

y

0.00000.00250.00500.0075

0 2500.0000.0020.0040.0060.008

0 250Prob

abilit

y

500 750 1000Standard deviation of height (mm)

Prob

abilit

y

0.00.10.20.3

−10 −5 0 5 10Z

Prob

abilit

y

ν = 1


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The t-distribution

0.005

0.00000.00250.0050

0.000

0.0080.012


0.0075


Prob

abilit

y

0.0000Pr

obab

ility

0.00.10.20.3

−10 −5 0 5 10Z

Prob

abilit

y

ν = 4


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The t-distribution

0.000.010.02


Prob

abilit

y

0.000.010.020.030.04


Prob

abilit

y

0.00.10.20.30.4

−10 −5 0 5 10Z

Prob

abilit

y

ν = 40


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Confidence limits

Why is this useful?

HT 2018 Statistics Lecture 2 — Confidence limits 23

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Confidence limits

Why is this useful? Because we don’t know µ, but we do knowthings about Z!


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Confidence limits

▶ Remember the first rule about probabilities, how they sum toone? Well, we can look at the above and write down astatement about probability –

P(−zα/2 ≤ Z ≤ zα/2) = 1 − α


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Confidence limits

▶ Remember the first rule about probabilities, how they sum toone? Well, we can look at the above and write down astatement about probability –

P(−zα/2 ≤ Z ≤ zα/2) = 1 − α


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Confidence limits

▶ Then, simply replacing Z, we get:

P(−zα/2 ≤ x̄ − µ

s/√

n ≤ zα/2

)= 1 − α


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Confidence limits

▶ Let’s do a bit of algebra inside the brackets:

−zα/2 ≤ x̄ − µ

s/√

n ≤ zα/2

−zα/2

(s√n

)≤ x̄ − µ ≤ +zα/2

(s√n

)−x̄ − zα/2

(s√n

)≤ −µ ≤ −x̄ + zα/2

(s√n

)HT 2018 Statistics Lecture 2 — Confidence limits 26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Confidence limits

x̄ − zα/2

(s√n

)≤ µ ≤ x̄ + zα/2

(s√n

)(Remember me!)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Confidence limits

Now, the thing about zα/2 is that it’s just a number, chosen todivide the area under the curve as shown so that most of it lieswithin a particular region.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Confidence limits

Now, the thing about zα/2 is that it’s just a number, chosen todivide the area under the curve as shown so that most of it lieswithin a particular region. Specifically, for some number α knownas the “significance level” (which is typically chosen to be 5%), wewant ∫ zα/2

−zα/2

p(t, ν) dt = 1 − α


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Confidence limits

Now, the thing about zα/2 is that it’s just a number, chosen todivide the area under the curve as shown so that most of it lieswithin a particular region. Specifically, for some number α knownas the “significance level” (which is typically chosen to be 5%), wewant ∫ zα/2

−zα/2

p(t, ν) dt = 1 − α

We can’t do this integral by hand very easily, but a computer can.In R, zα/2 is given by qt(1-alpha/2,df=n-1) where you fill in nand alpha to taste.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Confidence limits

▶ This lets us say some very powerful things about thelocation of the true population mean given the sample weobtained.

▶ For example, I can tell you that zα/2 for a large number ofsamples (ν → ∞) at the 5% significance limit isapproximately 1.96.

▶ This lets me say that the 95% confidence limit for thepopulation mean µ is

x̄ − 1.96 s√n ≤ µ ≤ x̄ + 1.96 s√

n


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Confidence limits

▶ This lets us say some very powerful things about thelocation of the true population mean given the sample weobtained.

▶ For example, I can tell you that zα/2 for a large number ofsamples (ν → ∞) at the 5% significance limit isapproximately 1.96.

▶ This lets me say that the 95% confidence limit for thepopulation mean µ is

x̄ − 1.96 s√n ≤ µ ≤ x̄ + 1.96 s√

n


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Confidence limits

This means that if I repeated everything again and again, 95%of the time the population mean would lie within this interval.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Confidence limits

This means that if I repeated everything again and again, 95%of the time the population mean would lie within this interval.In other words, A (1 − α)× 100% confidence interval is an intervalcalculated using a procedure such that it will contain the truevalue (1 − α)× 100% of the times you use it, but the rest of thetime you will be unlucky.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Confidence limits

In R, it’s easy to generate a whatever-precision-you-like confidencelimit, e.g. for the upper 95% limit:mean(data) +qt(0.975,df=length(data)-1)*sd(data)/sqrt(length(data))


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The Standard Error on the Mean

If we set zα/2 to one we obtain an estimator for the standarderror on the mean or SEM, which is the standard deviation ofthe mean’s sampling distribution.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


If we set zα/2 to one we obtain an estimator for the standarderror on the mean or SEM, which is the standard deviation ofthe mean’s sampling distribution.I.e., I mean this:

0

500

1000

1400 1600 1800Height (mm)

Peop

le

0.0000.0050.0100.0150.020


Prob

abilit

y

1σ = SEM


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


In other words,SEM =

σ√n ≈ s√

n(This definition is a bit wooly – everyone in the biosciences uses s which estimates σ )

Note that as n → ∞ the SEM tends to 0, i.e. the sample meantends to the population mean.

Not all authors of papers seem to appreciate this point!


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Summary so far

▶ Estimators based on a final sample size are inherentlyuncertain.

▶ The sampling distribution for an estimator tells us about thatuncertainty.

▶ It turns out that the sampling distribution of the sample meanis related to the t-distribution

▶ (I haven’t discussed this, but the sampling distribution of thevariance is related to something called the χ2 distribution)

▶ One can use knowledge of this to construct confidence limitson the mean.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The t-test

HT 2018 Statistics Lecture 2 — t-tests 35

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Factors

Now, let’s go back to the height histogram:

0

500

1000

1400 1600 1800

Height (mm)

Peo

ple


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Factors


0

500

1000

1400 1600 1800

Height (mm)

Peo

ple Sex

M

F


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Factors


0

300

600

900

1400 1600 1800Height (mm)

Peop

le SexM

F


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Factors

▶ In the language of statistics sex – here a variable relevant tothe quantity at hand – is called a factor, and its levels are“male” and “female”. Colloquially we may refer to them asgroups.

▶ Note that we can’t “rank” or “order” levels; consequently sexis called a categorical variable.

▶ (Sometimes we do have categorical variables that we might be able to rank, like that classic ‘Strongly

agree, Agree, ..., Strongly Disagree’ scale that you might have seen before. These are known as ordinal

variables, as they can be ordered)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Factors

Because we often have many factors that may influence aparticular experiment, it’s much more common to see factorsplotted on an x axis, e.g. in a box plot:

x7580

8590

95

Example BoxplotVa

lue

Largest non-extreme value(typically 1.5 × IQR)

Upper quartileMedian

Lower quartile

Smallest non−extreme value

Extreme value


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

FactorsBecause we often have many factors that may influence aparticular experiment, it’s much more common to see factorsplotted on an x axis, e.g. in a box plot:

1300

1500

1700

1900

M F

Sex

Hei

ght (

mm

)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

FactorsBecause we often have many factors that may influence aparticular experiment, it’s much more common to see factorsplotted on an x axis, e.g. in a box plot:

Med

ian

Thyr

otro

pin

Leve

l(m

U/li

ter)

3.0

2.5

1.5

1.0

2.0

0.5

0.0No Thyroxine

TreatmentThyroxineTreatmentwith Low

ThyrotropinLevel

Treatment withThyroxine

andOmeprazole

Treatment withHigher-Dose

Thyroxineand

Omeprazole


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Statistical tests

A common question (asked by Student and many other peoplesince!) is as follows:


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Statistical tests


H1: Given that I have measured a set of samples in both groups, isthere evidence that there is a difference in the population means ofboth groups?


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Statistical tests


H1: Given that I have measured a set of samples in both groups, isthere evidence that there is a difference in the population means ofboth groups?

H0: Or could my samples all come from one similar underlyingdistribution? (It’s always important to consider the case wherenothing happens!)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Statistical tests

0.0

0.5

1.0

1.5

2.0

1400 1600 1800

Height (mm)

Peo

ple Sex

M

F


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

One-sample Student’s t-test

▶ The way we deal with this is by going back to the expressionfor Z we had earlier – we know that x̄−µ

s/√n is t-distributed withn − 1 degrees of freedom.

▶ So, if we want to test the hypothesis that x̄ is equal to somespecified value µ, we just compute

x̄ − µ

s/√

n .

Since we know that this quantity is t-distributed, and we havethe ability to look up values of zα, we can see if this is verylikely – i.e. obtain a value p that represents theprobability of observing a value at least as extreme asthe one observed.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


▶ The way we deal with this is by going back to the expressionfor Z we had earlier – we know that x̄−µ

s/√n is t-distributed withn − 1 degrees of freedom.

▶ So, if we want to test the hypothesis that x̄ is equal to somespecified value µ, we just compute

x̄ − µ

s/√

n .

Since we know that this quantity is t-distributed, and we havethe ability to look up values of zα, we can see if this is verylikely – i.e. obtain a value p that represents theprobability of observing a value at least as extreme asthe one observed.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


▶ More “extreme” values of observed Z are larger in magnitude,and are less likely to occur.

▶ Here,“more extreme” means having a Z-value at least as greatin magnitude (at least as far from zero) as the observedZ-value. This means this is called a two-tailed test, as I’minterested in both tails of the t-distribution

▶ If, a priori I have a good reason to know that an effect canonly possibly exist in one direction, I can do a one-tailed test.This is discouraged.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


▶ More “extreme” values of observed Z are larger in magnitude,and are less likely to occur.

▶ Here,“more extreme” means having a Z-value at least as greatin magnitude (at least as far from zero) as the observedZ-value. This means this is called a two-tailed test, as I’minterested in both tails of the t-distribution

▶ If, a priori I have a good reason to know that an effect canonly possibly exist in one direction, I can do a one-tailed test.This is discouraged.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

▶ Suppose I measure the plasma iron concentration in fivepeople with a particular SNP in the gene called HBB, whichcodes for a protein whose absence or reduction is known tocause thalassemia, a form of anaemia that arises becauseblood cells are destroyed.

▶ The “reference range” for a normal healthy adult is 11 –32 µmol l−1, reflecting the fact that plasma iron can changedue to physiological reasons in different people.

▶ I measure their plasma iron concentration as being 42, 34, 48,45, and 55 µmol l−1.

▶ Is this different from the known population maximum value ofµ = 32 µmol l−1?


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example






...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example






...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example






...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

▶ So, I obtain x̄ = 44.8 and s = 7.73 µmol l−1 as estimators forthe population mean (µ1) and SD (σ1) of the ironconcentration.

▶ I can then formally state the hypothesis that I am testing:

H0 :The sample is drawn from the healthy population: µ1 = µ

H1 :They’re different: µ1 ̸= µ

▶ I then computeZ =

x̄ − 32s/√

n ≈ 3.70


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example






x̄ − 32s/√

n ≈ 3.70


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example






x̄ − 32s/√

n ≈ 3.70


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

▶ Looking this up in either a big table of t values (or using acomputer) for the distribution with 4 degrees of freedom (i.e.5-1) I find that this corresponds to a p value of 0.021.

▶ This is less than the commonly-used significance thresholdof 0.05 – in other words the sample mean is likely to bedifferent from the population mean.

▶ I can therefore say that we state that as p < 0.05 we rejectthe null hypothesis at the 5% level and conclude that thosepeople with the SNP in question are likely to have higherplasma iron levels than the reference range.

▶ In R we’d do this more concisely as: t.test(data, mu=32).


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example▶ Looking this up in either a big table of t values (or using a

computer) for the distribution with 4 degrees of freedom (i.e.5-1) I find that this corresponds to a p value of 0.021.

...p is between 0.05 and 0.02

We measured Z≈3.7, which implies that....





...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example






...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example






...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example






...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Spoilers ahead

If you have already covered this material before, and are waitingfor me to start talking about all the assumptions and problemsof t-tests, and say words like ‘Type I error’, that’s the subject ofthe next lecture.

For now, let’s extend this machinery to compare two differentgroups of samples, and ask what evidence there is that theirmeans are different.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Two-sample Student’s t-tests

0.0

0.5

1.0

1.5

2.0

1400 1600 1800

Height (mm)

Peo

ple Sex

M

F


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Two sample Student’s t-tests

▶ The general approach is similar, but things are morecomplicated because we don’t know either mean exactly.

▶ Moreover, we also don’t know either standard deviation!


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Two sample Student’s t-tests

▶ The general approach is similar, but things are morecomplicated because we don’t know either mean exactly.

▶ Moreover, we also don’t know either standard deviation!

Group A Parameter Group Bx̄A Sample mean x̄BsA Sample standard deviation sBnA Number of samples nB


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Two-sample Student’s t-test

▶ It turns out that there are two fundamentally different waysof interpreting the two different estimates for populationvariance that the two groups give us.

▶ We can either assume that both groups have the samevariance, or, unsurprisingly, different variance.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Two-sample Student’s t-test

▶ It turns out that there are two fundamentally different waysof interpreting the two different estimates for populationvariance that the two groups give us.

▶ We can either assume that both groups have the samevariance, or, unsurprisingly, different variance.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Equal variance two-sample Student’s t-test

If the two groups have the same mean, then the difference of x̄Aand x̄B should be, on average, zero.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


If the two groups have the same mean, then the difference of x̄Aand x̄B should be, on average, zero.

It turns out that we can construct a pooled estimate of thestandard deviation, if we assume that it’s common to both groups.This estimate is

sp =

√(nA − 1)s2

A + (nB − 1)s2B

nA + nB − 2


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


We therefore construct the quantity

x̄A − x̄B

sp√

1nA

+ 1nB

,

which is very much like before – it’s t-distributed, but withnA + nB − 2 degrees of freedom.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


We therefore construct the quantity

x̄A − x̄B

sp√

1nA

+ 1nB

,

which is very much like before – it’s t-distributed, but withnA + nB − 2 degrees of freedom.

To test H1 : the groups A and B have different population means,we plug the numbers in and compare the value we get to thet-distribution.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Unequal variance two-sample Welch’s t-test

If we take the different s’s of acting as estimators forfundamentally different population variances, then the picture ismore complex. (It was originally described by B. L. Welch in 1947, in Biometrika, 34, 28 – 35)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


If we take the different s’s of acting as estimators forfundamentally different population variances, then the picture ismore complex. (It was originally described by B. L. Welch in 1947, in Biometrika, 34, 28 – 35)

Here, the quantity in question is

Z =x̄A − x̄B√

s2A

nA+

s2B

nB


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


There’s a catch, however, – this isn’t t-distributed “nicely”.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


There’s a catch, however, – this isn’t t-distributed “nicely”.

It’s t-distributed with (s2A

nA+

s2B

nB

)2

(s2AnA

)2

nA−1 +

(s2BnB

)2

nB−1

degrees of freedom (!)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Nevertheless...

▶ This is just a quantity that can be calculated, and thetheoretical t-value compared to the observed t-value.

▶ We can therefore perform a hypothesis test as before, andchoose to reject (or not) the null hypothesis that thepopulation means are the same at some chosen significancelevel.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Paired t-tests

One other very powerful trick is to perform repeated experimentson the same subject – for example, measuring a quantity with andwithout administration of a drug within a number of patients.

We’re then interested in changes, and if the mean differencebetween the groups is distinct from zero. As we obtain data inpairs, this is known as a paired test – and it can have morestatistical power.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

In practice...

0.0

0.5

1.0

1.5

2.0

1400 1600 1800

Height (mm)

Peo

ple Sex

M

F


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

In practice...

> t.test(x=males, y=females)

Welch Two Sample t-test

data: males and femalest = 3.893, df = 9.9081, p-value = 0.003047alternative hypothesis:true difference in means is not equal to 0

95 percent confidence interval:53.23589 196.14649

sample estimates:mean of x mean of y1751.374 1626.683


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Quick summary & Spoilers

▶ The sample mean and standard deviation provide estimatesfor the population mean and standard deviation.

▶ The standardised (or “Studentised”) constructs I’ve shownyou all have the same distribution – a t-distribution if thedata are drawn from a normal distribution.

▶ We can use this to infer whether or not two groups ofmeasurements are likely to have been drawn from onepopulation with one mean.

▶ Next time, I’ll talk a lot about the perils of the t-test, andwhat happens if your data are not normally distributed.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Quick summary & Spoilers

▶ The sample mean and standard deviation provide estimatesfor the population mean and standard deviation.

▶ The standardised (or “Studentised”) constructs I’ve shownyou all have the same distribution – a t-distribution if thedata are drawn from a normal distribution.

▶ We can use this to infer whether or not two groups ofmeasurements are likely to have been drawn from onepopulation with one mean.

▶ Next time, I’ll talk a lot about the perils of the t-test, andwhat happens if your data are not normally distributed.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The end!


Biochemistry Prelims Statistics Lecture II: [2em] Sampling ... · t-Distribution DRAUGHT 19 08 6 4 2 2 4 6 0.1 0.2 0.3 0.4 Biochemistry Prelims Statistics Lecture II: Sampling and

Documents