BASIC STATISTICAL INFERENCE

A. COMPARE BETWEEN TWO MEANS OF POPULATIONSB. COMPARE BETWEEN TWO VARIANCES OF POPULATIONS

PARAMETERIC TESTS(QUANTITATIVE DATA)

t-distribution z-distribution

f-distribution (fisher’s distribution)

𝝁 ,

TEST THE NULL HYPOTHESIS 𝑯 𝟎 :𝝁=𝝁𝟎

We shall consider here three forms for the alternative hypothesis:

𝑯𝟏 :𝝁>𝝁𝟎 , 𝑯 𝟏 :𝝁<𝝁𝟎 , 𝑯𝟏 :𝝁≠𝝁𝟎 ,

TEST THE ALTERNATIVE HYPOTHESIS 𝑯𝟎 :𝝈=𝝈𝟎

𝑯𝟏 :𝝈>𝝈𝟎𝑯 𝟏:𝝈<𝝈𝟎 𝑯𝟏 :𝝈≠𝝈𝟎

𝝈 ,

Not significant

Distribution showing 0.05 significant level in one-tailed test

0.05 significant level

One tailed test

P < 0.05 P < 0.01 P < 0.001P > 0.05 Insignificant difference

(𝟏−𝜶 )

Not significant

Distribution showing 0.05 significant level in one-tailed test

0.05 significant level

0.95(𝟏−𝜶 ) 𝜶=𝟎 .𝟎𝟓

𝜶=𝟎 .𝟎𝟓

𝑯𝟏 :𝝁>𝝁𝟎 ,

𝑯𝟏 :𝝁<𝝁𝟎 ,

𝑯𝟏 :𝝈>𝝈𝟎

𝑯𝟏 :𝝈<𝝈𝟎

𝑯𝟏 :𝝁≠𝝁𝟎

Distribution showing 0.05 significant level in two-tailed test

0.05 significant level 0.05 significant level

Two tailed test

Not significant

(𝟏−𝜶 ) 𝜶𝟐=

𝟎 .𝟎𝟓𝟐 =𝟎 .𝟎𝟐𝟓𝜶

𝟐=𝟎 .𝟎𝟓𝟐 =𝟎 .𝟎𝟐𝟓

𝑯𝟏 :𝝈≠𝝈𝟎

−𝟐𝝈𝝈

0.050.01

Calculated t

𝞵𝝈

Tabulated t

𝑷>𝟎 .𝟎𝟓

𝑷<𝟎 .𝟎𝟓𝑷<𝟎 .𝟎𝟏𝑷<𝟎 .𝟎𝟎𝟏

Accept H0 Reject H0Reject H0

P > 0.05

𝟏−𝛂

𝑷<𝟎 .𝟎𝟓𝑷<𝟎 .𝟎𝟏

𝑷<𝟎 .𝟎𝟎𝟎𝟏𝑷<𝟎 .𝟎𝟎𝟏

𝒁 𝒔=𝑿−𝝁𝟎

𝝈 /√𝒏

Calculated z

Mean sample

A given fixed value to be tested

Population standard deviation Sample size (>30)

HYPOTHESIS TESTS ON THE MEAN (LARGE SAMPLES >30)

𝒕 𝒔=𝑿−𝝁𝟎

𝑺 /√𝒏

Calculated z

Mean sample

A given fixed value to be tested

Sample standard deviation Sample size (<30)

HYPOTHESIS TESTS ON THE MEAN (SMALL SAMPLES <30)

o To decide if a sample mean is different from a hypothesized population mean.o You have calculated mean value and standard deviation for the group assuming

you have measurement data. where the standard score (t) is:

𝐭𝐜𝐚𝐥𝐜𝐮𝐥𝐚𝐭𝐞𝐝=𝐗−𝛍❑

𝑺 .𝑬 .

One sample t-distribution

𝐭 𝐭𝐚𝐛𝐮𝐥𝐚𝐭𝐞𝐝 :(𝐝 .𝐟 . ,𝐩𝐯𝐚𝐥𝐮𝐞)Degree of freedom (n-1)

t-distribution

o The percentiles values of the t-distribution (tp) are tabulated for a range of values of d.f. and several values of p are represented in a Table .

The mean concentration of cadmium in water sample was 4 ppm for sample size 7 and a standard deviation=0.9 ppm. The allowable limit for this metal is 2 ppm. Test whether or not the cadmium level in water sample at the allowable limit.

Example

Solution

T cal (2.447) > t tab (2.447) Reject the null hypothesis

T cal (2.447) > t tab (3.707) Reject the null hypothesis

𝑷<𝟎 .𝟎𝟏Decision:

Thus the cadmium level in water is not at the allowable limit.

One sample t-DISTRIBUTION

Example: In an New Zealand, Does the average mass of male turtles in location A was significantly higher than Location B?

Location A Location B

n 25 2638 35

𝑯 𝒂 :𝝁 𝑨>𝝁𝑩

𝑺𝑷𝟐=

(𝒏𝟏−𝟏)𝑺𝟏𝟐+ (𝒏𝟐−𝟏 )𝑺𝟐

𝒏𝟏+𝒏𝟐−𝟐

d.f. = n1 + n2 - 2

¿(𝟐𝟓−𝟏 )∗𝟏𝟔+(𝟐𝟔−𝟏 )∗𝟗

𝟐𝟓+𝟐𝟔−𝟐 =𝟔𝟗𝟗𝟓𝟗 =𝟏𝟏 .𝟖𝟓

¿(𝟑𝟖−𝟑𝟓 )

√𝟏𝟏 .𝟖𝟓 ( 𝟏𝟐𝟓 + 𝟏𝟐𝟔 )

=¿ 𝟑√𝟎 .𝟖𝟎

=𝟑 .𝟑𝟓

= 25 + 26 - 2 = 49

Tabulated t at df 59 = 1.671 Thus, tobserved (3.35) > ttabulated (1.67) at α= 0.05

The mass of male turtles in location A is significantly higher than those of location B (reject H0) P<0.05

Two sample INDEPENDENT t-DISTRIBUTION

Control (X1) Pb (X2)

𝐒𝐒𝟏=∑𝐗𝟏𝟐−

(𝐗𝟏 )𝟐

𝐧𝟏

62416889452434816561577664005476336424014624

39695041211632492809211632495776270446245329

775 662 55837 40982

𝐒𝐒𝟏=𝟓𝟓𝟖𝟑𝟕−(𝟕𝟕𝟓 )𝟐

𝟏𝟏 =𝟏𝟐𝟑𝟒 .𝟕𝟑

𝐒𝐒𝟐=𝟒𝟎𝟗𝟖𝟐− (𝟔𝟔𝟐 )𝟐

𝟏𝟏 =𝟏𝟏𝟒𝟏 .𝟔𝟒

=70.45 =60.18

𝐭= 𝐗𝟏−𝐗𝟐

√( 𝐒𝐒𝟏+𝐒𝐒𝟐

(𝐧𝟏+𝐧𝟐 )−𝟐 )( 𝟏𝐧𝟏+ 𝟏𝐧𝟐 )

tcalculated (2.209) > ttabulated (2.086) at d.f. 20

d.f. = n1 + n2 - 2 = 11+11 -2= 20

𝒕= ∑ 𝑫

√𝒏∑ 𝑫𝟐− (𝑫 )𝟐

𝒏−𝟏𝑫=𝑿𝟐−𝑿𝟏

d.f. = n - 1Before (X1) After (X2)

-14-6-15-3-3-5-4-5-3

196361

2516259

𝒕= −𝟑𝟗𝟏𝟒 .𝟖𝟕=−𝟐 .𝟔𝟐𝟑d.f. = 10 – 1= 9ttabulated at d.f. 10 = 1.833 ?

-39 351

TESTING THE DIFFERENCE BETWEEN TWO MEANS OF DEPENDENT SAMPLES

Two sample DEPENDENT t-DISTRIBUTION

1) That is, you will test the null hypothesis H0: σ12 = σ2

2 against an appropriate alternate hypothesis Ha: σ1

2 ≠ σ22 .

2) You calculate the F-value as the ratio of the two variances:

𝑭 𝒄𝒂𝒍𝒄𝒖𝒍𝒂𝒕𝒆𝒅=𝒔𝟏𝟐

𝒔𝟐𝟐 where s1

2 ≥ s22, so that F ≥ 1.

The degrees of freedom for the numerator and denominator are n1-1 and n2-1, respectively.

Compare Fcalc. to a tabulated value Ftab. to see if you should accept or reject the null hypothesis.

Fisher’s F-distribution

Example:Assume we want to see if a Method 1 for measuring the arsenic

concentration in soil is significantly more precise than Method 2. Each method was tested ten times, with yielding the following values:

Methods Mean (ppm) S.D. (ppm)

Method 1 6.7 0.8Method 2 8.2 1.2

So we want to test the null hypothesis H0: σ22 = σ1

2 against the alternate hypothesis HA: σ2

2 > σ12

Solution:

∵𝑭 𝒄𝒂𝒍𝒄𝒖𝒍𝒂𝒕𝒆𝒅=𝒔𝟏𝟐

𝒔𝟐𝟐¿𝟏 .𝟐❑

𝟎 .𝟖❑𝟐¿𝟐 .𝟐𝟓

o The tabulated value for d.f.= 9 in each case, at 1-tailed, 95% confidence level is F9,9 = 3.179.

o In this case, Fcalc < F9,9 tabulated, so we Accept H0 that the two standard deviations are equal, so P > 0.05

d.f.= 10 – 1 = 9

o We use a 1-tailed test in this case because the only information we are interested in is whether Method 1 is more precise than Method 2

BASIC STATISTICAL INFERENCE

significant level0

tailed test0

tabulated t

tailed testp

tabulated value

h0reject h0reject h0

cadmium level

confidence level

Documents

Basic principles of statistical inference

Grossman Statistical Inference

Big Data, Statistical Inference and Official Statistics •....

STATS 200: Introduction to Statistical Inference 200:...

Introduction to Statistical...

CO902 Probabilistic and statistical inference · CO902...

Statistical Inference - UZH

3 . Statistical Inference

STATS 200: Introduction to Statistical Inference ·...

Statistical Inference for Diffusion Processes ·...

Markov Logic And other SRL Approaches. Overview Statistical....

Basic Principles of Statistical Inference · Probability...

Statistical Inference: Introduction

BIOS 362 Advances Statistical Inference (Statistical...

Contrasts & Statistical Inference

6.2 Statistical inference