Top Banner
Statistical Inference Making decisions regarding the population base on a sample
112

Statistical Inference

Jan 24, 2016

Download

Documents

zoe

Statistical Inference. Making decisions regarding the population base on a sample. Decision Types. Estimation. Deciding on the value of an unknown parameter. Hypothesis Testing. Deciding a statement regarding an unknown parameter is true of false. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Statistical Inference

Statistical Inference

Making decisions regarding the population base on a sample

Page 2: Statistical Inference

Decision Types

• Estimation

– Deciding on the value of an unknown parameter

• Hypothesis Testing

– Deciding a statement regarding an unknown parameter is true of false

• All decisions will be based on the values of statistics

Page 3: Statistical Inference

Estimation

• Definitions

– An estimator of an unknown parameter is a sample statistic used for this purpose

– An estimate is the value of the estimator after the data is collected

• The performance of an estimator is assessed by determining its sampling distribution and measuring its closeness to the parameter being estimated

Page 4: Statistical Inference

Examples of Estimators

Page 5: Statistical Inference

The Sample Proportion

Let p = population proportion of interest or binomial probability of success.

Let

trialsbimomial of no.

succeses of no.ˆ

n

Xp

p̂ ofon distributi sampling Then the

pp ˆmean

n

ppp

)1(ˆ

is a normal distribution with

= sample proportion or proportion of successes.

Page 6: Statistical Inference

p̂ ofon distributi Sampling

0

5

10

15

20

25

30

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

c

pp ˆ

Page 7: Statistical Inference

The Sample Mean

Let x1, x2, x3, …, xn denote a sample of size n from a normal distribution with mean and standard deviation .Let

mean sample1

n

xx

n

ii

x ofon distributi sampling Then the

xmean

nx

is a normal distribution with

Page 8: Statistical Inference

0

0.05

0.1

0.15

0.2

0.25

0.3

80 90 100 110 120

population

n = 5

n = 10

n = 15

n = 20c

x

x ofon distributi Sampling

Page 9: Statistical Inference

Confidence Intervals

Page 10: Statistical Inference

Estimation by Confidence Intervals• Definition

– An (100) P% confidence interval of an unknown parameter is a pair of sample statistics (t1 and t2) having the following properties:

1. P[t1 < t2] = 1. That is t1 is always smaller than t2.

2. P[the unknown parameter lies between t1 and t2] = P.

• the statistics t1 and t2 are random variables

• Property 2. states that the probability that the unknown parameter is bounded by the two statistics t1 and t2 is P.

Page 11: Statistical Inference

Critical values for a distribution• The upper critical value for a any distribution

is the point x underneath the distribution such that P[X > x] =

x

Page 12: Statistical Inference

Critical values for the standard Normal distribution

P[Z > z] =

z

Page 13: Statistical Inference

Critical values for the standard Normal distribution

P[Z > z] =

Page 14: Statistical Inference

Confidence Intervals for a proportion p

Then t1 to t2 is a (1 – )100% = P100% confidence interval for p

n

ppzpzpt p

1ˆˆ 2/ˆ2/1

n

ppzp

ˆ1ˆˆ 2/

and

n

ppzpzpt p

1ˆˆ 2/ˆ2/2

n

ppzp

ˆ1ˆˆ 2/

Let

Page 15: Statistical Inference

Logic:

Thus t1 to t2 is a (1 – )100% = P100% confidence interval for p

p

ppz

ˆ

ˆ

has a Standard Normal distribution

and

1

ˆ2/

ˆ2/ z

ppzP

p

PzzzP 1Then

Hence 1ˆˆ ˆ2/ˆ2/ pp zppzpP

121 tptP

ˆ ˆ/ 2 / 2ˆ 1p pP z p p z

ˆ ˆ/ 2 / 2ˆ 1p pP z p p z

Page 16: Statistical Inference

Example• Suppose we are interested in determining the success rate

of a new drug for reducing Blood Pressure

• The new drug is given to n = 70 patients with abnormally high Blood Pressure

• Of these patients to X = 63 were able to reduce the abnormally high level of Blood Pressure

900.070

63ˆ

n

Xp

• The proportion of patients able to reduce the abnormally high level of Blood Pressure was

Page 17: Statistical Inference

Then

Thus a 95% confidence interval for p is 0.8297 to 0.9703

n

ppzpt

ˆ1ˆˆ 2/1

and

If P = 1 – = 0.95 then /2 = .025 and z = 1.960

70

10.090.0)960.1()90.0(

8297.00703.)90.0(

n

ppzpt

ˆ1ˆˆ 2/2

70

10.090.0)960.1()90.0(

9703.00703.)90.0(

Page 18: Statistical Inference

100P% Confidence Interval for the population proportion:

Confidence Interval for a Proportion

pzp ˆ2/ˆ

n

pp

n

ppp

ˆ1ˆ1ˆ

Interpretation: For about 100P% of all randomly selected samples from the population, the confidence interval computed in this manner captures the population proportion.

point critical 2/upper 2/ z

ndistribtio normal standard theof

Page 19: Statistical Inference

Error Bound

For a (1 – )% confidence level, the approximate margin of error in a sample proportion is

ˆ ˆ1Error Bound

p pz

n

Page 20: Statistical Inference

Factors that Determine the Error Bound

1. The sample size, n. When sample size increases, margin of error decreases.

p̂2. The sample proportion, . If the proportion is close to either 1 or 0 most individuals have the same trait or opinion, so there is little natural variability and the margin of error is smaller than if the proportion is near 0.5.

3. The “multiplier” z/2. Connected to the “(1 – a)%” level of confindence of the Error Bound. The value of z/2 for a 95% level of

confidence is 1.96 This value is changed to change the level of confidence.

Page 21: Statistical Inference

Determination of Sample Size

In almost all research situations the researcher is interested in the question:

How large should the sample be?

Page 22: Statistical Inference

Answer:

Depends on:

• How accurate you want the answer.

Accuracy is specified by:

• Specifying the magnitude of the error bound

• Level of confidence

Page 23: Statistical Inference

Error Bound:

• If we have specified the level of confidence then the value of za/2 will be known.

• If we have specified the magnitude of B, it will also be known

n

ppz

n

ppzB aa

ˆ1ˆ12/2/

Solving for n we get:

2

22/

2

22/ *1*1

B

ppz

B

ppzn aa

Page 24: Statistical Inference

Summarizing:

The sample size that will estimate p with an Error Bound B and level of confidence P = 1 – is:

where:• B is the desired Error Bound• z is the /2 critical value for the standard normal

distribution• p* is some preliminary estimate of p.If you do not have a preliminary estimate of p, use p* = 0.50

2

22/ *1*

B

ppzn a

Page 25: Statistical Inference

Reason

For p* = 0.50 2

22/ *1*

B

ppzn a

0

500

1000

1500

2000

2500

3000

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

n

*p

n will take on the largest value.

Thus using p* = 0.50, n may be larger than required if p is not 0.50. but will give the desired accuracy or better for all values of p.

Page 26: Statistical Inference

Example• Suppose that I want to conduct a survey and want to estimate

p = proportion of voters who favour a downtown location for a casino:

I know that the approximate value of p is• p* = 0.50. This is also a good choice for p if one has no

preliminary estimate of its value.• I want the survey to estimate p with an error bound B = 0.01

(1 percentage point)• I want the level of confidence to be 95% (i.e. = 0.05 and

z = z = 1.960Then

9604

01.0

50.050.0960.12

2

n

Page 27: Statistical Inference

Confidence Intervals for the mean of a Normal Population,

Then t1 to t2 is a (1 – )100% = P100% confidence interval for

nzxzxt x

2/2/1 Let

andn

zxzxt x

2/2/2

Page 28: Statistical Inference

Logic:

Thus t1 to t2 is a (1 – )100% = P100% confidence interval for p

x

xz

has a Standard Normal distribution

and

12/2/ z

xzP

x

PzzzP 1Then

Hence 12/2/ xx zxzxP

121 ttP

Page 29: Statistical Inference

Example• Suppose we are interested average Bone Mass Density

(BMD) for women aged 70-75

• A sample n = 100 women aged 70-75 are selected and BMD is measured for eahc individual in the sample.

• The average BMD for these individuals is:

63.25x• The standard deviation (s) of BMD for these individuals

is: 82.7s

Page 30: Statistical Inference

Then

Thus a 95% confidence interval for is 24.10 to 27.16

n

szx

nzxt 2/2/1

and

If P = 1 – = 0.95 then /2 = .025 and z = 1.960

10.2453.163.25100

82.7960.163.25

n

szx

nzxt 2/2/2

16.2753.163.25100

82.7960.163.25

Page 31: Statistical Inference

Determination of Sample Size

Again a question to be asked:

How large should the sample be?

Page 32: Statistical Inference

Answer:

Depends on:

• How accurate you want the answer.

Accuracy is specified by:

• Specifying the magnitude of the error bound

• Level of confidence

Page 33: Statistical Inference

Error Bound:

• If we have specified the level of confidence then the value of z/2 will be known.

• If we have specified the magnitude of B, it will also be known

nzB a

2/

Solving for n we get:

2

222/

2

222/ *

B

sz

B

zn aa

Page 34: Statistical Inference

Summarizing:

The sample size that will estimate with an Error Bound B and level of confidence P = 1 – is:

where:• B is the desired Error Bound• z is the /2 critical value for the standard normal

distribution• s* is some preliminary estimate of s.

2

222/

2

222/ *

B

sz

B

zn aa

Page 35: Statistical Inference

Notes:

• n increases as B, the desired Error Bound, decreases– Larger sample size required for higher level of

accuracy• n increases as the level of confidence, (1 – ), increases

– z increases as /2 becomes closer to zero.– Larger sample size required for higher level of

confidence• n increases as the standard deviation, , of the population

increases.– If the population is more variable then a larger sample

size required

2

222/

2

222/ *

B

sz

B

zn aa

Page 36: Statistical Inference

Summary:

The sample size n depends on: • Desired level of accuracy• Desired level of confidence• Variability of the population

Page 37: Statistical Inference

Example• Suppose that one is interested in estimating the average

number of grams of fat (m) in one kilogram of lean beef hamburger :

This will be estimated by:• randomly selecting one kilogram samples, then • Measuring the fat content for each sample.• Preliminary estimates of and indicate:

– that and are approximately 220 and 40 respectively.

• I want the study to estimate with an error bound 5 and • a level of confidence to be 95% (i.e. = 0.05 and z =

z = 1.960)

Page 38: Statistical Inference

Solution

2469.2455

40960.12

22

n

Hence n = 246 one kilogram samples are required to estimate within B = 5 gms with a 95% level of confidence.

Page 39: Statistical Inference

Confidence Intervals

Page 40: Statistical Inference

Confidence Interval for a Proportion

pzp ˆ2/ˆ

n

pp

n

ppp

ˆ1ˆ1ˆ

point critical 2/upper 2/ z

ndistribtio normal standard theof

ˆ ˆ ˆ/ 2 / 2 / 2

ˆ ˆ1 1p p p

p p p pB z z z

n n

Error Bound

Page 41: Statistical Inference

The sample size that will estimate p with an Error Bound B and level of confidence P = 1 – is:

where:• B is the desired Error Bound• z is the /2 critical value for the standard normal

distribution• p* is some preliminary estimate of p.

2

22/ *1*

B

ppzn a

Determination of Sample Size

Page 42: Statistical Inference

Confidence Intervals for the mean of a Normal Population,

/ 2 xx z

/ 2or x zn

/ 2or s

x zn

sample meanx point critical 2/upper 2/ z

ndistribtio normal standard theof sample standard deviation s

Page 43: Statistical Inference

The sample size that will estimate with an Error Bound B and level of confidence P = 1 – is:

where:• B is the desired Error Bound• z is the /2 critical value for the standard normal

distribution• s* is some preliminary estimate of s.

2

222/

2

222/ *

B

sz

B

zn aa

Determination of Sample Size

Page 44: Statistical Inference

Hypothesis Testing

An important area of statistical inference

Page 45: Statistical Inference

Definition

Hypothesis (H)– Statement about the parameters of the population

• In hypothesis testing there are two hypotheses of interest.– The null hypothesis (H0)

– The alternative hypothesis (HA)

Page 46: Statistical Inference

Either

– null hypothesis (H0) is true or

– the alternative hypothesis (HA) is true.

But not both

We say that are mutually exclusive and exhaustive.

Page 47: Statistical Inference

One has to make a decision – to either to accept null hypothesis

(equivalent to rejecting HA)

or– to reject null hypothesis (equivalent to

accepting HA)

Page 48: Statistical Inference

There are two possible errors that can be made.

1. Rejecting the null hypothesis when it is true. (type I error)

2. accepting the null hypothesis when it is false (type II error)

Page 49: Statistical Inference

An analogy – a jury trial

The two possible decisions are

– Declare the accused innocent.

– Declare the accused guilty.

Page 50: Statistical Inference

The null hypothesis (H0) – the accused is innocent

The alternative hypothesis (HA) – the accused is guilty

Page 51: Statistical Inference

The two possible errors that can be made:

– Declaring an innocent person guilty.(type I error)

– Declaring a guilty person innocent.(type II error)

Note: in this case one type of error may be considered more serious

Page 52: Statistical Inference

Decision Table showing types of Error

H0 is True H0 is False

Correct Decision

Correct Decision

Type I Error

Type II Error

Accept H0

Reject H0

Page 53: Statistical Inference

To define a statistical Test we

1. Choose a statistic (called the test statistic)

2. Divide the range of possible values for the test statistic into two parts

• The Acceptance Region

• The Critical Region

Page 54: Statistical Inference

To perform a statistical Test we

1. Collect the data.

2. Compute the value of the test statistic.

3. Make the Decision:

• If the value of the test statistic is in the Acceptance Region we decide to accept H0 .

• If the value of the test statistic is in the Critical Region we decide to reject H0 .

Page 55: Statistical Inference

Example

We are interested in determining if a coin is fair.

i.e. H0 : p = probability of tossing a head = ½.

To test this we will toss the coin n = 10 times.

The test statistic is x = the number of heads.

This statistic will have a binomial distribution with p = ½ and n = 10 if the null hypothesis is true.

Page 56: Statistical Inference

0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10

Sampling distribution of x when H0 is true

Page 57: Statistical Inference

Note

We would expect the test statistic x to be around 5 if H0 : p = ½ is true.

Acceptance Region = {3, 4, 5, 6, 7}.

Critical Region = {0, 1, 2, 8, 9, 10}.

The reason for the choice of the Acceptance region:

Contains the values that we would expect for x if the null hypothesis is true.

Page 58: Statistical Inference

Definitions: For any statistical testing procedure define

1 = P[Rejecting the null hypothesis when it is true] = P[ type I error]

= P[accepting the null hypothesis when it is false] = P[ type II error]

Page 59: Statistical Inference

In the last example

1 = P[ type I error] = p(0) + p(1) + p(2) + p(8) + p(9) + p(10) = 0.109, where p(x) are binomial probabilities with p = ½ and n = 10 .

= P[ type II error] = p(3) + p(4) + p(5) + p(6) + p(7), where p(x) are binomial probabilities with p (not equal to ½) and n = 10. Note: these will depend on the value of p.

Page 60: Statistical Inference

Table: Probability of a Type II error, vs. p

p 0.1 0.0700.2 0.3220.3 0.6160.4 0.8200.6 0.8200.7 0.6160.8 0.3220.9 0.070

Note: the magnitude of increases as p gets closer to ½.

Page 61: Statistical Inference

Comments: 1. You can control = P[ type I error] and = P[

type II error] by widening or narrowing the acceptance region. .

2. Widening the acceptance region decreases = P[ type I error] but increases = P[ type II error].

3. Narrowing the acceptance region increases = P[ type I error] but decreases = P[ type II error].

Page 62: Statistical Inference

Example – Widening the Acceptance Region

1. Suppose the Acceptance Region includes in addition to its previous values 2 and 8 then = P[ type I error] = p(0) + p(1) + p(9) + p(10) = 0.021, where again p(x) are binomial probabilities with p = ½ and n = 10 .

= P[ type II error] = p(2) + p(3) + p(4) + p(5) + p(6) + p(7) + p(8). Tabled values of are given on the next page.

Page 63: Statistical Inference

Table: Probability of a Type II error, vs. p

p 0.1 0.2640.2 0.6240.3 0.8510.4 0.9520.6 0.9520.7 0.8510.8 0.6240.9 0.264

Note: Compare these values with the previous definition of the Acceptance Region. They have increased,

Page 64: Statistical Inference

Example – Narrowing the Acceptance Region

1. Suppose the original Acceptance Region excludes the values 3 and 7. That is the Acceptance Region is {4,5,6}. Then = P[ type I error] = p(0) + p(1) + p(2) + p(3) + p(7) + p(8) +p(9) + p(10) = 0.344.

= P[ type II error] = p(4) + p(5) + p(6) . Tabled values of are given on the next page.

Page 65: Statistical Inference

Table: Probability of a Type II error, vs. p

p 0.1 0.0130.2 0.1200.3 0.3400.4 0.5630.6 0.5630.7 0.3400.8 0.1200.9 0.013

Note: Compare these values with the otiginal definition of the Acceptance Region. They have decreased,

Page 66: Statistical Inference

p 0.1 0.0130.2 0.1200.3 0.3400.4 0.5630.6 0.5630.7 0.3400.8 0.1200.9 0.013

p 0.1 0.2640.2 0.6240.3 0.8510.4 0.9520.6 0.9520.7 0.8510.8 0.6240.9 0.264

p 0.1 0.0700.2 0.3220.3 0.6160.4 0.8200.6 0.8200.7 0.6160.8 0.3220.9 0.070

Acceptance Region

{4,5,6}.

Acceptance Region

{3,4,5,6,7}.

Acceptance Region

{2,3,4,5,6,7,8}.

= 0.344 = 0.109 = 0.021

Page 67: Statistical Inference

The Approach in Statistical Testing is:

• Set up the Acceptance Region so that is close to some predetermine value (the usual values are 0.05 or 0.01)

• The predetermine value of (0.05 or 0.01) is called the significance level of the test.

• The significance level of the test is = P[test makes a type I error]

Page 68: Statistical Inference

The z-test for Proportions

Testing the probability of success in a binomial experiment

Page 69: Statistical Inference

Situation

• A success-failure experiment has been repeated n times

• The probability of success p is unknown. We want to test – H0: p = p0 (some specified value of p)

Against

– HA:0pp

Page 70: Statistical Inference

The Data

• The success-failure experiment has been repeated n times

• The number of successes x is observed.

• Obviously if this proportion is close to p0 the Null Hypothesis should be accepted otherwise the null Hypothesis should be rejected.

successes ofpoportion theˆ n

xp

Page 71: Statistical Inference

The Test Statistic• To decide to accept or reject the Null Hypothesis

(H0) we will use the test statistic

n

pp

ppppz

p 00

0

ˆ

0

1

ˆ

ˆ

• If H0 is true we should expect the test statistic z to be close to zero.

• If H0 is true we should expect the test statistic z to have a standard normal distribution.

• If HA is true we should expect the test statistic z to be different from zero.

Page 72: Statistical Inference

The sampling distribution of z when H0 is true:

The Standard Normal distribution

0 z

Accept H0Reject H0 Reject H0

Page 73: Statistical Inference

The Acceptance region:

1 when trueAccept 2/2/0 zzzPHP

/2

0 z

/2

2/z 2/z

Accept H0

Reject H0 Reject H0

2/2/0 or when trueReject zzzzPHP

Page 74: Statistical Inference

• Acceptance Region– Accept H0 if:

• Critical Region– Reject H0 if:

2/2/ zzz

2/2/ or zzzz

when trueReject Error I Type 0HPP

• With this Choice

2/2/ or zzzzP

Page 75: Statistical Inference

Summary

To Test for a binomial probability p

H0: p = p0 (some specified value of p)Against

HA:

we

0pp

1. Decide on = P[Type I Error] = the significance level of the test (usual choices 0.05 or 0.01)

Page 76: Statistical Inference

2. Collect the data

3. Compute the test statistic

n

pp

ppppz

p 00

0

ˆ

0

1

ˆ

ˆ

4. Make the Decision• Accept H0 if:

• Reject H0 if:

2/2/ zzz

2/2/ or zzzz

Page 77: Statistical Inference

Example

• In the last election the proportion of the voters who voted for the Liberal party was 0.08 (8 %)

• The party is interested in determining if that percentage has changed

• A sample of n = 800 voters are surveyed

Page 78: Statistical Inference

We want to test

– H0: p = 0.08 (8%)

Against

– HA: %)8( 08.0p

Page 79: Statistical Inference

Summary

1. Decide on = P[Type I Error] = the significance level of the test

Choose ( = 0.05)

2. Collect the data

• The number in the sample that support the liberal party is x = 92

(11.5%) 115.0800

92ˆ

n

xp

Page 80: Statistical Inference

3. Compute the test statistic

n

pp

ppppz

p 00

0

ˆ

0

1

ˆ

ˆ

4. Make the Decision• Accept H0 if:

• Reject H0 if:

960.1960.1 z

960.1or 960.1 zz

649.3

80080.0180.0

80.0115.0

960.1025.02/ zz

Page 81: Statistical Inference

Since the test statistic is in the Critical region we decide to Reject H0

Conclude that H0: p = 0.08 (8%) is false

There is a significant difference ( = 5%) in the proportion of the voters supporting the liberal party in this election than in the last election

Page 82: Statistical Inference

The one tailed z-test

• A success-failure experiment has been repeated n times

• The probability of success p is unknown. We want to test – H0: (some specified value of p)Against– HA:

• The alternative hypothesis is in this case called a one-sided alternative

0pp

0pp

Page 83: Statistical Inference

The Test Statistic• To decide to accept or reject the Null Hypothesis

(H0) we will use the test statistic

n

pp

ppppz

p 00

0

ˆ

0

1

ˆ

ˆ

• If H0 is true we should expect the test statistic z to be close to zero or negative

• If p = p0 we should expect the test statistic z to have a standard normal distribution.

• If HA is true we should expect the test statistic z to be a positive number.

Page 84: Statistical Inference

The sampling distribution of z when p = p0 :

The Standard Normal distribution

0 z

Accept H0Reject H0

Page 85: Statistical Inference

The Acceptance and Critical region:

1 when trueAccept 0 zzPHP

0 zz

Accept H0

Reject H0

zzPHP when trueReject 0

Page 86: Statistical Inference

• Acceptance Region– Accept H0 if:

• Critical Region– Reject H0 if:

zz

zz

when trueReject Error I Type 0HPP

• The Critical Region is called one-tailed

• With this Choice

zzP

Page 87: Statistical Inference

Example• A new surgical procedure is developed for

correcting heart defects infants before the age of one month.

• Previously the procedure was used on infants that were older than one month and the success rate was 91%

• A study is conducted to determine if the success rate of the new procedure is greater than 91% (n = 200)

Page 88: Statistical Inference

We want to test

– H0:

Against

– HA: %)91( 91.0p

%)91( 91.0p

procedure new theof rate success thep

Page 89: Statistical Inference

Summary

1. Decide on = P[Type I Error] = the significance level of the test

Choose ( = 0.05)

2. Collect the data

• The number of successful operations in the sample of 200 cases is x = 187

(93.5%) 935.0200

187ˆ

n

xp

Page 90: Statistical Inference

3. Compute the test statistic

n

pp

ppppz

p 00

0

ˆ

0

1

ˆ

ˆ

4. Make the Decision• Accept H0 if:

• Reject H0 if:

645.1z

645.1z

235.1

20091.0191.0

91.0935.0

645.105.0 zz

Page 91: Statistical Inference

Since the test statistic is in the Acceptance region we decide to Accept H0

There is a no significant ( = 5%) increase in the success rate of the new procedure over the older procedure

Conclude that H0: is true%)91( 91.0p

Page 92: Statistical Inference

Comments

• When the decision is made to accept H0 is made it should not be conclude that we have proven H0.

• This is because when setting up the test we have not controlled = P[type II error] = P[accepting H0 when H0 is FALSE]

• Whenever H0 is accepted there is a possibility that a type II error has been made.

Page 93: Statistical Inference

In the last example

The conclusion that there is a no significant ( = 5%) increase in the success rate of the new procedure over the older procedure should be interpreted:

We have been unable to proof that the new procedure is better than the old procedure

Page 94: Statistical Inference

An analogy – a jury trial

The two possible decisions are

– Declare the accused innocent.

– Declare the accused guilty.

Page 95: Statistical Inference

The null hypothesis (H0) – the accused is innocent

The alternative hypothesis (HA) – the accused is guilty

Page 96: Statistical Inference

The two possible errors that can be made:

– Declaring an innocent person guilty.(type I error)

– Declaring a guilty person innocent.(type II error)

Note: in this case one type of error may be considered more serious

Page 97: Statistical Inference

Requiring all 12 jurors to support a guilty verdict :

– Ensures that the probability of a type I error (Declaring an innocent person guilty) is small.

– However the probability of a type II error (Declaring an guilty person innocent) could be large.

Page 98: Statistical Inference

Hence: When decision of innocence is made:

– It is not concluded that innocence has been proven

but that

– we have been unable to disprove innocence

Page 99: Statistical Inference

The z-test for the Mean of a Normal Population

We want to test, , denote the mean of a normal population

Page 100: Statistical Inference

Situation

• A success-failure experiment has been repeated n times

• The probability of success p is unknown. We want to test – H0: p = p0 (some specified value of p)

Against

– HA:0pp

Page 101: Statistical Inference

The Data

• Let x1, x2, x3 , … , xn denote a sample from a normal population with mean and standard deviation .

• Let

• we want to test if the mean, , is equal to some given value 0.

• Obviously if the sample mean is close to 0 the Null Hypothesis should be accepted otherwise the null Hypothesis should be rejected.

mean sample the1

n

xx

n

ii

Page 102: Statistical Inference

The Test Statistic• To decide to accept or reject the Null Hypothesis

(H0) we will use the test statistic

s

xn

xn

n

xxz

x

0000

• If H0 is true we should expect the test statistic z to be close to zero.

• If H0 is true we should expect the test statistic z to have a standard normal distribution.

• If HA is true we should expect the test statistic z to be different from zero.

Page 103: Statistical Inference

The sampling distribution of z when H0 is true:

The Standard Normal distribution

0 z

Accept H0Reject H0 Reject H0

Page 104: Statistical Inference

The Acceptance region:

1 when trueAccept 2/2/0 zzzPHP

/2

0 z

/2

2/z 2/z

Accept H0

Reject H0 Reject H0

2/2/0 or when trueReject zzzzPHP

Page 105: Statistical Inference

• Acceptance Region– Accept H0 if:

• Critical Region– Reject H0 if:

2/2/ zzz

2/2/ or zzzz

when trueReject Error I Type 0HPP

• With this Choice

2/2/ or zzzzP

Page 106: Statistical Inference

Summary

To Test for a binomial probability p

H0: = 0 (some specified value of p)

Against

HA: 0

1. Decide on = P[Type I Error] = the significance level of the test (usual choices 0.05 or 0.01)

Page 107: Statistical Inference

2. Collect the data

3. Compute the test statistic

4. Make the Decision• Accept H0 if:

• Reject H0 if:

2/2/ zzz

2/2/ or zzzz

s

xn

xnz 00

Page 108: Statistical Inference

Example

A manufacturer Glucosamine capsules claims that each capsule contains on the average:

• 500 mg of glucosamine

To test this claim n = 40 capsules were selected and amount of glucosamine (X) measured in each capsule.

Summary statistics:

496.3 and 8.5x s

Page 109: Statistical Inference

We want to test:

Manufacturers claim is correct

against

0 :H

:AH Manufacturers claim is not correct

Page 110: Statistical Inference

The Test Statistic

s

xn

xn

n

xxz

x

0000

496.3 500 40

8.52.75

Page 111: Statistical Inference

The Critical Region and Acceptance Region

Using = 0.05

We accept H0 if-1.960 ≤ z ≤ 1.960

z/2 = z0.025 = 1.960

reject H0 ifz < -1.960 or z > 1.960

Page 112: Statistical Inference

The Decision

Sincez= -2.75 < -1.960

We reject H0

Conclude: the manufacturers’s claim is incorrect: