Top Banner
06/22/22 (c) 2001, Ron S. Kenett, Ph.D. 1 arametric Statistical Inferen Instructor: Ron S. Kenett Email: [email protected] Course Website: www.kpa.co.il/biostat Course textbook: MODERN INDUSTRIAL STATISTICS, Kenett and Zacks, Duxbury Press, 1998
37

1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: [email protected]@kpa.co.il Course Website: .

Mar 26, 2015

Download

Documents

Jose Parsons
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 1

Parametric Statistical Inference

Instructor: Ron S. KenettEmail: [email protected]

Course Website: www.kpa.co.il/biostatCourse textbook: MODERN INDUSTRIAL STATISTICS,

Kenett and Zacks, Duxbury Press, 1998

Page 2: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 2

Course Syllabus

•Understanding Variability•Variability in Several Dimensions•Basic Models of Probability•Sampling for Estimation of Population Quantities•Parametric Statistical Inference•Computer Intensive Techniques•Multiple Linear Regression•Statistical Process Control•Design of Experiments

Page 3: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 3

Null Hypotheses H0: Put here what is typical of the

population, a term that characterizes “business as usual” where nothing out of the ordinary occurs.

Alternative Hypotheses H1: Put here what is the challenge, the

view of some characteristic of the population that, if it were true, would trigger some new action, some change in procedures that had previously defined “business as usual.”

Definitions

Page 4: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 4

Step 1.A claim is made.

A new claim is asserted that challenges existing thoughts about a population characteristic.

Suggestion: Form the alternative hypothesis first, since it embodies the challenge.

The Logic of Hypothesis Testing

Page 5: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 5

The Logic of Hypothesis Testing

Step 2.How much error are you willing to accept?

Select the maximum acceptable error,. The decision maker must elect how much error he/she is willing to accept in making an inference about the population. The significance level of the test is the maximum probability that the null hypothesis will be rejected incorrectly, a Type I error.

Page 6: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 6

The Logic of Hypothesis Testing

Step 3.If the null hypothesis were true, what would you expect to see?

Assume the null hypothesis is true. This is a very powerful statement. The test is always referenced to the null hypothesis.Form the rejection region, the areas in which the decision maker is willing to reject the presumption of the null hypothesis.

Page 7: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 7

The Logic of Hypothesis Testing

Step 4.What did you actually see?

Compute the sample statistic. The sample provides a set of data that serves as a window to the population. The decision maker computes the sample statistic and calculates how far the sample statistic differs from the presumed distribution that is established by the null hypothesis.

Page 8: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 8

The Logic of Hypothesis Testing

Step 5.Make the decision.

The decision is a conclusion supported by evidence. The decision maker will: reject the null hypothesis if the

sample evidence is so strong, the sample statistic so unlikely, that the decision maker is convinced H1 must be true.

fail to reject the null hypothesis if the sample statistic falls in the nonrejection region. In this case, the decision maker is not concluding the null hypothesis is true, only that there is insufficient evidence to dispute it based on this sample.

Page 9: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 9

The Logic of Hypothesis Testing

Step 6.What are the implications of the decision for future actions?

State what the decision means in terms of the research program.The decision maker must draw out the implications of the decision. Is there some action triggered, some change implied? What recommendations might be extended for future attempts to test similar hypotheses?

Page 10: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 10

Type I Error: Saying you reject H0 when it really is

true. Rejecting a true H0.

Type II Error: Saying you do not reject H0 when it

really is false. Failing to reject a false H0.

Two Types of Errors

Page 11: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 11

What are acceptable error levels?

Decision makers frequently use a 5% significance level. Use = 0.05. An -error means that we will decide to

adjust the machine when it does not need adjustment.

This means, in the case of the robot welder, if the machine is running properly, there is only a 0.05 probability of our making the mistake of concluding that the robot requires adjustment when it really does not.

Page 12: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 12

Three Types of Tests

Nondirectional, two-tail test: H1: pop parameter n.e. value

Directional, right-tail test: H1: pop parameter value

Directional, left-tail test: H1: pop parameter value

Always put hypotheses in terms of population parameters and have H0: pop parameter = value

Page 13: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 13

Two tailed test

–z +z

Do NotReject H 0

00 Reject HReject H

H0: pop parameter = valueH1: pop parameter n.e. value

Page 14: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 14

Right tailed test

H0: pop parameter valueH1: pop parameter > value

+z

Do Not Reject H 00 Reject H

Page 15: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 15

Left tailed test

H0: pop parameter valueH1: pop parameter < value

–z

Do Not Reject H 0Reject H0

Page 16: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 16

H1

Ho

Ho H1

OKOK

OKOK

TypeType IIErrorError

TypeType IIIIErrorError

Page 17: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 17

What Test to Apply?

Ask the following questions: Are the data the result of a

measurement (a continuous variable) or a count (a discrete variable)?

Is known? What shape is the distribution of the

population parameter? What is the sample size?

Page 18: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 18

Test of µ, Known, Population Normally Distributed

Test Statistic:

where is the sample statistic. µ0 is the value identified in the null

hypothesis. is known. n is the sample size.

n

xz 0

x

Page 19: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 19

Test of µ, Known, Population Not Normally Distributed

If n 30, Test Statistic:

If n < 30, use a distribution-free test.

n

xz 0

Page 20: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 20

Test of µ, Unknown, Population Normally Distributed

Test Statistic:

where is the sample statistic. µ0 is the value identified in the null

hypothesis. is unknown. n is the sample size degrees of freedom on t are n – 1.

x

x–

nst 0

Page 21: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 21

Test of µ, Unknown, Population Not Normally Distributed

If n 30, Test Statistic:

If n < 30, use a distribution-free test.

tx –

0sn

Page 22: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 22

If both n 5 and n(1 – ) 5,Test Statistic:

where p = sample proportion 0 is the value identified in the null

hypothesis. n is the sample size.

zp–

0

0(1–

0)

n

Test of , Sample Sufficiently Large

Page 23: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 23

Test of , Sample Not Sufficiently Large

If either n < 5 or n(1 – ) < 5, convert the proportion to the underlying binomial distribution.

Note there is no t-test on a population proportion.

Page 24: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 24

Observed Significance Levels

A p-Value is: the exact level of significance of the test

statistic. the smallest value can be and still allow us to

reject the null hypothesis. the amount of area left in the tail beyond the test

statistic for a one-tailed hypothesis test or twice the amount of area left in the tail beyond

the test statistic for a two-tailed test. the probability of getting a test statistic from

another sample that is at least as far from the hypothesized mean as this sample statistic is.

Page 25: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 25

Observed Significance Levels

A p-Value is: the exact level of significance of the test

statistic. the smallest value can be and still allow us to

reject the null hypothesis. the amount of area left in the tail beyond the test

statistic for a one-tailed hypothesis test or twice the amount of area left in the tail beyond

the test statistic for a two-tailed test. the probability of getting a test statistic from

another sample that is at least as far from the hypothesized mean as this sample statistic is.

Page 26: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 26

Several Samples

Independent Samples: Testing a

company’s claim that its peanut butter contains less fat than that produced by a competitor.

Dependent Samples: Testing the

relative fuel efficiency of 10 trucks that run the same route twice, once with the current air filter installed and once with the new filter.

Page 27: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 27

Test of (µ1 – µ2), 1 = 2, Populations Normal

Test Statistic

where degrees of freedom on t = n1 + n2 – 2

2–21

22

)1–2

( 21

)1–1

( 2 where

21

112

]2

–1

[– ]2

–1

[

nn

snsnps

nnps

xxt

Page 28: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 28

The mean of population 1 is equal to the mean of population 2The mean of population 1 is equal to the mean of population 2

(1) Both distributions are normal1 = 2(1) Both distributions are normal1 = 2

HypothesisHypothesis

AssumptionAssumption

Test StatisticTest Statistic

t distribution with df = n1+ n2-2t distribution with df = n1+ n2-2

2/11/1/1 212

222

1121

21

nnsnsnnn

XXt

H0: pop1 = pop2

H1: pop1 n.e. pop2

Example:Comparing Two populations

Page 29: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 29

-5 0 5

0.0

0.1

0.2

0.3

0.4

0.5

t

t(x;

nu)

nu=5

nu=50

-5 0 5

0.0

0.1

0.2

0.3

0.4

0.5

t

t(x;

nu)

nu=5

nu=50

t distribution with df = n1+ n2-2t distribution with df = n1+ n2-2

2/11/1/1 212

222

1121

21

nnsnsnnn

XXt

RejectionRejectionRegionRegion

RejectionRejectionRegionRegion

RejectionRejectionRegionRegion

RejectionRejectionRegionRegion

Example:Comparing Two populations

Page 30: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 30

Test of (µ1 – µ2), 1 n.e. 2, Populations Normal, large n

Test Statistic

with s12 and s2

2 as estimates for 12 and

22

z [x

1– x

2]–[

1–

2]0

s12

n1

s2

2n2

Page 31: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 31

Test of Dependent Samples(µ1 – µ2) = µd

Test Statistic

where d = (x1 – x2)

= d/n, the average difference

n = the number of pairs of observations

sd = the standard deviation of d

df = n – 1

nd

sdt

d

Page 32: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 32

Test of (1 – 2), where n1p15, n1(1–p1)5, n2p25, and n2 (1–p2 )

Test Statistic

where p1 = observed proportion, sample 1

p2 = observed proportion, sample 2

n1 = sample size, sample 1

n2 = sample size , sample 2p

n1

p1

n2

p2

n1

n2

zp p

p p n n

1 2

1 11

12

( )

Page 33: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 33

Test of Equal Variances

Pooled-variances t-test assumes the two population variances are equal.

The F-test can be used to test that assumption.

The F-distribution is the sampling distribution of s1

2/s22 that would

result if two samples were repeatedly drawn from a single normally distributed population.

Page 34: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 34

Test of 12 = 2

2

If 12 = 2

2 , then 12/2

2 = 1. So the hypotheses can be worded either way.

Test Statistic: whichever is

larger The critical value of the F will be F(/2, 1, 2) where = the specified level of

significance1 = (n – 1), where n is the size of

the sample with the larger variance2 = (n – 1), where n is the size of the sample

with the smaller variance

21

22 or

22

21

s

s

s

sF

Page 35: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 35

Confidence Interval for (µ1 – µ2)

The (1 – )% confidence interval for the difference in two means: Equal variances, populations normal

Unequal variances, large samples

׳

2

1

1

122

)2

–1

(nnpstxx

2

22

1

21

2 )

2–

1(

n

s

n

szxx ׳

Page 36: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 36

Confidence Interval for (1 – 2)

The (1 – )% confidence interval for the difference in two proportions:

when sample sizes are sufficiently large.

(p1

– p2

) z2׳

p1(1– p

1)

n1

p2

(1– p2

)

n2

Page 37: 1/2/2014 (c) 2001, Ron S. Kenett, Ph.D.1 Parametric Statistical Inference Instructor: Ron S. Kenett Email: ron@kpa.co.ilron@kpa.co.il Course Website: .

04/10/23

(c) 2001, Ron S. Kenett, Ph.D. 37

The mean of population 1 is equal to the mean of population 2The mean of population 1 is equal to the mean of population 2

(1) Both distributions are normal1 = 2(1) Both distributions are normal1 = 2

HypothesisHypothesis

AssumptionAssumption

Test StatisticTest Statistic

The standard deviation of population 1 is equal to the standard deviation of population 2The standard deviation of population 1 is equal to the standard deviation of population 2

Both distributions are normalBoth distributions are normal

The proportion of error in population 1 is equal to the proportion of errors in population 2The proportion of error in population 1 is equal to the proportion of errors in population 2

n1p1 and n2p2 > 5 (approximation by normal distribution)

n1p1 and n2p2 > 5 (approximation by normal distribution)

F distribution with df2 = n1-1 and df2 = n2-1

F distribution with df2 = n1-1 and df2 = n2-1

22

21

s

sF

t distribution with df = n1+ n2-2t distribution with df = n1+ n2-2

2/11/1/1 212

222

1121

21

nnsnsnnn

XXt

Z - Normal distributionZ - Normal distribution

21

2211

/1/11

//

nnpp

nXnXZ

avgavg

21

21

nn

XXpavg

Summary