Top Banner
Lecture (11 Lecture (11 ,12) ,12) Parameter Estimation Parameter Estimation of PDF and of PDF and Fitting a Fitting a Distribution Function Distribution Function
54

Lecture (11 ,12)

Jan 10, 2016

Download

Documents

Brooke

Lecture (11 ,12). Parameter Estimation of PDF and Fitting a Distribution Function. How can we specify a distribution from the data?. Two steps procedure: Decide which family to use (Normal, Log-normal, exponential …, etc. This step is done by: - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture (11 ,12)

Lecture (11Lecture (11,12),12)

Parameter Estimation Parameter Estimation of PDF and of PDF and

Fitting a Distribution Fitting a Distribution Function Function

Page 2: Lecture (11 ,12)

How can we specify a distribution How can we specify a distribution from the data? from the data?

Two steps procedure:

1. Decide which family to use (Normal, Log-normal, exponential …, etc.This step is done by:- Guess the family by looking at the observations.- Use the Chi-square goodness-of-fit test to test our guess.

2. Decide which member of the chosen family to use. This means specify the values of the parameters.

This is done by producing estimates of the parameters based on the observations in the sample.

Page 3: Lecture (11 ,12)

EstimationEstimation

Estimation has to do with the second step.

2. Decide which member of the chosen family to use. This means specify the values of the parameters.

Page 4: Lecture (11 ,12)

General Concept of ModellingGeneral Concept of Modelling

Page 5: Lecture (11 ,12)

Point EstimatesPoint Estimates

A point estimate of an unknown parameter is a number which to the best of our knowledge represents the parameter-value.

Each random sample can give an estimator. So, the estimator is regarded as a random variable.

A good estimator has the following:1. It gives a good result. Not always too big or always too small.1. Unbiased. The expected value of the estimator should be equal to the true value of the parameter.3. The variance is small.

Page 6: Lecture (11 ,12)

Unbiased EstimatorsUnbiased Estimators

2 2

2 2

is unbiased estimator for if,

{ }

is unbiased estimator for if,

{ }

x

x

x

E x

s

E s

Page 7: Lecture (11 ,12)

Method of MomentsMethod of Moments

2

1

2

22

1

If the random variable has a Normal distribution

with unknown parameters and ,

the method of moments is simple,

first moment of ,

second central moment of ,

x

n

rj jj

x

n

x rj jj

x

x

= f x

x

f x

Page 8: Lecture (11 ,12)

Method of Moments (Cont.)Method of Moments (Cont.)

2 2

2

2

We use the following notation:

ˆ

ˆ

the "hat" indicates that

the given value is only an estimate.

and are random variables.

ˆ ˆ and are also random variables.

x

x

x

s

x s

Page 9: Lecture (11 ,12)

Mean of the MeansMean of the Means

1 2 3

1{ } { } { } { } ... { }

1{ } ...

nE x E x E x E x E xn

E xn

Page 10: Lecture (11 ,12)

Standard Deviation of the MeansStandard Deviation of the Means

1 2 1 2

1 2 1 2

1 21 1 2 2

1 22 2 2

1 2

2 2 2

2 2

2 2

,

2

0

by induction,

X X X X

X X X X

X

X

X

X

n

x X X x X X

X X

nX X

n

n

nnn

n

Page 11: Lecture (11 ,12)

Confidence IntervalsConfidence Intervals

2 2

22 2 2

Random variable in the population has

{ } and ( )

Irrespective of the sample-size and

the distribution of holds:

{ } and ( ) , { }

x

E x x

n

x

E x x E sn

When we estimate a parameter from a sample the estimation can be different from different samples. It would be better to indicate reliability of the estimate. This can be done by giving the confidence of the result.

Page 12: Lecture (11 ,12)

Confidence Intervals for the meanConfidence Intervals for the meanHow to constract a confidence-interval for ?

1. Estimate .

2. For 90%-confidence-interval we can derive

from the table of N(0,1) by cutting from both ends 5%:

( 1.645 1.645) 0.90

3.

( 1

P u

xu

n

P

.645 1.645 ) 0.90

( 1.645 1.645 ) 0.90

xn n

P x xn n

nσ/-μ2αz nσ/μ

2αzμ

E E

2

2

Page 13: Lecture (11 ,12)

Confidence Interval For the Mean (cont.)

• A general expression for a 100(1-)% confidence interval for the mean is given by:

2

x zn

nσ/-μ2αz nσ/μ

2αzμ

E E

2

2

2

E z n

Page 14: Lecture (11 ,12)

Confidence Interval For the Mean (Cont.)

• According to the above formula we have• 90%•• 95%

• 98%

• 99%

• These formulae apply for any population as long as the sample size is sufficiently large for the central limit theorem to hold

nx /645.1

nx / 96.1

nx / 33.2

nx /576.2

Page 15: Lecture (11 ,12)

Statistical Inference for The population Variance

• For normal populations statistical inference procedures are available for the population variance

• The sample variance S2 is an unbiased estimator of 2

• We assume we have a random sample of n observations from a normal population with unknown variance 2 .

Page 16: Lecture (11 ,12)

The Chi Square Distribution

• If the population is Normal with variance 2 then the statistic

• Has a Chi Square distribution with (n-1) degrees of freedom

2

22 )1(

Sn

Page 17: Lecture (11 ,12)

Confidence Region For The Variance

• Using this result a confidence interval for 2 is given by the interval:

2

)1(;2

1

2

2

)1(;2

2 )1(,

)1(

nn

SnSn

Page 18: Lecture (11 ,12)

Confidence Region For The Variance

(n-1) d.f.

2

)1(; 2

1 n 2

)1(; 2

n

2/2/

Page 19: Lecture (11 ,12)

Estimation of the Confidence Estimation of the Confidence Intervals of the varianceIntervals of the variance

2

2

2 2

2 2 20.95 0.05

2

How to constract a confidence-interval for ?

1. Estimate .

2. For 90%-confidence-interval we use the -distribution: ( 1)

the degree of freedom: -1

( ) 0.90

( ) 0.90

3.we d

n

df n

P

P a b

2

2

2

2

2 22

etermine and from the table of

in such a way that we cut of both tails 5%. This gives (for 30),

17.7 and 42.6

(17.7 42.6) 0.90

( 1)(17.7 42.6) 0.90

( 1) ( 1)( ) 0.

42.6 17.7

a b

n

a b

P

n sP

n s n sP

90

Page 20: Lecture (11 ,12)

Fitting a Distribution Fitting a Distribution FunctionFunction

Page 21: Lecture (11 ,12)

A goodness-of-fit test is an inferential procedure used to determine whether a frequency distribution follows a claimed distribution.

Goodness-of-Fit Test

Page 22: Lecture (11 ,12)

Hypothesis Testing

• Hypothesis:– A statement which can be proven false

• Null hypothesis Ho:– “There is no difference”

• Alternative hypothesis (H1):– “There is a difference…”

• In statistical testing, we try to “reject the null hypothesis”– If the null hypothesis is false, it is likely that our

alternative hypothesis is true– “False” – there is only a small probability that the

results we observed could have occurred by chance

Page 23: Lecture (11 ,12)

Application of Testing hypothesis on Application of Testing hypothesis on Goodness of Fit Goodness of Fit

Testing Hypothesis: Ho: the null hypothesis is defined as the distribution function is a good fit to the empirical distribution. H1: the alternative hypothesis is defined as the distribution function is not a good fit to the empirical distribution.

Testing of hypothesis is a procedure for deciding whether to accept or reject the hypothesis.

The Chi-squared test can be used to test if the fit is satisfactory.

Page 24: Lecture (11 ,12)

Testing Goodness of Fit of a Testing Goodness of Fit of a Distribution Function to an Empirical Distribution Function to an Empirical

DistributionDistribution

Unk

now

n re

al s

itua

tion

Decision

Accept Ho Reject Ho

Ho is TrueCorrect

DecisionType I Error Probability

Ho is FalseType II Error Probability

Correct Decision

Page 25: Lecture (11 ,12)

Common Values for Significant Levels

Significant-level of the test (Risk level):

(type I error)

Common values for are 1%, 2.5%, 5% or 10%

(type II error)

Common values for are 1%, 2.5%

P

P

Page 26: Lecture (11 ,12)

1. It is not symmetric.

2. The shape of the chi-square distribution depends upon the degrees of freedom.

3. As the number of degrees of freedom increases, the chi-square distribution becomes more symmetric as is illustrated in the figure.

4. The values are non-negative. That is, the values of are greater than or equal to 0.

The Chi-Square Distribution

Page 27: Lecture (11 ,12)

Chi2 Degrees of Freedom

• All statistical tests require the compotation of degrees of freedom

• Chi2 df = (No. classes -1)

Page 28: Lecture (11 ,12)

Critical Values of Chi2

Significance Level

df 0.10 0.05 0.25 0.01 0.005

1 2.7055 3.8415 5.0239 6.6349 7.8794

2 4.6062 5.9915 7.3778 9.2104 10.5965

3 6.2514 7.8147 9.3484 11.3449 12.8381

Page 29: Lecture (11 ,12)

Chi-Square Table

Page 30: Lecture (11 ,12)

Step 1: A claim is made regarding a distribution. The claim is used to determine the null and alternative hypothesis.

Ho: the random variable follows the claimed distribution

H1: the random variable does not follow the claimed distribution

Procedure for Chi Square Test

Page 31: Lecture (11 ,12)

Step 2: Calculate the expected frequencies for each of the k classes. The expected frequencies are Ei , i = 1, 2, …, k assuming the null hypothesis is true.

Procedure for Chi Square Test (cont.)

Page 32: Lecture (11 ,12)

Step 3: Verify the requirements fort he goodness-of-fit test are satisfied.

(1) all expected frequencies are greater than or equal to 1 (all Ei > 1)

(2) no more than 20% of the expected frequencies are less than 5.

Procedure for Chi Square Test (cont.)

Page 33: Lecture (11 ,12)

Procedure for Chi Square Test (cont.)

Page 34: Lecture (11 ,12)

Procedure for Chi Square Test (cont.)

Page 35: Lecture (11 ,12)

Procedure for Chi Square Test (cont.)

Page 36: Lecture (11 ,12)

Example 1 (Discrete Variable)

Page 37: Lecture (11 ,12)

Example 1 (cont.)

• Observed Frequency– The obtained frequency for each category.

of

Page 38: Lecture (11 ,12)

Example 1 (cont.)

• State the research hypothesis.– Is the rat’s behavior random?

• State the statistical hypotheses.

Page 39: Lecture (11 ,12)

Example 1 (cont.)

.25.25 .25

.25

If picked by chance.

false. is :

25.,25.,25.,25.:

0

0

HH

PPPPH

A

DCBA

Page 40: Lecture (11 ,12)

Example 1 (cont.)

• Expected Frequency– The hypothesized frequency for each

distribution, given the null hypothesis is true.

– Expected proportion multiplied by number of observations.8(32) .25 ef

Page 41: Lecture (11 ,12)

Example 1 (cont.)

• Set the decision rule.

Page 42: Lecture (11 ,12)

Example 1 (cont.)

• Set the decision rule.– Degrees of Freedom

• Number of Categories -1

• (C) –1

314 df

Page 43: Lecture (11 ,12)

Example 1 (cont.)

• Set the decision rule.

81.7

3

05.

2

crit

df

Page 44: Lecture (11 ,12)

Example 1 (cont.)

• Calculate the test statistic.

e

eo

f

ff 22 )(

Page 45: Lecture (11 ,12)

Example 1 (cont.)

• Calculate the test statistic.

Page 46: Lecture (11 ,12)

Example 1 (cont.)

• Decide if your result is significant.– Reject H0, 9.25>7.81

• Interpret your results.– The rat’s behavior was not random.

7.81 7.8122

Do Not Reject H0Do Not Reject H0 Reject H0Reject H0

Page 47: Lecture (11 ,12)

Example 2 (Continuous Variable)Example 2 (Continuous Variable)

Class number

Class interval

Class Mark

Observed frequency

Expected frequency

1 0.00-1.00 0.5 14 20

2 1.00-2.00 1.5 18 20

3 2.00-3.00 2.5 26 20

4 3.00-4.00 3.5 18 20

5 4.00-5.00 4.5 20 20

6 5.00-6.00 5.5 18 20

7 6.00-7.00 6.5 24 20

8 7.00-8.00 7.5 22 20

9 8.00-9.00 8.5 16 20

10 9.00-10.0 9.5 24 20

Observed frequency of 10 size classes of shale thicknesses,

1 1 1( )

10 0 101

( ). . .200.1 2010jf

f xb a

E f x N x

Page 48: Lecture (11 ,12)

Example (Cont.)Example (Cont.)

2

2

1

2 2 2 22

2 2 2 2

2 2

( )

(20 14) (20 18) (20 26) (20 18)

20 20 20 20

(20 20) (20 18) (20 24) (20 22)

20 20 20 20

(20 16) (20 24)6.8

20 20

j j

j

kf f

j f

E O

E

Page 49: Lecture (11 ,12)

Example (Cont.)Example (Cont.)

2 2critical

2 2critical

For significant level (risk level) 0.05

degrees of freedom=( -1) 10 -1 9

table gives, =16.92

< we accept the hypotesis

that the distribution is uniform

because we do have suff

n

icient evidence

to reject it at the chosen level of significant.

Page 50: Lecture (11 ,12)

Chi2 Graph

16.92

Page 51: Lecture (11 ,12)

ExerciseExercise

In exercise 1 test the hypothesises that the distribution is Normal For =0.05.

2

22

1 ( )( ; , ) exp

22

( ). .

where,

= Number of observations.

class interval.

jf

xf x

E f x N x

N

x

Page 52: Lecture (11 ,12)

Other Statistical Tests

• The Chi2 and Independent T-test are very useful.

• A Variety of other tests are available for other research designs.

• Parametric Examples Follow• T-tests are used to compare 2 groups• F tests (Analyses of Variance tests )are used to

compare more than 2 groups.

Page 53: Lecture (11 ,12)

Common Statistical Tests

Question Test

Does a single observation belong to a population of values?

Z-test (Standard Normal Dist)

Are two (or more populations) of number different?

T-testF-test (ANOVA)

Is there a relationship between x and y Regression

Is there a trend in the data (special case of above

Regression

Page 54: Lecture (11 ,12)

SPSS and Computer Applications

• Most actual analysis is done by computer• Simple test are easily done in Excel• Sophisticated programs (such as SPSS) are

used for more complicated designs