Top Banner
Parameter estimation, maximum likelihood and least squares techniques Jorge Andre Swieca School Campos do Jordão, January,2003 third lecture
35

Parameter estimation, maximum likelihood and least squares techniques

Jan 14, 2016

Download

Documents

frye

third lecture. Parameter estimation, maximum likelihood and least squares techniques. Jorge Andre Swieca School Campos do Jordão, January,2003. References. Statistics, A guide to the Use of Statistical Methods in the Physical Sciences, R. Barlow , J. Wiley & Sons, 1989; - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Parameter estimation, maximum likelihood and least squares techniques

Parameter estimation, maximum likelihood and least squares

techniques

Jorge Andre Swieca School

Campos do Jordão, January,2003

third lecture

Page 2: Parameter estimation, maximum likelihood and least squares techniques

References

• Statistics, A guide to the Use of Statistical Methods in the Physical Sciences, R. Barlow, J. Wiley & Sons, 1989;

• Statistical Data Analysis, G. Cowan, Oxford, 1998

• Particle Data Group (PDG) Review of Particle Physics, 2002 electronic edition.

• Data Analysis, Statistical and Computational Methods for Scientists and Engineers, S. Brandt, Third Edition, Springer, 1999

Page 3: Parameter estimation, maximum likelihood and least squares techniques

Likelihood

“A verossimilhança (…) é muita vez toda a verdade.” Conclusão de Bento – Machado de Assis

“Quem quer que a ouvisse, aceitaria tudo por verdade, tal era a nota sincera, a meiguice dos termos e a verossimilhança dos pormenores”

Quincas Borba – Machado de Assis

Page 4: Parameter estimation, maximum likelihood and least squares techniques

Parameter estimation

p.d.f. f(x): sample space all possible values of x.

Sample of size n: independent obsv.),...,,( nxxxx 21

Joint p.d.f. )()()(),...,( nnsam xfxfxfxxf 211

Central problem of statistics: from n measurements of x , infer properties of , .);(

xf ),,,( m

21

A statistic: a function of the observed . x

To estimate prop. of p.d.f. (mean, variance,…): estimador.

Estimador para : θ θEstimador consistent 0 )ˆ(lim Pn

(large sample or assimptotic limit)

Page 5: Parameter estimation, maximum likelihood and least squares techniques

Parameter estimation

),,(ˆ nxx 1 random variable distributed as );( g(sampling distribution)

ˆ);(ˆ)](ˆ[ dgxE

nn dxdxxfxfx

11 );();()(ˆ

Infinite number of similar experiments of size n

Bias ]ˆ[Eb • sample size• functional form of estimator• true properties of p.d.f.b=0 independent of n: θ is unbiased

Important to combine results of two or more experiments.

Page 6: Parameter estimation, maximum likelihood and least squares techniques

Parameter estimation

mean square error222 ])ˆ[(])ˆ[ˆ[(])ˆ[( EEEEMSE

2bVMSE ]ˆ[

Classical statistics: no unique method for building estimatorsgiven an estimator one can evaluate its properties

sample mean

From supposed from unknown pdf ),...,,( nxxxx 21

)(xf

Estimator for E[x]=µ (population mean)

one possibility:

n

iix

nx

1

1

Page 7: Parameter estimation, maximum likelihood and least squares techniques

Parameter estimation

Important property: weak law of large numbers

If V(x) exists, is a consistent estimator for µ x

n→∞, →µ in the sense of probability x

n

i

n

ii

n

ii n

xEn

xn

ExE111

111 ][][

nnii dxdxxfxfxxE 11 )()(][

x is an unbiased estimator for the population mean µ

Page 8: Parameter estimation, maximum likelihood and least squares techniques

Parameter estimation

Sample variance )()( 22

1

22

11

1xx

n

nxx

ns

n

ii

22 ][sE

s2 is an unbiased estimator for V[x]

if µ is known 22

1

22 1

xxn

Sn

ii )(

S2 is an unbiased estimator for σ2.

Page 9: Parameter estimation, maximum likelihood and least squares techniques

Maximum likelihood

Technique for estimating parameters given a finite sample of data ),...,,( nxxxx 21

Suppose the functional form of f(x;θ) known.

The probability of be in is 1x ],[ 111 dxxx 11 dxxf );(

prob. xi in for all i = ],[ iii dxxx

n

iii dxxf

1

);(

If parameters correct: high probability for the data.

n

iixfL

1

);()( likelihood function• joint probability• θ variables• X parameters

ML estimators for θ: maximize the likelihood function

0

i

L

mi ,,1 )ˆ,,ˆ(ˆ

m 1

Page 10: Parameter estimation, maximum likelihood and least squares techniques

Maximum likelihood

Page 11: Parameter estimation, maximum likelihood and least squares techniques

Maximum likelihood

n decay times for unstable particles t1,…,tn

hypothesis: distribution an exponential p.d.f. with mean

)exp();(

ttf

1 )(log);(log)(log

in

i

n

ii

ttfL

1 1

1

0

)(logL

n

iitn 1

1

nnnn dtdtttfttttE 11joint11 );,,(),,(ˆ)],,(ˆ[

n

ttn

ii dtdteet

n

n

11

111 1

)(

n

i

n

i ijj

t

i

t

i ndtedtet

n

ji

11

1111

Page 12: Parameter estimation, maximum likelihood and least squares techniques

Maximum likelihood

01.

0621.ˆ

50 decaytimes

Page 13: Parameter estimation, maximum likelihood and least squares techniques

Maximum likelihood

1 ?

given )(a 0

a

a

LL0

a

)(ˆ aa

n

iit

n

1

1

ˆˆ

1

1

1

n

n

n

nE

]ˆ[

unbiased estimator for when n→∞

1

Page 14: Parameter estimation, maximum likelihood and least squares techniques

Maximum likelihood

n measurements of x assumed to come from a gaussian

2

2

2

2

22

1

)(

exp),;(x

xf

n

i

in

ii

xxfL

12

2

22

1

2

2

1

2

1

2

1

)(

loglog),;(log),(log

0

Llog

n

iix

n 1

1 ]ˆ[E unbiased

02

Llog

n

iix

n 1

22 1)ˆ(

/\

22 1n

nE

/\

][

unbiased for large n

Page 15: Parameter estimation, maximum likelihood and least squares techniques

Maximum likelihood

we showed that s2 is an unbiased estimator for the variance of any p.d.f., so

n

iix

ns

1

22

1

1)ˆ(

is unbiased estimator for 2

Page 16: Parameter estimation, maximum likelihood and least squares techniques

Maximum likelihood

Variance of ML estimators

many experiments (same n): spread of ? analytically (exponential)

n

iitn 1

122 ])ˆ[(]ˆ[]ˆ[ EEV

n

ttn

ii dtdteet

n

n

12

1

111 1

)(

n

ttn

ii dtdteet

n

n

11

111 1

)(

n

2

transf. invariance of ML estimators

ML estimate of n

22 ˆ

n

22

ˆ/\

ˆ n

ˆ

ˆ ˆ

Page 17: Parameter estimation, maximum likelihood and least squares techniques

Maximum likelihood

430827 ..ˆ If the experiment repeated many times (with the same n) the standard deviation of the estimation 0.43.

• one possible interpretation• not the standard when the distribution is not gaussian (68% confidence interval, +- standard deviation if the p.d.f. for the estimator is gaussian)• in the large sample limit, ML estimates are distributed according to a gaussian p.d.f.• two procedures lead to the same result

Page 18: Parameter estimation, maximum likelihood and least squares techniques

Maximum likelihood

Variance: MC method

cases too difficult to solve analytically: MC method

• simulate a large number of experiments• compute the ML estimate each time• distribution of the resulting values

S2 unbiased estimator for the variance of a p.d.f.

S from MC experiments: statistical errors of the parameter estimated from real measurement

asymptotic normality: general property of ML estimators for large samples.

Page 19: Parameter estimation, maximum likelihood and least squares techniques

Maximum likelihood

1000 experiments50 obs/experiment

s = 0.151sample standarddeviation

50

0621.ˆˆ ˆ

n

1500.

Page 20: Parameter estimation, maximum likelihood and least squares techniques

RCF bound

A way to estimate the variance of any estimators without analytical calculations or MC.

2

22

1

LE

bV

log]ˆ[ Rao-Cramer-Frechet

Equality (minimum variance): estimator efficient If efficient estimators exist for a problem, the ML will find them.

ML estimators: always efficient in the large sample limit.

Ex: exponential

t

etf

1

);(

ˆlog

2112

12

122

2 nt

n

nL n

ii

0b

nnE

V2

2

21

1

ˆ

][ equal to exact resultefficient estimator

Page 21: Parameter estimation, maximum likelihood and least squares techniques

RCF bound

),,( m

1 assume efficiency and zero bias]ˆ,ˆcov[ jiijV

jiij

LEV

log2

1

n

lll

n

kk

ji

dxxfxf11

2

);();(log

dxxfxfnji

);(log);(

2

nV

1 statistical errors

n

1

Page 22: Parameter estimation, maximum likelihood and least squares techniques

RCF bound

large data sample: evaluating the second derivative with the measured data and the ML estimates

ji

VL

ij

log/\2

1

2

22 1

Llog

/\

usual method for estimating the covariance matrix when the likelihood function is maximized numerically

Ex: MINUIT (Cern Library) • finite differences• invert the matrix to get Vij

Page 23: Parameter estimation, maximum likelihood and least squares techniques

Graphical method

single parameter θ

...)ˆ(log

!)ˆ(

log)(log)(log

ˆˆ

22

2

2

1

LLLL

2

2

2 ˆ

)ˆ(log)(log max

LL

2

1 maxlog)ˆ(log LL

logLmax

2

1

later ]ˆ,ˆ[ 68.3% central confidenceinterval

Page 24: Parameter estimation, maximum likelihood and least squares techniques

ML with two parameters

angular distribution for the scattering angles θ (x=cosθ) in a particle reaction.

32

2

1 2

xxxf ),;( normalized -1≤ x ≤+1

realistic measurements only in xmin ≤ x ≤ xmax

)()()(),;(

minmaxminmaxminmax3322

2

32

1

xxxxxx

xxxf

Page 25: Parameter estimation, maximum likelihood and least squares techniques

ML with two parameters

50.

50.

05205080 ..ˆ

10804660 ..ˆ

2000 events

Page 26: Parameter estimation, maximum likelihood and least squares techniques

ML with two parameters

500 exper.

2000 evts/exp.

Both marginalpdf’s are aprox.gaussian

4990.ˆ

0510.ˆ s

4980.ˆ

1110.ˆ s

420.r

Page 27: Parameter estimation, maximum likelihood and least squares techniques

Least squares

measured value y: gaussian random variable centered about the quantity’s true value λ(x,θ) niyx iii ,,),( 1

2

2

12

22111 22

1

i

iin

i i

nnn

yyyL

)(

exp),,,,,;,,(

estimate the

n

i i

ii xyL

12

2

2

1

));((

)(log

maximized with that mimize

n

i i

ii xy

12

22

)),((

)(

Page 28: Parameter estimation, maximum likelihood and least squares techniques

Least squares

used to define the procedure even if yi are not gaussian

n

jijjijii xyVxyL

1

1

2

1

,

));(()))(,(()(log

measurements not independent, described by a n-dim Gaussian p.d.f. with nown cov. matrix but unknown mean values:

n

jijjijii xyVxy

1

12

,

));(()))(,(()(

m ˆ,,ˆ 1LS estimators

Page 29: Parameter estimation, maximum likelihood and least squares techniques

Least squares

m

jjj xax

1

)();( )(xa j linearly independent

• estimators and their variances can be found analytically• estimators: zero bias and minimum variance

m

jjij

m

jjiji Axax

11

)();(

)()()()(

AyVAyyVy TT 112

minimum 02 112 )(

AVAyVA TT

yByVAAVA TT 111 )(

Page 30: Parameter estimation, maximum likelihood and least squares techniques

Least squares

covariance matrix for the estimators

],cov[ jiijU 11 )( AVABVBU TT

ˆ

)(

ji

ijU2

1

2

1

coincides with the RCF bound for the inverse covariance matrix if yi are gaussian distributed

2

2Llog

Page 31: Parameter estimation, maximum likelihood and least squares techniques

Least squares

);(

x linear in , quadratic in 2

)ˆ)(ˆ()ˆ()(ˆ,

jjii

m

ji ji

1

2222

2

1

to interpret this, one single θ

222

22

2

1)ˆ()ˆ()(

ˆ

/\

2

ˆ

12 )ˆ()( min

Page 32: Parameter estimation, maximum likelihood and least squares techniques

Chi-squared distribution

)();(

2

1

2

22

2 nn

zn

eznzf

n=1,2,…0 ≤ z ≤ ∞

(degrees of freedom) )!()( 1 nn)()( xxx

0

ndznzfzE );(][

ndznzfnzzV 20

2

);()(][

n independent gaussian random variables xi with known 2ii ,

n

i i

iixz

12

2

)( is distributed as a for n dof

2

Page 33: Parameter estimation, maximum likelihood and least squares techniques

Chi-squared distribution

Page 34: Parameter estimation, maximum likelihood and least squares techniques

Chi-squared distribution

Page 35: Parameter estimation, maximum likelihood and least squares techniques

Chi-squared distribution