Top Banner
Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares
47

Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

Mar 31, 2015

Download

Documents

Colby Bellows
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

Environmental Data Analysis with MatLab

Lecture 6:The Principle of Least Squares

Page 2: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

Lecture 01 Using MatLabLecture 02 Looking At DataLecture 03 Probability and Measurement Error Lecture 04 Multivariate DistributionsLecture 05 Linear ModelsLecture 06 The Principle of Least SquaresLecture 07 Prior InformationLecture 08 Solving Generalized Least Squares ProblemsLecture 09 Fourier SeriesLecture 10 Complex Fourier SeriesLecture 11 Lessons Learned from the Fourier TransformLecture 12 Power SpectraLecture 13 Filter Theory Lecture 14 Applications of Filters Lecture 15 Factor Analysis Lecture 16 Orthogonal functions Lecture 17 Covariance and AutocorrelationLecture 18 Cross-correlationLecture 19 Smoothing, Correlation and SpectraLecture 20 Coherence; Tapering and Spectral Analysis Lecture 21 InterpolationLecture 22 Hypothesis testing Lecture 23 Hypothesis Testing continued; F-TestsLecture 24 Confidence Limits of Spectra, Bootstraps

SYLLABUS

Page 3: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

purpose of the lecture

estimate model parameters using the

principle of least-squares

Page 4: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

part 1

the least squares estimation of model parameters and their covariance

Page 5: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

the prediction error

motivates us to define an error vector, e

Page 6: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

prediction error in straight line case

-6 -4 -2 0 2 4 6-15

-10

-5

0

5

10

15

x

dplot of linedata01.txt

auxiliary variable, x

data

, d

dipre

diobs ei

Page 7: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

total errorsingle number summarizing the error

sum of squares of individual errors

Page 8: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

principle of least-squares

that minimizes

Page 9: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

least-squares and probability

suppose that each observation has a Normal p.d.f.

2

Page 10: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

for uncorrelated datathe joint p.d.f. is just the product of

the individual p.d.f.’s

least-squares formula for E suggests a link

between probability and least-squares

Page 11: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

now assume that Gm predicts the mean of d

minimizing E(m) is equivalent to maximizing p(d)

Gm substituted for d

Page 12: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

the principle of least-squaresdetermines the m

that makes the observations “most probable”

in the sense of maximizingp(dobs)

Page 13: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

the principle of least-squaresdetermines the model parameters

that makes the observations “most probable”

(provided that the data are Normal)

this isthe principle of maximum likelihood

Page 14: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

a formula for mestat the point of minimum error, E

∂E / ∂mi = 0so solve this equation for mest

Page 15: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

Result

Page 16: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

where the result comes fromE =

so

Page 17: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

unity when k=jzero when k≠jsince m’s are independent

use the chain rule

so just delete sum over j and replace j with k

Page 18: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

which gives

Page 19: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

covariance of mestmest is a linear function of d of the form mest = M dso Cm = M Cd MT, with M=[GTG]-1GTassume Cd uncorrelated with uniform variance, σd

2

then

Page 20: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

two methods of estimating the variance of the data

posterior estimate: use prediction error

prior estimate: use knowledge of measurement technique

the ruler has 1mm tic marks, so σd≈½mm

Page 21: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

posterior estimates are overestimates when the model is poor

reduce N by M since an M-parameter model can exactly

fit N data

Page 22: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

confidence intervals for the estimated model parameters

(assuming uncorrelated data of equal variance)

soσmi = √[Cm]ii

and

m=mest±2σmi (95% confidence)

Page 23: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

MatLab script for least squares solution

mest = (G’*G)\(G’*d);Cm = sd2 * inv(G’*G);sm = sqrt(diag(Cm));

Page 24: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

part 2

exemplary least squares problems

Page 25: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

Example 1: the mean of data

the constant

will turn out to be the mean

Page 26: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

usual formula for the mean

variance decreases with number of data

Page 27: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

m1est = d = 2σd± √N (95% confidence)

formula for mean formula for covariance

combining the two into confidence limits

Page 28: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

Example 2: fitting a straight line

intercept

slope

Page 29: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.
Page 30: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.
Page 31: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

[GTG]-1=(uses the rule)

Page 32: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.
Page 33: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

intercept and slope are uncorrelated

when the mean of x is zero

Page 34: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

keep in mind that none of this algrbraic manipulation is needed if we just compute

using MatLab

Page 35: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

Generic MatLab scriptfor least-squares problems

mest = (G’*G)\(G’*dobs);dpre = G*mest;e = dobs-dpre;E = e’*e;sigmad2 = E / (N-M);covm = sigmad2 * inv(G’*G);sigmam = sqrt(diag(covm));mlow95 = mest – 2*sigmam;mhigh95 = mest + 2*sigmam;

Page 36: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000-40

-20

0

2040

time, days

obs t

em

p,

C

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000-40

-20

0

20

40

time, days

pre

tem

p,

C

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000-40

-20

0

2040

time, days

err

or,

C

d(t)obs

d(t)pre

error, e(t)time t, days

time t, days

time t, days

Example 3:modeling long-term trend and annual cycle in

Black Rock Forest temperature data

Page 37: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

the model:

long-term trend annual cycle

Page 38: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

Ty=365.25; G=zeros(N,4); G(:,1)=1; G(:,2)=t; G(:,3)=cos(2*pi*t/Ty); G(:,4)=sin(2*pi*t/Ty);

MatLab script to create the data kernel

Page 39: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

prior variance of databased on accuracy of thermometerσd = 0.01 deg C

posterior variance of databased on error of fitσd = 5.60 deg C

huge difference, since the model does not include diurnal cycle of weather patterns

Page 40: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

long-term slope

95% confidence limits based on prior variancem2 = -0.03 ± 0.00002 deg C / yr95% confidence limits based on posterior variancem2 = -0.03 ± 0.00460 deg C / yrin both cases, the cooling trend is significant, in the sense that the confidence intervals do not include zero or positive slopes.

Page 41: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

However

The fit to the data is poor, so the results should be used with caution. More effort needs to be put into developing a better model.

Page 42: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

part 3

covariance and the shape of the error surface

Page 43: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

m1est

0 4m20

4

mest

m1

m2est

solutions within the region of low error are almost as good as mest

small range of m2

large range of m1

E(m)mi

miest

near the minimum the error is shaped like a parabola. The curvature of the parabola

controls the with of the region of low error

Page 44: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

near the minimum, the Taylor series for the error is:

curvature of the error surface

Page 45: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

starting with the formula for error

we compute its 2nd derivative

Page 46: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

but

so

curvature of the error surface

covariance of the model parameters

Page 47: Environmental Data Analysis with MatLab Lecture 6: The Principle of Least Squares.

the covariance of the least squares solution

is expressed

in the shape of the error surface

E(m)mi

miest

E(m)mi

miest

large variance

small variance