Top Banner
26

Error Squares

Jun 03, 2018

Download

Documents

Michael Edwards
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 1/26

Page 3: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 3/26

Linear Least Squares

Is the line of best fitfor a group of points

It seeks to minimizethe sum of all datapoints of the squaredifferences between

the function value anddata value.

It is the earliest form

of linear regression

Page 4: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 4/26

Gauss and Legendre

The method of least squares wasfirst published by Legendre in1805 and by Gauss in 1809.

Although Legendre’s work waspublished earlier, Gauss claimshe had the method since 1795.Both mathematicians applied the

method to determine the orbits ofbodies about the sun.Gauss went on to publish furtherdevelopment of the method in

1821.

Page 5: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 5/26

ExampleConsider the points (1,2.1) , (2,2.9) , (5,6.1) , and (7,8.3) with the best fit line f(x) = 0.9x + 1.4

The squared errors are:x1=1 f(1)=2.3 y 1=2.1 e 1= (2.3 – 2.1)² = .04

x2=2 f(2)=3.2 y 2=2.9 e 2= (3.2 – 2.9)² =. 09x3=5 f(5)=5.9 y 3=6.1 e 3= (5.9 – 6.1)² = .04x4=7 f(7)=7.7 y 4=8.3 e 4= (7.7 – 8.3)² = .36

So the total squared error is .04 + .09 + .04 + .36 = .53

By finding better coefficients of the best fit line, we can make this errorsmaller…

Page 6: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 6/26

We want tominimize the

vertical distancebetween the point

and the line.

• E = (d 1)² + (d 2)² + (d 3)² +…+(d n)² for n data points

• E = [f(x 1) – y1]² + [f(x 2) – y2]² + … + [f(x n) – yn]²• E = [mx 1 + b – y1]² + [mx 2 + b – y2]² +…+ [mx n + b – yn]²

• E= ∑(mx i+ b – y i )²

Page 7: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 7/26

E must be MINIMIZED!

How do we do this?E = ∑(mx i+ b – y i )²

Treat x and y as constants, since we aretrying to find m and b.So…PARTIALS!

E/ m = 0 and E/ b = 0But how do we know if this will yieldmaximums, minimums, or saddle points?

Page 8: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 8/26

Minimum Point Maximum Point

Saddle Point

Page 9: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 9/26

Minimum!

Since the expressionE is a sum ofsquares and istherefore positive (i.e.it looks like an upwardparaboloid), we knowthe solution must be aminimum.We can prove this byusing the 2 nd PartialsDerivative Test.

Page 10: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 10/26

2nd Partials Test

And form the discriminant D = AC – B2

1) If D < 0, then (x 0,y0) is a saddle point .2) If D > 0, then f takes on

A local minimum at (x 0,y0) if A > 0

A local maximum at (x 0,y0) if A < 0

A f x

B f y x

C f y

2

2

2 2

2, ,

Suppose the gradient of f(x 0,y0) = 0.(An instance of this is E/ m = E/ b = 0.)

We set

Page 11: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 11/26

Calculating the Discriminant

A f

x

A E

m

Amx b y

m

A x mx b y

m A x

A x

2

2

2

2

2 2

2

2

2

2

2

2

( )

( )( )

( )

B f

y x

B E

b m

Bmx b y

b m

B x mx b y

b B x

B x

2

2

2 2

2

2

2

( )

( )( )

( )

C f

y

C E

b

C mx b y

b

C mx b y

bC C

2

2

2

2

2 2

2

2

22 1

( )

( )( )

D AC B x x 2 2 24 1 4( ( ))

Page 12: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 12/26

1) If D < 0, then (x 0,y0) is a saddle point .2) If D > 0, then f takes on

A local minimum at (x 0,y0) if A > 0 A local maximum at (x 0,y0) if A < 0

Now D > 0 by an inductive proof showing that

Those details are not covered in thispresentation.We know A > 0 since A = 2 ∑ x2 is always

positive (when not all x’s have the same value).

D AC B x x 2 2 24 1 4( ( ))

n x xi

n

ii

n

i

1

2

1

2

Page 13: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 13/26

Therefore…

Setting E/ m and E/ b equal to zero will

yield two minimizing equations of E, thesum of the squares of the error.

Thus, the linear least squares algorithm(as presented) is valid and we can continue.

Page 14: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 14/26

E = ∑(mx i + b – y i)² is minimized (as just shown) whenthe partial derivatives with respect to each of thevariables is zero. ie: E/ m = 0 and E/ b = 0

E/ b = ∑2(mx i + b – y i) = 0 set equal to 0m∑x i + ∑b = ∑y i

mSx + bn = SyE/ m = ∑2x i (mx i + b – y i) = 2∑(m xi² + bx i – x iyi) = 0

m∑xi² + b ∑x i = ∑x iyi

mSxx + bSx = Sxy

NOTE:∑x i = Sx ∑y i = Sy ∑x i² = Sxx ∑x iyi = SxSy

Page 15: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 15/26

Next we will solve the system ofequations for unknowns m and b:

nmSxx + bnSx = nSxy Multiply by nmSxSx + bnSx = SySx Multiply by Sx

nmSxx – mSxSx = nSxy – SySx Subtract

mSxx bSx SxymSx bn Sy

mnSxy SySxnSxx SxSx

m(nSxx – SxSx) = nSxy – SySx Factor m

Solving for m…

Page 16: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 16/26

Next we will solve the system ofequations for unknowns m and b:

mSxSxx + bSxSx = SxSxy Multiply by SxmSxSxx + bnSxx = SySxx Multiply by Sxx

bSxSx – bnSxx = SxySx – SySxx Subtract

mSxx bSx SxymSx bn Sy

bSxxSy SxySxnSxx SxSx

b(SxSx – nSxx) = SxySx – SySxx Solve for b

Solving for b…

Page 17: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 17/26

Example: Find the linear least squaresapproximation to the data: (1,1), (2,4), (3,8)

mnSxy SySxnSxx SxSx

Sx = 1+2+3= 6Sxx = 1²+2²+3² = 14Sy = 1+4+8 = 13Sxy = 1(1)+2(4)+3(8) = 33n = number of points = 3

The line of best fit is y = 3.5x – 2.667

Use these formulas:

bSxxSy SxySxnSxx SxSx

b

14 13 33 63 14 6 6

166

2667( ) ( )( ) ( )

.

m

3 33 6 133 14 6 6

216

35( ) ( )( ) ( )

.

Page 18: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 18/26

Line of best fit: y = 3.5x – 2.667

-1 1 2 3 4 5

-5

5

10

15

Page 19: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 19/26

THE ALGORITHM

in Mathematica

Page 20: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 20/26

Page 21: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 21/26

Page 22: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 22/26

ActivityFor this activity we are going to use the linear leastsquares approximation in a real life situation.You are going to be given a box score from either abaseball or softball game.With the box score you are given you are going to writeout the points (with the x coordinate being the number ofhits that player had in the game and the y coordinatebeing the number of at-bats that player had in the game).

After doing that you are going to use the linear leastsquares approximation to find the best fitting line.The slope of the besting fitting line you find will be theteam’s batting average for that game.

Page 23: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 23/26

In Conclusion…

E = ∑(mx i+ b – y i )² is the sum of thesquared error between the set of datapoints {(x1,y1),…,(x i,y i),…,(x n,yn)} and theline approximating the data f(x) = mx + b .By minimizing the error by calculusmethods , we get equations for m and bthat yield the least squared error :

mnSxy SySxnSxx SxSx

b

SxxSy SxySxnSxx SxSx

Page 24: Error Squares

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 24/26

Advantages

Many common methods of approximating dataseek to minimize the measure of differencebetween the approximating function and givendata points.

Advantages for using the squares of differencesat each point rather than just the difference,absolute value of difference, or other measures oferror include: – Positive differences do not cancel negative differences – Differentiation is not difficult – Small differences become smaller and large differences

become larger