8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 2/26
Linea r Least Sq uares
Approximatio n
ByKristen Bauer, Renee Metzger,
Holly Soper, Amanda Unklesbay
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 3/26
Linear Least Squares
Is the line of best fitfor a group of points
It seeks to minimizethe sum of all datapoints of the squaredifferences between
the function value anddata value.
It is the earliest form
of linear regression
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 4/26
Gauss and Legendre
The method of least squares wasfirst published by Legendre in1805 and by Gauss in 1809.
Although Legendre’s work waspublished earlier, Gauss claimshe had the method since 1795.Both mathematicians applied the
method to determine the orbits ofbodies about the sun.Gauss went on to publish furtherdevelopment of the method in
1821.
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 5/26
ExampleConsider the points (1,2.1) , (2,2.9) , (5,6.1) , and (7,8.3) with the best fit line f(x) = 0.9x + 1.4
The squared errors are:x1=1 f(1)=2.3 y 1=2.1 e 1= (2.3 – 2.1)² = .04
x2=2 f(2)=3.2 y 2=2.9 e 2= (3.2 – 2.9)² =. 09x3=5 f(5)=5.9 y 3=6.1 e 3= (5.9 – 6.1)² = .04x4=7 f(7)=7.7 y 4=8.3 e 4= (7.7 – 8.3)² = .36
So the total squared error is .04 + .09 + .04 + .36 = .53
By finding better coefficients of the best fit line, we can make this errorsmaller…
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 6/26
We want tominimize the
vertical distancebetween the point
and the line.
• E = (d 1)² + (d 2)² + (d 3)² +…+(d n)² for n data points
• E = [f(x 1) – y1]² + [f(x 2) – y2]² + … + [f(x n) – yn]²• E = [mx 1 + b – y1]² + [mx 2 + b – y2]² +…+ [mx n + b – yn]²
• E= ∑(mx i+ b – y i )²
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 7/26
E must be MINIMIZED!
How do we do this?E = ∑(mx i+ b – y i )²
Treat x and y as constants, since we aretrying to find m and b.So…PARTIALS!
E/ m = 0 and E/ b = 0But how do we know if this will yieldmaximums, minimums, or saddle points?
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 8/26
Minimum Point Maximum Point
Saddle Point
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 9/26
Minimum!
Since the expressionE is a sum ofsquares and istherefore positive (i.e.it looks like an upwardparaboloid), we knowthe solution must be aminimum.We can prove this byusing the 2 nd PartialsDerivative Test.
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 10/26
2nd Partials Test
And form the discriminant D = AC – B2
1) If D < 0, then (x 0,y0) is a saddle point .2) If D > 0, then f takes on
A local minimum at (x 0,y0) if A > 0
A local maximum at (x 0,y0) if A < 0
A f x
B f y x
C f y
2
2
2 2
2, ,
Suppose the gradient of f(x 0,y0) = 0.(An instance of this is E/ m = E/ b = 0.)
We set
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 11/26
Calculating the Discriminant
A f
x
A E
m
Amx b y
m
A x mx b y
m A x
A x
2
2
2
2
2 2
2
2
2
2
2
2
( )
( )( )
( )
B f
y x
B E
b m
Bmx b y
b m
B x mx b y
b B x
B x
2
2
2 2
2
2
2
( )
( )( )
( )
C f
y
C E
b
C mx b y
b
C mx b y
bC C
2
2
2
2
2 2
2
2
22 1
( )
( )( )
D AC B x x 2 2 24 1 4( ( ))
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 12/26
1) If D < 0, then (x 0,y0) is a saddle point .2) If D > 0, then f takes on
A local minimum at (x 0,y0) if A > 0 A local maximum at (x 0,y0) if A < 0
Now D > 0 by an inductive proof showing that
Those details are not covered in thispresentation.We know A > 0 since A = 2 ∑ x2 is always
positive (when not all x’s have the same value).
D AC B x x 2 2 24 1 4( ( ))
n x xi
n
ii
n
i
1
2
1
2
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 13/26
Therefore…
Setting E/ m and E/ b equal to zero will
yield two minimizing equations of E, thesum of the squares of the error.
Thus, the linear least squares algorithm(as presented) is valid and we can continue.
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 14/26
E = ∑(mx i + b – y i)² is minimized (as just shown) whenthe partial derivatives with respect to each of thevariables is zero. ie: E/ m = 0 and E/ b = 0
E/ b = ∑2(mx i + b – y i) = 0 set equal to 0m∑x i + ∑b = ∑y i
mSx + bn = SyE/ m = ∑2x i (mx i + b – y i) = 2∑(m xi² + bx i – x iyi) = 0
m∑xi² + b ∑x i = ∑x iyi
mSxx + bSx = Sxy
NOTE:∑x i = Sx ∑y i = Sy ∑x i² = Sxx ∑x iyi = SxSy
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 15/26
Next we will solve the system ofequations for unknowns m and b:
nmSxx + bnSx = nSxy Multiply by nmSxSx + bnSx = SySx Multiply by Sx
nmSxx – mSxSx = nSxy – SySx Subtract
mSxx bSx SxymSx bn Sy
mnSxy SySxnSxx SxSx
m(nSxx – SxSx) = nSxy – SySx Factor m
Solving for m…
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 16/26
Next we will solve the system ofequations for unknowns m and b:
mSxSxx + bSxSx = SxSxy Multiply by SxmSxSxx + bnSxx = SySxx Multiply by Sxx
bSxSx – bnSxx = SxySx – SySxx Subtract
mSxx bSx SxymSx bn Sy
bSxxSy SxySxnSxx SxSx
b(SxSx – nSxx) = SxySx – SySxx Solve for b
Solving for b…
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 17/26
Example: Find the linear least squaresapproximation to the data: (1,1), (2,4), (3,8)
mnSxy SySxnSxx SxSx
Sx = 1+2+3= 6Sxx = 1²+2²+3² = 14Sy = 1+4+8 = 13Sxy = 1(1)+2(4)+3(8) = 33n = number of points = 3
The line of best fit is y = 3.5x – 2.667
Use these formulas:
bSxxSy SxySxnSxx SxSx
b
14 13 33 63 14 6 6
166
2667( ) ( )( ) ( )
.
m
3 33 6 133 14 6 6
216
35( ) ( )( ) ( )
.
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 18/26
Line of best fit: y = 3.5x – 2.667
-1 1 2 3 4 5
-5
5
10
15
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 19/26
THE ALGORITHM
in Mathematica
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 22/26
ActivityFor this activity we are going to use the linear leastsquares approximation in a real life situation.You are going to be given a box score from either abaseball or softball game.With the box score you are given you are going to writeout the points (with the x coordinate being the number ofhits that player had in the game and the y coordinatebeing the number of at-bats that player had in the game).
After doing that you are going to use the linear leastsquares approximation to find the best fitting line.The slope of the besting fitting line you find will be theteam’s batting average for that game.
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 23/26
In Conclusion…
E = ∑(mx i+ b – y i )² is the sum of thesquared error between the set of datapoints {(x1,y1),…,(x i,y i),…,(x n,yn)} and theline approximating the data f(x) = mx + b .By minimizing the error by calculusmethods , we get equations for m and bthat yield the least squared error :
mnSxy SySxnSxx SxSx
b
SxxSy SxySxnSxx SxSx
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 24/26
Advantages
Many common methods of approximating dataseek to minimize the measure of differencebetween the approximating function and givendata points.
Advantages for using the squares of differencesat each point rather than just the difference,absolute value of difference, or other measures oferror include: – Positive differences do not cancel negative differences – Differentiation is not difficult – Small differences become smaller and large differences
become larger
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 25/26
Disadvantages
Algorithm will fail if data points fall in avertical line.
Linear Least Squares will not be the bestfit for data that is not linear.
8/11/2019 Error Squares
http://slidepdf.com/reader/full/error-squares 26/26Th E d