Top Banner
Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods 1 Regression
19

Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Jul 18, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Correlation and Regression Analysis

� Definitions

� Regression

� Linear Least Squares

� Continuous Least Squares

ITCS 4133/5133: Numerical Comp. Methods 1 Regression

Page 2: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Regression

� Frequently, in engineering problems, estimates of variables with min-imal error are needed.

� Simply using a function of aggregate measures (mean, standard de-viation) is insufficient.

� Building a model to relate a random variable to measured values ofother variables is termed optimization or model calibration

� Uses concepts of calculus and numerical analysis to determine un-knowns of a function that minimizes or maximizes some function.

ITCS 4133/5133: Numerical Comp. Methods 2 Regression

Page 3: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Statistical Optimization

Consists of the following elements:

� Objective Function: the function to be optimized

� Criterion Variable: the dependent variable

� Predictor Variables: the independent variables

� Regression Coefficients: unknowns to be determined

� Mathematical Model: the function that is used to estimate values ofthe criterion variable

� Dataset: measured values of criterion and predictor variables

Optimization requires (1) objective function, (2)the mathematical modeland (3) a dataset.

ITCS 4133/5133: Numerical Comp. Methods 3 Regression

Page 4: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Statistical Optimization (contd)

� Bivariate Model:

E = aX + b

where E, the evaporation is related to T , the temperature via a linearmodel, parameterized by two constants b0, b1

� Multivariate Model:

Y = a1X1 + a2X2 + · · · + apXp + b

where p is the number of predictor variables, Xi is the ith predic-tor variable, b is the intercept coefficient, ais are partial regressioncoefficients.

� Models may be linear, quadratic, cubic, etc.

ITCS 4133/5133: Numerical Comp. Methods 4 Regression

Page 5: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Principle of Least Squares:Motivation

� Oil Reservoir Model: Data relates equil. constant of reaction to pres-sure

ITCS 4133/5133: Numerical Comp. Methods 5 Regression

Page 6: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Principal of Least Squares:Motivation

ITCS 4133/5133: Numerical Comp. Methods 6 Regression

Page 7: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Least Squares Approximation

� The ai coefficients can be determined using the Principle of LeastSquares

� Least Squares is an example of a regression method

� Principle of least squares is used to regress Y on the Xis so as tobring the expected value of the random variable towards the meanof the set.

ITCS 4133/5133: Numerical Comp. Methods 7 Regression

Page 8: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Least Squares Approximation: Procedure

� Error (or residual) is defined as

ei = Yi − Yi

where ei, Yi, Yi are the ith error, predicted and measured variablesrespectively.

� Objective Function:

F = MIN

n∑i=1

(Yi − Yi)2

� The minimization of the objective function by taking derivatives w.r.t.each unknown and setting it to zero.

� Solve the resulting set of equations

ITCS 4133/5133: Numerical Comp. Methods 8 Regression

Page 9: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Linear LSQ Example: Bivariate Model

F = MIN

n∑i=1

(Yi − Yi)2

= MIN

n∑i=1

(aXi + b− Yi)2

Taking derivatives,

∂∑n

i=1 (yi − yi)2

∂a= 2

n∑i=1

(aXi + b− Yi)(Xi) = 0

∂∑n

i=1 (yi − yi)2

∂b= 2

n∑i=1

(aXi + b− Yi) = 0

ITCS 4133/5133: Numerical Comp. Methods 9 Regression

Page 10: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Linear LSQ Example: Bivariate Model (contd)

� Upon simplification these become,

a

n∑i=1

X2i + b

n∑i=1

Xi =

n∑i=1

XiYi

an∑

i=1

Xi + b(n) =

n∑i=1

Yi

� Let Sx =∑n

i=1 Xi, Sxx =∑n

i=1 X2i , etc. Equations become

aSxx + bSx = Sxy

aSx + b(n) = Sy

ITCS 4133/5133: Numerical Comp. Methods 10 Regression

Page 11: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Linear LSQ Example: Bivariate Model (contd)

aSxx + bSx = Sxy

aSx + b(n) = Sy

Can use Cramer’s rule for small systems.

a = det(A1)/det(A)

b = det(A1)/det(A)

a =nSxy − SxSy

(n)Sxx − SxSx

b =SxxSy − SxySx

nSxx − SxSx

ITCS 4133/5133: Numerical Comp. Methods 11 Regression

Page 12: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Linear LSQ Approximation: Algorithm

ITCS 4133/5133: Numerical Comp. Methods 12 Regression

Page 13: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Linear LSQ Approximation: Example:NoisyData

ITCS 4133/5133: Numerical Comp. Methods 13 Regression

Page 14: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Quadratic LSQ Approximation

� Approximate the function f(x) with a quadratic,

f (x) = ax2 + bx + c

� Error function is

E = [a(x1)2 + b(x1) + c− y1]

2 + . . . + [a(xn)2 + b(xn) + c− yn]

2

� Partially differentiate w.r.t. a, b and c and equate to zero.

� The normal equations are given by

a

n∑i=1

x4i + b

n∑i=1

x3i + c

n∑i=1

x2i =

n∑i=1

x2iyi

a

n∑i=1

x3i + b

n∑i=1

x2i + c

∑i=1

xi =

n∑i=1

xiyi

a

n∑i=1

x2i + b

∑i=1

xi + c[n] =

n∑i=1

yiITCS 4133/5133: Numerical Comp. Methods 14 Regression

Page 15: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Quadratic LSQ Approximation:Algorithm

ITCS 4133/5133: Numerical Comp. Methods 15 Regression

Page 16: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Quadratic LSQ Approximation:Example

ITCS 4133/5133: Numerical Comp. Methods 16 Regression

Page 17: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

General LSQ Approximation

� Similarly, LSQ approximation can be extended to fit cubics - resultsin 4 equations in 4 unknowns

� More generally, data can be approximated by a function that is alinear combination of a fixed set of functions, also known as basisfunctions.

◦ Linear case: g1(x) = 1, g2(x) = x.◦ Quadratic case: g1(x) = 1, g2(x) = x, g3(x) = x2.

� For the general case,

f (x) = a1g1(x) + a2g2(x) + a3g3(x) + a4g4(x)

where gi(x), i ∈ (1, 2, 3, 4) are the basis functions, and the error Eis

E =

n∑i=1

[f (xi)− yi]2

ITCS 4133/5133: Numerical Comp. Methods 17 Regression

Page 18: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

General LSQ Approximation(contd)

� For n data points,

E = [a1g1(x1) + a2g2(x1) + a3g3(x1) + a4g4(x1)− y1]2+ · · ·

[a1g1(xn) + a2g2(xn) + a3g3(xn) + a4g4(xn)− yn]2

� Setting ∂E∂a1

= 0,

a1

4∑i=1

g1(xi)g1(xi) + a2

4∑i=1

g1(xi)g2(xi) +

4∑i=1

g1(xi)g3(xi) +

4∑i=1

g1(xi)g4(xi) =

4∑i=1

g1(xi)yi

and similarly for ∂E∂a2

= 0, etc.

ITCS 4133/5133: Numerical Comp. Methods 18 Regression

Page 19: Correlation and Regression Analysis · Correlation and Regression Analysis Definitions Regression Linear Least Squares Continuous Least Squares ITCS 4133/5133: Numerical Comp. Methods

Continuous Least-Squares Approximation

� Extension to continuosly defined function.

� Define the problem as fitting a function to a function defined over aninterval, say [0, 1].

� Summations are defined by integrals.

E =

∫ 1

0[ax2 + bx + c− s(x)]

2

resulting in

a

5+

b

4+

c

3=

∫ 1

0x2s(x)dx

a

4+

b

3+

c

2=

∫ 1

0xs(x)dx

a

3+

b

2+ c =

∫ 1

0s(x)dx

ITCS 4133/5133: Numerical Comp. Methods 19 Regression