Top Banner
Gu Yuxian Wang Weinan Beijing National Day School Research Project For Linear Regression
24
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Gu Yuxian Wang Weinan Beijing National Day School.

Gu Yuxian Wang WeinanBeijing National Day School

Research Project For Linear Regression

Page 2: Gu Yuxian Wang Weinan Beijing National Day School.

Part 1 The Simple Linear Regression

• Given two variables X and Y.• , … are measured without an error• , … are measured with error • So we can let • We can use the least squares estimators and

the maximum likelihood estimator to estimate parameter and .

1x nx2x

1y ny2y

ii xY 10

0 1

Page 3: Gu Yuxian Wang Weinan Beijing National Day School.

The Least Squares Estimators• Let • All we need to do is to minimize Δ .

• Let ,• Solve the equation.

n

iii

n

ii yxyy

1

210

1

2 ])[()(

XX

XY

S

Sxy 0̂

XX

XY

S

S1̂

00

01

n

iiiXY yyxxS

1

))((

n

iiYY xxS

1

2)(

n

iiXX yyS

1

2)(

Page 4: Gu Yuxian Wang Weinan Beijing National Day School.

The Maximum Likelihood Estimator

• Assume that

• So

iiεii

iiiiY

xββyFxββyεP

yεxββPyYPyF

1010

10

2,0 ~ N

2

210

210

2

1 σ

xββy

iiεiY

ii

eσπ

xββyfyf

Page 5: Gu Yuxian Wang Weinan Beijing National Day School.

• The likelihood function

• Compute and

• Solve

• We get

n

i

ii xynL

12

210

22

1ln

0

01

00

1

XX

XY

S

Sxy 0̂

XX

XY

S

S1̂

Page 6: Gu Yuxian Wang Weinan Beijing National Day School.

Efficiency Analysis

• They are unbiased.

00

11

1011

21

10

1111

ˆ

ˆ

E

n

xn

S

x

S

xx

n

YE

S

x

S

YExE

n

ii

XX

n

ii

XX

n

ii

n

ii

n

ii

XX

n

ii

XX

n

iii

Page 7: Gu Yuxian Wang Weinan Beijing National Day School.

Part2 Errors-in-Variables (EIV) Regression Model

• When the measurements for X is not accurate.• There are two ways to measure errors.• The orthogonal regression and the geometric

mean regression.

Page 8: Gu Yuxian Wang Weinan Beijing National Day School.

The Orthogonal Regression(OR)• The distances between the regression line and points are

• To minimize Compute and solve• We are supposed to get

2

1

10

1

ii xy

n

i

ii xy

12

1

210

1

00

01

XY

YYXXXYYYXX

S

SSSSS

2

4ˆ22

1

10

ˆˆ xy

Page 9: Gu Yuxian Wang Weinan Beijing National Day School.

The Geometric Mean Regression(GMR)

• The area is• To minimize

Compute and solve

we get

}

2{

1 1

210

n

i

ii xy

1

210

2 ii xy

00

01

XX

YY

S

S1̂ 10

ˆˆ xy

Page 10: Gu Yuxian Wang Weinan Beijing National Day School.

Parametric Method

• Assume

X and Y follow a bivariate normal distribution

• We use moment generating function (mgf) to derive the distribution of X and Y :

10

Y

X

),(~

),0(~

),0(~

2

2

2

N

N

N

Page 11: Gu Yuxian Wang Weinan Beijing National Day School.

2101212

12

2222

12

122

02

22

221

22121

2121

02

0121

)()(2

1)(

2

1

2

1

2

1)(

2

1)(

21121

)()(21,

)(M)(M)(M

)(E),(M

tttttt

ttttttt

t

ttYX

e

eeee

etttt

ett

• Since are independent, we can separate mgf.

• The bivariate normal distribution that

• method of moment estimator(MOME)

222

12

1

21

22

10

,N~

Y

X

,,

Page 12: Gu Yuxian Wang Weinan Beijing National Day School.

2

2

21 1

1

21

11

2

2221

2

11

2

22

101

1

)(1

)()()(),(

)(

)(

)(

)(

n

S

n

yxyx

n

YEXEXYEYXCov

n

S

n

y

n

yYD

n

S

n

x

n

xXD

n

yyYE

n

xxXE

XY

n

i

n

iiin

iii

YY

n

ii

n

ii

XX

n

ii

n

ii

n

ii

Y

n

ii

X

Page 13: Gu Yuxian Wang Weinan Beijing National Day School.

• We get:

xy

S

SSSSS

XY

XYXXYYXXYY

10

22

1

ˆˆ

2

Page 14: Gu Yuxian Wang Weinan Beijing National Day School.

Special Situation for MLE • The Orthogonal Regression(OR)

• The Geometric Mean Regression (GMR)

XY

YYXXXYYYXX

XY

XYXXYYXXYY

S

SSSSS

S

SSSSS

2

2

22

11

22

1

XX

YYS

S

XY

XYXXYYXXYY

S

S

S

SSSSS

XX

YY

1

22

1

ˆ

2

Page 15: Gu Yuxian Wang Weinan Beijing National Day School.

–This is when Y has no error.

–This is when X has no error, so we get the same answer as our first discussion.

02

2

XY

YY

S

S1̂

XX

XY

S

S1̂ 2

2

Page 16: Gu Yuxian Wang Weinan Beijing National Day School.

Another Estimator • We want to (1)occupy all la (like MLE)

(2)without distributions(like (OR)&(G))

• Calculate

n

iii

n

iii

yxc

c

xxcyyc

1

2102

1

1

22

1

])())(1[(

00

01

0)1()1( 13

14

1 YYXYXYXX ScSccScS

Page 17: Gu Yuxian Wang Weinan Beijing National Day School.

Let

XY

XYXXYYXXYY

S

SSSSS

2

4)(

22

1

22

21

4

2)(

2

1)(

XYXXYY

XYXXXXYYXX

XY SSS

SSSSS

Sd

d

0

04

2)(

2

22

2

XYYYXX

XYXXYY

XYXXXXYYXX

SSS

SSS

SSSSS

)(1 XY

YY

XX

XY

S

S

S

S )()0( 11

],[),0[ 1-1

XY

YY

XX

XY

S

S

S

S

So is increasing and

We get

Prove 1-1 to 1

Page 18: Gu Yuxian Wang Weinan Beijing National Day School.

Let

So there is at least one root for

YYXYXYXX ScSccScSxf )1()1( 13

14

1

1Prove 1-1 to

0)()()(

)(

2

2

XY

YY

XX

XYYYXXXY

XY

YY

XYYYXXXX

XY

S

Sf

S

SfSSS

S

Sf

SSSS

Sf

0)1()1( 13

14

1 YYXYXYXX ScSccScS

We have

c

Page 19: Gu Yuxian Wang Weinan Beijing National Day School.

So there is ONLY one root for

(when )

XY

YY

XX

XY

S

S

S

S,1And when

0)1()1( 13

14

1 YYXYXYXX ScSccScS

XY

YY

XX

XY

S

S

S

S,1

Then we have

XY

YY

XX

XY

S

S

S

Sc ,]1,0[ 1

1-1

0)( 1 fWe can proof

Page 20: Gu Yuxian Wang Weinan Beijing National Day School.

Another Estimator Again• The angle• Let Compute & solve

• We get

• ***

cossin)sin(

sin

1

22

ddd

21

1

210

2

cossin

])[(

n

iii yx

d

00

01

cot

cot1̂

XXXY

XYYY

SS

SS

10

ˆˆ xy

cot1̂

Page 21: Gu Yuxian Wang Weinan Beijing National Day School.

Part3 Multiple Linear Regression

The Least Squares Estimators

• Similar to simple linear regression:

• Compute

• We will get a group of equations:

)()2(2

)1(10

niniii xxxY

n

i

niniii xxxy

1

2)()2(2

)1(10

00,010

n

Page 22: Gu Yuxian Wang Weinan Beijing National Day School.

0

0

0

0

111

22

1

11

10

1

2

1

2

1

323

1

222

1

121

1

20

1

1

1

1

1

313

1

212

1

121

1

10

111

22

1

110

n

ii

ni

n

i

ni

nin

n

ii

ni

n

ii

ni

n

i

ni

n

iii

n

i

niin

n

iii

n

ii

n

iii

n

ii

n

iii

n

i

niin

n

iii

n

iii

n

ii

n

ii

n

ii

n

i

nin

n

ii

n

ii

yxxxxxxxx

yxxxxxxxxx

yxxxxxxxxx

yxxxn

Assume its coefficient matrix is

The solution is

11 nnija

n

ii

ni

n

iii

n

ii

nnij

n yx

yx

y

a

1

1

1

1

1

11

1

0

Page 23: Gu Yuxian Wang Weinan Beijing National Day School.

Errors-in-Variables (EIV) Regression Model(Two Variables)

• The Orthogonal Regression(OR)

• The Geometric Mean Regression(GMR1)(the volume )

• The Geometric Mean Regression(GMR2)(the sum of area )

iii YXZ 210

n

i

iii yxz

1 21

3210

6

)]([

n

i

iii yxz

12

22

1

2210

1

)]([

2210

21 121

)]()[111

(2

1iii

n

i

yxz

Page 24: Gu Yuxian Wang Weinan Beijing National Day School.

Thanks!!!!!!