Introduction Improved Model
Alternative Statistical Model:Weighted Least Square and Generalized
Least Square
Xingye QiaoDr. Jim Crooks
SAMSISAMSI/CRSC Undergraduate Workshop at NCSU
May 22, 2007
Introduction Improved Model
Outline
1 IntroductionRecall of Ordinary Least-Square RegressionCurrent Model
2 Improved ModelWeighted Least-SquareGeneralized Least-Square
Introduction Improved Model
Outline
1 IntroductionRecall of Ordinary Least-Square RegressionCurrent Model
2 Improved ModelWeighted Least-SquareGeneralized Least-Square
Introduction Improved Model
Recall of Ordinary Least-Square Regression
Least Square Regression
Linear“Linear" is for the parameter(s)e.g. yi = β0 +β1xi + εi
Non-linear“Non-linear" is for the parameter(s)e.g. yi = exp(−β1xi)+αcos(β2xi)+ εi
Summaryyi = η(xi ;β)+ εiη(x ;β) is deterministic function of x, with parameter β
Goal: to estimate parameter β
Introduction Improved Model
Recall of Ordinary Least-Square Regression
Least Square Regression
Linear“Linear" is for the parameter(s)e.g. yi = β0 +β1xi + εi
Non-linear“Non-linear" is for the parameter(s)e.g. yi = exp(−β1xi)+αcos(β2xi)+ εi
Summaryyi = η(xi ;β)+ εiη(x ;β) is deterministic function of x, with parameter β
Goal: to estimate parameter β
Introduction Improved Model
Recall of Ordinary Least-Square Regression
Least Square Regression
Linear“Linear" is for the parameter(s)e.g. yi = β0 +β1xi + εi
Non-linear“Non-linear" is for the parameter(s)e.g. yi = exp(−β1xi)+αcos(β2xi)+ εi
Summaryyi = η(xi ;β)+ εiη(x ;β) is deterministic function of x, with parameter β
Goal: to estimate parameter β
Introduction Improved Model
Recall of Ordinary Least-Square Regression
OLS estimation
Find β to minimize
m
∑i=1
(yi −η(xi ;β))2,
to give βOLS
Standard Statistical Assumption:Mean of εi is 0 for all iVariance of εi is constant for all i, equal to σ2
εi ,εj are independent of each other for all i 6= j
Introduction Improved Model
Recall of Ordinary Least-Square Regression
OLS Estimation (Cont.)
Property of OLS EstimationβOLS converges to β as n increases
Makes efficient use of the data, i.e. has small standarderrorThese properties hold only when the model is a rightmodel. To be more specific, when the standard statisticalassumption holds.
Introduction Improved Model
Recall of Ordinary Least-Square Regression
OLS Estimation (Cont.)
Property of OLS EstimationβOLS converges to β as n increasesMakes efficient use of the data, i.e. has small standarderror
These properties hold only when the model is a rightmodel. To be more specific, when the standard statisticalassumption holds.
Introduction Improved Model
Recall of Ordinary Least-Square Regression
OLS Estimation (Cont.)
Property of OLS EstimationβOLS converges to β as n increasesMakes efficient use of the data, i.e. has small standarderrorThese properties hold only when the model is a rightmodel. To be more specific, when the standard statisticalassumption holds.
Introduction Improved Model
Current Model
Inverse Problem
Spring Model:
d2y(t)dt2 +C
dy(t)dt
+Ky(t) = 0
For each given C and K, the differential equation has aunique solution given initial value, called y(t ;C,K )
Target: Estimate C and K based on the observed yi
Minimize the cost function
L(C,K ) =m
∑i=1
(yi −y(ti ;C,K ))2
Introduction Improved Model
Current Model
Inverse Problem
Spring Model:
d2y(t)dt2 +C
dy(t)dt
+Ky(t) = 0
For each given C and K, the differential equation has aunique solution given initial value, called y(t ;C,K )
Target: Estimate C and K based on the observed yi
Minimize the cost function
L(C,K ) =m
∑i=1
(yi −y(ti ;C,K ))2
Introduction Improved Model
Current Model
Inverse Problem
Spring Model:
d2y(t)dt2 +C
dy(t)dt
+Ky(t) = 0
For each given C and K, the differential equation has aunique solution given initial value, called y(t ;C,K )
Target: Estimate C and K based on the observed yi
Minimize the cost function
L(C,K ) =m
∑i=1
(yi −y(ti ;C,K ))2
Introduction Improved Model
Current Model
Inverse Problem
Spring Model:
d2y(t)dt2 +C
dy(t)dt
+Ky(t) = 0
For each given C and K, the differential equation has aunique solution given initial value, called y(t ;C,K )
Target: Estimate C and K based on the observed yi
Minimize the cost function
L(C,K ) =m
∑i=1
(yi −y(ti ;C,K ))2
Introduction Improved Model
Current Model
Underlying Statistical Models
The above model can be viewed as a regression model
yi = y(ti ;C,K )+ εi
Here εi are iid(independent identically distributed) fromN(0,σ2). That is we suppose the statistical assumptionshold.
But is this model a right model?
Introduction Improved Model
Current Model
Underlying Statistical Models
The above model can be viewed as a regression model
yi = y(ti ;C,K )+ εi
Here εi are iid(independent identically distributed) fromN(0,σ2). That is we suppose the statistical assumptionshold.But is this model a right model?
Introduction Improved Model
Current Model
Violation of Statistical Assumptions
1 Is variance of εi constant across time range?
2 Are error independent?3 Are error from N(0,σ2)?
Implication:Standard statistical assumptions don’t hold.[C, K ] are no longer good estimators for [C,K ].We should find a way to remedy this problem.
Introduction Improved Model
Current Model
Violation of Statistical Assumptions
1 Is variance of εi constant across time range?2 Are error independent?
3 Are error from N(0,σ2)?
Implication:Standard statistical assumptions don’t hold.[C, K ] are no longer good estimators for [C,K ].We should find a way to remedy this problem.
Introduction Improved Model
Current Model
Violation of Statistical Assumptions
1 Is variance of εi constant across time range?2 Are error independent?3 Are error from N(0,σ2)?
Implication:Standard statistical assumptions don’t hold.[C, K ] are no longer good estimators for [C,K ].We should find a way to remedy this problem.
Introduction Improved Model
Current Model
Violation of Statistical Assumptions
1 Is variance of εi constant across time range?2 Are error independent?3 Are error from N(0,σ2)?
Implication:
Standard statistical assumptions don’t hold.[C, K ] are no longer good estimators for [C,K ].We should find a way to remedy this problem.
Introduction Improved Model
Current Model
Violation of Statistical Assumptions
1 Is variance of εi constant across time range?2 Are error independent?3 Are error from N(0,σ2)?
Implication:Standard statistical assumptions don’t hold.
[C, K ] are no longer good estimators for [C,K ].We should find a way to remedy this problem.
Introduction Improved Model
Current Model
Violation of Statistical Assumptions
1 Is variance of εi constant across time range?2 Are error independent?3 Are error from N(0,σ2)?
Implication:Standard statistical assumptions don’t hold.[C, K ] are no longer good estimators for [C,K ].
We should find a way to remedy this problem.
Introduction Improved Model
Current Model
Violation of Statistical Assumptions
1 Is variance of εi constant across time range?2 Are error independent?3 Are error from N(0,σ2)?
Implication:Standard statistical assumptions don’t hold.[C, K ] are no longer good estimators for [C,K ].We should find a way to remedy this problem.
Introduction Improved Model
Outline
1 IntroductionRecall of Ordinary Least-Square RegressionCurrent Model
2 Improved ModelWeighted Least-SquareGeneralized Least-Square
Introduction Improved Model
Weighted Least-Square
Assumption
Instead of constant variance assumption, we deal withnonconstant variance here.
Assume Var(εi) = σ2
wi, i = 1, . . . ,m, for known wi
What does it mean for (yi , ti) if wi is large?⇔ This observation is of high quality.⇔ This observation is of importance
Introduction Improved Model
Weighted Least-Square
Assumption
Instead of constant variance assumption, we deal withnonconstant variance here.Assume Var(εi) = σ2
wi, i = 1, . . . ,m, for known wi
What does it mean for (yi , ti) if wi is large?⇔ This observation is of high quality.⇔ This observation is of importance
Introduction Improved Model
Weighted Least-Square
Assumption
Instead of constant variance assumption, we deal withnonconstant variance here.Assume Var(εi) = σ2
wi, i = 1, . . . ,m, for known wi
What does it mean for (yi , ti) if wi is large?
⇔ This observation is of high quality.⇔ This observation is of importance
Introduction Improved Model
Weighted Least-Square
Assumption
Instead of constant variance assumption, we deal withnonconstant variance here.Assume Var(εi) = σ2
wi, i = 1, . . . ,m, for known wi
What does it mean for (yi , ti) if wi is large?⇔ This observation is of high quality.
⇔ This observation is of importance
Introduction Improved Model
Weighted Least-Square
Assumption
Instead of constant variance assumption, we deal withnonconstant variance here.Assume Var(εi) = σ2
wi, i = 1, . . . ,m, for known wi
What does it mean for (yi , ti) if wi is large?⇔ This observation is of high quality.⇔ This observation is of importance
Introduction Improved Model
Weighted Least-Square
Solve Weighted Least Square in Linear Case
Consider linear context,
yi = xTi β+ εi .
Denotey∗i =
√wiyi ,x∗i =
√wixi ,
Theny∗i = x∗Ti β+
√wiεi ,
where Var(√
wiεi) = wiVar(εi) = σ2
Introduction Improved Model
Weighted Least-Square
Solve Weighted Least Square in Linear Case
Consider linear context,
yi = xTi β+ εi .
Denotey∗i =
√wiyi ,x∗i =
√wixi ,
Theny∗i = x∗Ti β+
√wiεi ,
where Var(√
wiεi) = wiVar(εi) = σ2
Introduction Improved Model
Weighted Least-Square
Solve Weighted Least Square in Linear Case
Consider linear context,
yi = xTi β+ εi .
Denotey∗i =
√wiyi ,x∗i =
√wixi ,
Theny∗i = x∗Ti β+
√wiεi ,
where Var(√
wiεi) = wiVar(εi) = σ2
Introduction Improved Model
Weighted Least-Square
Solve Weighted Least Square in Linear Case (Cont.)
Then minimizing the weighted (least) sum squares of error
S =n
∑i=1
wi(yi −xTi β)2,
is the same as minimizing the ordinary (least) sum squares oferror
S =n
∑i=1
(y∗i −x∗Ti β)2.
In matrix notation, the weighted least squares estimator of β is
β =(X ∗T X ∗)−1X ∗T Y ∗ =(X T WX )−1X T WY ,W = diag{w1, . . . ,wn}.
Introduction Improved Model
Weighted Least-Square
Solve Weighted Least Square in Linear Case (Cont.)
Then minimizing the weighted (least) sum squares of error
S =n
∑i=1
wi(yi −xTi β)2,
is the same as minimizing the ordinary (least) sum squares oferror
S =n
∑i=1
(y∗i −x∗Ti β)2.
In matrix notation, the weighted least squares estimator of β is
β =(X ∗T X ∗)−1X ∗T Y ∗ =(X T WX )−1X T WY ,W = diag{w1, . . . ,wn}.
Introduction Improved Model
Weighted Least-Square
Estimation
Instead of minimizingm∑
i=1(yi −y(ti ;C,K ))2 in OLS, here
minimize
L(C,K ) =m
∑i=1
wi(yi −y(ti ;C,K ))2,
to give C and K
Introduction Improved Model
Weighted Least-Square
Value of wi
In practice, we don’t know wi . Several ways to estimate wi :
1 Estimate Var(εi) as σ2i from repeated measurment at time
ti :
wi =σ2
σ2i.
2 If error is larger for larger |yi |, simply let wi = 1y2
i3 Or, alternatively, assume that wi = 1
y2(ti ;C,K )
Introduction Improved Model
Weighted Least-Square
Value of wi
In practice, we don’t know wi . Several ways to estimate wi :1 Estimate Var(εi) as σ2
i from repeated measurment at timeti :
wi =σ2
σ2i.
2 If error is larger for larger |yi |, simply let wi = 1y2
i3 Or, alternatively, assume that wi = 1
y2(ti ;C,K )
Introduction Improved Model
Weighted Least-Square
Value of wi
In practice, we don’t know wi . Several ways to estimate wi :1 Estimate Var(εi) as σ2
i from repeated measurment at timeti :
wi =σ2
σ2i.
2 If error is larger for larger |yi |, simply let wi = 1y2
i
3 Or, alternatively, assume that wi = 1y2(ti ;C,K )
Introduction Improved Model
Weighted Least-Square
Value of wi
In practice, we don’t know wi . Several ways to estimate wi :1 Estimate Var(εi) as σ2
i from repeated measurment at timeti :
wi =σ2
σ2i.
2 If error is larger for larger |yi |, simply let wi = 1y2
i3 Or, alternatively, assume that wi = 1
y2(ti ;C,K )
Introduction Improved Model
Generalized Least-Square
Assumption
More general, now deal with correlated observations andnonconstant variance (weighted least square only deals withnonconstant variance):
Let ε = (ε1,ε2, . . . ,εm)T , and assume
Cov(ε) = σ2V , for known matrix V.
Let W = V−1. Remember that if V is diagonal matrix, thenthis is the case in weighted least square, andW = diag{w1,w2, . . . ,wm}.
Introduction Improved Model
Generalized Least-Square
Assumption
More general, now deal with correlated observations andnonconstant variance (weighted least square only deals withnonconstant variance):
Let ε = (ε1,ε2, . . . ,εm)T , and assume
Cov(ε) = σ2V , for known matrix V.
Let W = V−1. Remember that if V is diagonal matrix, thenthis is the case in weighted least square, andW = diag{w1,w2, . . . ,wm}.
Introduction Improved Model
Generalized Least-Square
Assumption
More general, now deal with correlated observations andnonconstant variance (weighted least square only deals withnonconstant variance):
Let ε = (ε1,ε2, . . . ,εm)T , and assume
Cov(ε) = σ2V , for known matrix V.
Let W = V−1. Remember that if V is diagonal matrix, thenthis is the case in weighted least square, andW = diag{w1,w2, . . . ,wm}.
Introduction Improved Model
Generalized Least-Square
Estimation
The generalized Least Square estimator minimizes
L(C,K ) = {y −y(t ;C,K )}T W{y −y(t ;C,K )},
to given C and K .
If the proposed covariance model holds, then theestimators have good properties.
Introduction Improved Model
Generalized Least-Square
Estimation
The generalized Least Square estimator minimizes
L(C,K ) = {y −y(t ;C,K )}T W{y −y(t ;C,K )},
to given C and K .If the proposed covariance model holds, then theestimators have good properties.
Introduction Improved Model
Generalized Least-Square
Anything More?
Does this improved model work better?
If not, what might be the main problem?Let Jim take over.
Introduction Improved Model
Generalized Least-Square
Anything More?
Does this improved model work better?If not, what might be the main problem?
Let Jim take over.
Introduction Improved Model
Generalized Least-Square
Anything More?
Does this improved model work better?If not, what might be the main problem?Let Jim take over.