λ
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Generalized Cross Validation
Mårten Marcus
26 november 2009
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Plan
1 Generals
2 What Regularization Parameter λ is Optimal?
Examples
Dening the Optimal λ
3 Generalized Cross-Validation
Cross Validation
Generalized Cross Validation (GCV)
Convergence Result
4 Discussion
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Plan
1 Generals
2 What Regularization Parameter λ is Optimal?
Examples
Dening the Optimal λ
3 Generalized Cross-Validation
Cross Validation
Generalized Cross Validation (GCV)
Convergence Result
4 Discussion
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
References
Chapter 4 of Spline models for Observational Data (1990) -
Grace Wahba
Optimal Estimation of Contour Properties by Cross-Validated
Regularization (1989) - Behzad Shahraray, David Anderson
Smoothing Noisy Data with Spline Function (1979) - Peter
Craven, Grace Wahba
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Conditional Expectations
From probability theory, we have for X ∈ L2Ω,F ,P
E[X ] = argminθ∈R
E[(X − θ)2
](1)
The least squares estimate therefore gives discretized estimate of θand a natural norm to choose when searching for data disturbed by
white noise
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Ill-posedness
Consider the model yi = g(ti ) + εi , i = 1, 2, . . . , n, ti ∈ [0, 1] where
g ∈W(m)2
= f |f ′, f ′′, . . . , f (m−1)abs.cont., f (m) ∈ L2[0, 1] andεi ∼WN(0, σ2) where σ is unknown.
The least squares estimate gives
mingi∈W
(m)2
n∑i=1
(yi − gi )2 = 0 ∀yini=1 (2)
The minimum does not depend on data and is clearly ill-posed.
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Regularization
We may choose to regularize, or smooth, the data as
gn,λ = argmingi∈W
(m)2
n∑i=1
(yi − g(ti ))2 + λ
∫1
0
(g (m)(t))2dt λ ≥ 0
(3)
which has a unique solution.
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Plan
1 Generals
2 What Regularization Parameter λ is Optimal?
Examples
Dening the Optimal λ
3 Generalized Cross-Validation
Cross Validation
Generalized Cross Validation (GCV)
Convergence Result
4 Discussion
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Properties of gn,λ
The reason for the term smoothing spline is the following
For λ = 0, gn,λ can be seen as an interpolating spline
For λ =∞, gn,λ is a single polynomial of degree m − 1
(optimal in the least squares sense)
For 0 < λ <∞, it can be shown that gn,λ is composed by
polynomials of degree at most 2m − 1 on the intevals
[ti , ti+1], i ∈ 1, 2, . . . , n − 1 such that the function and its
derivatives up to and including the 2m − 2 derivative are
continuous at the knots, and f (k)(t1) = f (k)(tn) = 0 for
k = m,m + 1, . . . , 2m − 2, i.e. natural conditions in the end
points.
From which we can see that there is a strong dependence between
the quality of the result and a good choice of λ.
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Examples
Plan
1 Generals
2 What Regularization Parameter λ is Optimal?
Examples
Dening the Optimal λ
3 Generalized Cross-Validation
Cross Validation
Generalized Cross Validation (GCV)
Convergence Result
4 Discussion
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Examples
Example
Example of impact of dierent λ on gn,λ and g ′n,λ for
yi = g(ti ) + εi where g(t) = sin( πt180
), t ∈ [0, 360]
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Dening the Optimal λ
Plan
1 Generals
2 What Regularization Parameter λ is Optimal?
Examples
Dening the Optimal λ
3 Generalized Cross-Validation
Cross Validation
Generalized Cross Validation (GCV)
Convergence Result
4 Discussion
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Dening the Optimal λ
True Mean Square Error R(λ) and Optimal λ
Using the notation above we dene the true mean square error,
R(λ), as
R(λ) :=1
n
n∑i=1
(gn,λ(ti )− gti )2 (4)
The optimal λ is then dened as
λ∗ = argminλ∈R+
R(λ) (5)
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Plan
1 Generals
2 What Regularization Parameter λ is Optimal?
Examples
Dening the Optimal λ
3 Generalized Cross-Validation
Cross Validation
Generalized Cross Validation (GCV)
Convergence Result
4 Discussion
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Cross Validation
Plan
1 Generals
2 What Regularization Parameter λ is Optimal?
Examples
Dening the Optimal λ
3 Generalized Cross-Validation
Cross Validation
Generalized Cross Validation (GCV)
Convergence Result
4 Discussion
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Cross Validation
Intuition
Given our sample of n measurements, we leave one measurement
out and predict that data point by the use of the n − 1 remaining
measurements.
The idea is now that the best modell for the measurements is the
one that best predicts each measurement as a function of the
others.
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Cross Validation
The Ordinary Cross-Validation (OCV)
Denition
Let gkn,λ, k ∈ 1, 2, . . . , n be dened as
g[k]n,λ = argmin
gi∈W(m)2
n∑i=1i 6=k
(yi − g(ti ))2 + λ
∫1
0
(g (m)(t))2dt λ ≥ 0 (6)
Then we dene the OCV mean square error as
V0(λ) =1
n
n∑i=1
(g[i ]n,λ(ti )− yi )
2 (7)
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Cross Validation
The Ordinary Cross-Validation (OCV)
Denition
Let V0(λ) be dened as in the previous denition, the Ordinary
Cross-Validation Estimate of λ is
λ0 = argminλ∈R+
V0(λ) (8)
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Cross Validation
Why we need more
The OCV estimate have proven good accuracy for certain periodic
functions, but in general it lacks a feature shown by the following:
Consider the function g(t) ∈W(m)2
, g(t) = g(t + 1) sampled
equidistantly e.g. tj = jn, j ∈ 1, 2, . . . , n adding some
uniform noise.
In this case all data are treated symmetrically. On the other hand,
dropping the conditions of periodicity and equidistant sampling, the
points interact dierently, giving rise to the need of weighting the
samples dierently.
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Generalized Cross Validation (GCV)
Plan
1 Generals
2 What Regularization Parameter λ is Optimal?
Examples
Dening the Optimal λ
3 Generalized Cross-Validation
Cross Validation
Generalized Cross Validation (GCV)
Convergence Result
4 Discussion
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Generalized Cross Validation (GCV)
Rewriting V0
There exists an n × n matrix A(λ), the inuence matrix, with the
property gn,λ(t1)gn,λ(t2)
...
gn,λ(tn)
= A(λ)y (9)
Such that V0(λ) can be rewritten as
V0(λ) =1
n
n∑k=1
∑ni=1
(akjyj − yk)2
(1− akk)2(10)
Where akj , k , j ∈ 1, 2, . . . , n is element k , j of A(λ)
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Generalized Cross Validation (GCV)
The Generalized Cross Validation (GCV)
Denition
Let A(λ) be the inuence matrix dened above, then the GCV
function is dened as
V (λ) =1
n||(I− A(λ))y ||2[1
ntr(I− A(λ))
]2 (11)
We say that the Generalized Cross-Validation Estimate of λ is
λ = argminλ∈R+
V (λ) (12)
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Generalized Cross Validation (GCV)
Similarity to OCV
By rewriting V (λ) we get
V (λ) =1
n
n∑i=1
(g[i ]n,λ(ti )− yi )
2wk(λ) (13)
where wk(λ) are given by
wk(λ) =
(1− akk(λ)
1
ntr(I− A(λ))
)2
(14)
wk(λ) have been shown to adjust V0(λ) for periodicity and
non-equidistant samples.
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Generalized Cross Validation (GCV)
Comparison
Comparison of the true mean square error with the GCV function
for the example λ on gn,λ and g ′n,λ for yi = g(ti ) + εi whereg(t) = sin( πt
180), t ∈ [0, 360]
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Convergence Result
Plan
1 Generals
2 What Regularization Parameter λ is Optimal?
Examples
Dening the Optimal λ
3 Generalized Cross-Validation
Cross Validation
Generalized Cross Validation (GCV)
Convergence Result
4 Discussion
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Convergence Result
Convergence Result
Theorem
Let g(·) ∈W(m)2
and let tini=1satisfy
∫ ti0w(u)du = i
n, where
w(u) is a strictly positive continuous weight function. Then there
exists sequences λn∞n=1and λ∗n∞n=1
of minimas of E[V (λ)] andE[R(λ)] such that the expectation ineciency I ∗,
I ∗ =E[V (λn)]
E[R(λ∗n)]→ 1, as n→∞ (15)
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
Plan
1 Generals
2 What Regularization Parameter λ is Optimal?
Examples
Dening the Optimal λ
3 Generalized Cross-Validation
Cross Validation
Generalized Cross Validation (GCV)
Convergence Result
4 Discussion
Mårten Marcus Generalized Cross Validation
Innehåll Generals What Regularization Parameter λ is Optimal? Generalized Cross-Validation Discussion
An implicit assumption that the true function g(t) is smooth
GCV was applied eectively to problems such as:
Finding the right order of splines in regression
Regularized solution of the Fredholm integral equations of the
rst kind
It provides an estimate of the regularization parameter from
general assumptions on the data.
Mårten Marcus Generalized Cross Validation