Top Banner
Data mining and statistic al learning - lecture 6 1 Overview Basis expansion • Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression Multidimensional splines • Wavelets
30

Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Dec 18, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

1

Overview

• Basis expansion

• Splines

• (Natural) cubic splines

• Smoothing splines

• Nonparametric logistic regression

• Multidimensional splines

• Wavelets

Page 2: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

2

Linear basis expansion (1)

Linear regression

True model:

Question: How to find ?

Answer: Solve a system of linear equations to obtain

x1 x2 x3 y

1 -3 6 12

… … … …

332211)( xxxxfy

321ˆ,ˆ,ˆ

Page 3: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

3

Linear basis expansion (2)

Nonlinear model

True model:

Question: How to find ?

Answer: A) Introduce new variables

x1 x2 x3 y

1 -3 -1 12

… … … … 2143322211 sin3 xxexxxy x

21433

22211

,sin

,, 3

xuxu

exuxxu x

Page 4: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

4

Linear basis expansion (3)

Nonlinear model

B) Transform the data set

True model:

C) Apply linear regression to obtain

u1 u2 u3 u4 y

-3 -1.1 -0.84 1 12

… … … … 44332211 uuuuy

4321ˆ,ˆ,ˆ,ˆ

Page 5: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

5

Linear basis expansion (4)

Conclusion:

We can easily fit any model of the type

i.e., we can easily undertake a linear basis expansion in X

Example: If the model is known to be nonlinear, but the exact form is unknown, we can try to introduce interaction terms

M

mmm XhXf

1

21122

11111 XXXXXXf pp

Page 6: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

6

Piecewise polynomial functions

Assume X is one-dimesional

Def. Assume the domain [a, b] of X is split into intervals [a, ξ1], [ξ 1, ξ 2], ..., [ξ n, b]. Then f(X) is said to be piecewise polynomial if f(X) is represented by separate polynomials in the different intervals.

Note The points ξ1,..., ξ n are called knots

Page 7: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

7

Piecewise polynomials

Example. Continuous piecewise linear function

Alternative A. Introduce linear functions on each interval and a set of constraints

(4 free parameters) INS. FIG 5.1 lower left

Alternative B. Use a basis expansion (4 free parameters)

Theorem. The given formulations are equivalent.

2322

1211

333

222

111

yy

yy

xy

xy

xy

241321 ,,,1 XXhXXhXXhXh

Page 8: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

8

Splines

Definition A piecewise polynomial is called order-M spline if it has continuous derivatives up to order M-1 at the knots.

Alternative definition An order-M spline is a function which can be represented by basis functions ( K= #knots )

Theorem. The definitions above are equivalent.

Terrminology. Order-4 spline is called cubic spline INS. FIG 5.2 LR

(look at basis and compare #free parameters)

Note. Cubic splines: knot-discontinuity is not visible

KlXXh

MjXXhM

llM

jj

,,1,

,,1,1

1

Page 9: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

9

Variance of spline estimators – boundary effects

INSERT FIG 5.3

Page 10: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

10

Natural cubic spline

Def. A cubic spline f is called natural cubic spline if the its 2nd and 3rd derivatives are zero at a and b

Note It implies that f is linear on extreme intervals

Basis functions of natural cubic splines

kK

Kkk

Kkk

XXXd

KkXdXdNXXNXN

33

1221

where

2...,,1,,,1

Page 11: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

11

Fitting smooth functions to data

Minimize a penalized sum of squared residuals

where λ is smoothing parameter.

λ=0 : any function interpolating data

λ=+ : least squares line fit

dttfxfyfRSSN

iii

2

1

2,

Page 12: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

12

Optimality of smoothing splines

Theorem The function f minimizing RSS for a given is a natural cubic spline with knots at all unique values of xi (NOTE: N knots!)

The optimal spline can be computed as follows.

yNNN

N

NyNy

TN

T

jiijNijij

NTT

TN

jjj

dttNtNxN

RSS

xNxNxf

1

''''

1

ˆ

,

Page 13: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

13

A smoothing spline is a linear smoother

The fitted function

is linear in the response values.

ySyNNNN T

NTf

Page 14: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

14

Degrees of freedom of smoothing splines

The effective degrees of freedom is

dfλ = trace(Sλ)

i.e., the sum of the diagonal elements of S.

Page 15: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

15

Smoothing splines and eigenvectors

It can be shown that

where K is the so-called penalty matrix

Furthermore, the eigen-decomposition is

Note: dk and uk are eigenvalues and

eigenvectors, respectively, of K

1 KIS

k

k

N

k

Tkkk

d

1

11

uuS

Page 16: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

16

Smoothing splines and shrinkage

• Smoothing spline decomposes vector y with respect to basis of eigenvectors and shrinks respective contributions

• The eigenvectors ordered by ρ increase in complexity. The higher the complexity, the more the contribution is shrunk.

N

k

Tkkk

1

,yuuyS

Page 17: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

17

Smoothing splines and local curve fitting

• Eigenvalues are reverse functions of λ. The higher λ, the higher penalization.

• Smoother matrix is has banded nature -> local fitting method

• INSERT fig 5.8

N

k kdtracedf

1 1

1

S

Page 18: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

18

Fitting smoothing splines in practice (1)

Reinsch form:

Theorem. If f is natural cubic spline with values at knots f and second derivative at knots then

where Q and R are band matrices, dependent on ξ only.

Theorem.

1 KIS

RQT f

TQQRK 1

Page 19: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

19

Fitting smoothing splines in practice (2)

Reinsch algorithm

• Evaluate QTy

• Compute R+λQTQ and find Cholesky decomposition (in linear time!)

• Solve matrix equation (in linear time!)

• Obtain f=y-λQγ

Page 20: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

20

Automated selection of smoothing parameters (1)

What can be selected:

Regression splines

• Degree of spline

• Placement of knots

->MARS procedure

Smoothing spline

• Penalization parameter

Page 21: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

21

Automated selection of smoothing parameters (2)

Fixing the degrees of freedom

• If we fix dfλ then we can find λ by solving the equation numerically

• One could try two different dfλ and choose one based on F-tests, residual plots etc.

N

k kdtracedf

1 1

1

S

Page 22: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

22

Automated selection of smoothing parameters (3)

The bias-variance trade off

INSERT FIG. 5.9

EPE – integrated squared

prediction error,

CV- cross validation

N

k kdtracedf

1 1

1

S

Page 23: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

23

Nonparametric logistic regression

Logistic regression model

Note: X is one-dimensional

What is f:

Linear -> ordinary logistic regression (Chapter 4)

• Enough smooth -> nonparametric logistic regression (splines+others)

• Other choices are possible

)(

|0Pr

|1Prlog Xf

xXY

xXY

Page 24: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

24

Nonparametric logistic regression

Problem formulation:

Minimize penalized log-likelihood

Good news: Solution is still a natural cubic spline.

Bad news: There is no analytic expression of that spline function

dttfflfl up

2

2

1,,min

Page 25: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

25

Nonparametric logistic regression

How to proceed?

Use Newton-Rapson to compute spline numerically, i.e

• Compute (analytically)

1. Compute Newton direction using current value of parameter and derivative information

2. Compute new value of parameter using old value and update formula

T

pp

pp

ll

ll

2

2,

ppoldnew ll

12

Page 26: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

26

Multidimensional splines

How to fit data smoothly in higher dimensions?

A) Use basis of one dimensional functions and produce basis by tensor product

Problem: Exponential INS FIG. 6.10

growth of basis with dim

XgXg

XhXhXg

jkjk

kjjk

,2211

Page 27: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

27

Multidimensional splines

How to fit data smoothly in higher dimensions?

B) Formulate a new problem

• The solution is thin-plate splines

• The similar properties for λ=0.

• The solution in 2 dimension is essentially sum of radial basis functions

fJxfyi

ii 2min

jjT xxxxf 0

Page 28: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

28

Wavelets

Introduction

• The idea: to fit bumpy function by removing noise

• Application area: Signal processing, compression

• How it works: The function is represented in the basis of bumpy functions. The small coefficients are filtered.

Page 29: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

29

Wavelets

Basis functions (Haar Wavelets, Symmlet-8 Wavelets)

INSERT FIG 5.13

Page 30: Data mining and statistical learning - lecture 6 1 Overview Basis expansion Splines (Natural) cubic splines Smoothing splines Nonparametric logistic regression.

Data mining and statistical learning - lecture 6

30

Wavelets

Example

Insert FIG 5.14