Introduction to Nonparametric Regression Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 1
33
Embed
Nathaniel E. Helwig - Statisticsusers.stat.umn.edu/~helwig/notes/npreg-Notes.pdfIn contrast,nonparametric regressiontries to estimate the form of the relationship between X and Y.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Introduction to Nonparametric Regression
Nathaniel E. Helwig
Assistant Professor of Psychology and StatisticsUniversity of Minnesota (Twin Cities)
Updated 04-Jan-2017
Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 1
Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 8
Need for Nonparametric Regression Nonparametric Regression
Parametric versus Nonparametric Regression
The general linear model is a form of parametric regression, where therelationship between X and Y has some predetermined form.
Parameterizes relationship between X and Y , e.g., Y = β0 + β1XThen estimates the specified parameters, e.g., β0 and β1
Great if you know the form of the relationship (e.g., linear)
In contrast, nonparametric regression tries to estimate the form of therelationship between X and Y .
No predetermined form for relationship between X and YGreat for discovering relationships and building prediction models
Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 9
Need for Nonparametric Regression Nonparametric Regression
Problem of Interest
Smoothers (aka nonparametric regression) try to estimate functionsfrom noisy data.
Suppose we have n pairs of points (xi , yi) for i ∈ {1, . . . ,n}, andWLOG assume that x1 ≤ x2 ≤ · · · ≤ xn.
Also, suppose the following assumptions hold:(A1) There is a functional relationship between x and y of the form
yi = η(xi) + εi ; i ∈ {1, . . . ,n}
(A2) The εi are iid from some distribution f (x) with zero mean
Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 10
Local Averaging
Local Averaging
Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 11
Local Averaging Overview
Friedman’s (1984) Local Averaging
To estimate η at the point xi , we could calculate the average of the yjvalues corresponding to xj values that are “near” xi .
Friedman (1984) defined “near” as the smallest symmetric windowaround xi that contains s observations.
Note that s is called the spanSize of averaging window can differ for each xi
But always use s points in averaging window
Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 12
Local Averaging Overview
Selecting the Span
Friedman proposed using a cross-validation approach to select span s.
For a given span s, leave-one-out cross-validation:Let y(i) denote the local averaging estimate of η at the point xiobtained by holding out the i-th pair (xi , yi)
Define CV residuals ei(s) = yi − y(i); note residual is function of s
s = mins∈S(1/n)∑n
i=1 e2i (s)
Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 13
Local Averaging Examples
Local Averaging Example 1: sunspots data
1750 1800 1850 1900 1950
050
150
250
Raw Data
years
suns
pots
1750 1800 1850 1900 1950
050
100
150
span=0.01
sunlocavg$x
sunl
ocav
g$y
1750 1800 1850 1900 1950
2060
100
span=0.05
sunlocavg$x
sunl
ocav
g$y
1750 1800 1850 1900 1950
2040
6080
span=cv
sunlocavg$x
sunl
ocav
g$y
Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 14