Functional Data Analysis with PACE - Statistics · Karhunen-Loeve expansion` X(t)= m(t)+ ¥ å k=1 x kf k(t) Best linear expansion with p components: X(t)ˇm(t)+ p å k=1 x kf k(t):

Functional Data Analysis with PACE

Kehui Chen

Department of Statistics,

University of California, Davis

JSM, 2012

Outline

• General introduction of PACE

• Illustrative examples for various functional regression programs

Overview of PACE

• Implements various methods of Functional Data Analysis (FDA).

• Provides analysis for sparsely or densely sampled randomtrajectories and time courses.

• The core program is based on the Principal Analysis byConditional Expectation (PACE) algorithm.

• The most updated version is PACE 2.15, written in Matlab, alongwith an R version in development.

Development of PACE

• Supported by various NSF grants.

• Coordinated by Hans-Georg Muller and Jane-Ling Wang.

• PACE 1.0 was written by Fang Yao in 2005, and subsequentmajor improvements were made by Bitao Liu.

• Contributors and developers include (alphabetical order):

Dong Chen, Kehui Chen, Jeng-Min Chiou, Joel Dubin,Andrew Farris, Andrea Gottlieb, Jinjiang He, Ci-Ren Jiang,Yu-Ru Su, Rona Tang, Wenwen Tao, Shuang Wu,Cong Xu, Matt Yang, Wenjing Yang, Xiaoke Zhang.

Functional Principal Component Analysis

• X(t) is a second order random process,mean function µ(t) ∈ L2(T ),continuous covariance function G(s, t) = cov(X(s),X(t)).

• G(s, t) = ∑∞k=1 λkφk(s)φk(t), eigenvalues λ1 ≥ λ2, · · · ,λk, · · · ≥ 0,

eigenfunctions φk(t) form an orthogonal basis.• Karhunen-Loeve expansion

X(t) = µ(t)+∞

∑k=1

ξkφk(t)

• Best linear expansion with p components:

X(t)≈ µ(t)+p

∑k=1

ξkφk(t).

Dense and Sparse Designs

• Very densely and regularly observed data: empirical mean andcovariance, and ξk =

∫T (X(t)−µ(t))φk(t)dt.

• Densely recorded but irregular design, or contaminated witherror: pre-smoothing for individual curves.

• Sparse random design (longitudinal data): pre-smoothing isproblematic.

• PACE works for both dense and sparse data.

The Core Program FPCA

• Pool all the sample Yij = Xi(tij)+ εij, 1≤ i≤ n,1≤ j≤ mi, andestimate mean and covariance by local linear smoothing. One(two) dimensional nonparametric rate for sparse data, and

√n

rate for dense data.

• Conditional expectation method to estimate the components ξik.For sparse case, best linear unbiased prediction; for dense data, itis asymptotically equivalent to the numerical approximation ofξik =

∫T (Xi(t)−µ(t))φk(t)dt.

• Yao et al. (2005), Hall et al. (2006), Li and Hsing (2010), Caiand Yuan (2010).

Local Linear Smoothing Estimators

• Mean function is given by µ(t) = a0, where

(a0, a1) = argminn

∑i=1

mi

∑j=1{[Yij−a0−a1(tij− t)]2×Kh(tij− t)}.

• Covariance function is given by G(t1, t2) = a0, where

(a0, a1, a2) = argminn

∑i=1

∑j 6=l{[Yc

ijYcil−a0−a1(tij− t1)

−a2(til− t2)]2×Kb(tij− t1)Kb(til− t2)}.

Covariance Estimation

G(s,t)

G(t,t)+σ2

t s t

Principal Analysis by Conditional Expectation

• Xi = (Xi(ti1), . . . ,Xi(timi))T , Yi = (Yi1, . . . ,Yimi)

T ,µi = (µ(ti1), . . . ,µ(timi))

T , φik = (φk(ti1), . . . ,φk(timi))T , by

Gaussianity

E[ξik|Yi] = λkφTikΣ−1Yi(Yi−µi),

where ΣYi = cov(Yi,Yi) = cov(Xi,Xi)+σ2Imi .

• The method is robust and works well for non-Gaussian data.

Functional Regression in PACE

• Linear regression and diagnostics

• Quadratic (Polynomial) regression

• Additive modeling

• Generalized responses

• Quantile and conditional distribution modeling

• Function to scalar; function to function

Illustrative Example: Meat Spectral Data

• FPCreg, FPCdiag: Let Xc(t) = Xc(t)−µ(t)

E(Y|X) = α +∫

Xc(t)β (t)dt

• FPCQuadReg: (Yao and Muller 2010, Horvath and Reeder, 2012)

E(Y|X) = α +∫

Xc(t)β (t)dt+∫∫

γ(s, t)Xc(s)Xc(t)dsdt

• FPCquantile (Chen and Muller 2012. JRSSB.)

P(Y ≤ y|X) = E(I(Y ≤ y)|X) = g−1(α(t)+∫

Xc(t)β (y, t)dt)



E(Y|X) = α +∫

Xc(t)β (t)dt


E(Y|X) = α +∫




P(Y ≤ y|X) = E(I(Y ≤ y)|X) = g−1(α(t)+∫

Xc(t)β (y, t)dt)



E(Y|X) = α +∫

Xc(t)β (t)dt


E(Y|X) = α +∫




P(Y ≤ y|X) = E(I(Y ≤ y)|X) = g−1(α(t)+∫

Xc(t)β (y, t)dt)

Predictor Functions: Spectral Data

850 900 950 1000 10502

2.5

3

3.5

4

4.5

5

5.5

Spectrum Channel

Abs

orba

nce

Coefficient of Linear Regression

850 900 950 1000 1050−800

−600

−400

−200

0

200

400

600

800

1000

1200

x

Confidence bands for Beta

E(Y|X) = α +∫

Xc(t)β (t)dt

Residual Plot for Linear Regression

0 10 20 30 40 50 60

−10

−5

0

5

10

Fitted

Res

idua

l

Coefficients of Quadratic Regression

850 900 950 1000 1050−15

−10

−5

0

5

10

850

900

950

1000

1050

850

900

950

1000

1050

−2

−1

0

1

2

3

E(Y|X) = α +∫



Residual Plot for Quadratic Regression

0 5 10 15 20 25 30 35 40 45 50 55−5

−4

−3

−2

−1

0

1

2

3

4

5

Fitted

Res

idua

l

Quantiles

0 5 10 15 20 25 30 35 40 45 500

5

10

15

20

25

30

35

40

45

50

Fat Content

Pre

dict

ed Q

uant

iles

truemedian0.1 th0.9 th

Illustrative Example: Traffic Data

Velocity on I-880

21 22 23 24 25 26 27

10

20

30

40

50

60

70

10:25:26V

eloc

ity (

mph

)

21 22 23 24 25 26 27

10

20

30

40

50

60

70

14:15:41

21 22 23 24 25 26 27

10

20

30

40

50

60

70

16:33:50

Postmile

Vel

ocity

(m

ph)

21 22 23 24 25 26 27

10

20

30

40

50

60

70

12:29:56

Postmile

Prediction for Response Functions

• Y and X are both functions

• FPCfam: E(Y(t)|X) = µY(t)+∑∞k=1 ∑

∞j=1 fjk(ξk)ψj(t)

• FPCpredBands (Chen and Muller 2012): Global prediction bandsfor Y conditional on X

• For Gaussian process: E(Y|X) and cov(Y|X)

• Common principal component assumptionAdditive assumption

cov(Y(t1),Y(t2) | X)= GYY(t1, t2)+∑

∞j=1{∑∞

k=1 gjk(ξk)−(

∑∞k=1 fjk(ξk)

)2}ψj(t1)ψj(t2)








cov(Y(t1),Y(t2) | X)= GYY(t1, t2)+∑

∞j=1{∑∞

k=1 gjk(ξk)−(

∑∞k=1 fjk(ξk)

)2}ψj(t1)ψj(t2)








cov(Y(t1),Y(t2) | X)= GYY(t1, t2)+∑

∞j=1{∑∞

k=1 gjk(ξk)−(

∑∞k=1 fjk(ξk)

)2}ψj(t1)ψj(t2)








cov(Y(t1),Y(t2) | X)= GYY(t1, t2)+∑

∞j=1{∑∞

k=1 gjk(ξk)−(

∑∞k=1 fjk(ξk)

)2}ψj(t1)ψj(t2)








cov(Y(t1),Y(t2) | X)= GYY(t1, t2)+∑

∞j=1{∑∞

k=1 gjk(ξk)−(

∑∞k=1 fjk(ξk)

)2}ψj(t1)ψj(t2)

Modeling the Prediction Bands

• Global prediction bands for Gaussian case:

P(µ(t)−DX(t)≤ YX(t)≤ µ(t)+DX(t) | X)≥ 1−α

where DX(t) = Cα {var(Y(t)|X)}1/2

• For more general random processes:

E{P(LX(t)≤ YX(t)≤ UX(t) | X)} ≥ 1−α

• Find Cα by the empirical coverage

‘Mobile Century’ Data

• Joint UC Berkeley - Nokia project (Herrera et al., 2010)

• Students were hired to drive on a segment of highway I-880 andsend data (time, location, and speed) back through GPS enabledmobile phones.

• The follow-up project ‘Mobile Millennium’ is generating moredata.

Estimated 90% Prediction Regions

0 50 100 150 200 250 300

−80−60−40−20

020

0 50 100 150 200 250 300

−80−60−40−20

020

0 50 100 150 200 250 300

−80−60−40−20

020

Rel

ativ

e S

peed

(m

ph)

0 50 100 150 200 250 300

−80−60−40−20

020

0 50 100 150 200 250 300

−80−60−40−20

020

Time (sec)0 50 100 150 200 250 300

−80−60−40−20

020

Time (sec)

Other Important Tools in PACE

• Modeling of derivatives (linear and nonlinear empiricaldynamics)

• Modeling of functional errors (variance processes, volatilityprocesses)

• Time-synchronization based on pairwise warping• Functional manifold analysis• Modeling of functional correlations• Distance based methods (curve clustering)• Stringing method

Get Started with PACE

Get Started with PACE

• User Friendly: help files, examples, documentation, references.

• � p = setOptions()� p2 = setOptions(′bwmu′,3)

• Various options for bandwidth selection, number of components,different designs, errors, pre-binning options.

• The code and descriptions can be downloaded fromhttp://anson.ucdavis.edu/~mueller/data/programs.html.

THANK YOU!

• Yao, F., Muller, H.G., Wang, J.L. (2005), Functional data analysis for sparselongitudinal data. J. American Statistical Association, 100, 577-590.

• Yao, F., Muller, H.G., Wang, J.L. (2005), Functional Linear RegressionAnalysis for Longitudinal Data. Annals of Statistics, 33, 2873-2903.

• Chiou, J., Muller, H.G. (2007), Diagnostics for functional regression viaresidual processes. Computational Statistics and Data Analysis, 51,4849-4863.

• Muller, H.G., Yao, F. (2010), Functional quadratic regression. Biometrika 97,49-64.

• Muller, H.-G. and Yao, F. (2008), Functional additive models, J. of theAmerican Statistical Association, 103, 1534-1544.

• Muller, H.-G. and Stadtmuller, U. (2005), Generalized functional linear

models, Annals of Statistics, 33, 774–805.

• Chen, K. and Muller, H.-G. (2012), Conditional quantile analysis whencovariates are functions, with application to growth data, J. of the RoyalStatistical Society: Series B, 74, 67-89.

• Liu, B., Muller, H.G. (2009), Estimating derivatives for samples of sparselyobserved functions, with application to on-line auction dynamics. J. AmericanStatistical Association, 104, 704-717.

• Muller, H.G., Yao, F. (2010), Empirical dynamics for longitudinal data. Annalsof Statistics, 38, 3458C3486.

• Muller, H.G., Stadtmuller, U., Yao, F. (2006), Functional variance processes. J.of the American Statistical Association, 101, 1007-1018.

• Muller, H.G., Sen, R., Stadtmuller, U. (2011), Functional Data Analysis for

Volatility. J. Econometrics 165, 233-245.

• Tang, R., Muller, H.G. (2008), Pairwise curve synchronization forhigh-dimensional data.Biometrika, 95, 875-889

• Chen, D., Muller, H.G. (2012), Nonlinear manifold representations forfunctional data. Annals of Statistics, 40, 1-29.

• Yang, W., Mller, H.G. Muller, H.G., Stadtmller, U. (2011), Functional singularcomponent analysis. J. Royal Statistical Society B, 73, 303C-324.

• Dubin, J., Muller, H.G. (2005), Dynamical correlation for multivariatelongitudinal data. J. American Statistical Association, 100, 872-881.

• Peng, J., Muller, H.G. (2008), Distance-based clustering of sparsely observedstochastic processes, with applications to online auctions. Annals of AppliedStatistics, 2, 1056-1077.

• Chen, K., Chen, K., Muller, H.G., Wang, J.L. (2011), Stringing

high-dimensional data for functional analysis. J. American Statistical

Association, 106, 275-284.

Functional Data Analysis with PACE - Statistics · Karhunen-Loeve expansion` X(t)= m(t)+ ¥ å k=1 x kf k(t) Best linear expansion with p components: X(t)ˇm(t)+ p å k=1 x kf k(t):

Documents