Functional Data Analysis - Lecture 4 & 5 Mathematical ... · Functional form of the dependence is not known, thus need some assumptions: the functional space and the basis What should

Functional Data AnalysisLecture – 4 & 5

Mathematical foundation & Exploratory Data Analysis

May 8, 2018

Outline

Usual linear regression model

y = X α + ε

Corresponding assumptions:

ε ∼ N (0,Σ) (major); mostly, weassume that Σ = I.

In fda, the data is usually some realisation of a stochasticprocess, as against a random variable. Therefore, we’re mostlyinterested in:

y(t) = f (t) + ε(t),

where we wish to estimate f (t), given the observation y(t).

y = X α + ε

Corresponding assumptions: ε ∼ N (0,Σ) (major);

mostly, weassume that Σ = I.

y(t) = f (t) + ε(t),

y = X α + ε

Corresponding assumptions: ε ∼ N (0,Σ) (major); mostly, weassume that Σ = I.

y(t) = f (t) + ε(t),

y = X α + ε

In fda, the data is usually some realisation of a stochasticprocess, as against a random variable.

Therefore, we’re mostlyinterested in:

y(t) = f (t) + ε(t),

y = X α + ε

y(t) = f (t) + ε(t),

Comparing the finite and the infinite dimensional models

Usual regressionMostly linear, or functionalform of the dependence isknown.

Estimation of thecoefficients relies on thedistributional assumptionson noise.

Functional form of thedependence is not known,thus need someassumptions: thefunctional space and thebasisWhat should be thedistribution of noise?

We know how to define mean and (co)variance of randomvectors, but now we need to define the same for infinitedimensional random elements.

Usual regressionMostly linear, or functionalform of the dependence isknown.

Estimation of thecoefficients relies on thedistributional assumptionson noise.

fdaFunctional form of thedependence is not known,thus need someassumptions: thefunctional space and thebasis

What should be thedistribution of noise?

Usual regressionMostly linear, or functionalform of the dependence isknown.Estimation of thecoefficients relies on thedistributional assumptionson noise.

fdaFunctional form of thedependence is not known,thus need someassumptions: thefunctional space and thebasis

What should be thedistribution of noise?

fdaFunctional form of thedependence is not known,thus need someassumptions: thefunctional space and thebasisWhat should be thedistribution of noise?

Outline

A simplest model for the data space is Hilbert space– acollection of elements which has infinitely many (but countable)basis elements, and one can define an inner product betweenany pair of elements.

For example:`2 = {(a1,a2,a3, . . .) : ai ∈ R, and

∑i≥1 a2

i <∞}.

Looks similar to Rn; and we know how to define standardnormal on Rn. So what about standard normal on `2?

Let us recall how to characterise standard normal distributionon Rn:

density is given by (2π)−n/2 exp(− x2

)or, all linear combinations are Normal with appropriatemean and variance.

A simplest model for the data space is Hilbert space– acollection of elements which has infinitely many (but countable)basis elements, and one can define an inner product betweenany pair of elements.For example:`2 = {(a1,a2,a3, . . .) : ai ∈ R, and

∑i≥1 a2

i <∞}.

∑i≥1 a2

i <∞}.

∑i≥1 a2

i <∞}.

or, all linear combinations are Normal with appropriatemean and variance.

∑i≥1 a2

i <∞}.

Density?

Density is always defined with respect to the Lebesguemeasure,

which does not exist when the dimension of thespace goes to infinity.

Density?

Density is always defined with respect to the Lebesguemeasure, which does not exist when the dimension of thespace goes to infinity.

Linear combinations?

Adaptable even to infinite dimensions.

Just need to know whatkind of linear combinations should be admissible.One can define Gaussian distribution on `2 with mean m, andcovariance C as long as

m ∈ `2,

and C is a linear operator1 on `2 such that∑i≥1

〈Cei ,ei〉 <∞,

where {ei} is an orthonormal basis of `2. But why?

1Discuss. Compare with matrices.

Adaptable even to infinite dimensions. Just need to know whatkind of linear combinations should be admissible.

One can define Gaussian distribution on `2 with mean m, andcovariance C as long as

m ∈ `2,

〈Cei ,ei〉 <∞,

Adaptable even to infinite dimensions. Just need to know whatkind of linear combinations should be admissible.One can define Gaussian distribution on `2 with mean m, andcovariance C as long as

m ∈ `2,

〈Cei ,ei〉 <∞,

where {ei} is an orthonormal basis of `2.

But why?

Adaptable even to infinite dimensions. Just need to know whatkind of linear combinations should be admissible.One can define Gaussian distribution on `2 with mean m, andcovariance C as long as

m ∈ `2,

〈Cei ,ei〉 <∞,

A quick explanation

Let us consider X = (X1,X2, . . .) an `2-valued random variabledistributed as standard Gaussian.

Meaning {Xi}i≥1 are i.i.d.standard normal. Implying that C = I on `2.Clearly,∑

〈Iei ,ei〉 =∞.

What is the magnitude/size of such a Gaussian element?

E(‖X‖2

∑i≥1

What about ‖X‖2 in general? ‖X‖2 =∞ almost surely. Onecan actually show that ‖X‖2 <∞, whenever C is trace class.

A quick explanation

Let us consider X = (X1,X2, . . .) an `2-valued random variabledistributed as standard Gaussian. Meaning {Xi}i≥1 are i.i.d.standard normal.

Implying that C = I on `2.Clearly,∑i≥1

〈Iei ,ei〉 =∞.

E(‖X‖2

∑i≥1

A quick explanation

Let us consider X = (X1,X2, . . .) an `2-valued random variabledistributed as standard Gaussian. Meaning {Xi}i≥1 are i.i.d.standard normal. Implying that C = I on `2.

Clearly,∑i≥1

〈Iei ,ei〉 =∞.

E(‖X‖2

∑i≥1

A quick explanation

Let us consider X = (X1,X2, . . .) an `2-valued random variabledistributed as standard Gaussian. Meaning {Xi}i≥1 are i.i.d.standard normal. Implying that C = I on `2.Clearly,∑

〈Iei ,ei〉 =∞.

E(‖X‖2

∑i≥1

A quick explanation

〈Iei ,ei〉 =∞.

E(‖X‖2

∑i≥1

A quick explanation

〈Iei ,ei〉 =∞.

E(‖X‖2

∑i≥1

A quick explanation

〈Iei ,ei〉 =∞.

E(‖X‖2

∑i≥1

What about ‖X‖2 in general?

‖X‖2 =∞ almost surely. Onecan actually show that ‖X‖2 <∞, whenever C is trace class.

A quick explanation

〈Iei ,ei〉 =∞.

E(‖X‖2

∑i≥1

What about ‖X‖2 in general? ‖X‖2 =∞ almost surely.

Onecan actually show that ‖X‖2 <∞, whenever C is trace class.

A quick explanation

〈Iei ,ei〉 =∞.

E(‖X‖2

∑i≥1

Covariance between various linear combinationsIn finite dimensions:On Rn, let Y ∼ N (µ,Σ), then 〈a,Y 〉 and 〈b,Y 〉 are bothnormally distributed, with covariance 〈Σa,b〉.

In infinite dimensions:On `2, let X ∼ N (m, C), then 〈a,X 〉 and 〈b,X 〉 are bothnormally distributed, with covariance 〈Ca,b〉, i.e.,

〈Ca,b〉 = E[〈a,X 〉〈b,X 〉]

After some analysis, one can conclude that

C =∑i≥1

λi φi ⊗ φi (Mercer’s Theorem)

for some ONB {φi}. (compare with matrices, and discuss L2

representation).

〈Ca,b〉 = E[〈a,X 〉〈b,X 〉]

C =∑i≥1

representation).

〈Ca,b〉 = E[〈a,X 〉〈b,X 〉]

C =∑i≥1

representation).

In fact, there exist a sequence {ξn} of zero-mean, uncorrelatedrandom variables such that E(ξ2

i ) = λi , and

Y =∑i≥1

ξi φi (Karhunen–Loeve expansion)

Special case

In case the functional data is realisation of certain stochasticprocess, then the mean m is a mean function m(t) = E(X (t)),and the covariance operator becomes a covariance kernel,C(s, t) = cov(X (s)X (t)).

In fact, there exist a sequence {ξn} of zero-mean, uncorrelatedrandom variables such that E(ξ2

i ) = λi , and

Y =∑i≥1

ξi φi (Karhunen–Loeve expansion)

Special case

In case the functional data is realisation of certain stochasticprocess, then the mean m is a mean function m(t) = E(X (t)),and the covariance operator becomes a covariance kernel,C(s, t) = cov(X (s)X (t)).

Outline

sample mean;sample covariancefunctional PCA (with Karhunen–Loeve –using probefunctions)

Functional Data Analysis - Lecture 4 & 5 Mathematical ... · Functional form of the dependence is not known, thus need some assumptions: the functional space and the basis What should

Documents

Correspondence of functional and microscopic pKa values in a...

Drug treatments for heroin dependence heroin dependence.

Scale-dependence of processes structuring dung beetle...

DATA NORMALIZATION CS 260 Database Systems. Overview ...

Priority and Particle Physics: structure, dependence, and...

Testing model assumptions in functional regression...

Optical Properties of Gold Nanoparticles. Introduction Size....

› ~ghoshal › papers › wtdslln.pdf · The strong law.....

When in life does density dependence occur in fish ... ·.....

The Impact of Culture - Washington Bankers Association...

Functional Dependence and Nutritional Status

Fully Secure Functional Encryption for Inner Products ... ·...

Global fitting of pairing density functional; the...

The relationship between frailty, functional dependence ...

RFT for clinical use Niklas Törneke 1. The structure of the...

Functional dependence underlies a positive plant ......