Top Banner
Scalable Gaussian Process Methods James Hensman Oxford, February 2015
65

Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

May 20, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Scalable Gaussian Process Methods

James Hensman

Oxford, February 2015

Page 2: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Joint work with

Nicolo FusiAlex

Matthews

NeilLawrence

ZoubinGhahramani

Page 3: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Overview

Background: Gaussian processes

Sparse Gaussian Processes

Stochastic Variational Inference

Gaussian Likelihoods

Non-Gaussian likelihoods

Bonus: Deep GPs

Page 4: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

GPs

0 5 10 15

t−1.5

−1.0

−0.5

0.0

0.5

1.0

1.5

2.0

2.5

f

Page 5: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

GPs - prior over functions

f ∼ GP(0, k(t, t′))

0 5 10 15

t−4

−3

−2

−1

0

1

2

3

4

f

Page 6: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

GPs - posterior over functions

0 5 10 15

t−1.5

−1.0

−0.5

0.0

0.5

1.0

1.5

2.0

2.5

f

Page 7: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

GPs - analytical solution

f |y ∼ GP(k(x, x)K−1y, k(x, x′) − k(x, x)K−1k(x, x′)

)

0 5 10 15

t−1.5

−1.0

−0.5

0.0

0.5

1.0

1.5

2.0

2.5

f

Page 8: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

What about noise?

Gaussian noise is tractable - additional parameter σn

0 5 10 15

t−1.5

−1.0

−0.5

0.0

0.5

1.0

1.5

2.0

2.5

f

Page 9: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Parameters

−4−3−2−1

01234

−4−3−2−1

01234

0 5 10 15−4−3−2−1

01234

0 5 10 15

Page 10: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Overview

Background: Gaussian processes

Sparse Gaussian Processes

Stochastic Variational Inference

Gaussian Likelihoods

Non-Gaussian likelihoods

Bonus: Deep GPs

Page 11: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Motivation

Inference in a GP has the following demands:

Complexity: O(n3)Storage: O(n2)

Inference in a sparse GP has the following demands:

Complexity: O(nm2)Storage: O(nm)

where we get to pick m!

Page 12: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Still not good enough!

Big Data

I In parametric models, stochastic optimisation is used.I This allows for application to Big Data.

This work

I Show how to use Stochastic Variational Inference in GPsI Stochastic optimisation scheme: each step requires O(m3)

Page 13: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Overview

Background: Gaussian processes

Sparse Gaussian Processes

Stochastic Variational Inference

Gaussian Likelihoods

Non-Gaussian likelihoods

Bonus: Deep GPs

Page 14: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Incomplete bibliography

(With apologies for 27,000 omissions)

I Csato and Opper, 2002I Seeger 2003, Lawrence and Seeger 2003I Snelson and Ghahramani 2006I Quinonero-Candela and Rasmussen 2005I Titsias 2009I Alvarez and Lawrence 2011

Page 15: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Computational savings

Knn ≈ Qnn = KnmK−1mmKmn

Instead of inverting Knn, we make a low rank (or Nystrom)approximation, and invert Kmm instead.

Page 16: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Information capture

Everything we want to do with a GP involvesmarginalising f

I PredictionsI Marginal likelihoodI Estimating covariance parameters

The posterior of f is the central object. This means invertingKnn.

Page 17: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

X, y

input space (X)

function v

alues

Page 18: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

X, y

f (x) ∼ GP

input space (X)

function v

alues

Page 19: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

X, y

f (x) ∼ GP

p(f) = N (0,Knn)

input space (X)

function v

alues

Page 20: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

X, y

f (x) ∼ GP

p(f) = N (0,Knn)

p(f |y,X)

input space (X)

function v

alues

Page 21: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Introducing u

Take and extra M points on the function, u = f (Z).

p(y, f,u) = p(y | f)p(f |u)p(u)

Page 22: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Introducing u

Page 23: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Introducing u

Take and extra M points on the function, u = f (Z).

p(y, f,u) = p(y | f)p(f |u)p(u)

p(y | f) = N(y|f, σ2I

)p(f |u) = N

(f|KnmKmmıu, K

)p(u) = N (u|0,Kmm)

Page 24: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

X, y

f (x) ∼ GP

p(f) = N (0,Knn)

p(f |y,X)

Z,up(u) = N (0,Kmm)

input space (X)

function v

alues

Page 25: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

X, y

f (x) ∼ GP

p(f) = N (0,Knn)

p(f |y,X)

p(u) = N (0,Kmm)

p(u |y,X)

input space (X)

function v

alues

Page 26: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

The alternative posterior

Instead of doing

p(f |y,X) =p(y | f)p(f |X)∫p(y | f)p(f |X)df

We’ll do

p(u |y,Z) =p(y |u)p(u |Z)∫p(y |u)p(u |Z)du

but p(y |u) involves inverting Knn

Page 27: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

The alternative posterior

Instead of doing

p(f |y,X) =p(y | f)p(f |X)∫p(y | f)p(f |X)df

We’ll do

p(u |y,Z) =p(y |u)p(u |Z)∫p(y |u)p(u |Z)du

but p(y |u) involves inverting Knn

Page 28: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Variational marginalisation of f

p(y |u) =p(y | f)p(f |u)

p(f |y,u)

ln p(y |u) = ln p(y | f) + lnp(f |u)

p(f |y,u)

ln p(y |u) = Ep(f |u)

[ln p(y | f)

]+ Ep(f |u)

[ln

p(f |u)p(f |y,u)

]ln p(y |u) = ln p(y |u) + KL[p(f|u)||p(f |y,u)]

No inversion of Knn required

Page 29: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Variational marginalisation of f

p(y |u) =p(y | f)p(f |u)

p(f |y,u)

ln p(y |u) = ln p(y | f) + lnp(f |u)

p(f |y,u)

ln p(y |u) = Ep(f |u)

[ln p(y | f)

]+ Ep(f |u)

[ln

p(f |u)p(f |y,u)

]ln p(y |u) = ln p(y |u) + KL[p(f|u)||p(f |y,u)]

No inversion of Knn required

Page 30: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Variational marginalisation of f

p(y |u) =p(y | f)p(f |u)

p(f |y,u)

ln p(y |u) = ln p(y | f) + lnp(f |u)

p(f |y,u)

ln p(y |u) = Ep(f |u)

[ln p(y | f)

]+ Ep(f |u)

[ln

p(f |u)p(f |y,u)

]

ln p(y |u) = ln p(y |u) + KL[p(f|u)||p(f |y,u)]

No inversion of Knn required

Page 31: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Variational marginalisation of f

p(y |u) =p(y | f)p(f |u)

p(f |y,u)

ln p(y |u) = ln p(y | f) + lnp(f |u)

p(f |y,u)

ln p(y |u) = Ep(f |u)

[ln p(y | f)

]+ Ep(f |u)

[ln

p(f |u)p(f |y,u)

]ln p(y |u) = ln p(y |u) + KL[p(f|u)||p(f |y,u)]

No inversion of Knn required

Page 32: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

An approximate likelihood

p(y |u) =n∏

i=1

N

(yi|k>mnK−1

mmu, σ2)

exp{−

12σ2

(knn − k>mnK−1

mmkmn)}

A straightforward likelihood approximation, and a penaltyterm

Page 33: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Now we can marginalise u

p(u |y,Z) =p(y |u)p(u |Z)∫p(y |u)p(u |Z)du

I Computing the (approximate) posterior costs O(nm2)I We also get a lower bound of the marginal likelihoodI This is the standard variational sparse GP (?).

Page 34: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Overview

Background: Gaussian processes

Sparse Gaussian Processes

Stochastic Variational Inference

Gaussian Likelihoods

Non-Gaussian likelihoods

Bonus: Deep GPs

Page 35: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Stochastic Variational Inference

I Combine the ideas of stochastic optimisation withVariational inference

I example: apply Latent Dirichlet allocation to projectGutenberg

I Can apply variational techniques to Big DataI How could this work in GPs?

Page 36: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

But GPs are not factorizing models?

I The variational marginalisation of f introducedfactorisation across the datapoints (conditioned on u)

I Marginalising u re-introdcuced dependencies between thedata

I Solution: a variational treatment of u

Page 37: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Variational Bayes

p(u |y) =p(y |u)p(u)

p(y)

p(y) =p(y |u)p(u)

p(u |y)

ln p(y) = lnp(y |u)p(u)

q(u)+ ln

q(u)p(u |y)

ln p(y) = Eq(u)

[ln

p(y |u)p(u)q(u)

]+ Eq(u)

[ln

q(u)p(u |y)

]ln p(y) = L + KL

(q(u) ‖ p(u |y)

)

Page 38: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Variational Bayes

p(u |y) =p(y |u)p(u)

p(y)

p(y) =p(y |u)p(u)

p(u |y)

ln p(y) = lnp(y |u)p(u)

q(u)+ ln

q(u)p(u |y)

ln p(y) = Eq(u)

[ln

p(y |u)p(u)q(u)

]+ Eq(u)

[ln

q(u)p(u |y)

]ln p(y) = L + KL

(q(u) ‖ p(u |y)

)

Page 39: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Variational Bayes

p(u |y) =p(y |u)p(u)

p(y)

p(y) =p(y |u)p(u)

p(u |y)

ln p(y) = lnp(y |u)p(u)

q(u)+ ln

q(u)p(u |y)

ln p(y) = Eq(u)

[ln

p(y |u)p(u)q(u)

]+ Eq(u)

[ln

q(u)p(u |y)

]ln p(y) = L + KL

(q(u) ‖ p(u |y)

)

Page 40: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Variational Bayes

p(u |y) =p(y |u)p(u)

p(y)

p(y) =p(y |u)p(u)

p(u |y)

ln p(y) = lnp(y |u)p(u)

q(u)+ ln

q(u)p(u |y)

ln p(y) = Eq(u)

[ln

p(y |u)p(u)q(u)

]+ Eq(u)

[ln

q(u)p(u |y)

]

ln p(y) = L + KL(q(u) ‖ p(u |y)

)

Page 41: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Variational Bayes

p(u |y) =p(y |u)p(u)

p(y)

p(y) =p(y |u)p(u)

p(u |y)

ln p(y) = lnp(y |u)p(u)

q(u)+ ln

q(u)p(u |y)

ln p(y) = Eq(u)

[ln

p(y |u)p(u)q(u)

]+ Eq(u)

[ln

q(u)p(u |y)

]ln p(y) = L + KL

(q(u) ‖ p(u |y)

)

Page 42: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

The objective

L = Eq(f)[log p(y | f)] − KL(q(u) ‖ p(u)

)I Tractable for Gaussian likelihoodsI Numerical integration (1D) for intractable likelihoods

(classification)

Page 43: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Optimisation

The variational objective L3 is a function of

I the parameters of the covariance functionI the parameters of q(u)I the inducing inputs, Z

Original strategy: set Z. Take the data in small minibatches,take stochastic gradient steps in the covariance functionparameters, stochastic natural gradient steps in the parametersof q(u).

New strategy: represent S as LL> (unconstrained). Throwm,L,Z,θ at Adagrad.

Page 44: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Optimisation

The variational objective L3 is a function of

I the parameters of the covariance functionI the parameters of q(u)I the inducing inputs, Z

Original strategy: set Z. Take the data in small minibatches,take stochastic gradient steps in the covariance functionparameters, stochastic natural gradient steps in the parametersof q(u).New strategy: represent S as LL> (unconstrained). Throwm,L,Z,θ at Adagrad.

Page 45: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Overview

Background: Gaussian processes

Sparse Gaussian Processes

Stochastic Variational Inference

Gaussian Likelihoods

Non-Gaussian likelihoods

Bonus: Deep GPs

Page 46: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective
Page 47: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

UK apartment prices

I Monthly price paid data for February to October 2012(England and Wales)

I from http://data.gov.uk/dataset/land-registry-monthly-price-paid-data/

I 75,000 entriesI Cross referenced against a postcode database to get

lattitude and longitudeI Regressed the normalised logarithm of the apartment

prices

Page 48: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective
Page 49: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective
Page 50: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Airline data

I Flight delays for everycommercial flight in theUSA from January to April2008.

I Average delay was 30minutes.

I We randomly selected800,000 datapoints (wehave limited memory!)

I 700,000 train, 100,000 test

Month

DayOfM

onth

DayOfW

eek

DepTim

e

ArrTim

e

AirTim

e

Distan

ce

PlaneA

ge0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Invers

e length

scale

Page 51: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

N=800 N=1000 N=120032

33

34

35

36

37

RM

SE

GPs on subsets

0 200 400 600 800 1000 1200iteration

32

33

34

35

36

37SVI GP

Page 52: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective
Page 53: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Overview

Background: Gaussian processes

Sparse Gaussian Processes

Stochastic Variational Inference

Gaussian Likelihoods

Non-Gaussian likelihoods

Bonus: Deep GPs

Page 54: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

What about Classification?

I Some of the above integrals are intractable, because of thenon-Gaussian likelihood

I Since the likelihood factorizes, we need to do None-dimension integrals

I Gauss-hermite quadrature works well! For derivatives alsoI Paper in AISTATS 2015

Page 55: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Does it work?KLSP

M=4 M=8 M=16 M=32 M=64 Full

MF

GFIT

C

Page 56: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Is it competetive?

100 101 102 103 104

0.1

0.2

0.3

0.4

0.5

Time (seconds)

Holdou

tnegativelogprobab

ility

KLSp M=4KLSp M=50KLSp M=200MFSp M=4MFSp M=50MFSp M=200EPFitc M=4EPFitc M=50EPFitc M=200

Page 57: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Big data?

102 103 104 1050.320.340.360.380.4

error(%

)

102 103 104 1050.6

0.62

0.64

0.66

Time (seconds)

-log

p(y)

Page 58: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Overview

Background: Gaussian processes

Sparse Gaussian Processes

Stochastic Variational Inference

Gaussian Likelihoods

Non-Gaussian likelihoods

Bonus: Deep GPs

Page 59: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

Bonus! Deep GPs

Deep models using compound functions:

y = f1( f2(... fn(x)))

Page 60: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

xn

h(1)nqu(1)

d

Z0

Z1

h(2)nqu(2)

dZ2

yndu(3)d

d = 1...D1

d = 1...D2

d = 1...Dn = 1...N

Page 61: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

000001002003004005006007008009010011012013014015016017018019020021022023024025026027028029030031032033034035036037038039040041042043044045046047048049050051052053054

055056057058059060061062063064065066067068069070071072073074075076077078079080081082083084085086087088089090091092093094095096097098099100101102103104105106107108109

0

0.5

1

000001002003004005006007008009010011012013014015016017018019020021022023024025026027028029030031032033034035036037038039040041042043044045046047048049050051052053054

055056057058059060061062063064065066067068069070071072073074075076077078079080081082083084085086087088089090091092093094095096097098099100101102103104105106107108109

0

0.5

1

000001002003004005006007008009010011012013014015016017018019020021022023024025026027028029030031032033034035036037038039040041042043044045046047048049050051052053054

055056057058059060061062063064065066067068069070071072073074075076077078079080081082083084085086087088089090091092093094095096097098099100101102103104105106107108109

0

0.5

1

000001002003004005006007008009010011012013014015016017018019020021022023024025026027028029030031032033034035036037038039040041042043044045046047048049050051052053054

055056057058059060061062063064065066067068069070071072073074075076077078079080081082083084085086087088089090091092093094095096097098099100101102103104105106107108109

−1 −0.5 0 0.5 1 1.5 2

0

0.5

1

Page 62: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

0 5 10 15 20 25 300

5

10

15

20

25

30

35

40

45

Page 63: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

0.0 0.2 0.4 0.6 0.8 1.0

1.0

0.5

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Page 64: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.430

20

10

0

10

20

30

40

50

60

Page 65: Scalable Gaussian Process Methodsseminars/seminars/Extra/2015... · I Numerical integration (1D) for intractable likelihoods (classification) Optimisation The variational objective

25 20 15 10 5 0 5 10 15 2030

20

10

0

10

20

30