Top Banner
In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@F AU.de LNT Seminar - 13.11.15 1/74
74

In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge [email protected] Seminar - 13.11.151/74.

Jan 21, 2016

Download

Documents

Janice Short
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing

Antoine Deleforge

[email protected] LNT Seminar - 13.11.15 1/74

Page 2: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

THANKS

[email protected] LNT Seminar - 13.11.15 2/74

Prof. Dr.-Ing. Walter KellermannHead of LMS audio groupEARS project coordinator

Dr. Roland MaasResearch and development engineer at Amazon

Prof. Radu HoraudDirector of research at Inria Grenoble (France)Head of the team PERCEPTION

Prof. Florence ForbesDirector of research at Inria Grenoble (France)Head of the team MISTIS

BSc. Boris BelusovFormer LMS student

MSc. Christian HümmerPhD candidate at LMS

Dr. Sileye BaResearcher at Inria Grenoble (France)Team PERCEPTION

MSc. Vincent DrouardPhD candidate at Inria Grenoble (France)Team PERCEPTION

Page 3: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

OUTLINE

[email protected] LNT Seminar - 13.11.15

• What is Bayesian inference?– Overview– Classical vs. Bayesian approach– Bayes Theorem & Example– General Methodology

• Bayesian inference by examples– Direct inference– The Expectation-Maximization algorithm– Variational Bayes methods

• Applications– Multichannel audio signal processing– High-dimensional regression

• Conclusions

3/74

Page 4: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

OUTLINE

[email protected] LNT Seminar - 13.11.15

• What is Bayesian inference?– Overview– Classical vs. Bayesian approach– Bayes Theorem & Example– General Methodology

• Bayesian inference by examples– Direct inference– The Expectation-Maximization algorithm– Variational Bayes methods

• Applications– Multichannel audio signal processing– High-dimensional regression

• Conclusions

4/74

Page 5: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

What is Bayesian Inference?

[email protected] LNT Seminar - 13.11.15 5/74

Estimation Quantitative deductions on causes or consequences of the observations, i.e., find underlying model parameters.

Example: I observed a certain amount of rain drops forming on my window in the last minute. What is the current rainfall in milimeters?

Prediction From the infered model, predict what missing or future observations should be.

Example: How many more raindrops will form on my window in the next hour?

Decision Take a decision out of a discrete set of choicesExample: Is it safe to open my window 1 minute to get some fresh air?

Observations Model

Goals

Ingredients

Inference

Overview

Page 6: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Statistics: Inference from the real world observations of a random phenomenon using probability theory

What is Bayesian Inference?

[email protected] LNT Seminar - 13.11.15 6/74

Definition

Features

Tools

Classical/Frequentist Statistics Bayesian Statistics

The model parameters which should be estimated are considered as unknown constant.

The model parameters are considered as hidden random variables, following a hypothetical probabilistic model.

Inference entirely based on observed data and frequentist arguments. Useful when few prior knowledge exist on the underlying random process.

Incorporate prior knowledge on the hidden variables in the form of a generative probabilistic model. Useful when some reasonnable probability density fundtions (PDFs) can be assumed.

• Linear estimators• First and second order statistics• A lot of:

• Bayes’ Theorem• Explicit PDFs• A lot of:

Classical vs. Bayesian

Page 7: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

• Bayes’ Theorem

What is Bayesian Inference?

[email protected] LNT Seminar - 13.11.15 7/74

Remark 1: Bayes does not « forbid » model parameters !• No formal difference between a parameter and a hidden variable with constant prior• Priors distributions often have parameters called « hyperparameters »

Remark 2: Why hidden variables?• Formally not needed: P(X) can be obtained by marginalizing out hidden variables• A convenient and powerful view point which makes inference possible in complex scenarios through a variety of methods

Posterior Likelihood

X : Observed variables X : Hidden variablesPrior

« Observed data » or « marginal » likelihood

Bayes’ Theorem & Example

Page 8: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

An Example:

What is Bayesian Inference?

[email protected] LNT Seminar - 13.11.15 8/74

Z=0 Z=1 Z=2

It’s not raining It’s raining rain It’s raining cats and dogs

Hypothesis

ObservationsXd Xc

I see drops on my window

I see a cat or a dog at my window

Xd Xc

Z=0 0.1 0.1

Z=1 0.99 0.1

Z=2 0.1 0.99

• : equal likelihood!• Add priors: • Bayes’ theorem:

??

Bayes’ Theorem & Example

Page 9: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

General Methodology

What is Bayesian Inference?

[email protected] LNT Seminar - 13.11.15 9/74

ModelingWhat is hidden?

What is observed?

Dependencies?

Graphical model

Choice of prior and conditional PDFs+

Joint PDF

Inference Apply Baye’s Theorem

Choice of method:• Exact / Approximate• Direct / Iterative

Posterior PDF

• Estimation (MAP, posterior mean,…)• Prediction• Decision

=

General Methodology

Page 10: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

OUTLINE

[email protected] LNT Seminar - 13.11.15

• What is Bayesian inference?– Overview– Classical vs. Bayesian approach– Bayes Theorem & Example– General Methodology

• Bayesian inference by examples– Direct inference– The Expectation-Maximization algorithm– Variational Bayes methods

• Applications– Multichannel audio signal processing– High-dimensional regression

• Conclusions

10/74

Page 11: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

OUTLINE

[email protected] LNT Seminar - 13.11.15

• What is Bayesian inference?– Overview– Classical vs. Bayesian approach– Bayes Theorem & Example– General Methodology

• Bayesian inference by examples– Direct inference– The Expectation-Maximization algorithm– Variational Bayes methods

• Applications– Multichannel audio signal processing– High-dimensional regression

• Conclusions

11/74

Page 12: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

OUTLINE

[email protected] LNT Seminar - 13.11.15

• What is Bayesian inference?– Overview– Classical vs. Bayesian approach– Bayes Theorem & Example– General Methodology

• Bayesian inference by examples– Direct inference– The Expectation-Maximization algorithm– Variational Bayes methods

• Applications– Multichannel audio signal processing– High-dimensional regression

• Conclusions

12/74

Page 13: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 13/74

Direct Inference

Page 14: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 14/74

Direct Inference

Page 15: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 15/74

Direct Inference

Page 16: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 16/74

Direct Inference

Page 17: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 17/74

Direct Inference

Page 18: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 18/74

Somemone is making a joke…

Direct Inference

Page 19: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 19/74

Direct Inference

Page 20: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 20/74

Observed variables: Hidden variable:

Guilty house number?

Graphical Model:

Conditionals

Priors

Modeling

…(Grandma Jane)

(Student house)

(Family with kids)

Direct Inference

Page 21: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 21/74

Bayes’ Theorem:

Estimation: Maximum a Posteriori (MAP)

Decision: These pranksters will hear from me at the Uni!

Inference

Direct computation

, the student house

Direct Inference

Page 22: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

OUTLINE

[email protected] LNT Seminar - 13.11.15

• What is Bayesian inference?– Overview– Classical vs. Bayesian approach– Bayes Theorem & Example– General Methodology

• Bayesian inference by examples– Direct inference– The Expectation-Maximization algorithm– Variational Bayes methods

• Applications– Multichannel audio signal processing– High-dimensional regression

• Conclusions

22/74

Page 23: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

OUTLINE

[email protected] LNT Seminar - 13.11.15

• What is Bayesian inference?– Overview– Classical vs. Bayesian approach– Bayes Theorem & Example– General Methodology

• Bayesian inference by examples– Direct inference– The Expectation-Maximization algorithm– Variational Bayes methods

• Applications– Multichannel audio signal processing– High-dimensional regression

• Conclusions

23/74

Page 24: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 24/74

EM algorithm

Page 25: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 25/74

EM algorithm

Page 26: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 26/74

They are on the roof!

EM algorithm

Page 27: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 27/74

Observed variables:

Hidden Variables:

1. 2. 3.

? ?

Graphical Model:Conditional:

Priors:

Parameters:

Modeling

EM algorithm

Page 28: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 28/74

Inference

Bayes’ Theorem:

where

Simple form, but is unknown => Maximum likelihood?

• Non-convex• Combinatorial• Intractable

• The joint probability has a much simpler form than the marginal

• Z is a hidden variable, and cannot be estimated without knowing

Expectation-Maximization (EM) algorithm

EM algorithm

Page 29: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 29/74

EM algorithm• E-step:

Complete-data log-likelihood

Posterior expectation

• M-step:

Proof of correctness:

where is the conditional cross entropy of Z given X,th for the distribution .

In particular: .

According to Gibb’s inequality, we have .

(Baye’s theorem)

.

The likelihood can only increase at each step!

Therefore:

EM algorithm

Page 30: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 30/74

Derivations for the Gaussian mixture model• E-step: computing the current posterior probabilities

We deduce :

• M-step: maximizing by finding the zeros of the derivative

, ,

EM algorithm

Page 31: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 31/74

EM algorithm• E-step:• Initialization: Random « guess » for

• M-step:• E-step:• M-step:

• Initialization: Random « guess » for

• Convergence

Minus 10 points for Mr. Green, minus 5 points for the others!

Decision:

• Convergence

Inference

EM algorithm

Page 32: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

OUTLINE

[email protected] LNT Seminar - 13.11.15

• What is Bayesian inference?– Overview– Classical vs. Bayesian approach– Bayes Theorem & Example– General Methodology

• Bayesian inference by examples– Direct inference– The Expectation-Maximization algorithm– Variational Bayes methods

• Applications– Multichannel audio signal processing– High-dimensional regression

• Conclusions

32/74

Page 33: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

OUTLINE

[email protected] LNT Seminar - 13.11.15

• What is Bayesian inference?– Overview– Classical vs. Bayesian approach– Bayes Theorem & Example– General Methodology

• Bayesian inference by examples– Direct inference– The Expectation-Maximization algorithm– Variational Bayes methods

• Applications– Multichannel audio signal processing– High-dimensional regression

• Conclusions

33/74

Page 34: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 34/74

The next day…

Variational methods

Page 35: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 35/74

The next day…

Variational methods

Page 36: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 36/74

I will show them what Bayes is capable of…

Variational methods

Page 37: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 37/74

Modeling A « fully Bayesian » model

GMM (same as before)

Priors on all parametersNote: These are the conjugate priors for the normal and the multinomial distributions, i.e., they are such that and

Graphical model: Choice of hyperparameters:

: low values will allow Gaussian weights to be close to 0

Variational methods

Page 38: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 38/74

Inference• The posterior distribution is intractable

• Technique: use a variational approximation , where is restricted to a family of distributions having a simpler form than the true

• The variational distribution is typically assumed to factorize over some partition of the latent variables.

• Here we use: . Remarkably, this is the only assumption needed to obtain a tractable EM-like inference procedure.

• Such procedures are referred to as Variational Bayesian EM algorithms. For :

• E-Z step:

• E-W step:VB-EM

Variational methods

Page 39: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 39/74

Inference

Proof of correctness• Using a similar reasonning as for EM, we can show that the VB-EM iteratively minimizes the Kulback-Leibler divergence between the true posterior and its variational approximation :

• E-Z step:

• E-W step:VB-EM

Variational methods

Page 40: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 40/74

Inference

Derivations for the Bayesian mixture of Gaussian•E-Λμπ-Step:

Using the decomposition (see E-Z-step).

with

This leads to the factorization:

,

where:

Variational methods

Page 41: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 41/74

Inference

Derivations for the Bayesian mixture of Gaussian•E-Z-Step:

where

It follows that where

Finally, we can express as a function of the parameters calculated in previous step:

where denotes the digamma function.

Variational methods

Page 42: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Bayesian Inference: Examples

[email protected] LNT Seminar - 13.11.15 42/74

Inference

VB-EM in action

• E-Λμπ-Step :

• Initialization: Random means + GMM E-step for

• E-Z-Step:

• Convergence

• Initialization: Random means + GMM E-step for

• E-Λμπ-Step :

• E-Z-Step:

• Convergence

Conclusions on GMM VB-EM• Similar computational time as GMM-EM (though slightly more iterations)

• Priors on Gaussian weights handle automatically degenerate or unused clusters

• Determination of

• Works even for very small data samples

Variational methods

Page 43: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

OUTLINE

[email protected] LNT Seminar - 13.11.15

• What is Bayesian inference?– Overview– Classical vs. Bayesian approach– Bayes Theorem & Example– General Methodology

• Bayesian inference by examples– Direct inference– The Expectation-Maximization algorithm– Variational Bayes methods

• Applications– Multichannel audio signal processing– High-dimensional regression

• Conclusions

43/74

Page 44: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

OUTLINE

[email protected] LNT Seminar - 13.11.15

• What is Bayesian inference?– Overview– Classical vs. Bayesian approach– Bayes Theorem & Example– General Methodology

• Bayesian inference by examples– Direct inference– The Expectation-Maximization algorithm– Variational Bayes methods

• Applications– Multichannel audio signal processing– High-dimensional regression

• Conclusions

44/74

Page 45: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

OUTLINE

[email protected] LNT Seminar - 13.11.15

• What is Bayesian inference?– Overview– Classical vs. Bayesian approach– Bayes Theorem & Example– General Methodology

• Bayesian inference by examples– Direct inference– The Expectation-Maximization algorithm– Variational Bayes methods

• Applications– Multichannel audio signal processing– High-dimensional regression

• Conclusions

45/74

Page 46: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Multichannel Audio

[email protected] LNT Seminar - 13.11.15 46/74

Modeling

Page 47: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Multichannel Audio

[email protected] LNT Seminar - 13.11.15 47/74

Short time Fourier Transform

Problem:How to optimally recover from a random observation of given ?

Modeling

Page 48: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Multichannel Audio

[email protected] LNT Seminar - 13.11.15 48/74

,

… …Multichannel Wiener Filter LCMV Beamforming

Vector

Matrix

Solution

Classical Signal Processing View

Page 49: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Multichannel Audio

[email protected] LNT Seminar - 13.11.15 49/74

• Multichannel Wiener Filter:

• LCMV Beamformer:

Classical Signal Processing View

Page 50: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Multichannel Audio

[email protected] LNT Seminar - 13.11.15 50/74

• Graphical Model:

Hidden source signals:

Observed mic. signals:

Hidden noise signals:

• Generative model:Conditional:

Prior:

Circular complex normal distribution:

Bayesian View

Page 51: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Multichannel Audio

[email protected] LNT Seminar - 13.11.15 51/74

• Bayes Theorem:

• Knowing that this density is complex Gaussian and identifying linear and quatradic forms of :

, where

• The maximum a posteriori (MAP) estimator of is therefore:

• Using the matrix inversion lemma we can rewrite it as:

.

.

.

• Besides, using and by linear transform of Gaussians:

where .

• We finally get

where .

The Wiener filter yields the MAP estimate of sources for complex Gaussian signals.

Bayesian View

Page 52: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Multichannel Audio

[email protected] LNT Seminar - 13.11.15 52/74

• Maximum Likelihood Estimate:

• Suppose the prior is constant. By Bayes’ theorem we have:

=> the MAP and the ML estimates coincides for constant priors. • A constant prior can be asymptotically model by a Gaussian with infinite variance, i.e., . This leads to .

• Using previous result we obtain:

.

• Using and the matrix inversion lemma twice:

where .

The LCMV beamformer is the ML estimate of sources

for complex Gaussian signals.

Bayesian View

Page 53: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

OUTLINE

[email protected] LNT Seminar - 13.11.15

• What is Bayesian inference?– Overview– Classical vs. Bayesian approach– Bayes Theorem & Example– General Methodology

• Bayesian inference by examples– Direct inference– The Expectation-Maximization algorithm– Variational Bayes methods

• Applications– Multichannel audio signal processing– High-dimensional regression

• Conclusions

53/74

Page 54: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

OUTLINE

[email protected] LNT Seminar - 13.11.15

• What is Bayesian inference?– Overview– Classical vs. Bayesian approach– Bayes Theorem & Example– General Methodology

• Bayesian inference by examples– Direct inference– The Expectation-Maximization algorithm– Variational Bayes methods

• Applications– Multichannel audio signal processing– High-dimensional regression

• Conclusions

54/74

Page 55: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

High-Dimensional Regression

[email protected] LNT Seminar - 13.11.15 55/74

Challenges

High- to low-dimensional regression is hard because f has a high-dimensional support: f(y1,y2,….,yD)

Solution: - Learn the regression the other way around: y=g(x)- Use the inverse function f=g-1 to map y to x.

Problem: The mapping in non-linear, but locally-linearSolution: -

- Mixture of locally-linear regression models

Problem:

OutputInput

1 2

associated

• • • • • • • •••

• •

x=f(y)

Learning Testing

Gaussian Locally-Linear Mapping (GLLiM)

Page 56: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

High-Dimensional Regression

[email protected] LNT Seminar - 13.11.15 56/74

The GLLiM Model

GenerativeModel

E-Step Posterior update

Closed-form EM algorithm

M-Step Parameters update

E-Step

M-Step

Convergence

Gaussian Locally-Linear Mapping (GLLiM)

Assign points to regions

Calculate transformations

OutputInput

AD., F. Forbes, R. Horaud, «High-dimensional regression with gaussian mixtures and partially-latent response variables», Springer Statistics and Computing, 2015.

Page 57: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

High-Dimensional Regression

[email protected] LNT Seminar - 13.11.15 57/74

• The forward conditional density

• The inverse conditional density (Bayes’ inversion)

The GLLiM Model

Page 58: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

High-Dimensional Regression

[email protected] LNT Seminar - 13.11.15 58/74

Regression: low-to-high or high-to-low?• Example: ,

isotropic and equal noise covariances

• Low-to-high regression ( ) model size:

• High-to-low regression ( ) model size:

+ requires the inversion of 1000 x 1000 matrices

The GLLiM Model

Page 59: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

High-Dimensional Regression

[email protected] LNT Seminar - 13.11.15 59/74

Applications

Application 1: Estimation of head pose in images

Training Images of faces annotated with 3D pose (yaw, pitch, roll)

Prima dataset (2004)

Histogram of gradients (HoG)

• • • • • • • •

• • • • • • • •

• • • • • • • •

• • • • • • • •

• • • • • • • •

• • • • • • • •

• •

• •

• •

• •

• • …

GLLiM

Page 60: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

High-Dimensional Regression

[email protected] LNT Seminar - 13.11.15 60/74

Applications

Application 1: Estimation of head pose in images

Testing

Drouard, Ba, Evangelidis, AD., Horaud, « Head pose estimation via probabilistic high-dimensional regression », IEEE ICIP 2015

(Best paper)

<Videos: https://team.inria.fr/perception/research/head-pose/ >

Page 61: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

High-Dimensional Regression

[email protected] LNT Seminar - 13.11.15 61/74

Applications

Application 2: Mapping Sounds onto Images

Training

• Loudspeaker emiting white-noise• Visual target (Chessboard pattern)• 432 positions in the camera field-of-view

AD., Drouard, Girin, Horaud, « Mapping sounds onto images using binaural spectrograms », EUSIPCO 2014.

Page 62: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

High-Dimensional Regression

[email protected] LNT Seminar - 13.11.15 62/74

Applications

Application 2: Mapping Sounds onto Images

TrainingILD Spectrogram

T

F

Temporal Mean

Acoustic space sampling

• • • • • • • •

• • • • • • • •

• • • • • • • •

• • • • • • • •

• • • • • • • •

• • • • • • • •

• •

• •

• •

• •

• •

• • …

Page 63: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

High-Dimensional Regression

[email protected] LNT Seminar - 13.11.15 63/74

Applications

Application 2: Mapping Sounds onto Images

Testing

Speech Spectrogram Speech ILD Spectrogram

Extension of GLLiM to time series of mixed, partially missing data:

?

Page 64: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

High-Dimensional Regression

[email protected] LNT Seminar - 13.11.15 64/74

Applications

Application 2: Mapping Sounds onto Images

Testing 1 Source, 2 Sources, NAO

AD., Drouard, Girin, Horaud, « Mapping sounds onto images using binaural spectrograms », EUSIPCO 2014.

AD., Horaud, Schechner, Girin, « Co-localization of Audio Sources in Images Using Binaural Features and Locally-Linear Regression », IEEE Trans. Audio, Speech and Lang. Proc. 2015.

Czech, Mittal, AD., Sanchez-Riera, Alameda-Pineda, Horaud, « Active-speaker detection and localization with microphones and cameras embedded into a robotic head.», HUMANOIDS 2013.

<Videos and more: https://team.inria.fr/perception/alumni/deleforge/ >

Page 65: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

High-Dimensional Regression

[email protected] LNT Seminar - 13.11.15 65/74

Applications

Application 3: Hyperspectral imaging of planet Mars surface

Mars Express - Omega (2004)[http://geops.geol.u-psud.fr/]

Spectraldimension

Spatial dimension

=

wav

elen

gth

Chemical

composition

Granularity

Physical state

Texture

=?

Page 66: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

High-Dimensional Regression

[email protected] LNT Seminar - 13.11.15 66/74

Applications

Application 3: Hyperspectral imaging of planet Mars surface

Training

• Training data pairs synthetized using a radiative transfer model

• Extensions of GLLiM to with• Spatial Markov dependencies (neighboring pixels are dependent)• A partially-latent output model (Atmospheric conditions, temperature…)

Page 67: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

High-Dimensional Regression

[email protected] LNT Seminar - 13.11.15 67/74

Applications

Application 3: Hyperspectral imaging of planet Mars surface

Testing

• Two points of view of Mars’ south polar cap (Orbits 41 and 61)

• No ground-truth available

• GLLiM:• Consistency between the two orbits• Complementarity of proportions• Higher concentration of dust on the

edge of the glacier• smoothness?

AD., Forbes, Horaud, « Hyper-spectral image analysis with partially-latent regression », IEEE EUSIPCO 2014

Page 68: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

High-Dimensional Regression

[email protected] LNT Seminar - 13.11.15 68/74

Applications

Application 3: Hyperspectral imaging of planet Mars surface

Testing Adding spatial Markov dependencies

Smoothing effectAD., Forbes, Ba, Horaud, « Hyperspectral Image analysis using locally-linear regression and spatial Markov dependencies », IEEE JSTSP 2015

Page 69: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

OUTLINE

[email protected] LNT Seminar - 13.11.15

• What is Bayesian inference?– Overview– Classical vs. Bayesian approach– Bayes Theorem & Example– General Methodology

• Bayesian inference by examples– Direct inference– The Expectation-Maximization algorithm– Variational Bayes methods

• Applications– Multichannel audio signal processing– High-dimensional regression

• Conclusions

69/74

Page 70: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

OUTLINE

[email protected] LNT Seminar - 13.11.15

• What is Bayesian inference?– Overview– Classical vs. Bayesian approach– Bayes Theorem & Example– General Methodology

• Bayesian inference by examples– Direct inference– The Expectation-Maximization algorithm– Variational Bayes methods

• Applications– Multichannel audio signal processing– High-dimensional regression

• Conclusions

70/74

Page 71: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Conclusions

[email protected] LNT Seminar - 13.11.15 71/74

Further Reading

• Many results of classical signal processing obtained by seeking linear filters can be reproduced without assuming linearity, using Bayesian inference and Gaussian signal assumptions

• How to include the estimation of parameters , , …?

Other example: Maas, R. Maas et al. "A Bayesian network view on linear and nonlinear acoustic echo cancellation." Signal and Information Processing (ChinaSIP), IEEE, 2014.

• Bayesian interpretation of NLMS• Optimal step-size estimation• Generalization to non-linear systems

Example: N. Q. K. Duong, E. Vincent, and R. Gribonval. "Under-determined reverberant audio source separation using a full-rank spatial covariance model." IEEE Transactions on Audio, Speech, and Language Processing, 2010.

• Complex Gaussian models, STFT observations• Blind source separation using the EM algorithm to estimate all the parameters• Generalization using full rank spatial covariance matrices for sources

Open: Many adaptive SP algorithms may be interpreted in

terms of EM/VEM procedures.

Open: Bayesian priors on parameters?

Page 72: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Conclusions

[email protected] LNT Seminar - 13.11.15 72/74

Further Reading

• Complex-Gaussian model generalized by alpha-stable harmonizable process• Additivity of fractional powers of the magnitude• Generalized single-channel Wiener filter:

• Extensions to non-Gaussian PDFsExample: A. Liutkus and R. Badeau. "Generalized Wiener filtering with fractional power spectrograms." ICASSP, IEEE, 2015.

Open: Beamforming/ MWF with alpha-stable priors?

Page 73: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Conclusions

[email protected] LNT Seminar - 13.11.15 73/74

• The “Bayesian View”• 2 ingredients: Observations + Model• 3 goals: Estimation, Prediction, Decision• Incorporate beliefs or priors using PDFs• A general, principled methodology

• Bayesian inference• A number of tools (EM, VEM, Monte-Carlo Sampling…)• Well established theoretical results

• Using Bayesian models in signal processing is a recent, exciting and fast-growing area of research

• Now it’s your turn!

Last Words

Page 74: In the Head of Bayes: A Tutorial on Bayesian Learning for Signal Processing Antoine Deleforge Antoine.Deleforge@FAU.deLNT Seminar - 13.11.151/74.

Conclusions

[email protected] LNT Seminar - 13.11.15 74/74

Last Words

That’s all folks!

Questions?

Thank you.