Top Banner
Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009
71

Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Jan 21, 2016

Download

Documents

Rosemary Lloyd
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Lecture 2: Statistical learning primer for biologists

Alan QiPurdue Statistics and CS

Jan. 15, 2009

Page 2: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Outline

• Basics for probability• Regression• Graphical models: Bayesian networks and

Markov random fields• Unsupervised learning: K-means and

Expectation maximization

Page 3: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Probability Theory

•Sum Rule

Product Rule

Page 4: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

The Rules of Probability

• Sum Rule

• Product Rule

Page 5: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Bayes’ Theorem

posterior likelihood × prior

Page 6: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Probability Density & Cumulative Distribution Functions

Page 7: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Expectations

Conditional Expectation(discrete)

Approximate Expectation(discrete and continuous)

Page 8: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Variances and Covariances

Page 9: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

The Gaussian Distribution

Page 10: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Gaussian Mean and Variance

Page 11: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

The Multivariate Gaussian

Page 12: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Gaussian Parameter Estimation

Likelihood function

Page 13: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Maximum (Log) Likelihood

Page 14: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Properties of and

Unbiased

Biased

Page 15: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Curve Fitting Re-visited

Page 16: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Maximum Likelihood

Determine by minimizing sum-of-squares error, .

Page 17: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Predictive Distribution

Page 18: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

MAP: A Step towards Bayes

Determine by minimizing regularized sum-of-squares error, .

Page 19: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Bayesian Curve Fitting

Page 20: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Bayesian Networks

• Directed Acyclic Graph (DAG)

Page 21: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Bayesian Networks

General Factorization

Page 22: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Generative Models

• Causal process for generating images

Page 23: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Discrete Variables (1)

• General joint distribution: K 2 -1 parameters

• Independent joint distribution: 2(K-1) parameters

Page 24: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Discrete Variables (2)

General joint distribution over M variables: KM -1 parameters

M node Markov chain: K-1+(M-1)K(K-1) parameters

Page 25: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Discrete Variables: Bayesian Parameters (1)

Page 26: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Discrete Variables: Bayesian Parameters (2)

Shared prior

Page 27: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Parameterized Conditional Distributions

If are discrete, K-state variables, in general has O(K M) parameters.

The parameterized form

requires only M + 1 parameters

Page 28: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Conditional Independence

• a is independent of b given c

• Equivalently

• Notation

Page 29: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Conditional Independence: Example 1

Page 30: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Conditional Independence: Example 1

Page 31: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Conditional Independence: Example 2

Page 32: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Conditional Independence: Example 2

Page 33: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Conditional Independence: Example 3

• Note: this is the opposite of Example 1, with c unobserved.

Page 34: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Conditional Independence: Example 3

Note: this is the opposite of Example 1, with c observed.

Page 35: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

“Am I out of fuel?”

B = Battery (0=flat, 1=fully charged)F = Fuel Tank (0=empty, 1=full)G = Fuel Gauge Reading

(0=empty, 1=full)

And hence

Page 36: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

“Am I out of fuel?”

Probability of an empty tank increased by observing G = 0.

Page 37: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

“Am I out of fuel?”

Probability of an empty tank reduced by observing B = 0. This referred to as “explaining away”.

Page 38: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

The Markov Blanket

Factors independent of xi cancel between numerator and denominator.

Page 39: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Cliques and Maximal Cliques

Clique

Maximal Clique

Page 40: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Joint Distribution

• where is the potential over clique C and

• is the normalization coefficient; note: M K-state variables KM terms in Z.

• Energies and the Boltzmann distribution

Page 41: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Illustration: Image De-Noising (1)

Original Image Noisy Image

Page 42: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Illustration: Image De-Noising (2)

Page 43: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Illustration: Image De-Noising (3)

Noisy Image Restored Image (ICM)

Page 44: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Converting Directed to Undirected Graphs (1)

Page 45: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Converting Directed to Undirected Graphs (2)

• Additional links: “marrying parents”, i.e., moralization

Page 46: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Directed vs. Undirected Graphs (2)

Page 47: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Inference on a Chain

Computational time increases exponentially with N.

Page 48: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Inference on a Chain

Page 49: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Supervised Learning

• Supervised learning: learning with examples or labels, e.g., classification and regression

• Linear regression (the example we just given), Generalized linear models (e.g, probit classification), Support vector machines, Gaussian processes classifications, etc.

• Take CS590M-Machine Learning in 2009 fall.

Page 50: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Unsupervised Learning

• Supervised learning: learning with examples or labels, e.g., classification and regression

• Unsupervised learning: learning without examples or labels, e.g., clustering, mixture models, PCA, non-negative matrix factorization

Page 51: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

K-means Clustering: Goal

Page 52: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Cost Function

Page 53: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Two Stage Updates

Page 54: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Optimizing Cluster Assignment

Page 55: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Optimizing Cluster Centers

Page 56: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Convergence of Iterative Updates

Page 57: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Example of K-Means Clustering

Page 58: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Mixture of Gaussians• Mixture of Gaussians:

• Introduce latent variables:

• Marginal distribution:

Page 59: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Conditional Probability

• Responsibility that component k takes for explaining the observation.

Page 60: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Maximum Likelihood

• Maximize the log likelihood function

Page 61: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Maximum Likelihood Conditions (1)

• Setting the derivatives of to zero:

Page 62: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Maximum Likelihood Conditions (2)

• Setting the derivative of to zero:

Page 63: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Maximum Likelihood Conditions (3)

• Lagrange function:

• Setting its derivative to zero and use the normalization constraint, we obtain:

Page 64: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Expectation Maximization for Mixture Gaussians

• Although the previous conditions do not provide closed-form conditions, we can use them to construct iterative updates:

• E step: Compute responsibilities .• M step: Compute new mean , variance ,

and mixing coefficients .• Loop over E and M steps until the log

likelihood stops to increase.

Page 65: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Example

• EM on the Old Faithful data set.

Page 66: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

General EM Algorithm

Page 67: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

EM as Lower Bounding Methods

• Goal: maximize

• Define:

• We have

Page 68: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Lower Bound

• is a functional of the distribution .

• Since and ,• is a lower bound of the log likelihood

function .

Page 69: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Illustration of Lower Bound

Page 70: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Lower Bound Perspective of EM

• Expectation Step:• Maximizing the functional lower bound

over the distribution .

• Maximization Step:• Maximizing the lower bound over the

parameters .

Page 71: Lecture 2: Statistical learning primer for biologists Alan Qi Purdue Statistics and CS Jan. 15, 2009.

Illustration of EM Updates