Bayesian Network - uwyo.educlan/teach/ai18/bayesnet_intro.pdfApplication of Bayesian Network Diagnosis of a patient can be {tuberculosis, lung cancer, bronchitis} This example is from

$Page 1: Bayesian Network - uwyo.educlan/teach/ai18/bayesnet_intro.pdfApplication of Bayesian Network Diagnosis of a patient can be {tuberculosis, lung cancer, bronchitis} This example is from$
Bayesian Network

Chao Lan

Assume Markov chain on observations (strong in reality).

Recap: Markov Model

Assume each observation is generated from one latent discrete variable.

Assume Markov chain on latent variables (weaker in reality).

Recap: Hidden Markov Model

Model joint probability of both observed and latent variables.

Recap: Probability Factorization in HMM

Recap: Applications of HMM

TodayIntroduction to Bayesian Network

Probability Factorization based on BN

Various Forms of BN

Sampling based on BN

Conditional Independence (D-separation)

Represent variable dependency in a graph!

- each node is a variable

- each link specifies a dependency between its connected nodes

- link can be directed or undirected

- directed link identifies parent & child node

Example: [1] c depends on (a, b) [2] a has no dependence on anyone [3] b depends on a

Probabilistic Graphical Model

Visualize the structure of a complex probabilistic model.

Assist design and motivate new models.

Provide insights into the probabilistic model (conditional independence).

Why Probabilistic Graphical Model?

Bayesian Network (directed graphical model): all links have directions.

Directed Acyclic Graph (DAG): directed graph with no directed cycles.

Special Probabilistic Graphical Models

Application of Bayesian NetworkDiagnosis of a patient can be {tuberculosis, lung cancer, bronchitis}

This example is from the lecture slides of “some applications of Bayesian networks” by Jiri Vomlel.

Application of Bayesian Network

Know nothing about the patient. Know the patient smokes.

… and complains about dyspnoea


smoke

… and his X-ray is positive


smoke + dyspnoea

… and his visited Asia recently


smokes + dyspnoea + positive X-ray

Let a, b, c be three random variables whose

dependencies are specified by the right graph.

How can we factorize p(a,b,c) ?

Factorization based on Bayesian Network




p(a,b,c) = p(a) * p(b|a) * p(c|a,b)





p(a,b,c) = p(a) * p(b|a) * p(c|a,b)

A general factorization form


Exercise

Factorize the following joint probability based on the right Bayesian network.

p(x1, x2, x3, x4, x5, x6, x7) =

Factorize the following joint probability based on the right Bayesian network.

p(x1, x2, x3, x4, x5, x6, x7) = p(x1) * p(x2) * p(x3)

* p(x4 | x1,x2,x3)

* p(x5 | x1,x3)

* p(x6 | x4)

* p(x7 | x4,x5)

Exercise

A set of observations x1, x2, …, xn

A model with unknown parameter w.

Model input is an observed xi, output is its unknown label ti.

Assume p(ti|xi,w,σ) ~ N(w*xi,σ2).

Assume p(w) ~ N(0,α-1)

Bayesian Networks: Various Forms

wxi ti

Example

- xi is sensor signal of individual i

- ti is activity of individual i

- activity prediction model


We are mainly interested in the joint distribution of unknown parameters t and w (t=[t1,...,tn])






wxi ti

We can use a more compact graph







wxi ti

Now add other (deterministic) parameters







wxi ti

Since t is observed in learning, make it solid.







wxi ti

Given a new observation ,predict its label .







wxi ti

Sampling based on Bayesian Network

Sampling is the process of drawing examples from a probability distribution.

Suppose outcome of a coin flip X~Be(0.4). We can sample X by actually flipping the coin. The outcome should be T with 60% chance, and H with 40%.

- x1 = H (1st example)

- x2 = T (2nd example)

- x3 = T (3rd example)

- x4 = ...


The graph can guide data sampling process, e.g.,

To sample a point from p(x4|pa4)

- first identify parent set pa4 = {x1, x2, x3}

- x1 has no parent, sample a point s1 from p(x1)



- sample a point from p(x4|x1=s1,x2=s2,x3=s3)

Exercise

How to sample a point from p(x7|pa7)?


To sample a (pair of) point from joint distribution p(x2,x4)?


To sample a (pair of) point from joint distribution p(x2,x4)

- sample a full joint distribution from p(x1,...,x7)

- keep x2 and x4 and discard the rest

Conditional Independence in BN

The Bayesian network can tell us whether two variables are conditionally independent or not.

Two random variables x, y are independent conditioned on z if

p(x,y|z) = p(x|z) p(y|z)





The right graph instructs the factorization

p(a, b | c) = p(a | c) p(b | c)






p(a, b | c) = p(a | c) p(b | c)

This implies






p(a, b | c) = p(a | c) p(b | c)

This implies

D-Separation

Sometimes it is not obvious whether two variables are conditionally independent. In this case we can use d-separation criterion.

D-Separation

Rule 1 (Unconditional Separation)

- x and y are d-connected if there is an unblocked path between them

Rule 2 (Blocking by Conditioning)

- x and y are d-connected, conditioned on a set Z of nodes, if there is a collider-free path between x and y that traverses no member of Z.

Rule 3 (Conditioning on Colliders)

- If a collider is a member of the conditioning set Z, or has a descendant in Z, then it no longer blocks any path that traces this collider.

Rule 1: Unconditional Separation

Rule 1: x and y are d-connected if there is an unblocked path between them.

- a path is a consecutive sequence of undirected links

- a path is unblocked if it contains no collider (a node where arrows meet head-to-head)

- otherwise (all paths between x, y are blocked), we say x and y are d-separated.

Three Types of Connection Node

Serial

Serial

Diverging

Converging (Collider)

Rule 1: x and y are d-connected if there is an unblocked path between them.

- a path is a consecutive sequence of undirected links

- a path is unblocked if it contains no collider (a node where arrows meet head-to-head)

- Q: which paths are d-connected?

Rule 1: Unconditional Separation

Rule 2: Blocking by Unconditioning

Rule 2: x and y are d-connected, conditioned on a set Z of nodes, if there is a collider-free path between x and y that traverses no member of Z.

- let Z = {r, v}. which paths are d-connected conditioned on Z?

Rule 3: Conditioning on Colliders

Rule 2: x and y are d-connected, conditioned on a set Z of nodes, if there is a collider-free path between x and y that traverses no member of Z. Rule 3: If a collider is a member of the conditioning set Z, or has a descendant in Z, then it no longer blocks any path that traces this collider.

- let Z = {r, p}. which paths are d-connected conditioned on Z?

TodayIntroduction to Bayesian Network

Probability Factorization based on BN

Various Forms of BN

Sampling based on BN

Conditional Independence (D-separation)

Bayesian Network - uwyo.educlan/teach/ai18/bayesnet_intro.pdfApplication of Bayesian Network Diagnosis of a patient can be {tuberculosis, lung cancer, bronchitis} This example is from

Documents