Top Banner
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( www.kddresearch.org ) Graphical Models of Probability Graphical Models of Probability for Causal Reasoning for Causal Reasoning Thursday 07 November 2002 (revised 09 December 2003) William H. Hsu Laboratory for Knowledge Discovery in Databases Department of Computing and Information Sciences Kansas State University http://www.kddresearch.org This presentation is: http://www.kddresearch.org/KSU/CIS/BN-Math-20021107.ppt KSU Math Department Colloquium KSU Math Department Colloquium
24

Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Apr 01, 2015

Download

Documents

Zoey Harkcom
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

Graphical Models of ProbabilityGraphical Models of Probabilityfor Causal Reasoningfor Causal Reasoning

Thursday 07 November 2002

(revised 09 December 2003)

William H. Hsu

Laboratory for Knowledge Discovery in Databases

Department of Computing and Information Sciences

Kansas State University

http://www.kddresearch.org

This presentation is:

http://www.kddresearch.org/KSU/CIS/BN-Math-20021107.ppt

KSU Math Department ColloquiumKSU Math Department Colloquium

Page 2: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

OverviewOverview

• Graphical Models of Probability– Markov graphs

– Bayesian (belief) networks

– Causal semantics

– Direction-dependent separation (d-separation) property

• Learning and Reasoning: Problems, Algorithms– Inference: exact and approximate

• Junction tree – Lauritzen and Spiegelhalter (1988)• (Bounded) loop cutset conditioning – Horvitz and Cooper (1989)• Variable elimination – Dechter (1996)

– Structure learning• K2 algorithm – Cooper and Herskovits (1992)• Variable ordering problem – Larannaga (1996), Hsu et al. (2002)

• Probabilistic Reasoning in Machine Learning, Data Mining• Current Research and Open Problems

Page 3: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

Stages of Data Mining andStages of Data Mining andKKnowledge nowledge DDiscovery in iscovery in DDatabasesatabases

Adapted from Fayyad, Piatetsky-Shapiro, and Smyth (1996)

Page 4: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

Graphical Models Overview [1]:Graphical Models Overview [1]:Bayesian NetworksBayesian Networks

P(20s, Female, Low, Non-Smoker, No-Cancer, Negative, Negative) = P(T) · P(F) · P(L | T) · P(N | T, F) · P(N | L, N) · P(N | N) · P(N | N)

• Conditional Independence– X is conditionally independent (CI) from Y given Z (sometimes written X Y | Z) iff

P(X | Y, Z) = P(X | Z) for all values of X, Y, and Z

– Example: P(Thunder | Rain, Lightning) = P(Thunder | Lightning) T R | L

• Bayesian (Belief) Network– Acyclic directed graph model B = (V, E, ) representing CI assertions over – Vertices (nodes) V: denote events (each a random variable)

– Edges (arcs, links) E: denote conditional dependencies

• Markov Condition for BBNs (Chain Rule):

• Example BBN

n

iiin21 Xparents |XPX , ,X,XP

1

X1 X3

X4

X5

Age

Exposure-To-Toxins

Smoking

CancerX6

Serum Calcium

X2Gender X7

Lung Tumor sDescendantNon

Parents

sDescendant

Page 5: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

Graphical Models Overview [2]:Graphical Models Overview [2]:Markov Blankets and Markov Blankets and dd-Separation Property-Separation Property

ZX E Y

(1)

(2)

(3) Z

Z

From S. Russell & P. Norvig (1995)

Adapted from J. Schlabach (1996)

Motivation: The conditional independence status of nodes within a BBN might change as the availability of evidence E changes. Direction-dependent separation (d-separation) is a technique used to determine conditional independence of nodes as evidence changes.

Definition: A set of evidence nodes E d-separates two sets of nodes X and Y if every undirected path from a node in X to a node in Y is blocked given E.

A path is blocked if one of three conditions holds:

Page 6: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

Graphical Models Overview [3]:Graphical Models Overview [3]:Inference ProblemInference Problem

Adapted from slides by S. Russell, UC Berkeley http://aima.cs.berkeley.edu/

Multiply-connected case: exact, approximate inference are #-complete

Page 7: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

Other Topics in Graphical Models [1]:Other Topics in Graphical Models [1]:Temporal Probabilistic ReasoningTemporal Probabilistic Reasoning

• Goal: Estimate

• Filtering: r = t

– Intuition: infer current state from observations

– Applications: signal identification

– Variation: Viterbi algorithm

• Prediction: r < t

– Intuition: infer future state

– Applications: prognostics

• Smoothing: r > t

– Intuition: infer past hidden state

– Applications: signal enhancement

• CF Tasks

– Plan recognition by smoothing

– Prediction cf. WebCANVAS – Cadez et al. (2000)

)y|P(X r1it

Adapted from Murphy (2001), Guo (2002)

Page 8: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

• General-Case BBN Structure Learning: Use Inference to Compute Scores

• Optimal Strategy: Bayesian Model Averaging

– Assumption: models h H are mutually exclusive and exhaustive

– Combine predictions of models in proportion to marginal likelihood

• Compute conditional probability of hypothesis h given observed data D

• i.e., compute expectation over unknown h for unseen cases

• Let h structure, parameters CPTs

Hh

m

n21m

D|hP h D,|xP

x,,x,x|x,,x,xPD|xP

1

m211

dΘ h |ΘPΘ h,|DPhP

hPh|DPD|hP

Posterior Score Marginal Likelihood

Prior over Structures Likelihood

Prior over Parameters

Other Topics in Graphical Models [2]:Other Topics in Graphical Models [2]:Learning Structure from DataLearning Structure from Data

Page 9: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

Propagation Algorithm in Singly-Connected Propagation Algorithm in Singly-Connected Bayesian Networks – Pearl (1983)Bayesian Networks – Pearl (1983)

C1

C2

C3

C4 C5

C6

Upward (child-to-parent) messages

’ (Ci’) modified during

message-passing phase

Downward messages

P’ (Ci’) is computed during

message-passing phase

Adapted from Neapolitan (1990), Guo (2000)

Multiply-connected case: exact, approximate inference are #-complete

(counting problem is #-complete iff decision problem is -complete)

Page 10: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

Inference by Clustering [1]: Graph Operations Inference by Clustering [1]: Graph Operations (Moralization, Triangulation, Maximal Cliques)(Moralization, Triangulation, Maximal Cliques)

Adapted from Neapolitan (1990), Guo (2000)

A

D

B E G

C

H

F

Bayesian Network(Acyclic Digraph)

A

D

B E G

C

H

F

Moralize

A1

D8

B2

E3

G5

C4

H7

F6

Triangulate

Clq6

D8

C4

G5

H7

C4

Clq5

G5

F6

E3

Clq4

G5E3

C4 Clq3

A1

B2Clq1

E3

C4

B2

Clq2

Find Maximal Cliques

Page 11: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

Inference by Clustering [2]:Inference by Clustering [2]:Junction Tree – Lauritzen & Spiegelhalter (1988)Junction Tree – Lauritzen & Spiegelhalter (1988)

Input: list of cliques of triangulated, moralized graph Gu

Output:

Tree of cliques

Separators nodes Si,

Residual nodes Ri and potential probability (Clqi) for all cliques

Algorithm:

1. Si = Clqi (Clq1 Clq2 … Clqi-1)

2. Ri = Clqi - Si

3. If i >1 then identify a j < i such that Clqj is a parent of Clqi

4. Assign each node v to a unique clique Clqi that v c(v) Clqi

5. Compute (Clqi) = f(v) Clqi = P(v | c(v)) {1 if no v is assigned to Clqi}

6. Store Clqi , Ri , Si, and (Clqi) at each vertex in the tree of cliquesAdapted from Neapolitan (1990), Guo (2000)

Page 12: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

Inference by Clustering [3]:Inference by Clustering [3]:Clique-Tree Operations Clique-Tree Operations

Clq6

D8

C4

G5

H7

C4

Clq5

G5

F6

E3

Clq4

G5E3

C4 Clq3

A1

B2Clq1

E3

C4

B2

Clq2

(Clq5) = P(H|C,G)

(Clq2) = P(D|C)

Clq1

Clq3 = {E,C,G}R3 = {G}

S3 = { E,C }

Clq1 = {A, B}R1 = {A, B}S1 = {}

Clq2 = {B,E,C}R2 = {C,E}

S2 = { B }

Clq4 = {E, G, F}

R4 = {F} S4 = { E,G }

Clq5 = {C, G,H}R5 = {H}

S5 = { C,G }

Clq6 = {C, D}R5 = {D}

S5 = { C}

(Clq1) = P(B|A)P(A)

(Clq2) = P(C|B,E)

(Clq3) = 1

(Clq4) = P(E|F)P(G|F)P(F)

AB

BEC

ECG

EGF CGH

CD

B

EC

CGEG

C

Ri: residual nodes

Si: separator nodes(Clqi): potential probability of Clique i

Clq2

Clq3

Clq4Clq5

Clq6Adapted from Neapolitan (1990), Guo (2000)

Page 13: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

Inference by Loop Cutset ConditioningInference by Loop Cutset Conditioning

Split vertex in undirected cycle;

condition upon each of its state values

Number of network instantiations:Product of arity of nodes in minimal loop cutset

Posterior: marginal conditioned upon cutset variable values

X3

X4

X5

Exposure-To-Toxins

Smoking

Cancer X6

Serum Calcium

X2

Gender

X7

Lung Tumor

X1,1

Age = [0, 10)

X1,2

Age = [10, 20)

X1,10

Age = [100, )

• Deciding Optimal Cutset: -hard

• Current Open Problems– Bounded cutset conditioning: ordering heuristics

– Finding randomized algorithms for loop cutset optimization

Page 14: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

Inference by Variable Elimination [1]:Inference by Variable Elimination [1]:IntuitionIntuition

Adapted from slides by S. Russell, UC Berkeley http://aima.cs.berkeley.edu/

Page 15: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

Inference by Variable Elimination [2]:Inference by Variable Elimination [2]:Factoring OperationsFactoring Operations

Adapted from slides by S. Russell, UC Berkeley http://aima.cs.berkeley.edu/

Page 16: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

Inference by Variable Elimination [3]:Inference by Variable Elimination [3]:ExampleExample

A

B C

F

G

Season

Sprinkler Rain

Wet

Slippery

D

Manual Watering

P(A|G=1) = ?

d = < A, C, B, F, D, G >

G

D

F

B

C

A

λG(f) = ΣG=1 P(G|F)

P(A), P(B|A), P(C|A), P(D|B,A), P(F|B,C), P(G|F)

P(G|F)

P(D|B,A)

P(F|B,C)

P(B|A)

P(C|A)

P(A)

G=1

Adapted from Dechter (1996), Joehanes (2002)

Page 17: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

[2] Representation Evaluatorfor Learning Problems

Genetic Wrapper forChange of Representationand Inductive Bias Control

D: Training Data

: Inference Specification

Dtrain (Inductive Learning)

Dval (Inference)

[1] Genetic Algorithm

αCandidate

Representation

f(α)Representation

Fitness

OptimizedRepresentation

α̂

eI

Genetic Algorithms for Parameter Tuning in Genetic Algorithms for Parameter Tuning in Bayesian Network Structure LearningBayesian Network Structure Learning

Page 18: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

Treatment 1(Control)

Treatment 2(Pathogen)

Messenger RNA(mRNA) Extract 1

Messenger RNA(mRNA) Extract 2

cDNA

cDNA

DNA Hybridization Microarray(under LASER)

Adapted from Friedman et al. (2000) http://www.cs.huji.ac.il/labs/compbio/

Computational Genomics andComputational Genomics andMicroarray Gene Expression ModelingMicroarray Gene Expression Modeling

LearningEnvironment

G = (V, E)

Specification Fitness(Inferential Loss)

B = (V, E, )

[B] ParameterEstimation

G1

G2

G3

G4 G5

[A] StructureLearning

G1

G2

G3

G4 G5

Dval (Model Validation by Inference)

D: Data (User, Microarray)

Page 19: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

DESCRIBERDESCRIBER: An Experimental: An ExperimentalIntelligent FilterIntelligent Filter

Domain-Specific Workflow Repositories

WorkflowsTransactional, Objective Views

Workflow ComponentsData Sources, Transformations; Other Services

Data Entity, Service, and Component Repository Index for Bioinformatics Experimental Research

Learningover Workflow Instances

and Use Cases(Historical

User Requirements)

Use Case &Query/Evaluation Data

Personalized Interface

Domain-SpecificCollaborative

Recommendation

User Queries & Evaluations

Decision SupportModels

Users ofScientificWorkflow Repository

Interface(s) to Distributed Repository

Example Queries:

• What experiments have found cell cycle-regulated metabolic pathways in Saccharomyces?

• What codes and microarray data were used? How and why?

Page 20: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

RGMs ofQueries

Module 4

Learning &Validation of

RGMsfor User

Requirements

Complete RGMs of User Queries

Module 1Collaborative

RecommendationFront-End

Personalized InterfaceModule 5

RGMParametersfrom User

Query Data

Module 3

Estimation ofRGM Parameters

from Workflow andComponentDatabase

RGMs ofWorkflows

Complete RGMs of Workflows (Data-Oriented)

Recommendations/Evaluations(Before and After Use)

UserQueries

Module 2

Learning & Validationof Relational Graphical

Models (RGMs) forExperimental

Workflows andComponents

Workflow Logs, Instances, Templates, Components (Services, Data Sources)

Training DataStructure &

Data

TrainingData

Structure& Data

Relational Graphical ModelsRelational Graphical Modelsin in DESCRIBERDESCRIBER

Page 21: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

Tools for Building Graphical ModelsTools for Building Graphical Models

• Commercial Tools: Ergo, Netica, TETRAD, Hugin• Bayes Net Toolbox (BNT) – Murphy (1997-present)

– Distribution page http://http.cs.berkeley.edu/~murphyk/Bayes/bnt.html

– Development group http://groups.yahoo.com/group/BayesNetToolbox

• Bayesian Network tools in Java (BNJ) – Hsu et al. (1999-present)– Distribution page

http://bndev.sourceforge.net

– Development group http://groups.yahoo.com/group/bndev

– Current (re)implementation projects for KSU KDD Lab

• Continuous state: Minka (2002) – Hsu, Guo, Perry, Boddhireddy

• Formats: XML BNIF (MSBN), Netica – Guo, Hsu

• Space-efficient DBN inference – Joehanes

• Bounded cutset conditioning – Chandak

Page 22: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

References [1]:References [1]:Graphical Models and Inference AlgorithmsGraphical Models and Inference Algorithms

• Graphical Models– Bayesian (Belief) Networks tutorial – Murphy (2001)

http://www.cs.berkeley.edu/~murphyk/Bayes/bayes.html– Learning Bayesian Networks – Heckerman (1996, 1999)

http://research.microsoft.com/~heckerman

• Inference Algorithms– Junction Tree (Join Tree, L-S, Hugin): Lauritzen & Spiegelhalter (1988)

http://citeseer.nj.nec.com/huang94inference.html– (Bounded) Loop Cutset Conditioning: Horvitz & Cooper (1989)

http://citeseer.nj.nec.com/shachter94global.html– Variable Elimination (Bucket Elimination, ElimBel): Dechter (1986)

http://citeseer.nj.nec.com/dechter96bucket.html– Recommended Books

• Neapolitan (1990) – out of print; see Pearl (1988), Jensen (2001)• Castillo, Gutierrez, Hadi (1997)• Cowell, Dawid, Lauritzen, Spiegelhalter (1999)

– Stochastic Approximation http://citeseer.nj.nec.com/cheng00aisbn.html

Page 23: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

References [2]:References [2]:Machine Learning, KDD, and BioinformaticsMachine Learning, KDD, and Bioinformatics

• Machine Learning, Data Mining, and Knowledge Discovery– K-State KDD Lab: literature survey and resource catalog (2002)

http://www.kddresearch.org/Resources

– Bayesian Network tools in Java (BNJ): Hsu, Guo, Joehanes, Perry, Thornton (2002) http://bndev.sourceforge.net

– Machine Learning in Java (BNJ): Hsu, Louis, Plummer (2002) http://mldev.sourceforge.net

– NCSA Data to Knowledge (D2K): Welge, Redman, Auvil, Tcheng, Hsu

http://alg.ncsa.uiuc.edu

• Bioinformatics– European Bioinformatics Institute Tutorial: Brazma et al. (2001) http://

www.ebi.ac.uk/microarray/biology_intro.htm

– Hebrew University: Friedman, Pe’er, et al. (1999, 2000, 2002) http://www.cs.huji.ac.il/labs/compbio/

– K-State BMI Group: literature survey and resource catalog (2002) http://www.kddresearch.org/Groups/Bioinformatics

Page 24: Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab () Graphical.

Kansas State UniversityDepartment of Computing and Information Sciences

Kansas State University KDD Lab (www.kddresearch.org)

AcknowledgementsAcknowledgements

• Kansas State University Lab for Knowledge Discovery in Databases– Graduate research assistants: Haipeng Guo ([email protected]), Roby

Joehanes ([email protected])– Other grad students: Prashanth Boddhireddy, Siddharth Chandak, Ben

B. Perry, Rengakrishnan Subramanian– Undergraduate programmers: James W. Plummer, Julie A. Thornton

• Joint Work with– KSU Bioinformatics and Medical Informatics (BMI) group: Sanjoy Das

(EECE), Judith L. Roe (Biology), Stephen M. Welch (Agronomy)– KSU Microarray group: Scot Hulbert (Plant Pathology), J. Clare Nelson

(Plant Pathology), Jan Leach (Plant Pathology)– Kansas Geological Survey, Kansas Biological Survey, KU EECS

• Other Research Partners– NCSA Automated Learning Group (Michael Welge, Tom Redman, David

Clutter, Lisa Gatzke)– The Institute for Genomic Research (John Quackenbush, Alex Saeed) – University of Manchester (Carole Goble, Robert Stevens)– International Rice Research Institute (Richard Bruskiewich)