Inference in DBNs with non-disjoint clustersperso.crans.org/~genest/CFF.pdf · 2015. 10. 1. · Pr(St=h,ESt=l) Pr(ESt=l, Et=m) Pr(ESt=l,Pt=h) Pr2(ESt=l) S ES E P Clustered Factored

Inference in DBNs with non-disjoint clusters

Matthieu Pichené

Introduction

Apoptosis pathway

Mcl1

Mcl1

Method

simulations Analysis

MATHEMATICAL FORMALISM

BIOLOGICAL SYSTEM

Method

Method

Method(Approximate) abstrac1on

of the low level biochemical model

DBNs

ES

S

E

P

S + E <—> ES —> P + E

t0 t1 t2 t3

k+1

k-1

k+2

{1 2 3 4 5

{1 2 3 4 5

{1 2 3 4 5

{1 2 3 4 5

Every specie at time point t is a random

variable over a discrete

number of values.

Number of configurations at each time point: ValuesSpecies

DBNs

ES

S

E

P

t0 t1 t2 t3

+CPT

S ES

E P

k+1

k-1

k+2S + E <—> ES —> P + E

CPTS + E <—> ES —> P + Ek+1

k-1

k+2

S S E ES Pr1 1 1 1 0.11 2 1 2 0.22 2 3 3 0.1…

SES S E ES P Pr112…

E S E ES Pr112…

P ES P Pr112…

ES

E P

DBNs

ES

S

E

P

t0 t1 t2 t3

+CPT

S ES

E P

k+1

k-1

k+2S + E <—> ES —> P + E

Complexity of exact inference: at least ValuesSpecies

DBNs

• We need an approximation. Express configurations as product of probabilities

• Simplest idea : Consider all species independent ( Factored Frontier )

Factored Frontier

ES

S

E

P

t0 t1 t2 t3

k+1

k-1

k+2

Hypothesis : Independent

S + E <—> ES —> P + E

complexity of FF inference: Species x ValuesNbPar+1

Pt2(P=h)= f(Pt1(P),Pt1(ES),CPT)

Low accuracy

Clustered Factored Frontier

• Use of clusters containing the species that have the most mutual information

• Clusters may vary over time

• All sets of states for species in a clusters are calculated (that limits the length of clusters)


• Use information theory (Eric) to obtain the important relations

• We (Eric) chose the tree to minimize distance

• Tree implies cluster of size 2

R

R

L:R

L:R

R*

R*

R*:pC8

R

*:pC

8

C8

C8

Bar

Bar

Bid

Bid

C8:Bar

C

8:Ba

r

flip

flip

R*:flip

R

*:flip

pC8

pC8

pC3

pC3

C8:pC3

C

8:pC

3

C3:XIAP

C

3:XI

AP

C3U

C3 U

tBid:Mcl1

tBid

:Mcl

1

C8:Bid

C

8:Bi

d

tBid

tBid

C3

C3

XIAP

XIAP

Smac

Smac

Smacr

S

mac

r

Apop

Apop

Apop:XIAP

Apo

p:XI

AP

PARP

PAR

P

cPARP

cPAR

P

CyCm

CyC

m

Smacm

Smac

m

CyC

CyC

CyCr

CyC

r

Smac:XIAP

Sm

ac:X

IAP

Bax2:Bcl2

Bax

2:Bc

l2

Bcl2

Bcl2

Bax

Bax

Bax*m

Bax*

m

Bax*

Bax*

Bax2

Bax2

Mcl1

Mcl

1

Pore*

Pore

*

Bax4

Bax4

Bax4:M

B

ax4:

M

Bax*m:Bcl2

Bax*

m:B

cl2

Apaf*

A

paf*

pC9

pC

9

Apaf

Apaf

Bax4:Bcl2

Bax

4:Bc

l2

Apop:pC3

Apo

p:pC

3

C3:PARP

C

3:PA

RP

tBid:Bax

tBi

d:Ba

x

M*:CyCm

M

*:CyC

m

M*:Smacm

M*:S

mac

m

CyC:Apaf

CyC

:Apa

f

pC6

pC6

Pore

Pore

C6

C6

C3:pC6

C

3:pC

6

C6:pC8

C

6:pC

8

136 238 5 61439 337 4 7404335464515 8113029345612132528272657492217191820162421514832333150554447525354 923104142

136 238 5 61439 337 4 7404335464515 8113029345612132528272657492217191820162421514832333150554447525354 923104142 0

0.5

1

1.5

2

2.5

3

Mutual information on the whole graph

Mutual Information on the Tree Approximation

R

R

L:R

L:R

R*

R*

R*:pC8

R

*:pC

8

C8

C8

Bar

Bar

Bid

Bid

C8:Bar

C

8:Ba

r

flip

flip

R*:flip

R

*:flip

pC8

pC8

pC3

pC3

C8:pC3

C

8:pC

3

C3:XIAP

C

3:XI

AP

C3U

C3 U

tBid:Mcl1

tBid

:Mcl

1

C8:Bid

C

8:Bi

d

tBid

tBid

C3

C3

XIAP

XIAP

Smac

Smac

Smacr

S

mac

r

Apop

Apop

Apop:XIAP

Apo

p:XI

AP

PARP

PAR

P

cPARP

cPAR

P

CyCm

CyC

m

Smacm

Smac

m

CyC

CyC

CyCr

CyC

r

Smac:XIAP

Sm

ac:X

IAP

Bax2:Bcl2

Bax

2:Bc

l2

Bcl2

Bcl2

Bax

Bax

Bax*m

Bax*

m

Bax*

Bax*

Bax2

Bax2

Mcl1

Mcl

1

Pore*

Pore

*

Bax4

Bax4

Bax4:M

B

ax4:

M

Bax*m:Bcl2

Bax*

m:B

cl2

Apaf*

A

paf*

pC9

pC

9

Apaf

Apaf

Bax4:Bcl2

Bax

4:Bc

l2

Apop:pC3

Apo

p:pC

3

C3:PARP

C

3:PA

RP

tBid:Bax

tBi

d:Ba

x

M*:CyCm

M

*:CyC

m

M*:Smacm

M*:S

mac

m

CyC:Apaf

CyC

:Apa

f

pC6

pC6

Pore

Pore

C6

C6

C3:pC6

C

3:pC

6

C6:pC8

C

6:pC

8

136 238 5 61439 337 4 7404335464515 8113029345612132528272657492217191820162421514832333150554447525354 923104142

136 238 5 61439 337 4 7404335464515 8113029345612132528272657492217191820162421514832333150554447525354 923104142 0

0.5

1

1.5

2

2.5

3

Species correlations (Eric)

Hypothesis :

Pr(St=h,ESt=l,Et=m,Pt=h) =

Pr(St=h,ESt=l) Pr(ESt=l, Et=m) Pr(ESt=l,Pt=h)

Pr2(ESt=l)

S ES E

P

Clustered Factored Frontierwe assume that relations not in tree are irrelevant

Apoptosis pathway

−1.5 −1 −0.5 0 0.5 1 1.5

−1.5

−1

−0.5

0

0.5

1

1.5

1

R

2

R*

3flip

4 pC8

5

C8

6

Bar

7 pC3

8 C3

9

pC6 10

C6

11XIAP

12

PARP

13

cPARP

14

Bid

15

tBid

16

Mcl1

17

Bax

18

Bax*

19

Bax*

m

20

Bax2

21Bax4

22

Bcl2

23

Pore

24

Pore*

25

CyCm

26

CyC

r

27

CyC

28

Smacm

29 Smacr

30 Smac

31 Apaf

32 Apaf*

33 pC9

34 Apop

35 C3U

36

L:R

37 R

*:flip

38

R

*:pC8

39

C8:Bar

40 C8:pC3

41

C3:pC6

42

C6:pC

8

43 C3:XIAP

44

C3:PARP

45 C8:Bid

46 tBid:Mcl1

47

tBid:Bax

48Bax*m:Bcl2

49

Bax

2:Bc

l2

50

Bax4:Bcl2

51 Bax4:M

52

M*:CyCm53

M*:Smacm54

CyC:Apaf

55

Apop:pC3

56

Apop:XIAP57

Sm

ac:X

IAP

Apoptosis pathway


ES

S

E

P

t0 t1 t2 t3

+CPT

S ES

E P

k+1

k-1

k+2S + E <—> ES —> P + E


ES

S

E

P

t0 t1 t2 t3

+CPT

S ES

E P

k+1

k-1

k+2S + E <—> ES —> P + E

Pt1(s’,es’)=Σs,es,e (Pt0(s,es,e)CPT(s,es,e,s’)CPT(s,es,e,es’))

How our algorithm work

Hypothesis :

Pr(St=h,ESt=l,Et=m,Pt=h) =

Pr(St=h,ESt=l) Pr(ESt=l, Et=m) Pr(ESt=l,Pt=h)

Pr2(ESt=l)

S ES E

P

How to compute P(parents(Cluster))

Proposition : P(Xp = vp, XL = VL, XR =VR) = P(Xp = vp, XL = VL) x P(Xp = vp, XR =VR)

P(Xp = vp)

p

L R


Parent_Cluster= set of nodes necessary to use the CPTs.






Independence between trees Complexity : Species x Values Parents_Cluster+1

Algorithm comparison

FF ClusteredFF Exact computation

Complexity Species x ValuesNbParents

Species x ValuesParents_Cluster+1 > ValuesSpecies

Accuracy Low ? but better than FF Exact

Conclusion

• Our program is currently still being written. Results will tell if the accuracy is good or not.

• After the first results are obtained we will upgrade it to accept bigger clusters and non-tree graphs






Order S x N


• For each time T groups of clusters are found

• Most efficient path is found to calculate each cluster

• Calculate probability using CPTs

• Results are saved, cluster probabilities are kept in memory


A

A*

A <—> A* CPT:

96.04% A = h , A* = l 0.04% A = l , A* = h 1.96% A = h , A* = h 1.96% A = l , A* = l 0.04% A = h , A* = l 96.04% A = l , A* = h 1.96% A = h , A* = h 1.96% A = l , A* = l

98% : A = h A* = l —> A = h 2% : A = h A* = l —> A = l 2% : A = l A* = h —> A = h 98% : A = l A* = h —> A = l 2% : A = h A* = l —> A* = h 98% : A = h A* = l —> A* = l 98% : A = l A* = h —> A* = h 2% : A = l A* = h —> A* = l

50% A = h A* = l 50% A = l A* = h :


A

A*

A <—> A* CPT:

53.04% A = h , A* = l 53.04% A = l , A* = h 1.96% A = h , A* = h 1.96% A = l , A* = l

98% : A = h A* = l —> A = h 2% : A = h A* = l —> A = l 2% : A = l A* = h —> A = h 98% : A = l A* = h —> A = l 2% : A = h A* = l —> A* = h 98% : A = h A* = l —> A* = l 98% : A = l A* = h —> A* = h 2% : A = l A* = h —> A* = l

50% A = h A* = l 50% A = l A* = h :

Inference in DBNs with non-disjoint clustersperso.crans.org/~genest/CFF.pdf · 2015. 10. 1. · Pr(St=h,ESt=l) Pr(ESt=l, Et=m) Pr(ESt=l,Pt=h) Pr2(ESt=l) S ES E P Clustered Factored

Documents