Ferhat Ay , Tamer Kahveci & Valerie de-Crecy Lagard 06/23/22 1 Ferhat Ay www.cise.ufl.edu/~fay
Jan 20, 2016
Ferhat Ay, Tamer Kahveci
& Valerie de-Crecy Lagard
04/21/23 1Ferhat Ay
www.cise.ufl.edu/~fay
Metabolic Pathways
04/21/23 2Ferhat Ay
What and Why?
04/21/23 3Ferhat Ay
Metabolic Pathway AlignmentFinding a mapping of the entities of the pathways
C2
C3
C4
C5
R1 R2C1
E1 E2
C2 C4R1 R2C1
C5
E1 E2
Applications○ Drug Target Identification○ Metabolic Reconstruction○ Phylogeny Prediction
Challanges
04/21/23 4Ferhat Ay
E1 E2 E3
E4
E1 E2 E3
Graph Alignment Even after Abstraction Metabolic Pathway Alignment problem is NP Complete!
Existing Algorithms Heymans et al. (2003) Clemente et al. (2005) Pinter et al. (2005) Singh et al. (2007) ….
Abstraction is a problem !
E1 C1
C2
E2 C3
C4
E3
E4
E1 C1 E2 C3 E3- Where are the compounds?
- E1 C1 E2 or E1 C2 E2 ?
Pathway Alignment is hard !
Abstraction
Outline
04/21/23 5Ferhat Ay
Graph Model of Pathways Consistency of an Alignment Homological & Topological Similarities Eigenvalue Problem Similarity Score Experimental Results
Non-Redundant Graph Model
04/21/23 6Ferhat Ay
Pyruv.
1.2.4.1
Lip-EThPP
R0014
S-Ac 2-ThP A-CoADi-hy
R7618R3270
R2569
2.3.1.12
1.8.1.4
Consistency
04/21/23 7Ferhat Ay
1- Align only the entities of the same type (compatible)
R1 R2 C1 C2
R1C1
2- The overall mapping should be 1-1
R1R2
R3
Consistency
04/21/23 8Ferhat Ay
C3
C2
C5
C4
R1 R2C1
C2 C4R1 R2C1
C5
3- Align two entities ui , vi only if there exists an aligned entity pair uj , vj such that uj and vj are on the reachability paths of ui and vi respectively.
Aligned Entities
Backward Reachability Path
Forward Reachability Path
Problem Statement
04/21/23 9Ferhat Ay
Given a pair of metabolic pathways, our aim is to
find the consistent alignment (mapping) of the
entities (enzymes, reactions, compounds)
such that the similarity between the pathways
(SimP score) is maximized.
04/21/23 10Ferhat Ay
Pairwise Similarities (Homology of Entities)
Pairwise Similarities (Homology)
04/21/23 11Ferhat Ay
Enzyme Similarity (SimE) Hierarchical Enzyme Similarity - Webb EC.(2002) Information-Content Enzyme Similarity - Pinter et al.
(2005)
Compound Similarity (SimC) Identity Score for compounds SIMCOMP Compound Similarity – Hattori et al.(2003)
Pairwise Similarities
04/21/23 12Ferhat Ay
Reaction Similarity
(SimR)
E1
R1 C3
C1
C2
R2 C6C4
C7
C5
E2
E3
SimR (R1,R2) =
Enzymesmax ( SimE (E1,E3) , SimC (E2,E3) )
Input Compounds+ max ( SimC (C1,C4) , SimC (C2,C4) )
Output Compounds+ max ( SimC (C3,C5) , SimC (C3,C6),
SimC (C3,C7) )
SimR (R1,R2) =
Enzymesmax ( SimE (E1,E3) , SimC (E2,E3) )
Input Compounds+ max ( SimC (C1,C4) , SimC (C2,C4) )
Output Compounds+ max ( SimC (C3,C5) , SimC (C3,C6),
SimC (C3,C7) )
04/21/23 13Ferhat Ay
Topological Similarity (Topology of Pathways)
Neighborhood Graphs
04/21/23 14Ferhat Ay
C4
C5
C6
C7
R1
R2
C1
E2
R3 R4E1
E3C3
C2
C9
C8
E1 E2 E3
Enzymes
R2
R3R1
R4
ReactionsC1
C3
C2
C4
C5
C6
C7
C8
C9
Compounds
Topological Similarities
04/21/23 15Ferhat Ay
R2
R3
R1
R4
R1 R3
R4
R5
|R| = 4
|R| = 4
BN (R3)= {R1,R2}FN (R3)= {R4}
BN (R3)= {R1}FN (R3)= {R4,R5}
(|R| |R| ) x (|R| |R| ) = 16 x 16
AR matrix
R1-R1 … R2-R1 … R4-R4 … R4-R5
....
R3 -R3 1 / 4 0 1 / 4 0 1 / 4 0 1 / 4
…..
AR [R3 ,R3][R2,R1] = 1 = 1
2*1 + 1*2 4
Problem Formulation
04/21/23 16Ferhat Ay
R2
R3
R1 R4
R5
R6
R7R8
R3
R1
R2
R5 R7R8
Focus on R3 – R3 matching
Iteration 1: Support of aligned first degree neighbors added Iteration 2: Support of aligned second degree neighbors added Iteration 3: Support of aligned third degree neighbors added Iteration 0: Only pairwise similarity of R3 and R3
1 0(1 )k kR R R RH A H H
******************************************
04/21/23 17Ferhat Ay
Initial ReactionSimilarity Matrix
HR0 Vector
0.11.00.20.9
0.30.50.80.8
0.51.00.40.3
HRs Vector Final Reaction
Similarity Matrix
0.21.00.40.6
0.20.30.60.9
0.60.90.50.5
Power MethodIterations
0.51.00.40.3
0.30.50.80.8
0.11.00.20.9
0.60.90.50.5
0.20.30.60.9
0.21.00.40.6
1 0(1 )k kR R R RH A H H
******************************************Problem Formulation
0.51.00.40.3
0.30.50.80.8
0.11.00.20.9
0.60.90.50.5
0.20.30.60.9
0.21.00.40.6
Max Weight Bipartite Matching
04/21/23 18Ferhat Ay
Six Possible OrderingsONLY 3 ARE UNIQUE
○ Reactions First○ Enzymes First ○ Compounds First
R First Pruning
R1
R2
R3
R1
R3
R2
C1 C1
C2
C3
C4
C2
C3
E1
E2
E3
E1
E2
Consistency Assured !
Weighted Edges
Aligned Entities
Inconsistent Edges
Alignment Score ( SimP )
04/21/23 19Ferhat Ay
C2
C3
C4
C5
R1 R2C1
C2 C4R1 R2C1
C5
0 =< SimP <= 1SimP =1 for identical pathways
SimP = Sim(C1) + Sim(C2) +Sim(C4) + (1 – Sim(E1) + Sim(E2) 3 2
E1E1
E2E2
Outline
04/21/23 20Ferhat Ay
Graph Model of Pathways Consistency of an Alignment Homological & Topological Similarities Eigenvalue Problem Similarity Score Experimental Results
Impact of Alpha
04/21/23 21Ferhat Ay
= 0Only pairwise similarities of entities - No iterations = 1 Only topology of the graphs
= 0.7 is good !
Alternative Entities & Paths
04/21/23 22Ferhat Ay
Kim J. et al. (2007)
Eukaryotes (e.g. H.Sapiens) Mevalonate PathBacterias (e.g. E.Coli) Non-Mevalonate Path
Kuzuyama T. et al. (2006)
Phylogeny Prediction
04/21/23 23Ferhat Ay
Thermoprotei
Eukaryota
Archaea
NCBI Taxonomy
Our Prediction
Deuterostomia
Effect Of Consistency Restriction
04/21/23 24Ferhat Ay
Running Time
04/21/23 25Ferhat Ay
04/21/23 Ferhat Ay 26
For source code and more information:
www.cise.ufl.edu/~fay
04/21/23 Ferhat Ay 27
Error Tolerance
04/21/23 28Ferhat Ay
Pylogenetic Reconstruction
04/21/23 29Ferhat Ay
Effect Of Consistency Restriction
04/21/23 30Ferhat Ay
Z-Score Calculation
04/21/23 31Ferhat Ay
E1 C1
C2
E2 C3
C4
E3
E4
Challanges
04/21/23 32Ferhat Ay
E1 E2 E3
E4
E1 C1 E2 C3 E3 E1 E2 E3
- Where are the compounds?
- E1 C1 E2 or E1 C2 E2 ?
Pathway 1
Pathway 2
Abstraction is a Problem!
Pathway 1 Abstracted
Pathway 2 Abstracted
NO Abstraction Abstraction
Alignment Problem is NP Complete !