Top Banner
Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li
35

Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Common Intervals in Sequences,Trees, and Graphs

Steffen Heber and Jiangtian Li

Page 2: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Genome Comparison of Bacteria

Kim et alKim et al.,., Nat. Biotechnol., 2004]

Page 3: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Gene Order & Function in Bacteria

• Gene order in bacteria is weakly conserved. [Gene order is not conserved in bacterial evolution. Mushegian, Koonin; Trends Genet. 1996]

• Some genes cluster together even in unrelated species.

• Genes inside a cluster are functionally associated.[Conserved clusters of functionally related genes in two bacterial

genomes. Tamames et al.; J Mol Evol. 1997]

Page 4: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Gene Order & Function in Bacteria

Page 5: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Gene Order & Function in Bacteria

Page 6: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Formalization of Gene Clusters

Genomes: permutations π1, π2 ,…, πk

Genes: numbers 1,…,n

π1

π2

π3

π4

1 2 3 4 5 6 7 8

8 7 6 4 5 2 1 3

3 1 2 5 8 7 6 4

6 7 4 2 1 3 8 5

Page 7: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Intervals

• For permutation of [n] = {1, 2, …, n},an interval (=gene cluster) is a set{(i), (i+1), …, (j)} for 1 i < j n.

• Any permutation of [n] has n(n-1)/2 intervals.

1 3 5 4 2 6 7

Page 8: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Common Intervals

• For a family F = (0, 1, …, k-1) of permutations, a common interval of F (=conserved gene cluster) is a subset S [n], iff S is interval in all i.

• We say SCF .

1 3 5 4 2 6 7 2 4 5 1 3 7 6

0 1

Page 9: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Common Intervals

• For a family F = (0, 1, …, k-1) of permutations, a common interval of F (=conserved gene cluster) is a subset S [n], iff S is interval in all i.

• We say SCF .

1 3 5 4 2 6 7 2 4 5 1 3 7 6

0 1

Page 10: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Common Intervals

• For a family F = (0, 1, …, k-1) of permutations, a common interval of F (=conserved gene cluster) is a subset S [n], iff S is interval in all i.

• We say SCF .

1 3 5 4 2 6 7 2 4 5 1 3 7 6

0 1

Page 11: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Lemma

Let F = (0, 1, …, k-1) and c, d CF .

• If c d then c d CF.

1 3 5 4 2 6 7 2 4 5 1 3 7 6

0 1

Page 12: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Lemma

Let F = (0, 1, …, k-1) and c, d CF .

• If c d then c d CF.

• We call c d reducible.

1 3 5 4 2 6 7 2 4 5 1 3 7 6

0 1

reducible interval

irreducible

Page 13: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Analysis

• We have K n(n-1)/2 common intervals, and I<n irreducible intervals.

• Find all K common intervals of k 2 permutations of [n]:O(kn + K) time & O(n) space

Page 14: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Common Intervals of Trees

Let T,T1,…,Tk be trees with vertex set [n].

Definition:

• S [n] is interval of T iffT[S] connected, and |S|>1

• S [n] is common interval of T1,…,Tk, iffS is interval in all trees.

• Tree intervals generalize intervals of permutations.

Page 15: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Miscellaneous

Example:

common intervals of T1, T2: { [2], [3], [4], [5] }

• (Common) Intervals in trees are induced subtrees.

4321

5

T1

5412

3

T2

Page 16: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Structure of Tree Intervals

• Tree intervals have the Helly property, i.e. for any family of tree intervals (Ti)iI the assumption Tp Tq for every p,qI implies iITi

Page 17: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Extreme Cases

n-vertex stars Sn-1

# non-trivial induced subtrees: 2n-1-1

Page 18: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

The Common Interval Graph

• Given T = (T1,…,Tk ) and corresponding common intervals CT. The common interval graph GT = (V,E) is the graph with

V = CT

E = {(c,d) | c,d CF, cd , c d}

Page 19: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Example

• V=[n], T=(Pn, Sn-1)

• We have CT = { [2],[3],…,[n] },GT = K(CT).

[2]

[3]

[4]

[n]

1

2

3

4321

4

GT

Page 20: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Common Interval Graphs cont’d

A graph is called chordal, if it does not contain an induced cycle Cn on n>3 vertices.

Proposition: Common interval graphs of trees are chordal graphs.

Page 21: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Irreducible Common Intervals

For a common interval c CT and a subset V CT we say that V generates c, iff

i. for each d V, d c

ii. c = Ud

iii. GT[V] is connected.

If there is no such V then c is irreducible.

The irred. intervals generate all common intervals.

1

53

2 4

6 7

Page 22: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Finding Irreducible Intervals

• We have K < 2n-1 common intervals, and I<n irreducible intervals.

• Find all irreducible common intervals of k trees on n vertices:O(kn2) time & O(kn) space

Page 23: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Finding Irreducible Intervals

• Irreducible intervals are minimal common intervals containing an adjacent vertex pair.

yx

l

z

m

x y lz m

yx

l

z

m

x y lz m

Page 24: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Graph Intervals

G=(V,E), undirected, connected graph, V=[n]

S V is interval (convex), iff the induced subgraph G[S] is connected, and includes every shortest path with end-vertices in S.

1

32

4

1

32

4

convex NOT!

Page 25: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Common Intervals of Graphs

Let G=(G1,…,Gk) family of connected undirected graphs, with vertex set [n].

Definition: S [n] is common interval of G, iff S is interval in all graphs.

• Graph intervals generalize tree intervals.

1

32

4

2

34

1

G0 G1

Page 26: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Some Differences

• The union of convex sets is NOT always convex.

Page 27: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Some Differences

3

21

• The common convex hull of an adjacent vertex pair is NOT always irreducible.

3

21

G1 G2

Page 28: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Finding Irreducible Graph Intervals

Sketch: Given G=(G0, G1, …, Gk-1)

For each edge (i,j)Ei* do

S(i,j) := {i,j}

For each (k,l)S(i,j)

Add vertices ‘between’ k and l to S(i,j)

Remove reducible intervals

Page 29: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Extreme Cases

Permutations (identical permutations):

• C n(n-1)/2 I < n

Trees (identical star-trees):

• C < 2n-1 I < n

Graphs (complete graphs):

• C < 2n I n(n-1)/2

Page 30: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Example: InterDom

Database of protein domain interactions.• Gene fusions• Protein-protein interactions (DIP & BIND)• Protein complexes (PDB)

Page 31: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Comparing Two Networks

Page 32: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Comparing Three Networks

G : Gene fusionP : PDBB : BIND D : DIP

Page 33: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Irreducible Intervals

size of irreducible interval

Page 34: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Biological Meaningful?

RAS family domain protein kinase

ankyrin repeat

PH domain

regulator of chromosome condensation

Page 35: Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

THANK YU!!!