Suffix Trees ALGGEN: Algorithmics and genetics group Dep. Llenguatges i Sistemes Informàtics Universitat Politècnica de Catalunya Dr. Xavier Messeguer http://www.lsi.upc.es/~alggen
Jan 29, 2016
Suffix Trees
ALGGEN: Algorithmics and genetics group
Dep. Llenguatges i Sistemes Informàtics
Universitat Politècnica de Catalunya
Dr. Xavier Messeguerhttp://www.lsi.upc.es/~alggen
Suffix trees
Given string ababaas:
1: ababaas
2: babaas
3: abaas
4: baas
5: aas
6: as
7: s
as,3
s,6
as,5
s,7
as,4ba
baas,2
a
babaas,1
a
babaas,1
ba
baas,2
as,3
as,4
s,6
as,5
s,7
Suffixes:
What kind of queries?
Queries on Suffix trees
a
babaas,1as,3
ba
baas,2
as,4
s,6
as,5
s,7
• Does the sequence ababaas contain any ocurrence of patterns abab, aab, and ab?
• Find repeats within the sequence ababaas.
…………………………
…………………………
Quadratic Insertion algorithm
Given the string ababaabbs
ababaabbs,1
Quadratic Insertion algorithm
Given the string ababaabbs
babaabbs,2
ababaabbs,1
Quadratic Insertion algorithm
Given the string ababaabbs
babaabbs,2
ababaabbs,1ababaabbs,1
Quadratic Insertion algorithm
Given the string ababaabbs
babaabbs,2
ababaabbs,1
abbs,3
Quadratic Insertion algorithm
Given the string ababaabbs
babaabbs,2
ababaabbs,1
abbs,3
ba
baabbs,2
Quadratic Insertion algorithm
Given the string ababaabbs
ababaabbs,1
abbs,3
ba
baabbs,2
abbs,4
Quadratic Insertion algorithm
Given the string ababaabbs
ababaabbs,1
abbs,3
abbs,4ba
baabbs,2
abbs,4
abbs,3ba
a
baabbs,1
Quadratic Insertion algorithm
Given the string ababaabbs
abbs,4ba
baabbs,2
abbs,4
abbs,3ba
a
baabbs,1
abbs,5
Quadratic Insertion algorithm
Given the string ababaabbs
abbs,4ba
baabbs,2
abbs,4
abbs,3ba
a
baabbs,1
abbs,5
Quadratic Insertion algorithm
Given the string ababaabbs
abbs,4
ba
ba
baabbs,2
abbs,4
a abbs,5
b
a abbs,3
baabbs,1
Quadratic Insertion algorithm
Given the string ababaabbs
abbs,4ba
baabbs,2
abbs,4
a abbs,5
b
a abbs,3
baabbs,1
bs,6
Quadratic Insertion algorithm
Given the string ababaabbs
abbs,4ba
baabbs,2
abbs,4
a abbs,5
b
a abbs,3
baabbs,1
bs,6
Quadratic Insertion algorithm
Given the string ababaabbs
a abbs,5
b
a abbs,3
baabbs,1
bs,6
a
baabbs,2
b
abbs,4
bs,7
Quadratic Insertion algorithm
Given the string ababaabbs
a abbs,5
b
a abbs,3
baabbs,1
bs,6
a
baabbs,2
b
abbs,4
bs,7
s,7
Quadratic Insertion algorithm
Given the string ababaabbs
a abbs,5
b
a abbs,3
baabbs,1
bs,6
a
baabbs,2
b
abbs,4
bs,7
s,7
s,7
Quadratic Insertion algorithm
Given the string ababaabbs
abbs,4ba
baabbs,2
abbs,4
a abbs,5
b
a abbs,3
baabbs,1
bs,6
a
baabbs,2
b
abbs,4
Definition of MUM
… a a t g….c t g...
… c g t g….c c c ...
MatchingUniqueMaximal
MUM
Search for MUMs
Given strings ababaabs and aabaat:
List of UM aab,abaa,baa.
ba
a
s,8
s,6s,7
baabs,2
ba
baabs,1
abs,3
a
s,5
abs,4b
ab
t,2t,5
t,6
t,4aat,1
t,3
(through the list of UM)
1st: Bottom-up traversal
2nd: Search for maximals
(Through the tree)
MUMs: aab,abaa.