Page 1
Evaluating alignments using motif detection
• Let’s evaluate alignments by searching for motifs
• If alignment X reveals more functional motifs than Y using technique Z then X is better than Y w.r.t. Z
• Motifs could be functional sites in proteins or functional regions in non-coding DNA
Page 2
Protein Functional Site Prediction
• The identification of protein regions responsible for stability and function is an especially important post-genomic problem
• With the explosion of genomic data from recent sequencing efforts, protein functional site prediction from only sequence is an increasingly important bioinformatic endeavor.
Page 3
What is a “Functional Site”?
• Defining what constitutes a “functional site” is not trivial
• Residues that include and cluster around known functionality are clear candidates for functional sites
• We define a functional site as catalytic residues, binding sites, and regions that clustering around them.
Page 6
Functional Sites (FS)
Page 7
Regions that Cluster Around FS
Page 8
Phylogenetic motifs
• PMs are short sequence fragments that conserve the overall familial phylogeny
• Are they functional?
• How do we detect them?
Page 9
Phylogenetic motifs
• PMs are short sequence fragments that conserve the overall familial phylogeny
• Are they functional?• How do we detect them? • First we design a simple heuristic to find
them• Then we see if the detected sites are
functional
Page 10
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Whole Tree
Page 11
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Whole Tree
Page 12
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Windowed Tree Whole Tree
Page 13
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Partition Metric Score: 6
Windowed Tree Whole Tree
Page 14
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Partition Metric Score: 8
Windowed Tree Whole Tree
Page 15
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Partition Metric Score: 4
Windowed Tree Whole Tree
Page 16
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Partition Metric Score: 6
Windowed Tree Whole Tree
Page 17
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Partition Metric Score: 8
Windowed Tree Whole Tree
Page 18
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Partition Metric Score: 6
Windowed Tree Whole Tree
Page 19
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Partition Metric Score: 6
Windowed Tree Whole Tree
Page 20
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Partition Metric Score: 0
Windowed Tree Whole Tree
Page 21
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Partition Metric Score: 6
Windowed Tree Whole Tree
Page 22
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Partition Metric Score: 6
Windowed Tree Whole Tree
Page 23
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Partition Metric Score: 8
Windowed Tree Whole Tree
Page 24
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Partition Metric Score: 0
Windowed Tree Whole Tree
Page 25
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Partition Metric Score: 6
Windowed Tree Whole Tree
Page 26
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Partition Metric Score: 6
Windowed Tree Whole Tree
Page 27
Scan for Similar Trees
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Partition Metric Score: 6
Windowed Tree Whole Tree
Page 28
Phylogenetic Motif Identification
• Compare all windowed trees with whole tree and keep track of the partition metric scores
• Normalize all partition metric scores by calculating z-scores
• Call these normalized scores Phylogenetic Similarity Z-scores (PSZ)
• Set a PSZ threshold for identifying windows that represent phylogenetic motifs
Page 29
Set PSZ Threshold
Page 31
Map PMs to the Structure
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Page 32
Map PMs to the Structure
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Set PSZ Threshold
Page 33
Map PMs to the Structure
Map
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Set PSZ Threshold
Page 34
Map PMs to the Structure
Map
2DBL DVVMTQIPLSLPVNL GDQASIS CRSSQSLIHSNGNTYLH WYLQKPGQS PKLLMYKVSNRF 1NCA DIVMTQSPKFMSTSV GDRVTIT CKASQ----- DVSTAVV WYQQKPGQS PKLLIYWASTRH 2JEL DVLMTQTPLSLPVSL GDQASIS CRSSQSIVHGNGNTYLE WYLQKPGQS PKLLIYKISNRF 2IGF DVLMTQTPLSLPVSL GDQASIS CRSNQTILLSDGDTYLE WYLQKPGQS PKLLIYKVSNRF 3HFM DIVLTQSPATLSVTP GNSVSLS CRASQS ----- IGNNLH WYQQKSHES PRLLIKYASQSI 3HFL DIVLTQSPAIMSASP GEKVTMT CSASSS ------ VNYMY WYQQKSGTS PKRWIYDT SKLA 1NGP QAVV TQES-ALTTSP GETVTLT CRSSTG --AVTTSNYAN WVQEKPDHLFTG LIGGTNNRA 2DBL YGVPDRFSGS GSGTDFTLKISRVEA EDLGIYFCS QSSHVPPTFGGGTKLEIK -RADAAPT 1NCA IGVPDRFAGSGSGTDYTLTISSVQA EDLALYYCQQHYSPPWTFGGGTKLEIK -RADAAPT 2JEL SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPYTFGGGTKLEIK -RADAAPT 2IGF SGVPDRFSGS GSGTDFTLKISRVEA EDLGVYYCFQGSHVPPTFGGGTKLEIK -RADAAPT 3HFM SGIPSRFSGS GSGTDFTLSINSVET EDFGMYFCQQSNSWPYTFGGGTKLEIK -RADAAPT 3HFL SGVPVRFSGS GSGTSYSLTISSMET EDAATYYCQQWGRNP-TFGGGTKLEIK -RADAAPT 1NGP PGVPARFSGS LIGDKAA LTITGAQT EDEAIYFCALWYSNHWV FGGGTKL TVLGQPKSSPS 2DBL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR ---QIQLVQSGPELKKPGETVKI 1NCA MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECQI QLVQSGPELKKPGETVKI 2JEL MSSTLTLTKDEYERHNSYTCEATHKTS DSPIVKSFNR N--QVQLAQSGPELVRPGVSVKI 2IGF MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECEV QLVESGGDLVKPGGSLK L 3HFM MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECDV QLQESGPSLVKPSQTLS L 3HFL MSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNR NECXV QLQQSGAELMKPGASVKI 1NGP ASSYLTLTARAWERHSSYSCQVTHEGHT--VEKSLSR---QVQLQQPGAELVKPGASVK L
Set PSZ Threshold
Page 35
PMs in Various Structures
Page 36
PMs and Traditional Motifs
Page 37
TIM
Phylogenetic Similarity False Positive Expectation
Page 38
TIM
Phylogenetic Similarity False Positive Expectation
Page 39
TIM
Phylogenetic Similarity False Positive Expectation
Page 40
TIM
Phylogenetic Similarity False Positive Expectation
Page 41
Cytochrome P450
Phylogenetic Similarity False Positive Expectation
Page 42
Cytochrome P450
Phylogenetic Similarity False Positive Expectation
Page 43
Enolase
Phylogenetic Similarity False Positive Expectation
Page 44
Glycerol Kinase
Phylogenetic Similarity False Positive Expectation
Page 45
Glycerol Kinase
Phylogenetic Similarity False Positive Expectation
Page 46
Myoglobin
Phylogenetic Similarity False Positive Expectation
Page 47
Myoglobin
Phylogenetic Similarity False Positive Expectation
Page 48
Evaluating alignments
• For a given alignment compute the PMs
• Determine the number of functional PMs
• Those identifying more functional PMs will be classified as better alignments
Page 51
Functional PMsPAl=blueMUSCLE=redBoth=green
(a)=enolase, (b)ammonia channel,(c)=tri-isomerase, (d)=permease,(e)=cytochrome