Top Banner
C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion
18

C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

Mar 31, 2015

Download

Documents

Abigale Arscott
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

CENTR

FORINTEGRATIVE

BIOINFORMATICSVU

E

Walter Pirovano24 Oct 2007

Genome analysis

Lecture 11: literature discussion

Page 2: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

[2] 24 Oct 2007 - Walter Pirovano - Genome analysis

Papers• Consensus sequences improve PSI-BLAST

through mimicking profile-profile alignmentsDariusz Przybylski and Burkhard Rost

Nucleic Acids Research 2007

• Heads or Tails: A Simple Reliability Check for Multiple Sequence AlisngmentsGiddy Landan and Dan Graur

Molecular Biology and Evolution 2007

Page 3: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

[3] 24 Oct 2007 - Walter Pirovano - Genome analysis

1st paper

Page 4: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

[4] 24 Oct 2007 - Walter Pirovano - Genome analysis

BLAST and PSI-BLAST• BLAST is a sequence-sequence method:

Sequence (query) – Sequence (nr database)

• PSI-BLAST is a profile-sequence method:RUN 1: just like normal BLASTRUN 2: Profile (query) – Sequence (nr database)

Page 5: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

[5] 24 Oct 2007 - Walter Pirovano - Genome analysis

Accuracy vs. Speedthe usual dilemma …

Sequence – Sequence

Profile – Sequence

Profile – Profile

AC

CU

RA

CY S

PEED

Page 6: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

[6] 24 Oct 2007 - Walter Pirovano - Genome analysis

Consensus sequences - 1• “1-D semplification of the sequence profile”

• Compromise between accuracy and speed

ACD..Y

Profile

Sequence 1

F A T N M G T S D P P T

Sequence 2

F V T N M N N S D G P T

Consensus

F * T N M * * S D * P T

Page 7: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

[7] 24 Oct 2007 - Walter Pirovano - Genome analysis

Consensus sequences - 2• How can we display consensus sequences?

• Replace the complete sequence by the consensus sequence (100%)

• Replace only local parts by consensus segments(top 50% & low 50%)

• Tests on:

• Sequence – Consensus• Consensus – Consensus• Profile – Consensus

Page 8: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

[8] 24 Oct 2007 - Walter Pirovano - Genome analysis

Method

Page 9: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

[9] 24 Oct 2007 - Walter Pirovano - Genome analysis

Evaluation of results• Ability to identify functionally related

proteins

• Correctly align them based on structural alignments

• Function is more conserved than Structure

Page 10: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

[10] 24 Oct 2007 - Walter Pirovano - Genome analysis

Functional evaluation: SCOP

foldssuperfamilie

sfamilies

classes

Page 11: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

[11] 24 Oct 2007 - Walter Pirovano - Genome analysis

Structural evaluation: 3D model quality

query template

MAGFWIL MLGKSLL

• Making the model: simply copy coordinates

• Test model quality through LGA superposition (query model with query structure)

• ‘Golden standard’: structural alignment of known structure of query & template with MAMMOTH

Page 12: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

[12] 24 Oct 2007 - Walter Pirovano - Genome analysis

Final sets for alignment test

                                                       

                                                                                                         

• Set 1: most related, non-trivial pairs(no. = 1647)

• Set 2: more difficult, most diverged(no. = 5551)

Page 13: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

[13] 24 Oct 2007 - Walter Pirovano - Genome analysis

Results functional analysis

                                                       

                                                                                                         

Page 14: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

[14] 24 Oct 2007 - Walter Pirovano - Genome analysis

Results structural analysis

                                                       

                                                                                                         

Page 15: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

[15] 24 Oct 2007 - Walter Pirovano - Genome analysis

2nd paper

Page 16: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

[16] 24 Oct 2007 - Walter Pirovano - Genome analysis

There are quite some multiple alignment methods ....

PRALINE

... but what about accuracy?

Page 17: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

[17] 24 Oct 2007 - Walter Pirovano - Genome analysis

Benchmarking: usual on structural alignments.• There are several alignment benchmarks, such

as BAliBASE, HOMSTRAD or SABMARK

• But they can only tell us the alignment quality on their predefined sets

• Alignment methods need to define quality and consistency criteria.

Page 18: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Walter Pirovano 24 Oct 2007 Genome analysis Lecture 11: literature discussion.

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

[18] 24 Oct 2007 - Walter Pirovano - Genome analysis

Heads-or-Tails method

?