Top Banner
Sequence Similarity
24

Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Jul 04, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Sequence Similarity

Page 2: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Why study sequence similarity?

• Possible indication of common ancestry

• Similarity of structure implies similar biological function – even among apparently distant organisms

• Example context: establishing possible causal relationship between wide use of antibiotics in agriculture and spread of antibiotic resistant bacteria

Page 3: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Antibiotic resistant bacteria

• have evolved rapidly

• can thrive when antibiotics kill non-resistant bugs

• horizontal gene transfer can speed development of antibiotic resistance

Source: http://textbookofbacteriology.net/themicrobialworld/bactresanti.html

Page 4: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Figure 3.2: Vertical and horizontal gene transfer

Page 5: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Figure 3.3: How exposure to antibiotics selects for the survival of resistant cells in a population of bacteria

Page 6: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Figure 3.4: A plasmid carrying an antibiotic-resistance gene can be transferred to a new cell by conjugation

Page 7: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Antibiotic resistant bacteria

• Widespread use of antibiotics means non-resistant strains die, leaving resistant strains to survive and multiply; phenomenon observed in hospitals, care centers, etc.

• Once some bacteria in environment are resistant, HGT can occur & spread resistance faster than would otherwise occur (through mutation)

Page 8: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Antibiotic resistant bacteria

• Use of antibiotics common in agriculture

• Presence in human pathogens of resistant genes that are highly similar to genes found in animals would provide evidence that HGT has occurred

Page 9: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Gene similarity

• Homologues: similar sequences– homology

– homologous

• Orthologs: a similar gene appears in two different organisms where– several other such similarities occur

– organisms have common evolutionary ancestry

• Xenologs: similar gene found in organisms that have little else in common – evidence of HST

Page 10: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Similarity: how close is close?

• Proteins considered homologous if 25% of residues are identical

• DNA homologous with 70% identity

• Threshold level for HST: 95% identity

Page 11: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Establishing homology: alignment

• Match sequences in meaningful way

• Account for differences in sequence length due to indels:

– insertions

– deletions

• Scoring system based on closeness of match

Page 12: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

BLAST: Basic Local Alignment & Search Tool

• Versions exist to compare

– protein – protein

• blastp: use when you want to learn about function of protein

– protein – nucleotide

• tblastn: used to compare protein with DNA to discover new genes encoding simple proteins

– nucleotide – nucleotide

• blastn: we’ll use this to look for HGT evidence

Page 13: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

BLAST servers

• Home server at NCBI

• Other servers available worldwide

– BLAST servers very popular (and busy)

– Japan is sleeping when it’s morning in the USA

– Europe is sleeping when it’s afternoon in the USA

Page 14: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Using blastn

• Start with query sequence – nucleotide sequence you want to investigate

• BLAST compares query with every GenBanksequence

– performs alignment

– reports matches with high degree of similarity

Page 15: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Using blastn

• Point browser to NCBI website

– choose BLAST on home page

– scroll down to Basic BLAST and choose nucleotide

Page 16: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Using blastn

• Paste your query sequence in the window, as shown:

Page 17: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Using blastn

• Scroll down to the next box on the page, and select the database to be searched (Nucleotide, in this case)

Page 18: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Using blastn

• Scroll down to the BLAST button and click it

• Then wait …

• Eventually, you’ll see a screen like this:

Page 19: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

BLAST results

• Graphical summary

– query sequence at top

– each bar represents portion of another sequence similar to query

• red: most similar – homologous to query

• pink: not as good

• green: borderline

• blue/black: “twilight zone”

Page 20: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

BLAST results: graphics section

Page 21: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

BLAST results: description section

Page 22: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

BLAST results: description section

• Accession: database entry’s GenBankaccession number

• Description: usually identifies organism, some characteristics of sequence

• Scores: based on number of matches in alignment

• E-value: statistical significance of score

Page 23: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

E-value

• Estimate of the number of times a match could have been produced by chance

• The lower the e-value, the greater the significance:– greater similarity between query & target

– greater confidence of homology

– identical sequences have e-value of 0; anything above .001 is considered insignificant

• E-values are written in scientific notation form

Page 24: Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a similar gene appears in two different organisms where –several other such similarities

Alignment section