Top Banner
© Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics
22

© Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Dec 16, 2015

Download

Documents

Nigel Summers
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

© Wiley Publishing. 2007. All Rights Reserved.

How Most People UseBioinformatics

Page 2: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Learning Objectives

Get an overview of the most basic tools used in bioinformaticsGet an overview of Medline, the virtual libraryGet an idea of what the rest of the course will be about

Page 3: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Outline

Retrieving scientific information with Medline Fetching the protein or the DNA sequence you

need Searching a database with BLAST Making a ClustalW alignment

Page 4: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

PubMed/Medline

PubMed is a database containing all the recent scientific publications in biology

PubMed is free

You can search PubMed using any keyword you are interested in.

Page 5: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Searching PubMed Rapidly

Open

www.ncbi.nlm.nih.gov/pubmedType your favorite keywordsPress Return or Enter

Page 6: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Searching PubMed Precisely

Click the Limits tab

Check the boxes you are

interested in, such as• Review• English• AIDS

Page 7: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Searching PubMed Very Precisely

Restrict the search with fields• [AU] Author• [SO] Source (journal)• [TI] Title• [AD] Address• [MH] Keywords

The words will be searched only

in the corresponding fields

Page 8: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Tips for Searching Medline

Use OR and AND to refine your queriesAdd the initials of the paper’s author, as in Smith TF Save the PMID of your papers

• Very precise• Very short

Medline contains only papers published after 1965Use no more than 10 names for papers before 1995

Page 9: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Retrieving Protein Sequences in Swiss-Prot

Swiss-Prot is a database containing all the

proteins with known functionsSwiss-Prot is available from the ExPAsy server at

www.expasy.ch/sprot/ExPASy: Expert Protein Analysis SystemExPASy contains many useful online tools

Page 10: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

The Swiss-Prot Entry

Each Swiss-Prot entry is dedicated to a protein

A Swiss-Prot entry summarizes everything that is known about a given protein

The entry contains functional information and links to other databases mentioning this protein

Page 11: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Typical Swiss-Prot Entry

Protein nameProtein functionBibliographyLinks to other databases

• Structure• Domains• Function

Page 12: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Looking for DNA Sequences

There are many types of DNA sequencesThe most common are

• Regulatory regions, often before genes• Untranslated regions, often around the genes• Protein-coding regions• Intergenic regions (between the genes)

All these sequences can be found in GenBank

Page 13: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Fetching a DNA Sequence at the NCBI

Navigate to

www.ncbi.nlm.nih.gov/Genbank/Type in a keyword.Press Return or Enter.

You get a list of entries matching

your keyword.Point, click, and explore…

Page 14: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Searching for Your Sequence with BLAST

BLAST: Basic Local Alignment Search Tools

Compares your sequence with all other sequences in your favorite database

Returns the most similar sequence

Page 15: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Choosing Your BLAST Program

Navigate to the NCBI

BLAST site at

www.ncbi.nlm.nih/BLAST

Select your BLAST

program:• Blastn DNA• Blastp Protein

Page 16: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Blasting a Protein Sequence

Cut and paste your

sequenceClick the BLAST button at

the bottom of the screenWait

Page 17: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Reading a BLAST Output

Every line is a hit Best hits come first Low E-Value = good hit E-value >1 = bad hit

G means a link onto a complete genome

U means a link to UniGene, the transcript database

Page 18: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Multiple-Sequence Alignments(MSAs)

Multiple alignments reveal common features between

sequencesMultiple alignments are useful for

• Comparing very different sequences• Making phylogenetic trees• Making structure predictions

Multiple-sequence alignments are abbreviated as MSAs

Page 19: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Making a Multiple-Sequence Alignment

Identify a set of related sequencesDo a BLAST of your favorite sequenceChoose a method:

• www.ebi.ac.uk/clustalw ClustalW Popular• www.tcoffee.org T-Coffee Accurate• www.drive5.com/muscle/ Muscle Fast• www.tcoffee.org M-Coffee Consensus

Page 20: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Making an MSA with M-Coffee

Open www.tcoffee.org Click MCoffee::Regular Cut and paste your sequences Submit your MSA

Page 21: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Making Sense of Your MSA

Positions are marked:• Completely conserved = asterisk ( * )• Highly conserved = colon (:)• Conserved = period (.)

Look for highly conserved blocks:• The red box on this slide shows a highly conserved block.• These blocks are often functionally important positions.

Page 22: © Wiley Publishing. 2007. All Rights Reserved. How Most People Use Bioinformatics.

Going Farther

Bioinformatics is all about getting knowledge without

having to make real-world experiments.

Many more details in later chapters:• Databases: Chapters 3 and 4

• BLAST: Chapter 7

• Multiple-sequence alignments: Chapters 9 and 10