Top Banner
An Introduction to Bioinformatics Molecular Biology Databases
23

An Introduction to Bioinformatics Molecular Biology Databases.

Dec 23, 2015

Download

Documents

Merryl Bradley
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: An Introduction to Bioinformatics Molecular Biology Databases.

An Introduction to Bioinformatics

Molecular Biology Databases

Page 2: An Introduction to Bioinformatics Molecular Biology Databases.

AIMS

OBJECTIVES

To introduce the major databases- nucleotide- protein

To explain how to search the appropriate databases

To explain how to retrieve information from databases

Choose appropriate databases for information retrieval

Use of Boolean operators to search databases

Retrieve nucleotide and protein sequence files

Page 3: An Introduction to Bioinformatics Molecular Biology Databases.

Introduction

• Hundreds!

• Databases of databases!

• Acronym rich!

• Subcomponents• organisms• structure• metabolism…….

• Searched• text, sequences

Page 4: An Introduction to Bioinformatics Molecular Biology Databases.

Historically

• 1960s •Mary Dayhoff - Protein Sequences

(Eck, R. V., and M. O. Dayhoff. 1966. Atlas of Protein Sequence and Structure 1966.

National Biomedical Research Foundation, Silver Spring, Maryland.)

• 1980s - explosion in DNA sequences• EMBL (European Molecular Biology Laboratory)• NIH (National Institute of Health) Genbank• DDBJ (DNA database of Japan)

• 1988• agreed on international collaboration

Page 5: An Introduction to Bioinformatics Molecular Biology Databases.
Page 6: An Introduction to Bioinformatics Molecular Biology Databases.

• Experimentally determined nucleotide sequence,• Inferred protein sequence

– EMBL, GenBank, DDBJ nucleotides– GenPept– PIR Protein Identification Resource proteins– SWISS-PROT

• Which to choose?

Primary Databases

}

Page 7: An Introduction to Bioinformatics Molecular Biology Databases.

Composite Databases

SWISS-PROT + PIR+ GenPept +

SWISS-PROT, Swissnew, Trembl, Tremblnew, Genbank, PIR, Wormpep and PDB

Page 8: An Introduction to Bioinformatics Molecular Biology Databases.

Secondary Databases

• Analytical results of primary databases

• Searching for related patterns

– Prosite– Pfam More on these later

Page 9: An Introduction to Bioinformatics Molecular Biology Databases.

Sub-Databases

• EST - Expressed Sequence Tags

• STS - Sequence Tagged Sites

• SNP - Single Nucleotide Polymorphisms

• OMIM - Online Medelian Inheritance in Man

Page 10: An Introduction to Bioinformatics Molecular Biology Databases.

Searching and Retrieval

• Entrez - National Center for Biotechnology Information

• SRS - European Bioinformatics Institute

• DBGET - Japan’s GenomeNet.

Capable of retrieving specific nucleotide or protein sequence.Provide links to additional related information.

Page 11: An Introduction to Bioinformatics Molecular Biology Databases.

Entrez

Page 12: An Introduction to Bioinformatics Molecular Biology Databases.

Entrez Tutorial

• Search for penicillin-binding genes• Search for Mycobacterium tuberculosis• Combine the searches• Scan the output

Q/ Are there any genes that code for penicillin binding in the Mycobacterium genome?

Example of a text based search to identify genes that have already been annotated.

Page 13: An Introduction to Bioinformatics Molecular Biology Databases.
Page 14: An Introduction to Bioinformatics Molecular Biology Databases.
Page 15: An Introduction to Bioinformatics Molecular Biology Databases.
Page 16: An Introduction to Bioinformatics Molecular Biology Databases.
Page 17: An Introduction to Bioinformatics Molecular Biology Databases.

#1 AND #2

Page 18: An Introduction to Bioinformatics Molecular Biology Databases.
Page 19: An Introduction to Bioinformatics Molecular Biology Databases.
Page 20: An Introduction to Bioinformatics Molecular Biology Databases.
Page 21: An Introduction to Bioinformatics Molecular Biology Databases.

SRS guide

Page 22: An Introduction to Bioinformatics Molecular Biology Databases.

Searching the Databases

• Subject

• Accession Numbers

• Author

e.g. AF208262

Page 23: An Introduction to Bioinformatics Molecular Biology Databases.

Boolean Operators

AND will locate all records containing both the words e.g. human AND protease

OR will locate all records containing either word not necessarily both e.g. human OR protease)

NOT will locate records containing one word, but NOT the other word e.g. human NOT protease