Top Banner
Biological Databases Biology outside the lab
18

Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Jan 03, 2016

Download

Documents

Susanna Cole
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Biological Databases

Biology outside the lab

Page 2: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Why do we need Bioinfomatics?

Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological information generated by the scientific community. This deluge of genomic information has, in turn, led to an absolute requirement for computerized databases to store, organize, and index the data and for specialized tools to view and analyze the data.

Page 3: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Information flux from data to decision

Biology, Chemistry and Pharmaceutical research generate an huge amount of data. Information analysis rate is smaller than data production.

Human Genome progect:

22.1 bilion bases sequenced but … what we do really know about it?

Page 4: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Bioinformatics

- Building and managing of biological databases (nucleotides, proteins, structures, small molecules, pathways, literature, …)

- Data mining and data analysis (Computational Biology)

- protein modelling ab initio – Homology modelling – simulations (Molecular Modeling)

Page 5: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Literature databases

http://www.ncbi.nlm.nih.gov/

Page 6: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Nucleotide databases

Page 7: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Protein databases

Uniprot databases:- Swiss-prot: provide a high level of annotation, minimal level of redundancy and high level of integration with other databases

- TrEMBL: a computer-annotated supplement of Swiss-Prot that contains all the translations of EMBL nucleotide sequence entries not yet integrated in Swiss-Prot.

NCBI protein database (meta-database containing sequences from Uniprot entries, PDB derived sequences and translation from predicted ORF in genebank)

Page 8: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Structural Database

Protein structures obtained by crystallography or NMR are stored in PDB.

Page 9: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Microarray Databases

GEOminibus SMD Stanford Microarray Database

Gene expression databases provides rough data of microarray expression.

Data originated by different experiments can be merged to obtain previously unidentified results.

Page 10: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

EST Databases

EST: Expressed Sequence Tags

5’ EST : These regions tend to be conserved across species and do not change much within a gene family

3’ EST: Because these ESTs are generated from the 3' end of a transcript, they are likely to fall within non-coding, or untranslated regions (UTRs), and therefore tend to exhibit less cross-species conservation than do coding sequences.Sequence Tagged Site (STS): help to locate a gene in the genome. 3’EST are a good source of STS

Available DBs:

Genebank – dbEST – Unigene

Page 11: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Tools

ORF finder Blast Multiple alignment Conserved Domain Identification Secondary structure and Folding Prediction

Page 12: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Example 1

A recombinant plasmid containing clone shows an interesting phenotype

sequencing

-Phylogenetically similar sequences

- Conserved Domain

Rough sequenceO

RF

iden

tification

In-frame sequenceBlast

Page 13: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

CDS

Page 14: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Example 2

Page 15: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Example 2

Page 16: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Example 2

Page 17: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Exampe 2

Page 18: Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Example 2

Tune the method

a) Increase window size in evaluating score

- increase local information integrating “environmental” data

- 2 residues window -> 2 frames

3 residues window -> 3 frames

….

b) Use degenerate matching methods (based on size, polarity, h-bond behavior, …)