Top Banner
BioSQL: A Generic relational model for Bioinformatics Chandan Kumar Deb 10272 Ph.D. (Computer Application) BI-691
56
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Bio sql presentation

BioSQL: A Generic relational model for Bioinformatics

Chandan Kumar Deb10272

Ph.D. (Computer Application)

BI-691

Page 2: Bio sql presentation

Contents

Generic Data Model Overview of BioSQL SchemaPreface of BioSQLDependency of BioSQLIntroductionInstallation BioSQL

Page 3: Bio sql presentation

Contents

Application of BioSQLAdvantages of BioSQLLimitation of BioSQLConclusionReferences

Page 4: Bio sql presentation

IntroductionFor database management Relational model is very important

Conceptualization of real world thing into logical model

First formulated and proposed in 1969 by Eadger F. Codd

Logical model is use making relation and their relationship

Page 5: Bio sql presentation

Introduction..

Relational Model

• Table• Tuple

• Relation Instance• Relation

schema• Relation

Key•Attribute

Domain

• Key Constraint• Domain Constraint•Referenti

al Integrity

Constraint

Page 6: Bio sql presentation

Introduction

This model is represented in terms of tuples, grouped into relations

A database organized in terms of the relational model is a relational database

Relational data model is the primary data model

This used widely around the world for data storage and processing

Page 7: Bio sql presentation

Introduction..

Page 8: Bio sql presentation

Generic Data Model

Page 9: Bio sql presentation

Generic Data ModelThe generic data model is the generalization of the conventional data model

This generic data model defines the standardised relation types

Consensus among the different Relational Modeler of can produce a generic model of a particular domain

Page 10: Bio sql presentation

Preface of BioSQL

Page 11: Bio sql presentation

Preface of BioSQL

Generic Data Model

Ewan Birney started BioSQL in 2001

Major Redesign and Refactorings 2002-2003

PhyloDb module added at 2006

V1.0 released in March 2008

Page 12: Bio sql presentation

Preface of BioSQL

Not a Query Language, It is a schema/dbmodel!!!

Page 13: Bio sql presentation

Preface of BioSQLCovering sequences, features, sequence and feature annotation, a reference taxonomy, and ontologies

Required highly normalized relational model

Local storage of global biological data

Page 14: Bio sql presentation

Overview of BioSQL shcema

Page 15: Bio sql presentation

BioSQL schema is not strongly typed paradigm

Derived entity always is in object oriented sense

Weakly typed paradigm

Generic, but can hold any number of specialization

Overview of BioSQL schema

Page 16: Bio sql presentation

Annotation Bundle

Overview of BioSQL schema

SeqfeatureWith

locationAnd

Annotation

Ontology term and

Relationship

Bioentry with taxon and names

spaces

Page 17: Bio sql presentation

Schema overview

BioEntry&Taxon

BiodatabaseBioentryBiosequenceBioentry

RelationshipTaxonTaxon Name

Page 18: Bio sql presentation

BioEntry

Core entity of BioSQL

Track any single entry or record in a biological databasesThe BIOENTRY contains information about the record's public name, public accession and version

Page 19: Bio sql presentation

BioDatabase

A BIODATABASE is simply a collection of bioentries

one BIOENTRY may only belong to one BIODATABASE

one BIODATABASE may contain many bioentries

Page 20: Bio sql presentation

BioSequence

In BioSQL, all relation have bioentries

BIOSEQUENCE table contains the raw sequence information associated with a BIOENTRY

Alphabet information ('protein', 'dna', 'rna')

One to One Relationship with BIOENTRY

Page 21: Bio sql presentation

BioEntryRelationship

BIOENTRY may themselves be related to one another

(e.g., a PDB record may be composed of multiple subrecords for separate chains)

Page 22: Bio sql presentation

Taxon,Taxon Name

Basic taxonomic information about the organism to which a given BIOENTRY refers

Reflect the structure of NCBI's taxonomy database

Each BIOENTRY can be associated with only one taxon

Many BIOENTRY can be associated with the same taxon

Page 23: Bio sql presentation

Annotation Bundle

Overview of BioSQL shcema

SeqfeatureWith

locationAnd

Annotation

Ontology term and

Relationship

Bioentry with taxon and names

spaces

Page 24: Bio sql presentation

Schema overview

Seqfeatures Location &Annotation

LocationSeqFeatureSEQFEATURE_RELATIONSHIPLocationQ.valueS.Q.ValueS.F DBxref

Page 25: Bio sql presentation

Seqfeature and Location

Semantic of Sequence

Describing the stop and start coordinates

and strand

Page 26: Bio sql presentation

Annotation Bundle

Overview of BioSQL shcema

SeqfeatureWith

locationAnd

Annotation

Ontology term and

Relationship

Bioentry with taxon and names

spaces

Page 27: Bio sql presentation

Schema overview

Ontology term and RelationshipTerm RelationTermTerm SynonymTermdbxrefOntology

Page 28: Bio sql presentation

Term and Ontology

Term is used to "label" a seqfeature's

name

An ontology is essentially a

dictionary of terms in a somewhat-

controlled vocabulary

Page 29: Bio sql presentation

Annotation Bundle

Overview of BioSQL shcema

SeqfeatureWith

locationAnd

Annotation

Ontology term and

Relationship

Bioentry with taxon and names

spaces

Page 30: Bio sql presentation

Schema overview

Annotation BundleReferencesBioentryReferencesCommentDbxrefBioentryDbxrefB&D QValue

Page 31: Bio sql presentation

Annotation Bundle

Overview of BioSQL shcema

SeqfeatureWith

locationAnd

Annotation

Ontology term and

Relationship

Bioentry with taxon and names

spaces

Page 32: Bio sql presentation
Page 33: Bio sql presentation

Dependency of BioSQL

Page 34: Bio sql presentation

Dependency of BioSQL

Page 35: Bio sql presentation

Installation of BioSQL

Page 36: Bio sql presentation

Installation of BioSQL

Page 37: Bio sql presentation

Installation of BioSQL

http://www.biosql.org/wiki/Downloads

Page 38: Bio sql presentation

Installation of BioSQL

Page 39: Bio sql presentation

Installation of BioSQL

Page 40: Bio sql presentation

Installation of BioSQL

Page 41: Bio sql presentation

Installation of BioSQL

Page 42: Bio sql presentation

Installation of BioSQL

Local MySQL Database

Page 43: Bio sql presentation

Advantages of BioSQL

Page 44: Bio sql presentation

The BioSQL project provides a well thought out relational database schema for storing biological sequences and annotations

Advantages of reusability

Compatible with several programming languages like BioPython, BioPerl, BioJava, BioRuby etc

Flexible storage of data via a key/value pair model

Advantages of BioSQL

Page 45: Bio sql presentation

Extensible with the required situation

Overall data model based on GenBank flat files

It also allows great flexibility in choosing the data used by Snapshot since sequence data from any source, including online databases

locally generated sequence data can be added

Advantages of BioSQL

Page 46: Bio sql presentation

Application of BioSQL

Page 47: Bio sql presentation

Application of BioSQL

Page 48: Bio sql presentation

Application of BioSQL

Page 49: Bio sql presentation

Limitation…

Page 50: Bio sql presentation

Limitation…

This is a single user solution

This is the least flexible since the database can not be shared

No Consideration of protein secondary structure prediction

Page 51: Bio sql presentation

Demonstration

Page 52: Bio sql presentation

Conclusion…

Page 53: Bio sql presentation

Conclusion…Local ‘GenBank’ with random access

‘GenBank’ in Relational format

Easy load of NCBI taxonomy data into Local DB

Integrated sequence and annotation databases

Handy Tool For Bioinformatics Community

Page 54: Bio sql presentation

References

•http://biojava.org/wiki/BioJava:Tutorial:Installing_and_using_BioSQL

•http://biopython.org/wiki/BioSQL

•http://biosqlweb.appspot.com/

•http://en.wikipedia.org/wiki/Generic_data_model

•http://userweb.eng.gla.ac.uk/umer.ijaz/bioinformatics/BIOSQL_tutorial.pdf

Page 56: Bio sql presentation

Thank you