Top Banner
Simon Twigger, PhD MCW Driving Biological Project 1 Monday, September 27, 2010
22
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: NCBO DBP

Simon Twigger, PhD

MCW Driving Biological Project

1Monday, September 27, 2010

Page 2: NCBO DBP

Rat Genome Database

2Monday, September 27, 2010

Page 3: NCBO DBP

Whats the problem?• large scale repositories

with unused or inaccessible information

• How can these databases be made more useful?

• How to help researchers find and use this information to connect genes to disease?

3Monday, September 27, 2010

Page 4: NCBO DBP

Rat researchers ask...

What tissue is this gene expressed in?

What expression data is known for SD (aka SD/NHsd,

Harlan Sprague Dawley, Sprague Dawley) rats?

Are any of these genes associated with my

phenotype?Has this gene been seen in the brain?

What rat expression studies have been done on Mammary Cancer(aka breast neoplasms/breast

cancer/cancer of the breast, breast carcinoma...)?

Has anyone done any expression studies using congenic rats?

Monday, September 27, 2010

Page 5: NCBO DBP

Create AnnotationJobs & Queue Up

Q-In

Put results in toqueue for save

ParseResults

Index textat OBA

1..n Annot. Workers

Results saved toGMiner database

Q-Out

RabbitMQ

GEO Records

What's the strategy?• Focus on GEO

(microarray)

• Use NCBO annotator to markup text, review annotations and then use for tools and visualization

• Combine annotations with biological data to derive new insights.

5Monday, September 27, 2010

Page 6: NCBO DBP

Current Ontologies

http://bioportal.bioontology.org/Monday, September 27, 2010

Page 7: NCBO DBP

7Monday, September 27, 2010

Page 8: NCBO DBP

8Monday, September 27, 2010

Page 9: NCBO DBP

Progress

Monday, September 27, 2010

Page 10: NCBO DBP

Linking annotations to data

Tm2d1

RGD1306410

Svs4

Hbb

Scgb2a1

Alb

Monday, September 27, 2010

Page 11: NCBO DBP

Tm2d1

RGD1306410

Svs4

Hbb

Scgb2a1

Alb

+

Hbb is_expressed_in rat kidneyTm2d1 is_expressed_in rat kidney

Human (U133, U133v2.), Mouse (430, U74, U95) and Rat (U34a/b/c, 230, 230v2)62,000 samples x ca. 25,000 genes/sample = 1.5B data points

Linking annotations to data

Monday, September 27, 2010

Page 12: NCBO DBP

Probeset results on GMiner

Probeset L08490cds_at for Gabra1 - gamma-aminobutyric acid (GABA) A receptor, alpha 1

Hs GABRA1

Monday, September 27, 2010

Page 13: NCBO DBP

Strain 1 Strain 2!=

Component

Function

Process

G

G

Pathway

Anatomy(Kidney)

G G G

QTLHypertensive

Phenotype

Hypertension

Monday, September 27, 2010

Page 14: NCBO DBP

QTL Gene ‘Highlighter’

G G G

QTL

Disease/Pheno.

AllegroGraph

GMiner RGD OBO etc

G

Monday, September 27, 2010

Page 15: NCBO DBP

RDF/OWL sourcesCell Ontologyhttp://www.berkeleybop.org/ontologies/obo-all/cell/cell.owl

Mouse Adult Gross Anatomyhttp://www.berkeleybop.org/ontologies/obo-all/adult_mouse_anatomy/adult_mouse_anatomy.owl

Mammalian Phenotypehttp://www.berkeleybop.org/ontologies/obo-all/mammalian_phenotype/mammalian_phenotype.owl

GO Functionhttp://www.berkeleybop.org/ontologies/obo-all/molecular_function/molecular_function.owl

GO Processhttp://www.berkeleybop.org/ontologies/obo-all/biological_process/biological_process.owl

GO componenthttp://www.berkeleybop.org/ontologies/obo-all/cellular_component/cellular_component.owl

Monday, September 27, 2010

Page 16: NCBO DBP

Rat Genome Database

16

Wide variety of data types - genomic and physiological many with corresponding ontologies

Monday, September 27, 2010

Page 17: NCBO DBP

Monday, September 27, 2010

Page 18: NCBO DBP

RGD->RDF

Existing RGD ‘object types’ & mappings to SO

Monday, September 27, 2010

Page 19: NCBO DBP

RGD Gene

Monday, September 27, 2010

Page 20: NCBO DBP

RGD QTL

Monday, September 27, 2010

Page 21: NCBO DBP

QTL Highlighter

• Rails source code will be available on GitHub• RDFizer (ruby) http://github.com/simont/MCW-RDF

Monday, September 27, 2010

Page 22: NCBO DBP

Next Steps• Register PURL for RGD

• Create RGD core object ontology (OWL/RDF)

• Select appropriate URIs for RGD data

• Ontology annotations - how best to represent in triple store?

• Export GMiner data to RDF-> Triple Store

• Document & refine biological use cases related to candidate gene selection/evaluation

• Identify additional data required for candidate gene selection, RDFize as appropriate, load into triple store.

• Connections to other RDF collections/LOD, etc.?

Monday, September 27, 2010