The MGED Ontology: Providing Descriptors for
Microarray Data
Trish WhetzelDepartment of Genetics
Center for Bioinformatics
University of Pennsylvania
• CBIL– Chris Stoeckert– Angel Pizarro– Elisabetta Manduchi
• EBI– Helen Parkinson– Susanna Sansone
• TIGR– Joe White
• Stanford– Cathy Ball
Acknowledgements
• NCICB– Gilberto Fragoso– Liju Fan– Mervi Heiskanen
• Others– Paul Spellman– John Matese– Helen Causton
• Ontology Mailing List
MGED Society
• International organization• Comprised of biologists
computer scientists, and data analysts
• Aims to facilitate the sharing of functional genomics data generated by microarray and proteomics experiments– Establish standards for
microarray data annotation– Create microarray databases– Promote sharing of high
quality, well-annotated data
www.mged.org
MGED Standardization Efforts
• MIAME– The formulation of the minimum information required about a
microarray experiment in order to interpret and verify the results.
• MAGE– The establishment of a data exchange format (MAGE-ML) and
an object model (MAGE-OM) for microarray experiments.
• Ontololgy Working Group– The development of an ontology to describe microarray
experiments and in particular the biological material (biomaterial) used in these experiments.
• Transformations– The development of recommendations regarding microarray
data transformations and normalization methods.
Microarray Information to be Shared
Figure from:David J. Duggan et al. (1999) Expression Profiling using cDNA microarrays. Nature Genetics 21: 10-14
MGED Ontology (MO)
• Purpose– Provide standard terms for the annotation of
microarray experiments
• Benefits– Unambiguous description of how the
experiment was performed– Structured queries can be generated
• MGED Ontology concepts derived from the MIAME guidelines/MAGE-OM
MGED Ontology developmenthttp://mged.sourceforge.net/ontologies/MGEDontology
.php
• Oiled• File formats
– Html file– Daml file– NCI DTS Browser
MGED Ontology Class Hierarchy
• MGED CoreOntology– In synch with MAGE v.1– Stable class structure
• MGED ExtendedOntology– Classes for additional
terms as the usage of MO expands for genomics technologies
Relationship ofMO to MAGE-OM
• MO class hierarchy follows that of MAGE-OM– Association to OntologyEntry
• MO provides terms for these associations by: – Instances internal to MO– Instances from external ontologies
• Take advantage of existing ontologies
Relationship ofMO and MAGE-OM
MO and References to External Ontologies
MO and References to External Ontologies
Desirable Microarray Queries
• Return all experiments with species X examined at developmental stage Y– Sort by platform type– Which are untreated? Treated?
• Treated with what compound?• How comparable are these results?
• These questions can be asked of all experiments annotated using the MGED Ontology.
MO and Structured Queries
Future Work
• Convert to OWL– W3C standard ontology language– Expressivity
• Add terms to describe– Data transformation and normalization
methods– Protocol types used by the Protein Data
Bank
Future Work cont.
• Expand the MGED Extended Ontology by adding classes and terms to describe new domains and technologies– Toxicogenomics, ecotoxicogenomics and
pharmacogenomics …• A public forum for developing internationally
compatible and public infrastructure for reporting array-based toxicogenomics.
– Protein Standards Initiative• Defines community standards for data
representation in proteomics to facilitate data comparision, exchange and verification.
Links
• mged.org• http://mged.sourceforge.net/
ontologies/MGEDontology.php
The Computational View of Microarray Information
Need an ontology to unambiguously represent this information.
Issues to Discuss
• Burning Issues– Developing MO in synch with related efforts
(MAGE-OM v.2.0)– Use/presentation in annotation forms– Coverage of other technologies and
biological domains
• Flame retardant structure– ExtendedOntology
• Space to add new classes, terms and their relationship to one another
Relationship of MO and MAGE-OM
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Microarray Information to be Shared
Microarray Information to be Shared
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
ExperimentSample
RNA Extract
Labeled nucleic acid
Protocols
Hybridizations
Genes
Array Design
Microarray
Gene expression data matrix
normalization
integration