DisGeNET: Applying Semantic Web and Network Analysis Approaches for the Integration and Analysis of Gene-Disease Associations for Translational Research and Drug Discovery Janet Piñero, Núria Queralt-Rosinach , Àlex Bravo, Ferran Sanz and Laura I. Furlong Integrative Biomedical Informatics Group, Research Programme on Biomedical Informatics; Hospital del Mar Medical Research Institute; Pompeu Fabra University Acknowledgements The authors thank the Open PHACTS partners, Michel Dumontier and the OpenLink staff for their input, collaboration and help. Funding: We received support from ISCIII-FEDER (PI13/00082, CP10/00524), from the IMI-JU under grants agreements nº 115002 (eTOX), nº 115191 (Open PHACTS)], nº 115372 (EMIF) and nº 115735 (iPiE), resources of which are composed of financial con-tribution from the European Union's Seventh Framework Pro-gramme (FP7/2007-2013) and EFPIA companies’ in kind contribu-tion, and the EU H2020 Programme 2014-2020 under grant agreements no. 634143 (MedBioinformatics) and no. 676559 (Elixir-Excelerate). The Research Programme on Biomedical Informatics (GRIB) is a node of the Spanish National Institute of Bioinformatics (INB). DISCOVERY PLATFORM 2 KNOWLEDGE BASE TOOLS EVIDENCE-BASED DISCOVERY CLINICIAN INTEROPERABILITY METADATA DATABASES & LITERATURE STANDARDS INTEGRATION OPEN http://www.disgenet.org/ RESEARCHER CURATOR BIOINFORMATICIAN & DEVELOPER DISCOVERABILITY COMMUNITY USE LARGE-SCALE EXTRACTION DIGITAL PUBLICATION, SHARING AND LINKING Usage stats (Ago2014-Ago2015): • 12,040 users, 22,696 sessions • 14,494 downloads • DisGeNET used in +20 publications, cited in +60 articles Registered: • biosharing • OMICtools • NeuroLex • Datahub Integrated knowledge: • Text mining extraction • Integration with well-curated data • Analysis • Discovery and decision-making Present in the Semantic Web: • URI/RDF/nanpublications • Machine-processable • Semantic integration • Links to the Linked Open Data cloud • Data analysis across domains IMPLEMENTATION STANDARDIZATION TRANSLATIONAL RESEARCH INTEROPERABILITY INTEGRATION EVIDENCE SEMANTIC WEB NETWORK ANALYSIS DRUG DISCOVERY WEB-BASED EXPLORATION KNOWLEDGE BASE TOOLS FOR EXPLORATION AND ANALYSIS REPRODUCIBILITY ACCESSIBILITY LHGDN S = W CURATED + W PREDICTED + W LITERATURE SYNTACTIC NORMALIZATION SEMANTIC Downloads Web Interface SPARQL endpoint Open Database License Programmatic access Metadata Digital objects • Data item-level description • Dataset-level description Transparency and validation • Tab separated plain text • SQLite • RDF • Trusty nanopublications http://www.disgenet.org/ http://rdf.disgenet.org/sparql/ http://opendatacommons.org/licenses/odbl/1.0/ Linked Data Browser http://rdf.disgenet.org/fct/ • Automatic analysis • Higher speed • Reduce error • Share results • Embed in workflows DisGeNET association type ontology Semanticscience Integrated Ontology (SIO) 4 • 11 common ontologies in NCBO BioPortal • RDF 1 • Nanopublications 3 • NCBI Gene ID • UMLS Concept Unique Identifiers • Normalized Identification Scheme http://rdf.disgenet.org/resource/gda/ + ID What are the diseases associated to melanocortin 4 receptor (MC4R)? What are the genes associated to Obesity? 429,111 Gene-Disease Associations What is the pattern of tissue expression of the genes associated to Obesity? References 1. Queralt-Rosinach, N., Piñero,J. , Bravo, À, Sanz, F. and Furlong, L.I. DisGeNET-RDF: harnessing the innovative power of the Semantic Web to explore the genetic basis of diseases, 2015 ( submitted). 2. Piñero, J., Queralt-Rosinach, N., Bravo, A., Deu-Pons, J., Bauer-Mehren, A., Baron, M., … Furlong, L. I. (2015). DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. Database, 2015(0), bav028 –bav028. 3. Queralt-Rosinach, N., Kuhn, T., Chichester, C., Dumontier, M., Sanz, F., and Furlong, L.I., Publishing DisGeNET as Nanopublications. Semantic Web Journal, ( to appear ), 1-10, 2015. 4. Dumontier, M., Baker, C. J., Baran, J., Callahan, A., Chepelev, L., Cruz-Toledo, J., … Hoehndorf, R. (2014). The Semanticscience Integrated Ontology (SIO) for biomedical research and knowledge discovery. Journal of Biomedical Semantics, 5(1), 2014. 5. Gray, A. J. G., Groth, P., Loizou, A., Askjaer, S., Brenninkmeijer, C., Burger, K., … Williams, A. J. (2014, January 1). Applying linked data approaches to pharmacology: Architectural decisions and implementation. Semantic Web. IOS Press. doi:10.3233/SW-2012- 0088 Several formats and models • Provenance (PubMed ID, source) • DisGeNET score (evidence) Context metadata for each G-D pair Sentence description • 17,181 Genes • PANTHER class • 14,610 Diseases • MeSH class • Semantic Web platform to answer complex questions for the pharmacological field 5 What is the pattern of tissue protein expression of the genes associated to Obesity that are involved in the same pathway, and retrieve all bioactive small molecules hitting each target (filter minEx-pChembl=5)? • Large-scale integration across domains • DisGeNET Cytoscape plugin 60% complex, 36% rare/Mendelian, and 4% infectious diseases DO MSH OMIM NCI ORDO ICD9 19 58 38 33 13 12 Recent findings