www.guidetopharmacology.org Open and Closed Antimalarial Drug Discovery: Comparing data Connectivity gaps and Disclosure Speed Dr Christopher Southan, Senior Database Curator, IUPHAR/BPS Guide to PHARMACOLGY (GtoPdb), University of Edinburgh BioIT Boston 2016, Wed 6 th ´ April, Track 11, Open Source Innovations 16:30 1 http:// www.slideshare.net/cdsouthan/antimalarial-drug-dscovery-data-di sclosure
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
www.guidetopharmacology.org
Open and Closed Antimalarial Drug Discovery: Comparing data Connectivity gaps
and Disclosure Speed
Dr Christopher Southan, Senior Database Curator, IUPHAR/BPS Guide to PHARMACOLGY (GtoPdb), University of Edinburgh
BioIT Boston 2016, Wed 6th ´April, Track 11, Open Source Innovations 16:30
Antimalarial research is the poster child for Open Source Drug Discovery (OSDD). However many leads compounds still have their origins in Traditional Closed Drug Discovery (TCDD) and uncertainty remains as to the differences. To provide an assessment, this work examined 32 recent antimalarial structures in terms of their PubChem connectivity. Of these, 21 had patent matches, only 23 linked to publications and only 21 had BioAssay records. Major data connectivity problems included 1) leads not findable by code name, 2) patents not cited in publications 3) leads not reciprocally linked to Plasmodium protein targets and pathways 4) name-to-structures only being declared years after patent disclosure. These issues will be contrasted with the Sydney University Open Source Malaria approach were open lab books are used to surface structures (e.g. as Google-findable InChIKey) and crowdsourced collaboration data close to real time, thereby shaving years of the discovery phase.
3
Outline
• Introduction to Open Source Drug Discovery (OSDD) • Differences to Traditional Closed Drug Discovery (TCDD) • Extracting antimalarial leads from the literature• Profiling structures in PubChem • A look into the MMV Pathogen Box • Introducing Open Source Malaria (OSM)• Profiling the OSM structure collection• Speed sharing • Google searching InChIKeys• Conclusions• Open structure sets• References and questions please
4
Introduction• The OSDD concept is not tied to any particular group• While antimalarials have become a poster-child for OSDD many leads still
come through TCDD route so boundaries between the two are blurred • OSDD has become a test bed (e.g. open data sets from GSK and others,
the Medicines for Malaria Ventures (MMV) “Malaria Box” and WIPO Re:Search IP sharing)
• Sydney Open Source Malaria project (@O_S_M) adheres to OSDD principles (see PMID 23985301)
• I have donated voluntary support to the OSM team since 2012 (i.e. in addition to my Guide to PHARMACOLOGY Senior Database Curator job)
• This has focused on structure searching and data surfacing • I blog on data connectivity in general, and for antimalarials in particular• The surfacing speed for structures reflect “shades of openness” that will be
discussed
5
Open vs closed research routes to new medicines
TCDD• Proprietary data • Patent filings • Leads maybe blinded by code
numbers• Papers after patents • No direct submissions to public
databases• Predominantly commercial
software and databases• Typically ~10 years R&D • Still the dominant model
OSDD• Open ELNs• No patent filings • Data surfaced rapidly for sharing• Open access papers• Submissions to public databases• Anyone can contribute• Crowdsourcing• Preference for open source
software and public databases• Potential to shorten research • Pure OSDD relatively rare
6
Recent review of leads - but• Link-free zone (except
for references) • PDF “tomb” with
images for structures• No chemical
specifications • No database
identifiers• No target protein
identifiers• DDD107498 was
blinded at that time (no structure)
• I mapped to PubChem CIDs as a community service
7
Consequently, much effort was neededto get from this to this
8
Getting name-to-structure out of primary papers: not trivial
• On a good day, MeSH curators will index the lead structures specified in PubMed and connect them to PubChem
• On a bad day (as in this case), they may record the name but without a link to a chemical structure
• The code name is still PubChem –ve after a year
9
Curatorial ferreting: DDD107498 structure and patent
IUPAC from supp dat > chemicalize.org > PubChem > SureChEMBL > SAR table
10
PubChem profile for 32 antimalarial lead structures
• Encouragingly, published output of antimalarial leads is increasing • However, challenges of curating and mapping are similar to those
encountered by the GtoPdb team for human targets and ligands • There is a grey zone between TCDD and OSDD and some leads are
patented• Authors and stakeholders should ensure their SAR is surfaced and
name-to-structure connected in databases (i.e. FAIR principles, see PMID 26978244)
• Gaps persist in mappings between leads, targets and pathways• The practice of OSDD by OSM and collaborators accelerates research• PubChem MyNCBI collections are useful for sharing structure sets
21
PubChem MyNCBI open structure sets• 16 clinical candidates from PMID 26000721
• 22 leads from various sources
• 114 from the Pathogen Box
• 250 from the OSM PubChem matches
n.b. Those engaged in antimalarial research can contact me if they need technical details and/or possible generation of new lists (e.g. CID subsets or patent extractions)