Top Banner

of 26

EUGM 2014 - Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for designing chemical libraries

Jul 07, 2015




As the usage of parallel synthesis in early stage drug discovery has evolved, medicinal chemists have demanded ever more sophisticated tools for the design and virtual screening of potential chemical libraries. We have created and deployed a chemical library design tool (LDT) using ChemAxon technology along with the Infocom nodes in KNIME. Users enumerate potential libraries with
Reactor, employing curated reactions and add standardized calculated properties. Custom KNIME nodes call back-end services on a high-performance computing grid to enable computationally intensive calculations (e.g. Open Eye ROCS) with result sets pushed back to the user on reconnection. Library profile shaping in Spotfire allows the selection of reaction sets with optimized properties, which are then pushed back into KNIME for further processing and export.

  • 1. A ChemAxon/KNIME based tool for designing chemical libraries Tim Parrott Dart NeuroScience September 25, 2013 Brock Luty Dart NeuroScience ChemAxon UGM

2. Dart NeuroScience Small molecules to maintain cognitive vitality (LTM) Currently about 200 FTEs with build-out expected at 260 Privately held LLC by a single individual 3. Scientific Computing Scientific Computing collaborates with other DNS Departments to deliver solutions that simplify and accelerate the drug discovery process. We rely on our (non-traditional) knowledge and experience in both Science and Technology to develop novel and efficient systems to meet this goal 4. Scientific Computing Groups Bioinformatics Philip Cheung Doug Fenger + 1 FTE Information Management + 1 Group Lead John Jaeger Tim Parrott James Harr Eileen Tompkins Heather Jones Methods Development Ron Blanford Daniel Garden Kevin Neal Hari Muddana + 1 FTE Computational Chemistry *Tami Marrone Meg McCarrick James Na Amy Shih Bill Sinko Project Support - Modeling - SBDD/Library Design - Apply Methods - Pre-LO/LO/PCC Data / Biz Analysis - Data Capture - Analytics - Data Access - QA/Scientific Support - Project Management Software Development - Informatics Software Development - Developing new methods - Enterprise Scale Architecture - RIA (MVC) with SOA - Extensions for ELN, Spotfire, IJC, etc Project Support - Target ID - Expression Analysis / Pathways - Novel Software algorithms - Enterprise Software (with Methods) 5. Background Dart NeuroScience (DNS) 200+ Scientists 50+ Chemists Parallel Synthesis Group About 20 chemists involved in the design and creation of chemical libraries We need a chemical library design tool ! 6. A Basic Chemical Library Design Tool Enumerate Products Calculate Properties Analyze & Filter Select Reactants DesignTest Analyze Synthesize 7. Goals SupportEase of Use Productivity Standardize calculations & reactions (services) Simplify: wrap processes and minimize import/export operations Enhance capabilities and speed by doing calculations remotely Constraints Limited IT/IM support Chemists already on software overload Approach = 8. Chemical Property Calculations, Reaction Enumeration Data Pipelining Visualization / Analytics 3D Scoring Platforms 9. Architecture Heavily invested in Service Oriented architecture (Rest Style API) with standardized DNS patterns Domain CRUD (Create, Read, Update, Delete) GUIs written for specific entities using MVC pattern (relying on Backbone.js and standardized DNS patterns) Traditional Stateless Computational Services (Property Calculation, Enumeration, etc) Services can be based on Scripts using command-line applications (primary use-case). Services can also be written on KNIME and run in this architecture. Move all the heavy lifting to the servers (automated parallelization). KNIME as a Service Orchestration Layer Application Service Database Brocks Geeky Slide 10. Tool Overview Selection & Configuration Panel Custom Nodes Spotfire Export 11. Reactant Selection Import curated classes of reactants (CRUD Service) 12. Reactant Selection Import list of Reagent Numbers (CRUD Service) 13. Reactant Deduplication Input Output Need to identify and remove functionally equivalent reactants (Comp Service) 14. Reaction Selection 15. Reactions: A Look under the Hood Reactor nodes can contain multi- step workflows. (Comp Services) Server-Side 16. Calculations 17. Clustering Server-Side 18. Calculations --- OpenEye ROCS ROCS output includes the Shape/Pose that scored best and the Tanimoto Score against that query. (Computational Service) 19. Pausing Local Execution 20. Export to Spotfire 21. Selections made in Spotfire 22. Spotfire Selections returned to KNIME New nodes with selected products & reactants appear in KNIME 23. Final Steps The library design plan contains separate sdf files for the products and each reactant, along with a .csv file listing how many times each reactant is used. The zipped file is parsed on import into a chemists electronic laboratory notebook. Stereochemical codes needed for registration are assigned based on structure. (Computational Service) 24. Load Library Design Plan into the Agilent ELN Custom Forms for planning and products tables 25. Summary June 2011 June 2012 Sept 2012 November 2012 April 2013 August 2013 Parallel Synthesis Group formed First release of Library Design Tool (LDT) Additional KNIME training Second release (Clustering, ROCS) Pausable Nodes, Deduplication RN Lookup, Stereo Code Assigner 40 Total Reactions 26. Acknowledgments Node Development Services & Deployment Testing and troubleshooting Management & PM loki der quaeler Ron Blanford Karen Do Kenny Leung Zach Young Daniel Garden Eileen Tompkins Andrew Burritt The SGC Team Melanie Nelson Heather Jones Brock Luty