Top Banner
ChemModLab: A Web- ChemModLab: A Web- based Cheminformatics based Cheminformatics Modeling Laboratory Modeling Laboratory S. Stanley Young + ECCR S. Stanley Young + ECCR and and ChemSpider Teams ChemSpider Teams
26

ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

Jan 01, 2016

Download

Documents

Barry Price
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

ChemModLab: A Web-ChemModLab: A Web-based based

Cheminformatics Cheminformatics Modeling LaboratoryModeling Laboratory

S. Stanley Young + ECCRS. Stanley Young + ECCR and and ChemSpider TeamsChemSpider Teams

Page 2: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

S. Stanley Young + ECCR and S. Stanley Young + ECCR and ChemSpider TeamsChemSpider Teams

ChemSpider : A Web-based ChemSpider : A Web-based Chemical Informatics ResourceChemical Informatics Resource

Page 3: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

3

What is What is ChemSpider?ChemSpider?

ChemSpider is a molecular structure-ChemSpider is a molecular structure-centric web service for chemists:centric web service for chemists: Chemical structure drawing, manipulation, Chemical structure drawing, manipulation,

visualization, modeling & databasingvisualization, modeling & databasing Web location to deposit, curate and enhance Web location to deposit, curate and enhance

data associated with chemical structuresdata associated with chemical structures Web structure-based access to federated Web structure-based access to federated

chemistry databases representing chemical chemistry databases representing chemical vendors, literature, online data, patents and vendors, literature, online data, patents and other forms of chemistry data other forms of chemistry data

Page 4: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

4

How do people generally use How do people generally use ChemSpider?ChemSpider?

Searching for chemical structures, in rank Searching for chemical structures, in rank order, via:order, via: Registry numbers, trade names and synonyms. Registry numbers, trade names and synonyms. Structure identifiers such as SMILES or InChIStructure identifiers such as SMILES or InChI Intrinsic properties: commonly mass-based Intrinsic properties: commonly mass-based

searches executed by mass spectrometristssearches executed by mass spectrometrists By systematic names: IUPAC or CAS Index nameBy systematic names: IUPAC or CAS Index name

Generation of physicochemical propertiesGeneration of physicochemical properties Text-based searching of Open Access Text-based searching of Open Access

articlesarticles

Page 5: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

5

ChemSpider Status ChemSpider Status August 2007August 2007

Online database of over Online database of over 16.5 million16.5 million structures structures Systems in place for: Systems in place for:

Single structure and data collection depositionsSingle structure and data collection depositions Association of analytical data with structuresAssociation of analytical data with structures Ability to curate data for each individual recordAbility to curate data for each individual record

Indexing of and Integration to:Indexing of and Integration to: Over 70 individual databasesOver 70 individual databases Patents from the US, European and Asian Patent officesPatents from the US, European and Asian Patent offices

Text-based searching of over Text-based searching of over 50,000 Open Access 50,000 Open Access articlesarticles

Over a thousand unique users access ChemSpider Over a thousand unique users access ChemSpider per dayper day

Page 6: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

6

Flexible Boolean SearchingFlexible Boolean Searching

Page 7: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

7

Predicted Properties Details Predicted Properties Details “Prozac”“Prozac”

Page 8: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

8

Search result: 49 hits in 2.8 Search result: 49 hits in 2.8 secondsseconds

Page 9: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

9

Integrated Visualization ToolsIntegrated Visualization Tools

Page 10: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

10

External Integrations - External Integrations - WikipediaWikipedia

The links between Wikipedia and ChemSpider are formed automatically

Page 11: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

11

What is What is ChemModLab?ChemModLab?

ChemModLab is a Web Service for building ChemModLab is a Web Service for building and evaluating QSAR models.and evaluating QSAR models.

Send your data: assay results and SD file.Send your data: assay results and SD file.

Use any or all of five descriptor types (2D).Use any or all of five descriptor types (2D). (Use your own descriptors)(Use your own descriptors)

Use any or all of 16 statistical modeling Use any or all of 16 statistical modeling methods.methods.

Predict potency of untested compound. Predict potency of untested compound.

Page 12: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

12

Virtual Virtual ScreeningScreening

ChemSpiderChemModLab

Page 13: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

13

ChemModLab ChemModLab Dialog Dialog (1)(1)

Data Input

Page 14: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

14

ChemModLab ChemModLab Dialog Dialog (2)(2)Five 2D Descriptor Sets

Page 15: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

15

ChemModLab ChemModLab Dialogue Dialogue (3)(3)

16 Modeling Methods

Page 16: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

16

ChemModLab Modeling ChemModLab Modeling MethodsMethods

16 Statistical Modeling Methods•Trees: RandomForest, rpart, tree• Neural networks• k-nearest neighbors• Support vector machines• Partial least squares• Partial least squares with linear discriminant analysis• Least angle regression• Ridge regression• Elastic net• Principal components regression• Family ensemble of k-nearest neighbors, using 70% selection• Family ensemble of tree, using 70% selection• Family ensemble of rpart, using 70% selection• randomForest using 70% selection

Page 17: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

17

ECCR@NCSU + ChemSpider ECCR@NCSU + ChemSpider

PlanPlan

User submits data to ChemModLab to get QSAR Model(s).

Model is sent to ChemSpider.

ChemSpider computes a “virtual screen”.

The hit-list is clustered and sent to the user.

Page 18: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

18

Accumulation curvesCompare descriptor sets, given a method

Page 19: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

19

Accumulation Curves

Compare modeling methods, given a descriptor set

Page 20: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

20

Diversity Diversity MapMap

ClusterActive

Compounds

Modeling Methods

Page 21: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

21

ContinuContinuousous

ResponsResponsee

Page 22: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

22

Continuous Continuous ResponseResponse

Page 23: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

23

ContinuContinuousous

ResponsResponsee

Page 24: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

24

ModelModelEvaluatiEvaluati

ononTake detailed looks at which

models?

AID348 (NCGC):KNN – PhENet – CAPRF – B#RF – CAPRF – FFTree – CAPTree – PhTree – FFPLS – CAP

Page 25: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

25

SummarSummaryy

1.ChemSpider is a web chemical informatics center.

2.ChemModLab is a free, web service for QSAR.

3.Together they support sophisticated virtual screening.

* ChemModLab is supported by the NCI RoadMap project.

Page 26: ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

26

ECCR@NCSU Group ECCR@NCSU Group ChemSpider GroupChemSpider Group

ChemModLab Team

Jacqueline M. Hughes-OliverAtina D. Brooks Gary W. HowellKirtesh PatilStan YoungQianyi Zhang

ChemSpider Team

Antony Williams (project lead)

A rotating team of advisors and developers including many contributions from the Open Source community

eccr.stat.ncsu.edu www.chemspider.com