Top Banner
Solutions for Cheminformatics November 2008 Migration from ISIS environment Szabolcs Csepregi et al
106

ChemAxon Presentation

May 09, 2015

Download

Documents

Sampetruda
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ChemAxon Presentation

Solutions for Cheminformatics

November 2008

Migration from ISIS environment

Szabolcs Csepregi et al

Page 2: ChemAxon Presentation

Migration - Topics

• ChemAxon - Product Overview

• From Isis/Host and MDL Direct to JChem Cartridge

• Alternatives to Cheshire (Standardizer)

• From ISIS/Base to Instant JChem

• From ISIS for Excel To JChem for Excel

• Migrating Custom Applications

• ChemAxon Web Services

• Appendix: ChemAxon for Developers (Resources)

Page 3: ChemAxon Presentation

Product Map

Page 4: ChemAxon Presentation

ChemAxon Embedded - examples

• Workflow– Pipeline Pilot, Inforsense, and KNIME.

• ELN– Agilent, Contur, DeltaSoft, Kinematic, etc.

• SAR– Spotfire, Synaptic Science, Omniviz

• Databases– Aureus, GVK, Jubilant Biosys, Patcore

• Web– Thomson Reuters, Wiley, Houghton Mifflin, Cengage,

Prentice Hall, Collaborative Drug Discovery, RCSB PDB, BindingDB, NIH/NLM ChemIDPlus, Molport etc

Page 5: ChemAxon Presentation

The Marvin family

MarvinSketch MarvinView MarvinSpaceAvailable as Java applets for HTML pages and Java beans for standalone apps (full API)

Structure, query & reaction editing

Individual and structure table visualization

Publication quality macromolecule visualization

MarvinSketch/View http://www.chemaxon.com/MarvinSketch_View.ppt

MarvinSpace http://www.chemaxon.com/MarvinSpace.ppt

Page 6: ChemAxon Presentation

Marvin Development History1998

Applets, Molfiles, stereo support, Windows, Unix

SMILES, SMARTS, PDB, Rgroups, isotopes, shortcuts, Marvin Beans

Ball and stickJPG, PNG, SVG, Cut&Paste with Isis/ChemDraw, 2D cleaning, (de)aromatization, reaction drawing

20001999

SDF, RDF, XYZ animations, CML, templates, compressed formats, Swing, 3D models

2001

Mac support, signed applets, Java Web Start, atom mapping

Partial charge,

pKa, logP/logD,

3D optimization, radicals,abbreviated groups

Marvin file format, enhanced stereo, shapes, text boxes, multiple groups, link nodes, TPSA, recursive SMARTS, Donor/Acceptor, electron arrows,

2004

2003

2005

Tautomers, resonance, lone pairs, conformers, 3D sketching, MarvinSpace,

Topology analysis, presentation quality graphics,...

2006

More Plugins,more R-groups,EMF, PDF and Mol2,Improved property storage in MRV, SDfiles and Rdfiles..NET support in MarvinBeans.

2002

2007

Structure to name, Coordination compounds, Polymer drawing, OLE, Markush enumeration pluginConfigurations

2008

Name to structure, OLE 2, Chemical TermsCustomizable GUI

Page 7: ChemAxon Presentation

Calculator Plugins

Calculator Plugins http://www.chemaxon.com/Calculator_Plugins.ppt

Elemental AnalysisIUPAC Name

Standard IUPAC NameProtonation pKa, Major Microspecies, Isoelectric PointPartitioning logP, logD

Charge Charge, Polarizability, Orbital Electronegativity

IsomersTautomerization, Resonance, Stereoisomer

ConformationConformer, Molecular Dynamics

GeometryTopology Analysis, Geometry, Polar Surface Area (2D), Molecular Surface Area (3D)

Markush enumerationOther

Hydrogen Bond Donor-Acceptor, Huckel Analysis,

Refractivity

A variety of structure based calculations are available from the Marvin GUI, cxcalc command line tool and the API. The calculations are widely used within several JChem tools and are available as functions of Chemical Terms expressions.

Page 8: ChemAxon Presentation

Chemical Naming

Structure to Name/ Name to structure

Supported nomenclatures :

• Chains, Monocycles/ Traditional names with and without heteroatom/ Spiro ring systems/ Ethers/ Common characteristic groups, Ionic compounds/ Unlimited number of atoms and rings/ All atom types /Stereochemistry/ etc.

Usage:

• drag&drop or copy&paste to MarvinSketch

• Label updated in real-time

• Automatic format recognition

• Batch from command line

Page 9: ChemAxon Presentation

JChem family

JChem Base JChem Cartridge Instant JChem

Fast substructure and similarity searching

Tight Oracle SQL integration

Desktop application for scientist

ChemAxon’s proprietary database structure

Arbitrary database structure Access local and remote databases

Page 10: ChemAxon Presentation

JChem development history2000

Oracle, MySQL, SQLServer, Access, hashed fingerprints, substructure and similarity search

DB2, PostgreSQL, Rgroup searching

Reaction searching, fragmentation, reaction processing, standardization, pharmacophores, screening

20022001

Clustering, diversity

2003

R-decomposition,R-enumeration, reaction library, custom fingerprints, random synthesis, link nodes…

2005 2006

Tautomer search, Instant JChem reaction similarity, Library MCS, GUI for Standardizer/ Reactor …

2007

Calculated columns, Installer,Tautomer Duplicate filtering, Query tables, Markush tables,

Speed enhancements for

JChem Cartridge,form design, relational data for Instant JChem ...

Cartridge, enhanced stereo searching, recursive SMARTS, Chemical Terms, virtual synthesis

2004

2008

Position variation queries, Instant JChem:-- Federated search, -- Cartridge support...-JChem for Excel

Page 11: ChemAxon Presentation

Features

• Fast and sophisticated searching(chemical and non-chemical data, Chemical Terms filter, many options)

• Custom standardization

• Calculated columns

• Combinatorial Markush structure tables

Interfaces

• Integration with most relational database engines

• JChem Cartridge for tight Oracle SQL integration

• JSP integration – open source web example

• Desktop-ready through Instant JChem

JChem Base

DB2

Structural Search http://www.chemaxon.com/Structural_Search.ppt

JChem Base http://www.chemaxon.com/JChem_Base.ppt

Page 12: ChemAxon Presentation

Searching in combinatorial Markush structures

Combinatorial Markush structure registration and search

• Markush features handled in search & enumeration:

• R-groups (nesting to any depth)

• Atom lists, bond lists

• Position variation bond

• Link nodes

• Compatible Markush enumeration plugin

• Not all query features supported

Detailed description:

http://www.chemaxon.com/product/markush_search.html

Page 13: ChemAxon Presentation

JChem Cartridge for Oracle

• Access JChem functionality via SQL functions

• All search features of JChem Base

• JChem index for chemical data in arbitrary database structure

• Chemical filters and property predictors using Chemical

Terms

• Standardization (structure canonicalization) during registration

• Structure format conversions

• 2D, 3D image generation

• Library enumeration using

virtual reactions and

Markush structures

JChem Cartridge http://www.chemaxon.com/JChem_Cartridge.ppt

Page 14: ChemAxon Presentation

Instant JChem

Desktop application for local and remote chemical database management, search and structure based prediction

• Simply connect to external databases and share your native database simultaneously

• Powerful search functionalities

• Scalable – explore large datasets (106 +)

• Dynamically predict properties using Calculator Plugins

• Apply canonicalization rules for import and viewing

• Wide import / export options

• Merge data sets into a single set

• Very active development – what do you want to do?

Instant JChem: http://www.chemaxon.com/conf/Instant_JChem.ppt

Page 15: ChemAxon Presentation

JChem for Excel

• Microsoft Excel integrated solution for Marvin and JChem functionality

• Use Excel’s powerful features: Functions, Sorting, Filtering, Charts…

• Implemented in C# .NET, and Visual Studio

– Proof that ChemAxon APIs can be used in a Java-less .NET environment

• Easy to install and deploy

• UNDER DEVELOPMENT

Page 16: ChemAxon Presentation

Canonicalization with Standardizer

Standardizer http://www.chemaxon.com/Standardizer.ppt

• Structure canonicalization– Mesomers

– Tautomers

– Solvent and counter ion removal

– Aromatization, dearomatization

– Explicit/implicit hydrogen conversion

– Stoichiometry expansion

– Stereo manipulations

– 2D cleaning

– Template based cleaning

• Custom rules

• Availability– JChem Base, Cartridge & IJC

– API (Java and .NET)

– Batch processing

– GUI

Page 17: ChemAxon Presentation

Drug discovery toolsJKlustor ScreenProfiling, analysis, diversity Virtual screening by topological

descriptors

Fragmenter ReactorLibrary profiling and reactant generation

Virtual reactions and synthesis

Page 18: ChemAxon Presentation

Migration - Topics

• ChemAxon - Product Overview

• From Isis/Host and MDL Direct to JChem Cartridge

• Alternatives to Cheshire (Standardizer)

• From ISIS/Base to Instant JChem

• From ISIS for Excel To JChem for Excel

• Migrating Custom Applications

• ChemAxon Web Services

• Appendix: ChemAxon for Developers (Resources)

Page 19: ChemAxon Presentation

Contents

• A short introduction of JChem Cartridge

• MDL/Symyx features in JChem

• Migration from MDL/Direct and ISIS/Host

• Migration case studies and user feedback

Page 20: ChemAxon Presentation

Purpose of JChem Cartridge

•Access JChem functionality using SQL:SELECT count(*) FROM nci WHERE jc_contains(structure, 'Brc1cnc2ccccc12') = 1

Access JChem in any programming environment offering Oracle connectivity (.NET, Java, Perl, PHP, Python, Apache mod_plsql...)

• Execute SQL queries efficiently using extensible indexes

Precompute chemical information on structures by creating jc_idxtype indexes:

CREATE INDEX jcxnci ON nci(structure) INDEXTYPE IS jc_idxtype

The jc_idxtype implementation scans the indexed column for eligible structures in one single performance-optimized operation: domain index scan

Page 21: ChemAxon Presentation

Features of JChem Cartridge

• Adds chemistry knowledge into the SQL language of Oracle (SELECT, INSERT, UPDATE, ...)

• Substructure, superstructure, exact structure, similarity searching

• Fast: typically 10k hits in 3M structures within a second• Complex chemical expressions using the Chemical Terms

language that includes logP, pKa, ...

• Automatic property calculation during registration• Standardization (canonicalization) during registration• Structure format conversions (MRV, Molfile, SDfile, RDfile,

SMILES, CML, etc.)• 2D, 3D image generation• Structure enumeration using reaction rules• Interaction with Oracle optimizer

Page 22: ChemAxon Presentation

• jc_compare: substructure/similarity/exact searching combined with Chemical Terms expressions

• jc_matchcount: number of occurences of the query structure in the target

• jc_evaluate: Chemical Terms evaluation• jc_molweight: molecular weight• jc_formula: molecular formula• jc_react: structure enumeration based on virtual reactions• jc_standardize: structure canonization• jc_molconvert: conversion to different formats (image generation is

supported) • jc_tanimoto: similarity search• jcf.hitColorAndAlign: substructure coloring and alignment

Similarity search example displaying ID, SMILES code, and molweight:SELECT cd_id, cd_smiles, cd_molweight FROM

my_structuresWHERE jc_tanimoto(cd_smiles, 'CC(=O)Oc1ccccc1C(O)=O') >= 0.8;

Operators and functions

Page 23: ChemAxon Presentation

• Wide range of query atoms

• Query properties

• R-group queries

• Full SMARTS support

• Coordination compounds

• Link nodes

• Position variation

• Pseudo atoms

• Lone pairs

• Relative stereo

• Reaction search features

• Hit coloring

Structure search features

See detailed information on structure search: www.chemaxon.com/conf/Structural_Search.ppt

Page 24: ChemAxon Presentation

Search options

• Chemical Terms filter constraint

• Tautomer search

• sp hybridization state check

• Stereo on/off

• Ignore charge/isotope/radical/valence/mixture brackets

• Vague bond matching modes: „or aromatic”; ignore bond types

• Inverse hit list

• Maximum search time / number of hits

• SQL SELECT statement for pre-filtering

• Ordering of results

• etc.

Page 25: ChemAxon Presentation

Compatibility and integration

File formats:• SMILES• MDL molfile (v2000 and v3000)• MDL SDF• RXN• RDF• MRV• IUPAC name, InChI

Operating systems:

• Windows• Linux• Solaris• HP-UX• etc.

DB engines:

Oracle versions 9i R2 or above

for alternative RDBMS systems, see the JChem Base

presentation: http://www.chemaxon.com/JChem_Base.ppt

Page 26: ChemAxon Presentation

Index parameters

Index parameters affect:• Fingerprint attributes• Standardizer configuration• Table space and storage options of the index table

Examples:

• Standardization by stripping hydrogens and using basic aromatization:CREATE INDEX jcxnci ON nci(structure) INDEXTYPE IS jc_idxtype PARAMETERS('STD_CONFIG=dehydrogenize:optional..aromatize:b')

• Add structural keys to fingerprint for more efficient substructure searching (structural keys are defined in table stfp_keys):CREATE INDEX jcxnci ON nci(structure) INDEXTYPE IS jc_idxtype PARAMETERS('STRUCTURALFP_CONFIG=select structure from stfp_keys')

Page 27: ChemAxon Presentation

Supported Column Types

• VARCHAR2: typically for short formats, e.g. SMILES

• CLOB

• BLOB

for longer formats, e.g. MDL molfile, Marvin (mrv)

Page 28: ChemAxon Presentation

MDL Feature Compatibility

• Generic atom types and bond types

• Atom query properties

• Atom Lists/ not lists

• Aliases

• Pseudo atoms

• Atom values

• Group and brackets– Abbreviated groups– Multiple groups– Repeating units– Polimers– Mixtures– Attached Data

• Link Nodes

• R-groups, R-logic

• Stereochemistry– Chiral flag– Parity– Double bond stereo– Enhanced stereo (abs/end/or)– inv/ret

• Reacting center on bonds

• Reaction mapping

• Topology (ring/chain)

• Option for ISIS-like look

• Others…

The learning curve of chemists familiar with ISIS is very short. After having some practice, Marvin is reported a more productive drawing environment. The most of the MDL features are available in Marvin and JChem, and many others not available in MDL technology.

Page 29: ChemAxon Presentation

MDL Feature Compatibility

• Polymer search (coming in 5.2)

• Attached data S-group search (coming in 5.2)

• 3D special features

• Exact change flag (reaction)

What is missing:

Page 30: ChemAxon Presentation

Migration from MDL/Direct cartridge

• 2 alternatives:– JChem indexes need to be created on structure columns of

existing tables, or – Structural data migrated to new tables with JChem

Cartridge indexes

• The MDL/Direct SQL operators need to be changed to JChem operators in all uses.

• Non-chemistry tables: no need for migration

Page 31: ChemAxon Presentation

Migration from ISIS/Host

• Molecule source need to be accessible for JChem:– Through exporting SD/Rdfiles from ISIS and importing into

new tables with JChem index, or– Setting option in ISIS/Host to include molfile in RCG tables,

and:• Use SQL to insert mol field into JChem tables, or

• Add JChem index on original tables

• ISIS/Host interfaces need to be rewritten to use SQL only, referencing JChem operators.

• Hviews and GUI-s need to be replaced separately. (See further slides later.)

• Non-chemistry tables: no need for migration

Page 32: ChemAxon Presentation

An Independent Comparison

FMC migrated from MDL® ISIS/Base ISIS/Host to ChemAxon’s JChem. They later published their detailed scientific comparison.

•Used 1.8 million vendor compounds to create a testing database

•Prepared 115 different query structures for comparison

•51 simple sub-structure search

•51 similarity search

•64 complex search

Identical search hits in almost all cases, major differences result from MDL’s incorrect aromatic bond definitions in case of 5 member aromatic rings. ChemAxon's approach is the chemically correct and their performance is higher (faster).

Page 33: ChemAxon Presentation

Identical Results

Page 34: ChemAxon Presentation

Differences

Page 35: ChemAxon Presentation

Vague Bonds

For the sake of perfect compatibility with MDL searching ChemAxon provides vague bond options to retrieve results according to MDL systems.

Page 36: ChemAxon Presentation

Technical Comparison

• Supported Platforms – ISIS®: Sun Solaris, Windows Servers– JChem: Sun Solaris, Windows Servers, Linux, Irix, MAC

• Supported Databases– ISIS: Oracle– JChem: Oracle, MySQL, SQL Server, PostgreSQL, Access, DB2

• Processing SD Files– ISIS: 31 hours, Pipeline Pilot & ISIS– JChem: 11 hours, JChem

• Technology Transparency– ISIS®: Unclear Data/Table Structures– JChem: Clear Understanding of

• Flow of Data• Structure of Data• Execution Process

Native Oracle Tables and Procedures

• Performance– ISIS®: Slow similarity search– JChem: Fast similarity search

Page 37: ChemAxon Presentation

Comparison Conclusions

• Technical Conclusion– Clear and straightforward understanding of data representation and system

architecture

– Integrated system

– Quicker and less error-prone

– Less hassle for software development

From technical point of view, ChemAxon is favorable

• Business Conclusion

ChemAxon was the better choice

Page 38: ChemAxon Presentation

Migration Experience Questionnaire

Five companies were interviewed about their JChem Cartridge migration experiences in the form of a questionnaire containing 14 questions.

•A UK based service/biotech company

•A Swedish biotech company

•A US branch of a Swiss pharmaceutical company

•A Japanese pharmaceutical company

•A US branch of a Japanese pharmaceutical company

Page 39: ChemAxon Presentation

Migration Experience

1. What was the platform you used before the migration?– All systems were run using the Daycart cartridge on Linux servers

– MDL Cartridge running on Sun Solaris

– Daycart

– We used ISIS/Host as a server, the client was ISIS/Base customized using ISIS/PL

– Daylight and IDBS Chembridge

2. How long did it take to migrate?– Very simple, hardly any time at all, just a few hours to uninstall old cartridge, install

new cartridge and build indexes. Then modify a few SQL statements in the code to use the new cartridge functions.

– It took a full weekend to switch over and convert all old databases.

– Since we use SQL for structure searches, the actually change in the application code are few. Code changes takes about 1 day. However, we spent at least two weeks to compare the daylight and jcart.

– It took 1 year for planning, and another 1 year for designing and developing the system. 1-year-migration time includes all of the operation that is needed. That means our technical people worked for this project 1 year. We migrated the data structure of HView, but the form was re-designed in order to fit our existing (wet) workflow.

– Two months

Page 40: ChemAxon Presentation

Migration Experience

3. How many technical people were required in the migration process?– It was fairly simple so just one developer with all round programming, database, and

chemistry knowledge.

– One person

– 2 people

– 6 technical people. 2 were contacting with users. For the system design, 11 users were involved from chemistry, HTS, eADME groups.

– 1.5

4. Why did you decide on leaving the previous platform? (problems)– Purely the cost. We found the Daycart system to be very good, very stable, fast, and

the API was well thought out. However, it was just too expensive for us.

– Old technology not offering new functionality. High cost, in particular for new licenses.

– Daycart (at least at that time) did not take MOL query, not all query structures could be correctly presented as smiles/smarts.

– Two main reasons were the maintainance cost, and the accessibility. We had to suppress the raising system (software) cost, and at the same time we had to enlarge the number of users and client PCs from which we could use DB system.

– Cost, maintenance and risk

Page 41: ChemAxon Presentation

Migration Experience

5. What alternative platforms were considered/evaluated?– Prior to selecting ChemAxon we looked at all the cartridges available at the time– The Accord cartridge was also evaluated. Some others did not qualify for evaluation.– None– Accord (Accelrys), and ChemOffice (Cambridge Soft) were two major alternatives.– Symyx/MDL Direct Oracle cartridge

6. Why did you choose ChemAxon technology? (advantages)– Cost was a major factor, but also because we felt we could work with ChemAxon to

develop the tools further as we wanted to use them. A very open approach. Another reason was that all the tools we needed were available from a single vendor, i.e. Oracle cartridge for searching, and sketching and viewing tools.

– Almost as good as Accord but with better impact on improvement and support.– Marvin Sketch and JCart represent the molecules in MOL using exactly the same

backend library. MOL is used instead of smiles/smarts. Much faster search. Price is good .

– We could keep the cost lowest by using ChemAxon, and more than that, the affinity for the web technology was favorable to our future vision of the cheminformatics system.

– The greatest advantage is the low cost and great support. We have always had MDL/Direct cartridge, but the greatest advantage is the low cost and stellar support speaks specifically to ChemAxon.

Page 42: ChemAxon Presentation

Migration Experience

7. What were the most problematic issues occurred during the migration? (negative impressions)

– Understanding the finer points of all the search functions / options i.e. precisely how things like aromaticity, stereochemistry, etc. are handled. We've also had to spend time considering how to restandardise structures and how to rewrite SQL. When doing a straight forward structure search (i.e. benchmarking), the JChem cartridge performs very well against other systems such as Daylight, however, if you want to incorporate joins between tables can considerably affect the query times even when using what we call ChemAxon SQL.

– Structure matching bugs in the cartridge and undocumented actions needed to be performed.

– JCart installation was not so smooth 3 years ago. Much better now. Most of the problem and issues are because some structures are interpreted different between the two software. Some are Daylight bugs and others are jchem bugs. JChem has fix all their share.

– There were little problem, what I remember is that the response was slower than expected when the chemical object was included in the page.

– Identifying all the integration points.

Page 43: ChemAxon Presentation

Migration Experience

8. How could you overcome in these difficulties? (resolutions) – We spent a lot of time experimenting with the different functions/options so we completely

understand what they do.– The structure search bugs was overcome by rewriting the registration procedures,

undocumented actions were overcome by hard work.– Wait until major bugs in JChem are fixed. We live with about 0.01% of inconsistencies and

work it out later.– The needless chemical objects were replaced by pictures.– Availability and quick turn around to patch any

9. Did you expect any other problem, that did not occur? (positive impressions) – We though there may be problems running two different cartridges on the same table but

this worked fine– Not really. Most MDL features were available in JChem. This was one of the selection

criteria, particularly important for chemical registration.– No– We expected that the transfer of the existing data might be problematic, and that the

system change might be inconsistent with existing 'wet' workflow. That was why we organized 11 users as a system designing team, and I think the team worked well.

– Migration went very smooth

Page 44: ChemAxon Presentation

Migration Experience

10. What additional components were purchased together with the JChem Cartridge?

– Most of them!

– Descriptor calculations.

– None. User probably should consider plug-ins for calculating HBD, HBA, logp, psa, etc. We did not because we need to stick to CLOGP in order to be consistent with the rest of the company.

– Standardizer.

– Standardizer.

11. How much technical support did you need from ChemAxon for the migration?

– Initially quite a lot, though the products have been developed a lot since then. We haven't required much support for structure migration, but we've also migrated a load of SMIRKS and we've needed support for that mainly because of the way in which they were handled in the old system (non-standard).

– A few needed support cases where filed on the support forum and fairly quickly resolved.

– Lots, we had close communication with dev team during the migration.

– Our technical people sent e-mail several times to your support team.

– Little.

Page 45: ChemAxon Presentation

Migration Experience

12. Were/Are you satisfied with the ChemAxon support? – Yes. Support has always been good.

– Yes very satisfied. The support has always been very fast and accurate.

– Yes.

– Yes.

– Yes.

13. Did the migration reach its original goals?– So far, yes! The systems are up and running.

– Yes.

– Yes.

– Yes.

– Yes.

Page 46: ChemAxon Presentation

Migration Experience

14. Are you satisfied with the performance/functions of the ChemAxon powered system?

– The number of functions available and flexibility of the JChem tools is excellent, and allows us to develop very interesting and useful drug discovery software for our scientists.

– Yes.

– Yes.

– Yes.

– Yes.

Page 47: ChemAxon Presentation

Useful migration resources

ChemAxon's Marvin & JChem (v 3.1.3) vs. MDL® ISIS/Draw ISIS/Host (v 4.0)Seong Jae Yu, David Roush*, Usha Ganesh, Young Moon, Henry Liu, FMC Corp.

http://www.chemaxon.com/conf/FMC_ChemAxon_JCHEM_Cart_xnotes.ppt

User Group Meeting presentations:

http://www.chemaxon.com/UGM/ugm_land.html

Page 48: ChemAxon Presentation

Migration - Topics

• ChemAxon - Product Overview

• From Isis/Host and MDL Direct to JChem Cartridge

• Alternatives to Cheshire (Standardizer)

• From ISIS/Base to Instant JChem

• From ISIS for Excel To JChem for Excel

• Migrating Custom Applications

• ChemAxon Web Services

• Appendix: ChemAxon for Developers (Resources)

Page 49: ChemAxon Presentation

Cheshire Alternatives from ChemAxon

What is Cheshire?

“Cheshire is a scripting language that enables you to write scripts to validate, modify, or gather information about chemical structures, such as molecules and reactions.”

What alternatives can ChemAxon offer?

•ChemAxon’s Java API (also available from .NET)

•Chemical Terms

•Standardizer

Page 50: ChemAxon Presentation

Java API for Cheminformatics from ChemAxon

ChemAxon’s class library consists of more than 1500 chemistry related classes tuned for usability and high performance.

Page 51: ChemAxon Presentation

Chemical Terms

charge() and match(amine) or match(hydrazine)

Chemical Terms offers more than a hundred popular chemistry functions opening up the power of cheminformatics for those scientists who focus on quick results instead of the details of programming and scripting. The integration of Chemical Terms makes make chemistry applications smarter and more customizable.

Page 52: ChemAxon Presentation

Standardizer for Batch Conversion

Standardizer is a batch conversion utility providing many useful and customizable functions for the canonicalization of chemical structures and restoration renovation chemical information in structures from older databases.

Page 53: ChemAxon Presentation

Standardizer Actions

Aromatize

Dearomatize

Add Explicit Hydrogens

Remove Explicit Hydrogens

Clean2D

Clean3D

Transform

Wedge Clean

Clear Isotopes

Remove Fragments

Remove R-groups

Neutralize

Tautomerize

Mesomerize

Set Absolute Stereo

Remove Absolute Stereo

Convert Wedge Interpretation

Convert Double Bonds

Clear Stereo

Alias to Group, Alias to Atom

Contract Group

Expand Group

Ungroup

Expand Stoichiometry

Map Reaction

Unmap

Page 54: ChemAxon Presentation

Counting Groups – Cheshire

Counting O=S=O groups in Cheshire

Page 55: ChemAxon Presentation

Counting Groups – Java API

Counting any functional groups with ChemAxon’s Java API

Counting O=S=O groups in Chemical Terms

Page 56: ChemAxon Presentation

Adding Explicit Hydrogens - Cheshire

Adding explicit hydrogens and cleaning the molecule in Cheshire

Page 57: ChemAxon Presentation

Adding Explicit Hydrogens – Java API

Adding explicit hydrogens and cleaning the molecule with ChemAxon’s Java API

Page 58: ChemAxon Presentation

Adding Explicit Hydrogens – Standardizer

Adding explicit hydrogens and cleaning the molecule with Standardizer

The same in command line

Page 59: ChemAxon Presentation

Group Conversions – Cheshire

Conversion of neutral form of nitro to the ionic one in Cheshire

Page 60: ChemAxon Presentation

Group Conversions – Java API

Conversion of neutral form of nitro to the ionic one with ChemAxon’s Java API

Page 61: ChemAxon Presentation

Group Conversions – Standardizer

The same in command line

Conversion of neutral form of nitro to the ionic one in Standardizer

Page 62: ChemAxon Presentation

Structure Checker Framework

• ValenceChecker

• AromaticityChecker

• OverlappingAtomsChecker

• OverlappingBondsChecker

• CrossedDoubleBondChecker

• WigglyDoubleBondChecker

• WedgeBondsChecker

• BondLengthChecker

• BondAngleChecker

• AliasChecker

• PseudoAtomChecker

• AbbreviatedGroupChecker

• MultiComponentChecker

• QueryChecker

• MoleculeChargeChecker

• RadicalChecker

• IsotopeChecker

• ExplicitHydrogenChecker

• StereoDoubleBondChecker

• TetrahedralStereoAtomChecker

• UnspecifiedStereoDoubleBondChecker

• ChiralFlagChecker

• CovalentSaltChecker

• FerroceneChecker

• CumulatedRingBondChecker

• UnbalancedReactionChecker

• MultistepReactionChecker

• AtomMapChecker

• MissingAtomMapChecker

• AtomMapStyleChecker

• RgroupQueryChecker

• MarkushChecker

• 3DCoordinateChecker

• MolfileChecker

• RxnfileChecker

• SmilesChecker

• SmartsChecker

• InchiChecker

• PeptideSequenceChecker

• CmlChecker

• PdbChecker

The new Structure Checker framework will provide plenty of validation and correction functions to detect and repair defective or unpreferred structures.

Page 63: ChemAxon Presentation

Summary

• ChemAxon’s Java API provides similar freedom and flexibility to Cheshire for programmers to develop chemistry functions for any tears like web clients, desktop applications, server systems and Oracle stored procedures.

• Java is a standard language with worlwide community, rich resources and lots of well educated developers. (The ChemAxon Java API is also accessible from .NET.)

• Chemical Terms provides more than a hundred high level, ready to use functions substituting dozens of lines of complex Cheshire code.

• Chemical Terms expressions can directly be used in database filters, virtual reactions, pharmacophore definitions or other cheminformatics applications.

• Standardizer is an easy to use batch tool and graphical interface for chemists to create conversion rules without writing a single line of code.

• The upcoming Structure Checker will provide and extensible set of quick “problem detection” functions that can be integrated in any applications and will be added to Marvin and Standardizer as well.

Page 64: ChemAxon Presentation

Migration - Topics

• ChemAxon - Product Overview

• From Isis/Host and MDL Direct to JChem Cartridge

• Alternatives to Cheshire (Standardizer)

• From ISIS/Base to Instant JChem

• From ISIS for Excel To JChem for Excel

• Migrating Custom Applications

• ChemAxon Web Services

• Appendix: ChemAxon for Developers (Resources)

Page 65: ChemAxon Presentation

Instant JChem is…

• An “out of the box” desktop application designed for biologists and chemists

• A modular platform for developing chemistry applications

Page 66: ChemAxon Presentation

Instant JChem lets users…

• Create or connect to existing structure databases

• Easily manage relational data

• Import/export/merge/edit data

• Build forms for reporting

• Run combined structure + data searches

• Perform structure based predictions

• Access sophisticated chemistry features

• Collaborate with other users

Page 67: ChemAxon Presentation

Feature comparison to ISIS/Base

Feature ISIS/Base Instant JChem

Databases Local + Oracle (differences in steroechemistry and calculations etc.).

Local + Oracle + MySQL (no differences in local and remote db functionality).

Forms Form builder. Form builder.

Tabular view Limited. Comprehensive.

Relational data Hview DataTree.

Deployment Installer only. Installer or Java Web Start.Many deployment features.

Collaboration Limited. Many collaboration and sharing features.

Scalability & performance

Limited, especially for local DBs. Good.

Page 68: ChemAxon Presentation

IJC Architecture

• Built on modular platform– Allows easy extension by

ChemAxon, customers and 3rd parties

– Strong enforcement of APIs

• API– Allows extension

– IJC functionality is built upon these APIs

Page 69: ChemAxon Presentation

Current architecture

Local DB

IJC Client Database

Remote DB

Oracle cartridge

Page 70: ChemAxon Presentation

IJC server architecture

IJC

IJC Client

Database

IJC Server

Web Apps

Web services

Services API

IJC server due Q1 2009

Oracle cartridge

Page 71: ChemAxon Presentation

Migration issues: general

Database artefacts IJC is currently table based. Access to views, synonyms etc. is currently being added.Use of database links has not been investigated yet, but no particular problems expected.

Security model Current implementation provides basic access control:1. Read-only2. Read-write3. Edit database modelUsers can create forms/lists etc. even in read-only mode,

but can’t modify data that affects other users.

Security integration LDAP is probably the most suitable, but the security implementation is quite flexible and customisable.

Page 72: ChemAxon Presentation

Migration issues: migration of ISIS DBs

Hview vs. Data Tree No direct conversion, so this would currently need to be done manually, though some automation is potentially possible.Data Tree is modelled on the same approach as Hview, so migration in most cases should be relatively simple.

Forms No direct import from ISIS, but creating IJC forms is very simple and fast.

Customisation Currently no equivalent to ISIS/PL. ISIS applications with complex logic may be more suited as IJC extensions, or as standalone web or JWS applications.

Page 73: ChemAxon Presentation

Hview vs. Data Tree: standard tables

master_tablemaster_table_idcol1col2col3

detail_tabledetail_table_idmaster_table_idcolacolbcolc

*

ISIS Hview

HVIEW my_data

TREE master DEVICE oracle USERNAME scott PASSWORD tiger TNAME master_table

TREE detail DEVICE oracle USERNAME scott PASSWORD tiger TNAME detail_table

LINK master (master_table_id) over detail (master_table_id)

One-to-many relationship

IJC Data Tree

Page 74: ChemAxon Presentation

Hview vs. Data Tree: Mol + Rxn tables

< RC tables>inventoryinventory_idmolregnocolacolbcolc

*

ISIS Hview

HVIEW cpds_inv

TREE compounds DEVICE chemicaldb USERNAME CPD/CPD PASSWORD TNAME compounds

TREE inventory DEVICE oracle USERNAME scott PASSWORD tiger TNAME inventory

LINK compounds (molregno) over inventory (molregno)

IJC Data Treecompoundsmolregnostructure [jc_index]

inventoryinventory_idmolregnocolacolbcolc

*

Page 75: ChemAxon Presentation

Migration options

Simple local or Host based databases used primarily for searching/reporting

Migrate to IJC

ISIS/Base application with complex application logic but standard (Hview based) data structure(e.g. registration applications)

Create custom IJC extension module built upon IJC API(much of the existing IJC functionality is essentially done this way)

Application with complex data model and logic

Either: Create standalone web applicationor: Create standalone JWS applicationor: Create custom IJC module that defines its own data access API.

Page 76: ChemAxon Presentation

Migration: local ISIS databases

• Analyse data hierarchy

• Export data as SDF/RDF

• Import into IJC

• Build forms

• Maybe possible to automate by writing COM application to read data from ISIS and write to Oracle database.

Page 77: ChemAxon Presentation

Migration: ISIS/Host databases

• Analyse database tables and Hview

• Migrate RCG tables to JChem table(s)

• Connect IJC to the database

• Promote tables/columns/foreign keys into IJC

• Assemble IJC Data Tree

• Build forms

• May be possible for some automation.

Page 78: ChemAxon Presentation

Migration - Topics

• ChemAxon - Product Overview

• From Isis/Host and MDL Direct to JChem Cartridge

• Alternatives to Cheshire (Standardizer)

• From ISIS/Base to Instant JChem

• From ISIS for Excel To JChem for Excel

• Migrating Custom Applications

• ChemAxon Web Services

• Appendix: ChemAxon for Developers (Resources)

Page 79: ChemAxon Presentation

JChem for Excel

• Microsoft Excel integrated solution for Marvin and JChem functionality

• Use Excel’s powerful features: Functions, Sorting, Filtering, Charts…

• Implemented in C# .NET, and Visual Studio

– Proof that ChemAxon APIs can be used in a Java-less .NET environment

• Easy to install and deploy

• UNDER DEVELOPMENT

Page 80: ChemAxon Presentation

ISIS for Excel to JChem for Excel

• Import ISIS SARTables (January 2009)– Workbook exported from ISIS for Excel

• Migration of standard ISIS Workbooks?

Page 81: ChemAxon Presentation

Migration - Topics

• ChemAxon - Product Overview

• From Isis/Host and MDL Direct to JChem Cartridge

• Alternatives to Cheshire (Standardizer)

• From ISIS/Base to Instant JChem

• From ISIS for Excel To JChem for Excel

• Migrating Custom Applications

• ChemAxon Web Services

• Appendix: ChemAxon for Developers (Resources)

Page 82: ChemAxon Presentation

Custom Applications

• Java Applications– Swing

• .NET Applications– JNBridge: Commercial Java - .NET Proxy– Byte Code to IL (.NET binary) translation (IKMV)

• Open Source, very good performance• No full GUI support at the moment, but coming• JChem for Excel is built using IKVM

• Web Based Applications– JSP, ASP.NET, AJAX

• SOAP

Page 83: ChemAxon Presentation

Custom Components

• Plans to release custom components– Java Swing– AJAX Examples– .NET

• Visual Studio integrated• Windows Forms (from JChem for Excel),WPF?• ASP.NET• ASP.NET AJAX, MVC

Page 84: ChemAxon Presentation

.NET Integration Enhancements

• Problem : ChemAxon API uses Java Classes, not familiar to .NET developers

• Higher Level .NET wrappers, components– Properties, Events– Search results in DataSet, IDataReader– LINQ, IEnumerable interfaces– GUI Components: DataGridView, Property Grids,

Components for Search

Page 85: ChemAxon Presentation

Custom Application Migration and Development

• Resources and experience for migrating custom ISIS(Host - Base) based applications– ISIS Forms to other applications– Procedural Language (ISIS/PL)

• Consultation– Help with custom application development on

ChemAxon platform– Both in-house (CXN) staff and partner companies

are available– Custom/prioritised improvements of ChemAxon

products

Page 86: ChemAxon Presentation

Migration - Topics

• ChemAxon - Product Overview

• From Isis/Host and MDL Direct to JChem Cartridge

• Alternatives to Cheshire (Standardizer)

• From ISIS/Base to Instant JChem

• From ISIS for Excel To JChem for Excel

• Migrating Custom Applications

• ChemAxon Web Services

• ChemAxon for Developers (Resources)

Page 87: ChemAxon Presentation

Web Services

• Extends ChemAxon functionality to Web applications

• Enables interoperability from multiple programming languages with SOAP Protocol

• Allows migration of existing web applications to ChemAxon services

• Encourages creation new web applications

Page 88: ChemAxon Presentation

Service Modules

• Application Building Blocks– DB Searching

• Substructure, Similarity, Exact, etc.

– Molecular Standardization

– Clustering and Diversity

• Chemically Intelligent Tools– Shorthand Chemical Terms and Calculator Plugins

• Lipinski Rule of 5, pKa, logP, logD, etc.

– Molecular Format Conversion

– Image Generation

Page 89: ChemAxon Presentation

SOAP Protocol

• SOAP protocol used by most major web application languages

• Programming languages– Java– .Net (C#, ASP.net)

• Scripting languages– JavaScript– Perl– Python

• Etc.

Page 90: ChemAxon Presentation

AJAX Example

Page 91: ChemAxon Presentation

Migration of Existing Web Apps

• ChemAxon Web Services can be called from existing web services

• ChemAxon Web Services can directly replace specific functionality

• Migrate using Security Standards– WS-Security, WS-Security Policy– Integrate with existing authentication services

(e.g. LDAP, Active Directory)

Page 92: ChemAxon Presentation

Creation of New Web Apps

• Standard WSDL files allow for automated client side code generation (Python, Perl, Java, C#, etc.)

• AJAX provides asynchronous and desktop application performance

• Easily integrate with Marvin applets

Page 93: ChemAxon Presentation

Migration - Topics

• ChemAxon - Product Overview

• From Isis/Host and MDL Direct to JChem Cartridge

• Alternatives to Cheshire (Standardizer)

• From ISIS/Base to Instant JChem

• From ISIS for Excel To JChem for Excel

• Migrating Custom Applications

• ChemAxon Web Services

• Appendix: ChemAxon for Developers (Resources)

Page 94: ChemAxon Presentation

API and Compatibility

Java API (Marvin GUI included)

Marvin Applets for web applications

.NET API over JNBridge (Marvin GUI included)

Native .NET solution under development (Marvin GUI included)

API from SQL: JChem Cartridge for Oracle

SOAP interface (Python, C, .NET, ... over SOAP) under development

AJAX interface under development (Marvin GUI included)

Instant JChem highly configurable + Java API

Integration: Pipeline Pilot, KNIME, Spotfire, ...

Page 95: ChemAxon Presentation

Java API

• Direct manipulation of structures

• Format conversions, name<=>structure, image generation

• Structure searching with/without DB access

• Standardization of structures

• Property calculations

• Reaction modelling (enumeration)

• Clustering

• Sketcher, 2D/3D viewers (Marvin family)

• EtcJChem API

Page 96: ChemAxon Presentation

Marvin Applets for Web Applications

• All relevant browsers (IE, FF, Safari, ...)

• Manipulation from HTML page (from JavaScript)

• Catching drawing events in JavaScript

• Can be used from .NET applications using the web browser control

Marvin demo

MarvinSketch Applet Examples

MarvinView Applet Examples

MarvinSpace Applet Examples

Page 97: ChemAxon Presentation

.NET API Over JNBridge

• Tight integration with .NET

• Full Java API is mirrored in .NET

• Marvin GUI components are also supported

Page 98: ChemAxon Presentation

Native .NET Solution

• Translating the non-GUI elements to Java binary to .NET binary (using IKVM)

• Building a thin .NET GUI for Marvin and other tools over the core.

Advantages

• Pure .NET solution, Java is not needed to be installed

• No license issue

• No performance overhead of proxying

under development

Page 99: ChemAxon Presentation

JChem Cartridge for Oracle

• API from Oracle SQL

• All features needed for structure handling and searching

• Fast searching, insertion, and indexing

• Special features:– Standardization of structures is tied with structure tables– Property calculations– Format conversions, name<=>structure, image generation– Reaction and Markush based structure enumeration– Markush libraries in structure tables (coming soon)

Page 100: ChemAxon Presentation

SOAP Interface

• Web services interface to most functionalities

• Bridges to Python, C, Perl, .NET, Java using WSDL

• Enables both remote and local access to ChemAxon functionalities

under development

Page 101: ChemAxon Presentation

AJAX GUI

• AJAX components for web applications

• Customization using CSS and XSL

• Accesses SOAP interface

• Structure searching, database handling example

• Fast and rich GUI– Floating windows– Scrolling through large database

without paging

• Marvin Applets are integrated

under development

Page 102: ChemAxon Presentation

Instant JChem for Developers

• Sharable forms, queries, lists

• URL-s to sharable items - Demos

• Instant JChem API

Page 103: ChemAxon Presentation

Integrations

Several software vendors integrated ChemAxon components

- Pipeline Pilot

- KNIME (by Infocom)

- Spotfire

- Aureus

- Integrity (Thomson)

- Others: (Agilent, Tripos, Symyx, Deltasoft, GVK, Wiley, Genedata, Contur, Inforsense, Kinematik, Houghton Mifflin, Kelaroo, Patcore, Cengage, Prentice Hall, Crossfire Beilstein, etc)

Page 104: ChemAxon Presentation

Visit other technical presentations

ChemAxon Overview http://www.chemaxon.com/conf/ChemAxon_Overview.ppt

MarvinSketch/View http://www.chemaxon.com/MarvinSketch_View.ppt

MarvinSpace http://www.chemaxon.com/MarvinSpace.ppt

Calculator Plugins http://www.chemaxon.com/Calculator_Plugins.ppt

Structural Search http://www.chemaxon.com/Structural_Search.ppt

JChem Base http://www.chemaxon.com/JChem_Base.ppt

Instant JChem http://www.chemaxon.com/conf/Instant_JChem.ppt

JChem Cartridge http://www.chemaxon.com/JChem_Cartridge.ppt

Standardizer http://www.chemaxon.com/Standardizer.ppt

Screen http://www.chemaxon.com/Screen.ppt

JKlustor http://www.chemaxon.com/JKlustor.ppt

Fragmenter http://www.chemaxon.com/Fragmenter.ppt

Reactor http://www.chemaxon.com/Reactor.ppt

Page 105: ChemAxon Presentation

Find out more

• Product descriptions & linkswww.chemaxon.com/products.html

• Forumwww.chemaxon.com/forum

• Presentations and posterswww.chemaxon.com/conf

• Downloadwww.jchem.com/licensefrset.html

Page 106: ChemAxon Presentation

Thank you for your attention!