(ATS4-PLAT04) Chemistry Data Model Enhancements in Pipeline Pilot 9.0: what are they and how will they impact your users
Post on 25-May-2015
623 Views
Preview:
DESCRIPTION
Transcript
(ATS4-PLAT04) Chemistry Data Model Enhancements in Pipeline Pilot 9.0: what are they and how will they impact your users
Keith T Taylor PhD
Advisory Product Manager, Chemistry Foundation
keith.taylor@accelrys.com
The information on the roadmap and future software development efforts are intended to outline general product direction and should not be relied on in making a purchasing decision.
• Enhanced representation – What you see is what you have
• New representations – New mapping options – Streamlined mapping
• Consistency between Pipeline Pilot Chemistry, Accelrys Direct, and Accelrys Draw
• Changes to perception mean that models, and calculators must be relearned and re-baselined – Significant effect from new ring perception option – Stereochemistry and aromaticity have smaller, but important effect
Pipeline Pilot 9.0 – New Capabilities
• Pipeline Pilot 9.0 (2013) will support all chemical representations supported by Direct and Draw – Generics – Biologics – Polymers and Mixtures – Variable attachments – Homology Groups – Haptic Structures – New bond types
• Depiction of all chemical objects supported by Accelrys Direct and Accelrys Draw – Look and feel that matches those of Accelrys Draw
• Mappers upgraded to support new representations
• Calculators upgraded to interpret the new representations appropriately
• Enhanced perceptions of stereochemistry. aromaticity, and rings
Pipeline Pilot 9.0 – New Capabilities
• Single/double/triple bonds supported in NONS
• Coordination/Dative bond
• Haptic bonds
• Markush Homology Groups
• Hydrogen bonds
Chemical Representations – New in 9.0
• Rendering between Accelrys Draw and Pipeline Pilot 9.0 now consistent
• Pipeline Pilot now supports: – PNG – JPEG – GIF – SVG – EMF – Linux and Windows!
• SVG and EMF generation fast – ~ 10,000 structures per second
Depiction
Draw
Pipeline Pilot
• Abbreviated groups are frequently used to simplify structures
• Attachment points are now correct – The Pipeline Pilot 8.x depictions are incorrect on the left of the phenyl
group
– The labels depicted imply different chemical entities
• Visual corruption
• Nitrile (CN) and isonitrile (NC) are chemically different
• NCS and SCN are also different entities
• Rich text markup renders correctly
• Whitespace around labels is consistent – Affects perceived bond length
Abbreviated groups render correctly
Draw
Pipeline Pilot
• Markush/Rgroup depiction is complete in Pipeline Pilot rendering
• Now renders – Rgroups definitions (e.g. R1 …)
– Rgroup logic (R1 = 1; R2 >= 0)
– Indication of direction for fragments with multiple attachment points (e.g. “ on R2)
Rgroup/Markush is functional
Draw
Pipeline Pilot
• Nonspecific (NONS) representation are equivalent with Direct 8.0 and Draw 4.1
– Pipeline Pilot version does not lose information
• Examples from mass spectrometry and industrial chemicals
Nonspecific representations are rendered
Draw
Draw
Pipeline Pilot
Pipeline Pilot
• Growing importance
• Representation exposed in Pipeline Pilot 8.5
• Completed in 9.0
– Much more functional and sophisticated
Biologics
Pipeline Pilot
Draw
• Representation and search new in 9.0
• Mix it up
Polymers, Mixtures, and Formulations New in 9.0
• Accelrys Direct and Accelrys Draw understand Antibody-Drug Conjugates
Example: Antibody-Drug Conjugates – New in 9.0
• Pipeline Pilot 9 understands Antibody-Drug Conjugates (ADC) – And with variable loading
• Harmonization facilitates support of all of Accelrys’ chemical representation in applications – Antibody-drug conjugates, polymers, formulation, mixtures, and metabolites
• Benefits of harmonized chemical representation are:
– Implement once
– Expose everywhere
– Consistent
Example: Antibody-Drug Conjugates
What does this mean to my scientists? (1)
• Higher quality reports
– Supports perception of quality research
• Enhanced depiction of biologics and Markush generics
– Will look different and minor adjustments to depiction protocols may be needed
• New chemical representations
– No change to existing protocols
– New opportunities open up
• Enhanced mapping – New in 9.0 e.g. Imipramine Metabolites
Mapping: Non Specific Structures - New
• Screen MDDR data set – 129,237 structures screened in ~30s
– No pre-processing
Mapping: Homology group screening
Hits = 470
Hits = 108
Hits = 45
Hits = 16
Hits = 10
What does this mean to my scientists? (2)
• New mapping capabilities – No change to existing protocols
– Enhancements to existing protocols – more efficient code
– New screening opportunities – screen by homology
• Changes to stereochemical and aromaticity perception will drive changes in the behavior of:
– Learned models
– Calculators
– Structure Matchers
• Will need to relearned and re-baselined
• Change is discontinuous
• There will be no legacy mode
– Because this will cause incompatibilities and drive confusion
Data Model Changes from PP 8.x PP9
• Pipeline Pilot 9.0 (2012) and Accelrys Direct 9.0 (2013) – Will be 100% compatible
• Pipeline Pilot 9.0 and Accelrys Direct 8.0 – Only difference is aromaticity perception edge-cases
– Direct 8.0 will use its current aromaticity perception
• Template based
– Will differ from that in Pipeline Pilot 9.0
• Hückel (4n+2) rule based
– Minor differences will be observed
Compatibility: Pipeline Pilot and Accelrys Direct
Dataset Number of Structures
Canonical SMILES
AlogP Number of
Rings Number of
Aromatic Rings Number of
Stereo Atoms ECFP4
ACD 239,996 251 105 2,455 65 0 214
Asinex 137,799 26 24 1,070 22 0 43
Maybridge 51,058 2 0 438 0 0 1
MDDR 2010 201,748 62 24 3,271 29 4 46
WDI 53,517 37 14 612 10 0 42
Observed Differences in Calculated Values
• Table shows the number of structures in the datasets that had different values in 9.0 compared with 8.5
• Difference generally very small
• Ring perception leads to more prominent differences especially in drug-like datasets
• Single chemistry foundation with single data model implemented in
a single code stream
– Adopted by Tools and Platform
• Direct , Pipeline Pilot and Accelrys Enterprise Platform
– Application Stack inherits all of the chemistry capabilities
• Simplifies development and application environment
• Enhances our ability to deliver to you new functionality more quickly across the products
Harmonization delivers
• Chemistry Harmonization project: – PPChem 9.0 inherits many new chemical representations – Existing representations enhanced – Aromaticity, Stereochemistry and Ring perceptions enhanced – Significant improvement to depiction aesthetics – PPChem 9.0 and Direct 8.0 share representations(1)
• Pipeline Pilot 9 and Direct 9 deliver the same results
• What Next – Get early access by joining the beta2 program – Get feedback from your chemists – What opportunities do the new features open up?
• Related Tech Summit Sessions (Current or previous)]
Summary
Chemistry Data Model
Changes to Chemical Perception Details
• Charged non-metals are now treated as their “isoelectronic” equivalent: – B- ~ C ~ N+ ~ O+2 ~ F+3
– Si- ~ P ~ S+ ~ Cl+2
• The bad valence filter has been improved and now catches more bad anions.
• Metal anions no longer have implicit hydrogens – Aluminum anions are an exception (for support of aluminum hydride anion)
• Nitrogen (V) is still allowed as a drawing alternative for nitro- and diazo- groups, amine oxides, and related substructures. However, the application is now less likely to perceive uncharged quaternary nitrogens as implicit hydrogens.
• Atoms with illegal valence are now better distinguished from atoms with maximum valence in ECFP fingerprint bits. For example, the Oxygen in N=O and N#O is now typed differently. This can affect the Canonical SMILES atom order for structures containing atoms with illegal valence.
• The changes in valence result in changes to ECFP fingerprint bits and Canonical SMILES.
Valence and Implicit Hydrogens
• Ring perception has been improved. Previously, the SSSR ring perception algorithm was used, which is not unique and often misses rings in complex non-planar assemblies, when they were atom-order and bond-order dependent. The unique “K-rings” perception algorithm is now used, which is the union of all possible SSSR sets. These changes result in changes to Canonical SMILES and improvement to aromaticity perception.
• Examples
• Now perceived as 3 rings:
• Now perceived as 4 rings:
• Now perceived as 6 rings:
Rings
• The isoelectronic equivalence enhancement described in Valence and Implicit Hydrogens improves the perception of ring systems containing charged non-metals. Improved detection of bad valence for anions also contributes to improved perception of aromaticity.
• The atoms that can contribute a lone pair to an aromatic ring are extended from (N,O,P,S) to include As, Se, and Te.
• These changes result in changes to ECFP fingerprint bits and Canonical SMILES.
Examples
• Now perceived as aromatic:
• No longer perceived as aromatic:
Aromaticity Perception
• The isoelectronic equivalence enhancement described in Valence and Implicit Hydrogens improves the perception of stereogenic centers that include charged non-metals.
• The symmetric equivalence of O-/OH/=O groups attached to P and S atoms has been extended to include As, Se, and Te centers.
• Stereo validation logic of reader code is synchronized with perception code. This allows for more consistent application of rules prohibiting S(IV) centers, P(V) centers, symmetric equivalence of O-/OH/=O, etc.
• “Double-symmetric” ring atom perception is improved Several symmetric spiro cases are now correctly not marked as pseudo-stereo.
Examples
• Now perceived as stereo:
•
• More consistently perceived as not stereo:
•
• No longer perceived as pseudostereo:
Stereochemical Perception
top related