Top Banner
1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle
23

1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

Mar 26, 2015

Download

Documents

Megan Thomas
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

1

Péter Kovács

May, 2005

Compound storage / retrieval with JChem Cartridge for Oracle

Page 2: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

2

Slide 2

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

Contents

Purpose of JChem Cartridge

Constituents of the JChem Cartridge API

Normal Tables vs. JChem Tables

Architecture of JChem Cartridge

Page 3: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

3

Slide 3

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

Purpose of JChem Cartridge

• Access JChem functionality using SQL:SELECT count(*) FROM nci WHERE jc_contains(structure, 'Brc1cnc2ccccc12') = 1

Access JChem in any programming environment offering Oracle connectivity (Visual Basic, Java, Perl, PHP, Python, Apache mod_plsql...).

• Execute SQL queries efficiently using extensible indexes

Precompute chemical information on structures by creating jc_idxtype indexes:

CREATE INDEX jcxnci ON nci(structure) INDEXTYPE IS jc_idxtype

The jc_idxtype implementation scans the indexed column for eligible structures in one single uninterrupted operation: domain index scan

Page 4: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

4

Slide 4

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

Elements of the JChem Cartridge API

• Operators (jc_...) and their functional forms (jcf_...)

• Index parameters and default properties

• DML operators for JChem tables

• Support functions for user defined operators

Page 5: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

5

Slide 5

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

Operators and their functional forms I.

Typical operator:jc_<some-operation>(<target-structure-column>, <some-operand>)

Operator for substructure search:jc_contains(<target-structure-column>, <query-structure>)

“Swiss-army-knife” search operator:jc_compare(<target-structure-column>, <query-structure>, <options>)

Page 6: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

6

Slide 6

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

Operators and their functional forms II.

Chemical TermsThe Lipinski-rule in chemical terms:

SELECT count(*) FROM nci_3m WHERE jc_compare(structure, 'O=C1ONC(N1c2ccccc2)-c3ccccc3','sep=! t:s!ctFilter:(mass() <= 500) && (logP() <= 5) && (donorCount() <= 5) && (acceptorCount() <= 10)') = 1

Presently, about 100 functions including topological and physiochemical descriptors.

Users can define their own functions.

Page 7: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

7

Slide 7

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

Operators and their functional forms III.

Chemical Terms and Query Filter:SELECT id, purchase_date FROM compounds_instock WHERE jc_compare(structure, 'C(=S)([N][N])[S]', 'sep=! t:t!simThreshold:0.9!ctFilter:logp()>1!filterQuery:select rowid from compounds_instock where purchase_date > DATE ''2002-01-01''') = 1

Filter queries allow to execute search on a subset of a table's rows and execute the performance sensitive chemical computations in domain index scan mode.

Dynamic generation of static images:SELECT jc_molconvertb(structure, 'png -2') FROM nci where id = :1

Avaliable image formats: png, jpeg, svg, powray, ppm

PNG

Page 8: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

8

Slide 8

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

Index parameters

Index parameters affect:• Fingerprint attributes• Standardizer configuration• Table space and storage options of the index table

• Generate index jcxnci using structures in the table stfp_keys as structural keys:CREATE INDEX jcxnci ON nci(structure) INDEXTYPE IS jc_idxtype PARAMETERS('STRUCTURALFP_CONFIG=select structure from stfp_keys')

• Strip hydrogens and use Daylight-style aromatization during index creation:CREATE INDEX jcxnci ON nci(structure) INDEXTYPE IS jc_idxtype PARAMETERS('STD_CONFIG=dehydrogenize:optional..aromatize:d')

Page 9: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

9

Slide 9

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

Default properties

Used for SQL statements where no information from JChem indexes is available.

Sample SQL statement without index information:SELECT jc_contains('O=C1C=CNC=C1', 'n1ccccc1') FROM dual

Set default properties:CALL jc_set_default_property('standardizerConfig',

'aromatize:d')

Page 10: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

10

Slide 10

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

Supported Column Types

• VARCHAR2 columnsTargets and queries are VARCHAR2.Operator names: jc_contains, jc_compare, jc_equals...

• BLOB columnsTargets and queries are BLOB.Operator names: jc_containsb, jc_compareb, jc_equalsb...

Exceptions:jc_molconvertb:takes VARCHAR2, returns BLOB

jc_molconvertbb:takes BLOB, returns BLOB

Page 11: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

11

Slide 11

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

Supported Table Types

• Regular Table: nci_1k

• JChem Table (generated by jcman or API): jc_nci_1k

CREATE INDEX jcxnci_1k...

Index table:jcxnci_1k_jcx

CREATE INDEX jcxjc_nci_1k...

Rowid of the base table (nci_1k)

Page 12: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

12

Slide 12

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

Regular Tables vs. JChem Tables

• Regular tables– base table and index table are physically distinct– index properties are specified as index parameters

• JChem tables– base table and index table are physically the same– most of the “index” properties are specified during table creation (jcman or

Java API)

• Pros & Cons:– inserts from outside the database are faster with JChem tables than with

regular tables– JChem tables require Java API or the jcman command line tool (for table

creation) and Java API or special cartridge functions for INSERTs, UPDATEs and DELETEs; standard SQL can be used with regular tables in all cases.

Page 13: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

13

Slide 13

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

JChem Indexes with JChem Tables

• VARCHAR2Index creation:

CREATE INDEX jcxjc_nci ON jc_nci(cd_smiles) INDEXTYPE IS jc_idxtype

Search:SELECT jc_contains(cd_smiles, 'n1ccccc1') FROM jc_nci

• BLOBIndex creation:

CREATE INDEX jcxjc_nci ON jc_nci(cd_structure) INDEXTYPE IS jc_idxtype

Search:SELECT jc_contains(cd_structure, 'n1ccccc1') FROM jc_nci

Page 14: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

14

Slide 14

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

JChem Cartridge Architecture I.

Computation intensive operations performed in a separate JVM (currently Tomcat)

Advantages:• fast execution (optimized native code)• starting point for distributed architecture

JChem Server

Search

Oracle

JChem Cartridge Cache

JChem Core

Cache

JChem Streams

JChem Base

Update

HTTP

JDBC

Page 15: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

15

Slide 15

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

JChem Cartridge Architecture II.

Type Of Database Access

• Single Session: One SQL statement accesses the database through one single database session

– Used for inserting and updating to maintain strict transaction semantics– Computation is still done in Tomcat (for performance)

• Dual Session: One SQL statement accesses the database through multiple database sessions

– Used for searching (for performance and code-reuse reasons)– Only committed changes are seen– “Ex Machina” mechanism to maintain user identity across sessions acting

on behalf of the same operation

Page 16: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

16

Slide 16

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

JChem Cartridge Architecture III.

Single Session Database Access

JChem Server

Search Engine

Oracle

JChem Cartridge

SQL Plus/any DB application

Index Table

jc_insertCache

Cache

Index Table

. . .

JChem Streams

JChem

Stream

s Adapter

JChem Core

Execution Engine

2.

3.

4.

5.

1.

6.

7.

8.

9.

10.

11.

Page 17: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

17

Slide 17

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

JChem Cartridge Architecture IV.

Dual Session Database Access

JChem Server

Search Engine

Oracle

JChem Cartridge

SQL Plus/any DB application

Index Table

Cache

jc_contains

Cache

Index Table

. . .

JChem Streams

JChem

Stream

s Adapter

JChem Core

7.

2.3. 4.

5.

6.

Execution Engine

1.

8.

10.

11.

12.

9.

Page 18: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

18

Slide 18

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

JChem Cartridge Architecture V.

Dual Session “Semantics”

• Transaction context differing across session

Changes must be committed to include them in searches

• Security context disrupted across sessions:

Two options:• Configure “super user” with many privileges• Use jchem_core_pkg.use_password( password VARCHAR2)for primary database sessions

Page 19: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

19

Slide 19

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

Performance

Table containing 3003012 structures (12xNCI) in VARCHAR2 colums with 3GHz dual Xeon with 2GB system memory

Index creation (regular tables): 5801 secondsImport w/o duplicate filtering (JChem-tables): 13104 seconds

Substructure search results:Query Structure Hit

CountTime (milliseconds)

JChem-tables Regular tables

C1CN1c2cnnc3c(cncc23)C4=CSC=C4 0 364 374

O=C1ONC(N1c2ccccc2)c3ccccc3 204 456 467

[#8]-c1c(N=N)c(cc2cc(ccc12)S([#8])(=O)=O)S([#8])(=O)=O 1188 1017 1042

C(Sc1ncnc2ncnc12)c3ccccc3 1752 980 1016

[#7]C1=CC=NC2=C1C=CC(Cl)=C2 4632 1987 2074

c1ncc2ncnc2n1 49848 15873 16446

Clc1ccccc1 274356 60459 63139

Page 20: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

20

Slide 20

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

Future plans

Support more features already included or planned in JChem Base:

• Pharmacophore similarity search

• Custom descriptor (e.g. BCUT, scalar) and metric at similarity search

• Coordination bond support

• Tautomeric search support

• Other S-groups

Page 21: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

21

Slide 21

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

Summary

JChem Cartridge for Oracle allows to access the rich functionality of JChem Base in a flexible and efficient manner.

JChem Cartridge for Oracle uses creative solutions to broaden the applicability of JChem's core functions while preserving key benefits of the Java platform.

Page 22: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

22

Slide 22

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

Links

• Documentation– www.jchem.com/doc/admin/cartridge.html– www.jchem.com/doc/guide/cartridge/index.html– www.jchem.com/doc/guide/cartridge/index.html

• Forum– www.chemaxon.hu/forum/forum7.html

• Brochure– www.chemaxon.com/brochures/

JChem_Cartridge.pdf

Page 23: 1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.

23

Slide 23

Compound storage / retrieval with JChem Cartridge for Oracle — May 2005

Máramaros köz 3/a Budapest, 1037Hungary

[email protected]

www.chemaxon.com

Thank you for your attention