InChI/InChIKey vs. NCI/CADD Structure Identifiers: A comparison Markus Sitzmann Computer-Aided Drug Design Group (NCI/CADD), Laboratory of Medicinal Chemistry, NCI-Frederick, NIH, DHHS
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
InChI/InChIKey vs. NCI/CADD Structure Identifiers: A comparison Markus Sitzmann
Computer-Aided Drug Design Group (NCI/CADD), Laboratory of Medicinal Chemistry, NCI-Frederick, NIH, DHHS
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
The Adaption and Use of the IUPAC InChI/InChIKey
NCI/CADD Identifiers InChI/InChIKey
Chemical Structure Lookup Service
FICTS FICuS uuuuu Std. InChI/InChIKey
74 million structure records – 46 million unique structures
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
• based on hashcodes calculated by the chemoinformatics toolkit CACTVS
• CACTVS hashcodes: represent a chemical structure uniquely as
16-digit hexadecimal number (64-bit unsigned)
have a high sensitivity to structural features of a compound
change if connectivity changes
NCI/CADD Structure Identifiers Unique Representation of Chemical Structures
H N N N H 2
O H
O
9850FD9F9E2B4E25
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
charged form
A3DAE0788050DDE4 3ECEF579D7DF025A
tautomers
isotope “errors”
E92E4BA2869F3611 8A7AD1EB498CC76A stereoisomers 6C16DE2351F9FF50
H N N N H 2
O H
O
N N H N H 2
O H O
H N N
O H O
N H 2 H N
N O H
O
N H 2
salt
H N N N H 2
O - O
N a + H N
N N H 3 + O -
O
8F7A1DE5A733F0E0
O
H N N N H 2
O N a
60525E1AF41497B6
H N N N H
O H O
B2FDA68AEDA06DB9
N H N 1 5 N H 2
O H O
9850FD9F9E2B4E25
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
input structure
MDL Molfile MDL SDF SMILES ChemDraw cdx PDB
structure normalization
parent structure
MDL SDF SMILES database
NCI/CADD Identifier
hashcode calculation
NCI/CADD Structure Identifiers Unique Representation of Chemical Structures
E_HASHISY
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
• adjustable levels of sensitivity:
NCI/CADD Structure Identifiers
Fragments
sensitive
keep only largest organic fragment
Isotopes
ignore isotope labels
sensitive
D D D
D D D
Charges
uncharge
sensitive
find canonical tautomer
O O
Stereochemistry
sensitive
C O O H N H 2
discard stereo information
O - O
N H 3 +
O H O
N H 2
un-sensitive un-sensitive un-sensitive un-sensitive
sensitive
O O H
O O H
Tautomers
C O O H H N H 2 C O O H
N H 2 H Na+
O O -
O O H
Structure Normalization
un-sensitive
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
NCI/CADD Structure Identifiers
Fragments Isotopes Charges
sensitive sensitive sensitive
D D D
D D D
O O C O O H N H 2
un-sensitive un-sensitive un-sensitive un-sensitive
O - O
N H 3 +
O H O
N H 2
Tautomers Stereochemistry
sensitive sensitive
O O H
O O H C O O H H N H 2 C O O H
N H 2 H Na+
O O -
O O H
Structure Normalization
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
NCI/CADD Structure Identifiers
Fragments Isotopes Charges
sensitive sensitive sensitive
D D D
D D D
O O C O O H N H 2
F I C
FICTS identifier: representation of the exact drawing
un-sensitive un-sensitive un-sensitive un-sensitive un-sensitive
T
O - O
N H 3 +
O H O
N H 2
≠ ≠ ≠
Tautomers Stereochemistry
sensitive sensitive
O O H
O O H C O O H H N H 2 C O O H
N H 2 H
≠
≠
S
Na+
O O -
O O H
=
=
≠
≠
Structure Normalization
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
NCI/CADD Structure Identifiers
Fragments Isotopes Charges
sensitive sensitive sensitive
D D D
D D D
O O C O O H N H 2
F I C
FICuS identifier: comes closest to how a chemist perceives a compound
un-sensitive un-sensitive un-sensitive un-sensitive un-sensitive
u
O - O
N H 3 +
O H O
N H 2
≠ ≠ ≠ ≠
Tautomers Stereochemistry
sensitive sensitive
O O H
O O H C O O H H N H 2 C O O H
N H 2 H =
= ≠
≠
S
Na+
O O -
O O H
Structure Normalization
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
NCI/CADD Structure Identifier
Fragments Isotopes Charges Tautomers Stereochemistry
Na+
sensitive sensitive sensitive sensitive sensitive
O O -
D D D
D D D
O - O
N H 3 +
O O H
O O H C O O H H N H 2 C O O H
N H 2 H
O O H
O O C O O H N H 2 O H
O
N H 2
=
= = = = = =
=
uuuuu identifier: closely related forms of the same compound
u u u u u
un-sensitive un-sensitive un-sensitive un-sensitive un-sensitive
Structure Normalization
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
NCI/CADD Structure Identifier
correct structure: add hydrogen atoms correct functional groups correct metal atom bonds
input structure
normalize or discard stereo information
define canonical tautomer
discard isotope labels
d
Structure Normalization
get largest fragment & uncharge: delete complex center get largest organic fragment delete radical center uncharge structure
uuuuu
uuuuS
uuuTu
uuuTS
FICuu
FICuS
FICTS
FICTu
n
n
n
n
d
d
d
define canonical resonance form/ protonation state
parent structures
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
NCI/CADD Structure Identifier
9850FD9F9E2B4E25-FICTS-01-57 9850FD9F9E2B4E25-FICuS-01-78 9850FD9F9E2B4E25-uuuuu-01-27
<CACTVS hashcode (E_HASHISY)>-<tag>-<version>-<checksum>
H N N N H 2
O H
O
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
A3DAE0788050DDE4-FICTS E5F83F10C5DB080A-FICTS
B2FDA68AEDA06DB9-FICTS
9850FD9F9E2B4E25-FICTS
E5F83F10C5DB080A-FICTS
E92E4BA2869F3611-FICTS 8A7AD1EB498CC76A-FICTS 6C16DE2351F9FF50-FICTS
H N N N H 2
O H
O
N N H N H 2
O H O
H N N
O H O
N H 2 H N
N O H
O
N H 2
H N N N H 2
O - O
N a + H N
N N H 3 + O -
O
O
H N N N H 2
O N a
H N N N H
O H O
N H N 1 5 N H 2
O H O
9850FD9F9E2B4E25-FICTS
charged form
tautomers
isotope
salt
stereoisomers
FICTS
“errors”
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
A3DAE0788050DDE4-FICuS E5F83F10C5DB080A-FICuS
B2FDA68AEDA06DB9-FICuS
9850FD9F9E2B4E25-FICuS
E5F83F10C5DB080A-FICuS
E92E4BA2869F3611-FICuS 8A7AD1EB498CC76A-FICuS 9850FD9F9E2B4E25-FICuS
H N N N H 2
O H
O
N N H N H 2
O H O
H N N
O H O
N H 2 H N
N O H
O
N H 2
H N N N H 2
O - O
N a + H N
N N H 3 + O -
O
O
H N N N H 2
O N a
H N N N H
O H O
N H N 1 5 N H 2
O H O
9850FD9F9E2B4E25-FICuS
charged form
tautomers
isotope
salt
stereoisomers
FICuS
“errors”
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
9850FD9F9E2B4E25-uuuuu 9850FD9F9E2B4E25-uuuuu
9850FD9F9E2B4E25-uuuuu
9850FD9F9E2B4E25-FICuS
9850FD9F9E2B4E25-uuuuu
9850FD9F9E2B4E25-uuuuu 9850FD9F9E2B4E25-uuuuu 9850FD9F9E2B4E25-uuuuu
H N N N H 2
O H
O
N N H N H 2
O H O
H N N
O H O
N H 2 H N
N O H
O
N H 2
H N N N H 2
O - O
N a + H N
N N H 3 + O -
O
O
H N N N H 2
O N a
H N N N H
O H O
N H N 1 5 N H 2
O H O
9850FD9F9E2B4E25-uuuuu
charged form
tautomers
isotope
stereoisomers
salt
uuuuu
“errors”
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
HNDVDQJCIGZPNO-UHFFFAOYSA-N
HNDVDQJCIGZPNO-CDYZYAPPSA-N
HNDVDQJCIGZPNO-RXMQYKEDSA-N HNDVDQJCIGZPNO-YFKPBYRVSA-N HNDVDQJCIGZPNO-UHFFFAOYSA-N
H N N N H 2
O H
O
N N H N H 2
O H O
H N N
O H O
N H 2 H N
N O H
O
N H 2
H N N N H 2
O - O
N a + H N
N N H 3 + O -
O
O
H N N N H 2
O N a
H N N N H
O H O
N H N 1 5 N H 2
O H O
HNDVDQJCIGZPNO-UHFFFAOYSA-N
charged form
tautomers
isotope
stereoisomers
salt
Std. InChIKey
“errors”
HNDVDQJCIGZPNO-UHFFFAOYSA-N
UHPNKBYGGMJTIM-UHFFFAOYSA-M
UHPNKBYGGMJTIM-UHFFFAOYSA-M
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
Structure Normalization
Tautomers
canonical tautomer
?
O
O OH
O
O OH
O
O O
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
• CACTVS: generation of all formal tautomers for a given organic compound (prototropic tautomerism)
• rule set of 21 transforms encoded as (CACTVS-extended) SMIRKS
• types of tautomerism covered:
Tautomers Structure Normalization
1.3, 1.5 keto/enol imine/enamine imine/amine lactam/lactim 1.3, 1.5, 1.7, 1.11 hydrogen atom shift on (aromatic) heteroatoms keten/ynol nitro/aci-nitro nitroso/oxime
special cases: cyanic/iso-cyanic acid, phosphonic acid, formamidinesulfonic acid, isocyanide, furanones and more …
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
Tautomers Structure Normalization
transform: 1.3 keto-enol
[O,S,Se,Te;X1:1]=[Cx1:2][CX4R{0-2}:3][#1:4]>> [#1:4][O,S,Se,Te;X2:1][Cx1,cx1:2]=[C,cx1,cx0:3]
transform: 1.3 heteroatom H shift
[N,n,S,s,O,o,Se,Te:1]=[NX2,nX2,C,c,P,p:2] [N,n,S,O,Se,Te:3][#1:4]>>[#1:4][N,n,S,O,Se,Te:1] [NX2,nX2,C,c,P,p:2]=[N,n,S,s,O,o,Se,Te:3]
transform: 1.5 heteroatom H shift
[nX2,NX2,S,O,Se,Te:1]=[C,c,nX2,NX2:6][C,c:5]=[C,c,nX2:2] [N,n,S,s,O,o,Se,Te:3][#1:4]>>[#1:4][N,n,S,O,Se,Te:1] [C,c,nX2,NX2:6]=[C,c:5][C,c,nX2:2]=[NX2,S,O,Se,Te:3]
• 21 SMIRKS transforms, examples:
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
N
N H
N H
N
O
H 2 N
N
N H
N
H N
O
H 2 N
N
N H
N
N
O H
H 2 N
H N
N N H
N
O
H 2 N
N
N N H
N
O H
H 2 N
HN
N N
H N
O
H 2 N
N
N N
H N
O H
H 2 N
H N
N N
N
O H
H 2 N
H N
N H
N H
N
O
H N
N
N H
N H
N
O H
H N
H N
N H
N
H N
O
H N
N
N H
N
H N
O H
H N
H N
N H
N
N
O H
H N
HN
N N H
N
O H
H N
HN
N N
H N
O H
H N
Tautomers Structure Normalization
A6199E68A788F2F5-FICTS 959B273B619C709F-FICTS
61248C4A7D045A47-FICTS
675R4FCC50F45026-FICTS
0B345B47F6625113-FICTS
181CA9BCE3EF47F4-FICTS
1AD375920BE60DAD-FICTS
67196F0B20B1D934-FICTS
BCCDA7D0CDACF120-FICTS CE8F480C11DBFC4F-FICTS
D46A1E6500B06AB6-FICTS
D979CF9770AC0BA5-FICTS
56FFE8B5619FB01-FICTS F802E527EC5C61BF-FICTS EF060DA9D97091DE-FICTS
BCCDA7D0CDACF120-FICuS
guanine
UYTPUPDQBNUYGX-UHFFFAOYSA-N
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
Tautomerism & Stereochemistry
O Z
O E
methyl propenyl ketone
Structure Normalization
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
O Z
O E
O H
tautomer
tautomer
methyl propenyl ketone
Structure Normalization
Tautomerism & Stereochemistry
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
O Z
O E
O H
O
76D03F08ACDF6C0C-FICuS
FICUS disregards stereo-chemistry on double bonds if the double bond is not located during tautomer generation.
tautomer
tautomer
methyl propenyl ketone
Tautomerism & Stereochemistry Structure Normalization
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
O Z
O E
O H
O
76D03F08ACDF6C0C-FICuS
FICUS disregards stereo-chemistry on double bonds if the double bond is not located during tautomer generation.
tautomer
InChI=1S/C5H8O/c1-3-4-5(2)6/h3-4H,1-2H3/b4-3+ LABTWGUMFABVFG-ONEGZZNKSA-N
InChI=1S/C5H8O/c1-3-4-5(2)6/h3-4,6H,1H2,2H3/b5-4- LYGWZVOQSCPYDG-PLNGDYQASA-N
InChI=1S/C5H8O/c1-3-4-5(2)6/h3-4H,1-2H3/b4-3- LABTWGUMFABVFG-ARJAWSKDSA-N
tautomer
methyl propenyl ketone
InChI/InChIKey - NCI/CADD Identifier comparison
Tautomerism & Stereochemistry InChI=1S/C5H8O/c1-3-4-5(2)6/h3-4H,1-2H3 LABTWGUMFABVFG-UHFFFAOYSA-N
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
O Z
O E
O H
821D8C17ACE5040E-FICTS
6EB4AA2BAA11965F-FICTS
1677645190718885-FICTS
tautomer
tautomer
O
76D03F08ACDF6C0C-FICTS
methyl propenyl ketone
FICTS “sees” four different structures
InChI/InChIKey - NCI/CADD Identifier comparison
Tautomerism & Stereochemistry
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
Charges in Resonance Systems Structure Normalization
F3A27F03AE77A722
F3A27F03AE77A722
62FADCB01F197FC9
canonical resonance structure?
uncharge
≠
uncharge
problem!
2E011EE4519F7920
N N H
N N H
H
N N H N
N H H
different protonation states
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
Structure Normalization
• generation of all formal resonance structures for a given (charged) organic compound
• rule set of 14 transforms encoded as (CACTVS-extended) SMIRKS
shifting of charges: 5 rules
recombination of charges: 5 rules
separation of charges: 4 rules
O N O
O N O
O N O
O N O
O N O
O N O
Charges in Resonance Systems
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
Structure Normalization
(no plausible unpolarized resonance structure can be drawn)
münchnones:
N O
O
N O
O
N O
O
N O
O
N O
O
N O
O
N O
O
N O
O
1.2 shift
1.2 recombination
1.2 recombination
separation (pentavalent N atom) 1.3 shift
1.3 shift
1.3 recombination 1.3 shift 1.3 shift 1.3 shift 1.3 shift
Charges in Resonance Systems
IUYUGWCTOLFFCL-UHFFFAOYSA-N F68AC07DE0D3379F-FICuS
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
• PubChem database (including Open NCI database, EPA DSSTox databases, NIAID HIV databases, NIST Webbook, NLM ChemIDplus, ChemSpider …)
• ChemNavigator iResearch Library
(compilation of commercially available screening compounds from ~250 international chemistry suppliers)
• Commercial Sources / Others (Asinex, Comgenex, …)
»Chemical Structure Lookup Service« Database
74 million structure records (~46 million unique structures)
InChI/InChIKey - NCI/CADD Identifier comparison
ChemNav. iResearch Lib. ~43%
PubChem ~47%
Others
~10%
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
• structure records registered in CSLS: 74.2 million
successful calculation of: Standard InChI/InChIKey: 73.8 million records NCI/CADD Structure Identifiers: 73.7 million records
• unique structure counts (compound sets)
Standard InChI/InChIKey: FICTS Identifier FICuS Identifier Standard InChIKey (first block) uuuuu Identifier
48,027,940 48,023,835 46,715,521 43,055,589 41,671,010
Standard InChI/InChIKeys were calculated by stdinchi-1 (Linux i-386 executable) from the original SD file records
Unique Structure Counts InChI/InChIKey - NCI/CADD Identifier comparison
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
original structure record set (74.2 million)
FICuS compound set (46.7 million unique)
Standard InChI/InChIKey set calculated by stdinchi-1
(73.8 million, 48.0 million unique)
Detailed Comparison InChI/InChIKey - NCI/CADD Identifier comparison
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
original structure record set (74.2 million)
FICuS compound set (46.7 million unique)
Standard InChI/InChIKey set calculated by stdinchi-1
(73.8 million, 48.0 million unique)
Detailed Comparison InChI/InChIKey - NCI/CADD Identifier comparison
1 conflicts?
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
original structure record set (74.2 million)
FICuS compound set (46.7 million unique)
Standard InChI/InChIKey set calculated by stdinchi-1
(73.8 million, 48.0 million unique)
Detailed Comparison InChI/InChIKey - NCI/CADD Identifier comparison
1 conflicts?
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
original structure record set (74.2 million)
FICuS compound set (46.7 million unique)
Standard InChI/InChIKey set calculated by stdinchi-1
(73.8 million, 48.0 million unique)
Detailed Comparison
Standard InChI/InChIKey calculated by CACTVS
from FICuS compound structure
InChI/InChIKey - NCI/CADD Identifier comparison
same InChI/InChIKey? 2
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
1 no conflicts between Std. InChI/InChIKey and FICuS
Detailed Comparison InChI/InChIKey - NCI/CADD Identifier comparison
FICuS linked to a single InChI/InChIKey
both linked to a single structure record
both linked to multiple structure records
62.3
34.4
27.9
all structure records
(46.9%)
(38.0%)
73.7
(84.5%)
structure records (million records)
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
1 conflicts between Std. InChI/InChIKey and FICuS
Detailed Comparison InChI/InChIKey - NCI/CADD Identifier comparison
structure records (million records)
all structure records
FICuS is linked to multiple InChI/InChIKeys or vice versa
one FICuS is linked to multiple InChI/InChIKeys
one InChI/InChIKey is linked to multiple FICuS
10.9
6.8
4.1
(9.2%)
(5.5%)
(14.7%)
73.7
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
1 conflicts between Std. InChI/InChIKey and FICuS
Detailed Comparison InChI/InChIKey - NCI/CADD Identifier comparison
structure records (million records)
all structure records
FICuS is linked to multiple InChI/InChIKeys or vice versa
one FICuS is linked to multiple InChI/InChIKeys
one InChI/InChIKey is linked to multiple FICuS
73.7
number of InChIKey first block 2.3
number of InChIKey first block 1.0
(3.1%)
(1.3%)
10.9
6.8
4.1
(9.2%)
(5.5%)
(14.7%)
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
Detailed Comparison
2
FICuS
FICTS
uuuuu
46.7
48.0
41.6
6.4 (13.7%)
3.8 (7.9%)
11.9 (28.6%)
compounds (unique structures) (million records)
all compounds
73.7 9.3
4.6
(29.7%) 21.9
(6.2%)
(12.7%)
structure records (million records)
all records
InChI/InChIKey - NCI/CADD Identifier comparison
same InChI/InChIKey?
InChI changes InChI changes
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
Detailed Comparison
FICuS
FICTS
uuuuu
46.7
48.0
41.6
6.4 (13.7%)
3.8 (7.9%)
11.9 (28.6%)
compounds (unique structures) (million records)
all compounds
structure records (million records)
all records
InChI/InChIKey - NCI/CADD Identifier comparison
3.2 6.3 (7.6%) (8.4%) vs. InChIKey first block
InChI changes InChI changes
2 same InChI/InChIKey?
73.7 9.3
4.6
(29.7%) 21.9
(6.2%)
(12.7%)
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
(formal) tautomer count > 1 (formal) tautomer count > 3 (formal) tautomer count > 10 full stereo contains metal atoms metal complexes salt has resonance charges inorganic
compound classification
14.5% 18.5% 28.9% 16.9% 34.5% 52.1% 18.6% 52.1% 33.9%
56.4% 25.4% 5.5%
25.7% 0.8% 0.2% 1.0% 0.2% 0.1%
Detailed Comparison InChI/InChIKey - NCI/CADD Identifier comparison
occurrence in FICuS set
occurrence in FICuS subset
(InChI changes)
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
FICuS: 12 different structure records linked to this structure
Std. InChI/InChIKey (stdinchi-1): calculates 3 different strings/keys for these 12 structure records (all have the same connectivity layer/first block)
all of these 3 StdInChI/InChIKey differ from the StdInChI/InChIKey calculated after FICuS normalization (including connectivity layer/ first block)
InChI/InChIKey - NCI/CADD Identifier comparison
H N
O N
N H
O
O
ChemBlock A3422/0145215
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
H N
O N
N H
O O
N
O N O
O N H
Z E
InChI/InChIKey - NCI/CADD Identifier comparison
H N
O N
N H
O
O
ChemBlock A3422/0145215
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
H N
O N
N H
O O
N
O N O
O N H
Z E
tautomer:
InChI/InChIKey - NCI/CADD Identifier comparison
H N
O N
N H
O
O
ChemBlock A3422/0145215
N
O N
N H
O O
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
H N
O N
N H
O O
N
O N O
O N H
Z E
tautomer:
H N
O N O
O N H
tautomeric interconversion?
InChI/InChIKey - NCI/CADD Identifier comparison
H N
O N
N H
O
O
ChemBlock A3422/0145215
N
O N
N H
O O
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
H N
O N
N H
O O
N
O N O
O N H
Z E
tautomer:
H N
O N O
O N H
tautomeric interconversion?
tautomeric interconversion?
S R
InChI/InChIKey - NCI/CADD Identifier comparison
H N
O N
N H
O
O
N
O N
N H
O O
N
O N
N H O
O
ChemBlock A3422/0145215
N
O N
N H
O O
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
H N
O N
N H
O O
N
O N O
O N H
Z E
tautomer:
H N
O N O
O N H
tautomeric interconversion?
tautomeric interconversion?
InChI/InChIKey - NCI/CADD Identifier comparison
H N
O N
N H
O
O
ChemBlock A3422/0145215
N
O N
N H
O O
S R
N
O N
N H
O O
N
O N
N H O
O
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
H N
O N
N H
O O
N
O N O
O N H
Z E
tautomer:
H N
O N O
O N H
tautomeric interconversion?
tautomeric interconversion?
S R
InChI/InChIKey - NCI/CADD Identifier comparison
H N
O N
N H
O
O
N
O N
N H
O O
N
O N
N H O
O
ChemBlock A3422/0145215
N
O N
N H
O O
How many structures?
ZINC04685909
ChemBlock A3422/0145215 ChemNavigator 47748165 NIST MS-Lib 1967005690
ChemNavigator 34903393
ChemNavigator 65635274
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
H N
O N
N H
O O
N
O N O
O N H
Z E
tautomer:
H N
O N O
O N H
tautomeric interconversion?
tautomeric interconversion?
S R
InChI/InChIKey - NCI/CADD Identifier comparison
H N
O N
N H
O
O
N
O N
N H
O O
N
O N
N H O
O
ChemBlock A3422/0145215
N
O N
N H
O O
How many structures?
InChIKey A
InChIKey B
InChIKey C
same connectivity layer/block
FICuS
parent structure
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
The Adaption and Use of the IUPAC InChI/InChIKey
NCI/CADD Identifiers InChI/InChIKey
Chemical Structure Lookup Service
FICTS FICuS uuuuu Std. InChI/InChIKey
74 million structure records – 46 million unique structures
http://cactus.nci.nih.gov/lookup
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
Web Service
Chemical Structure REST Service (beta)
http://cactus.nci.nih.gov/chemical/structure/{identifier}/{method}
http://cactus.nci.nih.gov/chemical/structure/InChIKey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N/smiles http://cactus.nci.nih.gov/chemical/structure/InChIKey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N/names http://cactus.nci.nih.gov/chemical/structure/InChIKey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N/ficus http://cactus.nci.nih.gov/chemical/structure/InChIKey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N/stdinchi http://cactus.nci.nih.gov/chemical/structure/InChIKey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N/image
http://cactus.nci.nih.gov/chemical/structure/ethanol/stdinchikey http://cactus.nci.nih.gov/chemical/structure/64-17-5/stdinchikey
URL scheme:
returns plain text/gif image if the structure identifier is not resolvable: http 404 status code
Comparison Standard InChI/InChIKeys - NCI/CADD Structure Identifiers
Acknowledgments
ChemNavigator Scott Hutton
Tad Hurst
CADD Group, LMC, NCI Marc Nicklaus
Igor V. Filippov
CACTVS, Xemistry GmbH
Wolf-Dietrich Ihlenfeldt
Thanks to all database providers
Thanks to the InChI Team
http://cactus.nci.nih.gov
Our web site: