The Structure Elucidation Challenge Between 2003 and 2011, 112 different datasets were submitted from various industries as challenges to the ACD/Structure Elucidator Suite system. [3] The structures determined ranged from 100 to more than 1000 Da (Figure 5) and the software had excellent success determining the correct structures (Figure 6). In one recent example, Structure Elucidator Suite was able to determine the structure of a large, symmetrical dimer known as Bacillusin A [4] (containing 86 skeletal units) even with the presence of ambiguous connectivities. [5] Conclusions Modern CASE systems such as Structure Elucidator Suite provide the necessary capability accurately elucidate a novel chemical structure for complex molecules based on readily available NMR data sets. This allows organizations to avoid expensive, labor-intensive, and time-consuming synthetic efforts. The routine use of CASE tools should also prevent the introduction of many improper structures into the literature, or into compound registries. Introduction Computer-Assisted Structure Elucidation (CASE) has come to be a broadly accepted method for the derivation of novel or difficult chemical structures, especially those of natural products or drug metabolites and impurities. For completely error-free structure determination, one must first evaluate all isomers that truly match connectivity criteria defined by experiment – but it is impossible to do this manually without bias. Modern NMR techniques give ever-increasing amounts of valuable data to substantiate structural evaluations, including lower limits of detection, and more information on connectivities. However, errors in published structures are rampant [1] giving weight to the argument for computer-assisted evaluation of structures. [2] CASE Methodology CASE follows a simple set of steps to comprehensively evaluate structures against available data: 1. Interpret experimental data to extract knowledge: • Molecular Formula • Integrals • Chemical shifts • Multiplicities • Connectivities • Known fragments • Known exclusions 2. Search structure space to derive all possible structures 3. Rank-order based on set criteria • Predicted chemical shift Data Interpretation The Case for CASE: Computer-Assisted Structure Elucidation References 1. Nicolaou, KC; Snyder, SA. Angew. Chem. Int. Ed. Engl., 2005, Feb 4:44(7); 1012-44. 2. Elyashberg, ME; Williams, AJ; Blinov, KA. Nat. Prod. Rep., 2010, 27(9): 1296-1328. 3. Moser, A; Elyashberg, ME; Williams, AJ; Blinov, KA; DiMartino, JC. J. Cheminf., 2012 4:5. 4. Ravu, RR, et al. J. Nat Prod., 2015, 78(4): 924-8. 5. http://www.acdlabs.com/comm/elucidation/2015_05.php Advanced Chemistry Development, Inc. Tel: (416) 368-3435 Fax: (416) 368-5596 Toll Free: 1-800-304-3988 Email: [email protected] www.acdlabs.com Reprints: [email protected] Structure Generation Ranking the Best Candidates Using the advanced algorithms of ACD/Structure Elucidator Suite, the entirety of the chemical space can be queried for possible structures which agree with the experimental data. The software efficiently evaluates all structural possibilities, and then ranks the best candidates for review based on the differences between the experimental and predicted chemical shifts. Rapid Identification of Structures With the right databases, known structures can also be quickly identified from previous work. The following DBs are available using ACD/Labs software: • ACD/HNMR and CNMR DB • 320,000 assigned chemical structures • ACD/Structure Elucidator DB • 425,000 structures • 2,300,000 structural fragments • ChemSpider DB • 22,000,000 predicted compounds with 13 C, 1 H, 19 F, 31 P, and 15 N NMR • Dictionary of Natural Products • Over 220,000 compounds • Marinlit • > 21,000 compounds (marine sources) • AntiBase • > 35,000 compounds (microorganisms and higher fungi) Cl Br OH Br OH OH O C 10 H 17 Br 2 ClO 2 , 50,502,293 C 15 H 22 O 2 , 138,136,211,624 O O O O C 15 H 20 O 1 , 37,568,150,635 C 12 H 12 O 3 , 68,930,547,646 O O H OH N H N H 2 O H O C 13 H 20 O 3 , 14,431,269,166 C 11 H 12 N 2 O 2 , 310 11 <n10 12 Figure 1: The large number of possible isomers for some relatively small organic molecules. Figure 2: In ACD/Labs NMR software, peak assignments are synchronized throughout a set of related NMR experiments with NMRSync. During interpretation, the software displays correlations for assigned spectra and structures, highlighting those which are likely to be erroneous. Figure 3: A Molecular Connectivity Diagram (MCD) is created automatically during data interpretation. This can be used both for de novo elucidation, or to perform an alternative check on proposed structures. Figure 4: A list of potential structures, ranked by their associated chemical shift deviations. Structural assignments are color-coded based on their agreement with the experimental data. Figure 6: The outcomes of structural evaluations performed by ACD/Structure Elucidator Suite from 2003-2011. Figure 5: The distribution of molecular weight in structural challenges submitted to ACD/Labs. CH 3 33 CH 3 33 C H 3 32 CH 3 32 C H 3 34 CH 3 34 26 26 28 28 20 20 30 30 22 22 18 18 31 31 27 27 21 21 19 19 23 23 29 29 4 4 6 6 25 25 16 16 8 8 14 14 15 15 12 12 9 9 11 11 13 13 24 24 10 10 2 2 7 7 17 17 3 3 5 5 1 1 O O O O O O O H OH O H OH O H OH O H OH O H OH OH O H Figure 5: The elucidated structure of Bacillusin A