Top Banner
An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007
37

An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Dec 18, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

An Expert System for Chemical Structure Elucidation

Sean WalkerCOMP 4200

November 13, 2007

Page 2: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Introduction

• I will be discussing an expert system developed to determine the chemical structure of an unknown compound (structure elucidation)

• The expert system is implemented on a blackboard

Page 3: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

IntroductionMotivation

• Structure elucidation is a fundamental component of organic chemistry

• Requires a wide range of expertise– Each elucidation technique has its own unique

vocabulary that needs to be mastered

• An expert system can be used to simplify this process

Page 4: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

IntroductionOutline

• Outline of presentation:1) Fundamentals of blackboard systems

2) The expertise being modeled• General spectroscopic techniques

3) Description of the expert system

Page 5: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Blackboard Systems

“Metaphorically, we can think of a set of workers, all looking at the same blackboard: each is able to read everything that is on it and to

judge when he has something worthwhile to add to it.” – Newell, 1969

Page 6: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Blackboard Systems

• A set of experts independently modify solution elements on a central database to produce a complete solution

• The experts communicate solely through their contributions to the central database

• Three major components:– 1) a globally accessible database (the blackboard)– 2) a set of knowledge sources (the experts)– 3) a control mechanism (the scheduler)

Page 7: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Blackboard SystemsThe Blackboard

• Blackboard is structured as an abstraction hierarchy

• Problems can be solved from different points by different knowledge sources

• Items on the blackboard are called entries

• Entries on the same level or on different levels of the hierarchy are linked

• Linked entries constitute a potential solution

Page 8: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Blackboard SystemsThe Knowledge Sources

• Knowledge sources are structured as condition-action pairs– The condition component monitors the blackboard for

any changes– The action component makes changes to the

blackboard when the condition-part is satisfied

• When the condition is satisfied, the knowledge source is “triggered” and the scheduler decides whether the knowledge source will execute its action

Page 9: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Blackboard SystemsThe Scheduler

• One or more problem solving strategies are implemented

• The scheduler examines the current state of the blackboard and decides which triggered knowledge source to execute based on the problem solving strategy in place

• The scheduler can abandon a strategy and adopt a new one or ignore a strategy altogether in order to pursue the most promising solution

Page 10: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Structure Elucidation

• Modern structure elucidation is done using spectroscopy

• In absorption spectroscopy a frequency of light is irradiated on a sample of the unknown and the absorption of the compound is measured

• The resulting data is analyzed by an expert and information about the structure of the unknown can be obtained

• The information collected from each spectra is integrated to determine the complete structure

Page 11: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

SpectroscopyThe Electromagnetic Spectrum

Page 12: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Infrared Spectroscopy

• Involves the absorption of light in the infrared region of the electromagnetic spectrum

• Used primarily to determine what functional groups are present in a molecule

O

CH3

CH3

CH N

CH3 OH

CH3 NH

CH3

CH3

CH3

CH CH

Page 13: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Infrared Spectroscopy

• The broad peak at around 3000 cm-1 indicates the presence of a hydroxyl group (OH)

• The strong, sharp peak at around 1750 cm-1 indicates the presence of a carbonyl group

Page 14: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

UV Spectroscopy

• Involves the absorption of light in the ultraviolet region of the electromagnetic spectrum

• Used to determine the level of conjugation in the unknown– Conjugation is alternating single and double bonds

• UV spectroscopy is not very useful in structure elucidation

CH3 CH2

Page 15: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Proton NMR

• Contains information about the hydrogens in the molecule

• Three key aspects:1) chemical shift – the “type” of hydrogen2) integration – ratio of different types of hydrogens3) splitting – nearest neighbour relationship

• Can be used to identify the presence of certain functional groups

• Used primarily to determine how the different functional groups present fit together (the connectivity)

Page 16: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Proton NMR• The peak at around 10 ppm indicates the presence of an aldehyde

• The peak at 2.6 ppm is split into 4 peaks (a quartet) indicating adjacent to a carbon with 3 hydrogens

O

H

CH3

Page 17: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Carbon-13 NMR

• Contains information about the carbons in the molecule

• Three key aspects:1) chemical shift – the “type” of carbon

2) splitting – the number of hydrogens bonded to each carbon

3) number of unique carbons present

• Used to determine connectivity

Page 18: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

• Peak at 190 ppm indicates the presence of a carbonyl (C=O)

• There are 7 total peaks indicating that there are only 7 unique carbons in the molecule

Carbon-13 NMR

Page 19: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Mass Spectroscopy

• Mass spectroscopy is used to determine the molecular formula of the unknown compound

• Mass spectroscopy data that provides structural information tends to be unreliable and thus will only be used to verify a possible structure or in the event that the other spectral techniques are unsuccessful

Page 20: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Structure ElucidationApplicability of a Blackboard Architecture

• Each type of spectroscopy is unique • A human expert will often analyze a set of

spectra as a whole, selectively determining which spectral information to utilize at a given time

• The blackboard architecture is ideal for this approach

• The blackboard architecture also allows for new experts to be added (new spectroscopic techniques)

Page 21: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

The Expert SystemThe Blackboard

• An expert system implemented on a distributed blackboard has been developed to determine the structure of a chemical compound

• A sequential implementation of a blackboard would allow only one expert to access the blackboard at a time

• In a distributed system experts can access different sections of the blackboard at the same time

Page 22: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

The Expert SystemThe Blackboard

• The hierarchy of the blackboard is based on the complexity of the structures being produced– Low level, basic structures occupy a certain

level of the blackboard while more complicated structures occupy a different level

Page 23: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

The Expert SystemThe Experts

• There are two main types of experts:1) Structure generation routines

2) Spectroscopy experts

Page 24: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Structure Generation RoutinesStoring Structures

• Ideally every possible chemical structure could be stored but this is not feasible– Even a simple formula such as C23H48 has 5,731,580

structural isomers

• Instead a set of substructures (components) is stored such that any possible structure can be formed from a combination of these components

• There are 630 total components• Components are classified as primary,

secondary or tertiary components

Page 25: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Structure Generation RoutinesTypes of Components

• 1) Primary Components:– Primary components are the most basic components

for constructing organic molecules (CH3, CH2, CH, C, CO, OH, O, NH2, NH, N, SH, S, F, Cl, Br, I)

• 2) Secondary Components:– Secondary components are combinations of primary

components– There are 86 secondary components

• 3) Tertiary Components:– Tertiary components are secondary components with

a restriction on what the component can bond to

Page 26: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Structure Generation Routines

• The structure generation routines produce sets of primary, secondary or tertiary components based on input data

• The sets can be further pruned using spectral information

Page 27: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Spectroscopy Experts

• There is an expert for each type of spectroscopy:1) Infrared Expert

2) Ultraviolet Expert

3) Proton NMR Expert

4) Carbon-13 NMR Expert

5) Mass Spectroscopy Expert

Page 28: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Spectroscopy Experts

Page 29: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Spectroscopy Experts

• The data contained in a spectrum may be unreliable or ambiguous– e.g. in a proton NMR spectrum if the chemical shift

between two hydrogens is < 1 then the splitting observed may be inaccurate

• Heuristic rules are used to handle this ambiguity• Uncertainty factors are attached to each

conclusion drawn from the spectra

Page 30: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Spectroscopy Experts

• Each spectral expert translates the data contained in the spectra into molecular fragments

• These fragments are placed in an “active list” which is used to direct and restrict the structure generation routines

• If fragments from different experts conflict then the fragment with the highest certainty factor is used

• The conflicting fragment is placed in an “inactive list” which is used in the event that a correct structure is not found using the active list

Page 31: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Spectroscopy Experts

• The spectroscopy experts are also used to test generated structures for consistency with the spectral information

• The system is able to identify when there is not enough information to verify a possible structure

Page 32: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

An Example…

• Formula of unknown: C7H12O4

• 93 possible sets of primary components are produced

• Using these primary sets 497 sets of secondary components are possible– the number of sets of secondary components

can be decreased if the primary component sets are pruned using spectral data

Page 33: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

An Example…

Page 34: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

An Example…

• After pruning the sets of primary components only one possible set remains:– Set contains 2CH3, 2C=O, 2OH, 1C and 2CH2

O

OHCH3

CH3

O

OHOH

OO

OH

CH3

CH3

Page 35: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

An Example…

Page 36: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

Conclusion

• Determining the chemical structure of an unknown is an important part of organic chemistry

• Expert system technology can be applied to this domain

• A blackboard architecture is especially well suited to this task

Page 37: An Expert System for Chemical Structure Elucidation Sean Walker COMP 4200 November 13, 2007.

References

1) Craig, I. D., Blackboard Systems, Artificial Intelligence Review (1988) 2, 103 - 118.

2) Funatsu, K., Susuta, Y., Sasaki, S., Introduction of Two-Dimensional NMR Spectral Information to an Automated Structure Elucidation System, CHEMICS. Utilization of 2D-Inadequate Information, J. Chem. Inf. Comput. Sci., 1989, 29, 6-11.

3) Sobczak, Ronald S., Matthews, Manton M., An Expert System for Chemical Structure Elucidation Implemented on a Blackboard, Proceedings of the 3rd International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems, 1990, 91-98.

4) Sobczak, Ronald S., Matthews, Manton M., A Massively Parallel Expert System Architecture for Chemical Structure Analysis, Distributed Memory Computing Conference, 1990, 11-17.

5) Sasaki, S., Kudo, Y., Structure Elucidation System Using Structural Information from Multisources: CHEMICS, J. Chem. Inf. Comput. Sci., 1985, Vol. 25, 252-257.