Transcript

Chemoinformatics: a Hot Topic in Chemoinformatics: a Hot Topic in

Distance EducationDistance Education

Zarrin Es’haghiZarrin Es’haghi

Department of Chemistry, Faculty of SciencesDepartment of Chemistry, Faculty of Sciences

Payame Noor University, Mashhad, IranPayame Noor University, Mashhad, Iran

E-Mail: z_eshaghi@pnu.ac.irE-Mail: z_eshaghi@pnu.ac.ir

Payame Noor University

Chemoinformatics: a Hot Topic in Chemoinformatics: a Hot Topic in

Distance EducationDistance Education

Zarrin Es’haghiZarrin Es’haghi

Department of Chemistry, Faculty of SciencesDepartment of Chemistry, Faculty of Sciences

Payame Noor University, Mashhad, IranPayame Noor University, Mashhad, Iran

E-Mail: z_eshaghi@pnu.ac.irE-Mail: z_eshaghi@pnu.ac.ir

Payame Noor University

3

Chemoinformatics Chem(o)informatics is a generic

term that encompasses the; design, creation, organization, design, creation, organization,

management, analysis, visualization management, analysis, visualization and use of chemical information.and use of chemical information.

In fact, Chemoinformatics is the application of informatics methods to solve chemical problems.

What is Chemoinformatics?What is Chemoinformatics?

Chemoinformatics, Cheminformatics, Chemical Informatics, Computational Chemistry, …

“the set of computer algorithms and tools to store and analyse chemical data in the context of drug discovery and design projects etc…”

4

What is Chemoinformatics?What is Chemoinformatics?

“the mixing of information resources to transform data into information and information into knowledge, for the intended purpose of making better decisions faster in the arena of drug lead identification and optimizaton”

5

What is Chemoinformatics?What is Chemoinformatics?

“chemoinformatics encompasses the design, creation, organisation, management, retrieval,analysis, dissemination, visualization and use

of chemical information”

6

Chemoinformatics : a new scienceChemoinformatics : a new science)?( )?(

7

Why do we need Why do we need ChemoinformaticsChemoinformatics??

To handle large amounts of information To move chemistry into the computer age

To move from data to knowledge.

9

And last but not least:

•To get funding (bioinformatics is doing well currently, whereas computational chemistry seems to be lagging behind).•Data information knowledge •measurements/calculations

Why do we need ChemoinformaticsWhy do we need Chemoinformatics?

10

How do we learn?How do we learn?

Inductive learning vs.. Inductive learning vs.. Deductive learningDeductive learning

Inductive learning vs. Deductive Inductive learning vs. Deductive learninglearning

Deductive learning:Deductive learning:A fundamental theory exists which allows us to calculate properties and predict the behavior of molecules. The fundamental theory for Chemistry is quantum mechanics.

Inductive learning vs. Deductive learningInductive learning vs. Deductive learningInductive learning = Learning from examplesInductive learning = Learning from examples

13

General scheme for inductive learningGeneral scheme for inductive learning

14

The fundamental tasks of a chemistThe fundamental tasks of a chemist property prediction, synthesis, design, reaction prediction, and

structure elucidation

15

The realm of ChemoinformaticsThe realm of Chemoinformatics

a) Representing Chemical Compounds

b) Searching Chemical Structuresc) Similarity Searchesd) Relating structure to properties with models

16

Machine Learning MethodsMachine Learning Methods

• Important role in chemoinformatics Important role in chemoinformatics – For example, it is usually difficult to

predict which types of descriptors are most suitable for a given search, classification.

• Therefore, machine learning techniques are often used to facilitate descriptor selection

17

Machine Learning Methods Machine Learning Methods – Genetic algorithms– Genetic algorithms

• Different parameters and model solutions to given problems are encoded in a chromosome and subjected to random variation, thus generating a population.

• Solutions provided by these chromosomes are evaluated by fitness function that assign high scores to desired results.

• Chromosomes yielding best intermediate solutions are subjected to mutation and crossover operation that correspond to random genetic mutations and gene recombination events.

• The resulting modified chromosomes represent the next generation and the process is continued until the obtained results meet a satisfactory convergence criterion

18

Quantitative Structure Activity Quantitative Structure Activity Relationship Analysis (QSAR)Relationship Analysis (QSAR)

Goal :Goal : Evaluation of molecular features that determine biological activity and the prediction of compound potency as a function of structural modification

19

Virtual Screening and Compound FilteringVirtual Screening and Compound Filtering

VS(Virtual Screening) - the process of screening large databases on the

computer for molecules having desired properties and biological activity.

A major application of VS techniques is the identification of novel active molecules in large compound databases.

20

Impact of new technology on drug discoveryImpact of new technology on drug discovery

• The last few years have seen a number of “revolutionary” new technologies:– Gene chips, genomics and HGP– Bioinformatics & Molecular biology– More protein structures– High-throughput screening & assays– Virtual screening and library design– Combinatorial chemistry– Other computational methods

• How do we make it all work for us?21

How Chemoinformatics can help outHow Chemoinformatics can help out

Producing and manage information for metrics to reduce risk, e.g.– Virtual screening– Library design,– Docking– Cost/benefit analysis

• Making information available at the right time and the right place Needs to be integrated into processes

22

Software relevance:Software relevance:Bridge between computation & scienceBridge between computation & science

clusteringsim. searchingactivity modelsscaffold detectiondockinglogp calculation

tasks:

“doing a cluster analysis”

“identifying activity-related fragments”

tools

chemoinformatics science

tasks:

work out a chemical synthesis

choose good reagents

try and document some reactions

goals:

e.g. produce compounds that have high biological activity

?

23

Chemoinformatics: WChemoinformatics: Where It Has here It Has Come From, Where It Is Now And Come From, Where It Is Now And

Where It Is GoingWhere It Is Going

OverviewOverviewFrom Chemical Information To From Chemical Information To

ChemoinformaticsChemoinformatics– Integration with techniques from molecular

modeling– Developments in computer hardware and software – Data explosion arising from developments in

combinatorial chemistry and high-throughput screening

25

Molecular ModellingMolecular Modelling

• Positioning of a putative ligand into a protein’s active site, first attempted by the DOCK program (UCSF, 1982)

• Initially restricted to rigid ligands and rigid proteins: current programs permit some degree of flexibility

• Use in structure-based design– Move from docking a single

ligand to sequential docking of large datasets

26

27

Graph TheoryGraph Theory• Graph theory is a branch of mathematics that

considers sets of objects, called nodesnodes, and the relationships, called edgesedges, between pairs of these objects

• The definition is completely general, allowing graphs to be used in many different application domains as long as an appropriate representation can be derived

28

Examples Of GraphsExamples Of Graphs

29

Proposed courses for a Distance Proposed courses for a Distance learning Programlearning Program

• Chemoinformatics Virtual ClassroomChemoinformatics Virtual Classroom

30

 

At present there are no specific software tools for chemical information training in the IranIran. A number of commercial software products used in the pharmaceutical and biotechnology industry are either too expensive or of limited utility for training in either academic or business settings. By employing distance learning distance learning through a web delivery system, the training software will provide an effective, low cost solution for academic institutions, whether they are offering a single course to students in a remote setting, or an entire program in cheminformatics.

31

32

In addition, such training tools will be very useful in industry settings with local area networks, where in a multidiscipline setting individuals need to receive training on the concepts employed by industrial chemoinformatics software's.

Chemoinformatics: aimsChemoinformatics: aims

• Develop an awareness of Informatics Management techniques used in the design and implementation of chemoinformatics systems

• Enable students to demonstrate skills learned by carrying out a small-scale industrially relevant chemoinformatics research project

• Basic structure– Three semesters of taught modules – One semester dissertation working at the site of

one of the companies supporting the programme33

Proposed Cources ;Proposed Cources ;

An introduction to chemoinformatics. – Chemoinformatics (Fundamental) – Information Systems Modelling – Information Storage and Retrieval– Foundations of Object-Oriented Programming

34

35

• Chemoinformatics (Advance ; more programming)

• Database Design • Research Methods and Dissertation

Preparation• Two from a range of elective modules,

including Molecular Modelling (Chemistry), Healthcare Information...etc

ConclusionsConclusions Distance learning is becoming increasingly accepted by the

professional bodies. The image of distance learning would need to be improved. The concept would have to be well presented as something new, modern and completely different from the old-style correspondence courses.

Chemoinformatics can step in to assist in this effort. And it can do so in all fields of chemistry, inorganic, analytical, organic, physical, medicinal, and bio-chemistry. And it can reach beyond chemistry provide methods and information that can be used in biology, medicine, and physics.

36

ReferencesReferencesJournal Articles

• Y. M. Alvarez-Ginarte,et al. Bioorganic & Medicinal Chemistry 16 (2008) 6448–6459.

• S. D. Lindell, L. C. Pattenden, J. Shannon, Bioorganic & Medicinal Chemistry 17 (2009) 4035–4046.

• J. Gasteiger, Chemometrics and Intelligent Laboratory Systems 82 (2006) 200 – 209.

37

ReferencesReferencesBooks• An introduction to chemoinformatics. A.R. Leach & V.J. Gillet.

Kluwer, 2003.• Chemoinformatics – A textbook. J. Gasteiger & T. Engel (eds).

Wiley-VCH, 2003.• Handbook of chemoinformatics. J. Gasteiger (ed.). Wiley-VCH,

2003.• Chemoinformatics: Concepts, Methods, and Applications (Methods

in MolecularBiology). J. Bajorath. Humana Press, 2004.• Molecular Modelling Principles and Applications. A. R. Leach.

Longman, 1996.

38

39

40

41

Chemoinformatics Chem(o)informatics is a generic

term that encompasses the; design, creation, organization, design, creation, organization,

management, analysis, visualization management, analysis, visualization and use of chemical information.and use of chemical information.

In fact, Chemoinformatics is the application of informatics methods to solve chemical problems.

What is Chemoinformatics?What is Chemoinformatics?

Chemoinformatics, Cheminformatics, Chemical Informatics, Computational Chemistry, …

“the set of computer algorithms and tools to store and analyse chemical data in the context of drug discovery and design projects etc…”

42

What is Chemoinformatics?What is Chemoinformatics?

“the mixing of information resources to transform data into information and information into knowledge, for the intended purpose of making better decisions faster in the arena of drug lead identification and optimizaton”

43

What is Chemoinformatics?What is Chemoinformatics?

“chemoinformatics encompasses the design, creation, organisation, management, retrieval,analysis, dissemination, visualization and use

of chemical information”

44

Chemoinformatics : a new scienceChemoinformatics : a new science)?( )?(

45

Why do we need Why do we need ChemoinformaticsChemoinformatics??

To handle large amounts of information To move chemistry into the computer age

To move from data to knowledge.

47

And last but not least:

•To get funding (bioinformatics is doing well currently, whereas computational chemistry seems to be lagging behind).•Data information knowledge •measurements/calculations

Why do we need ChemoinformaticsWhy do we need Chemoinformatics?

48

How do we learn?How do we learn?

Inductive learning vs.. Inductive learning vs.. Deductive learningDeductive learning

Inductive learning vs. Deductive Inductive learning vs. Deductive learninglearning

Deductive learning:Deductive learning:A fundamental theory exists which allows us to calculate properties and predict the behavior of molecules. The fundamental theory for Chemistry is quantum mechanics.

Inductive learning vs. Deductive learningInductive learning vs. Deductive learningInductive learning = Learning from examplesInductive learning = Learning from examples

51

General scheme for inductive learningGeneral scheme for inductive learning

52

The fundamental tasks of a chemistThe fundamental tasks of a chemist property prediction, synthesis, design, reaction prediction, and

structure elucidation

53

The realm of ChemoinformaticsThe realm of Chemoinformatics

a) Representing Chemical Compounds

b) Searching Chemical Structuresc) Similarity Searchesd) Relating structure to properties with models

54

Machine Learning MethodsMachine Learning Methods

• Important role in chemoinformatics Important role in chemoinformatics – For example, it is usually difficult to

predict which types of descriptors are most suitable for a given search, classification.

• Therefore, machine learning techniques are often used to facilitate descriptor selection

55

Machine Learning Methods Machine Learning Methods – Genetic algorithms– Genetic algorithms

• Different parameters and model solutions to given problems are encoded in a chromosome and subjected to random variation, thus generating a population.

• Solutions provided by these chromosomes are evaluated by fitness function that assign high scores to desired results.

• Chromosomes yielding best intermediate solutions are subjected to mutation and crossover operation that correspond to random genetic mutations and gene recombination events.

• The resulting modified chromosomes represent the next generation and the process is continued until the obtained results meet a satisfactory convergence criterion

56

Quantitative Structure Activity Quantitative Structure Activity Relationship Analysis (QSAR)Relationship Analysis (QSAR)

Goal :Goal : Evaluation of molecular features that determine biological activity and the prediction of compound potency as a function of structural modification

57

Virtual Screening and Compound FilteringVirtual Screening and Compound Filtering

VS(Virtual Screening) - the process of screening large databases on the

computer for molecules having desired properties and biological activity.

A major application of VS techniques is the identification of novel active molecules in large compound databases.

58

Impact of new technology on drug discoveryImpact of new technology on drug discovery

• The last few years have seen a number of “revolutionary” new technologies:– Gene chips, genomics and HGP– Bioinformatics & Molecular biology– More protein structures– High-throughput screening & assays– Virtual screening and library design– Combinatorial chemistry– Other computational methods

• How do we make it all work for us?59

How Chemoinformatics can help outHow Chemoinformatics can help out

Producing and manage information for metrics to reduce risk, e.g.– Virtual screening– Library design,– Docking– Cost/benefit analysis

• Making information available at the right time and the right place Needs to be integrated into processes

60

Software relevance:Software relevance:Bridge between computation & scienceBridge between computation & science

clusteringsim. searchingactivity modelsscaffold detectiondockinglogp calculation

tasks:

“doing a cluster analysis”

“identifying activity-related fragments”

tools

chemoinformatics science

tasks:

work out a chemical synthesis

choose good reagents

try and document some reactions

goals:

e.g. produce compounds that have high biological activity

?

61

Chemoinformatics: WChemoinformatics: Where It Has here It Has Come From, Where It Is Now And Come From, Where It Is Now And

Where It Is GoingWhere It Is Going

OverviewOverviewFrom Chemical Information To From Chemical Information To

ChemoinformaticsChemoinformatics– Integration with techniques from molecular

modeling– Developments in computer hardware and software – Data explosion arising from developments in

combinatorial chemistry and high-throughput screening

63

Molecular ModellingMolecular Modelling

• Positioning of a putative ligand into a protein’s active site, first attempted by the DOCK program (UCSF, 1982)

• Initially restricted to rigid ligands and rigid proteins: current programs permit some degree of flexibility

• Use in structure-based design– Move from docking a single

ligand to sequential docking of large datasets

64

65

Graph TheoryGraph Theory• Graph theory is a branch of mathematics that

considers sets of objects, called nodesnodes, and the relationships, called edgesedges, between pairs of these objects

• The definition is completely general, allowing graphs to be used in many different application domains as long as an appropriate representation can be derived

66

Examples Of GraphsExamples Of Graphs

67

Proposed courses for a Distance Proposed courses for a Distance learning Programlearning Program

• Chemoinformatics Virtual ClassroomChemoinformatics Virtual Classroom

68

 

At present there are no specific software tools for chemical information training in the IranIran. A number of commercial software products used in the pharmaceutical and biotechnology industry are either too expensive or of limited utility for training in either academic or business settings. By employing distance learning distance learning through a web delivery system, the training software will provide an effective, low cost solution for academic institutions, whether they are offering a single course to students in a remote setting, or an entire program in cheminformatics.

69

70

In addition, such training tools will be very useful in industry settings with local area networks, where in a multidiscipline setting individuals need to receive training on the concepts employed by industrial chemoinformatics software's.

Chemoinformatics: aimsChemoinformatics: aims

• Develop an awareness of Informatics Management techniques used in the design and implementation of chemoinformatics systems

• Enable students to demonstrate skills learned by carrying out a small-scale industrially relevant chemoinformatics research project

• Basic structure– Three semesters of taught modules – One semester dissertation working at the site of

one of the companies supporting the programme71

Proposed Cources ;Proposed Cources ;

An introduction to chemoinformatics. – Chemoinformatics (Fundamental) – Information Systems Modelling – Information Storage and Retrieval– Foundations of Object-Oriented Programming

72

73

• Chemoinformatics (Advance ; more programming)

• Database Design • Research Methods and Dissertation

Preparation• Two from a range of elective modules,

including Molecular Modelling (Chemistry), Healthcare Information...etc

ConclusionsConclusions Distance learning is becoming increasingly accepted by the

professional bodies. The image of distance learning would need to be improved. The concept would have to be well presented as something new, modern and completely different from the old-style correspondence courses.

Chemoinformatics can step in to assist in this effort. And it can do so in all fields of chemistry, inorganic, analytical, organic, physical, medicinal, and bio-chemistry. And it can reach beyond chemistry provide methods and information that can be used in biology, medicine, and physics.

74

ReferencesReferencesJournal Articles

• Y. M. Alvarez-Ginarte,et al. Bioorganic & Medicinal Chemistry 16 (2008) 6448–6459.

• S. D. Lindell, L. C. Pattenden, J. Shannon, Bioorganic & Medicinal Chemistry 17 (2009) 4035–4046.

• J. Gasteiger, Chemometrics and Intelligent Laboratory Systems 82 (2006) 200 – 209.

75

ReferencesReferencesBooks• An introduction to chemoinformatics. A.R. Leach & V.J. Gillet.

Kluwer, 2003.• Chemoinformatics – A textbook. J. Gasteiger & T. Engel (eds).

Wiley-VCH, 2003.• Handbook of chemoinformatics. J. Gasteiger (ed.). Wiley-VCH,

2003.• Chemoinformatics: Concepts, Methods, and Applications (Methods

in MolecularBiology). J. Bajorath. Humana Press, 2004.• Molecular Modelling Principles and Applications. A. R. Leach.

Longman, 1996.

76

77

78

top related