Top Banner
Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge [email protected]
26
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

Chemistry Add-in for Word

OR 10Joe Townsend

University of [email protected]

Page 2: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

The World

Publication

Page 3: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

The World (2003)

The Scientist

The Lab Journals

Web Pages

Sad

Page 4: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

Front Matter

Abstract

Introduction

Discussion

Experimental

References

Results

Article Structure

Synthesis

Set up

Analysis

Compound Name

Page 5: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

(6R,12aR)-6-(1,3-benzodioxol-5-yl)-2-methyl-2,3,6,7,12,12a-hexahydropyrazino[1',2':1,6]pyrido

[3,4-b]indole-1,4-dione

Page 6: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

N-{[(1,1-dimethylethyl)oxy]carbonyl}-L-tryptophyl-L-methionyl-L-α-aspartyl-3,4,5-tribromo-L-

phenylalaninamide

(S) (S)

(S) (S)

Br

Br

Br

NH2

NH

HN

HN

NH

NH

OH

O

OO

O O

O

O

S

H H

H H

Page 7: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

(2R,3R,4S)-2-[(2S,3R,4R,5S,6R)-3-acetamido-2-[(2S,3R,4S,5R,6R)-6-[(2S,3R,4R,5S,6R)-3-acetamido-2-[(2S,3R,4S,5R,6R)-6-[(2S,3R,4R,5S,6R)-3-acetamido-2-[(2S,3R,4S,5R,6R)-6-[(2S,3R,4R,5S,6R)-3-acetamido-2-[(2S,3R,4S,5R,6R)-6-[(2S,3R,4R,5S,6R)-3-acetamido-2-[(2S,3R,4S,5R,6R)-6-[(2S,3R,4R,5S,6R)-3-acetamido-2-[(2S,3R,4S,5R,6R)-6-[(3R,4R,5S,6R)-3-acetamido-2-hydroxy-5-sulfooxy-6-

(sulfooxymethyl)oxan-4-yl]oxy-2-carboxylato-4,5-disulfooxyoxan-3-yl]oxy-5-sulfooxy-6-(sulfooxymethyl)oxan-4-yl]oxy-2-carboxylato-4,5-disulfooxyoxan-3-yl]oxy-5-sulfooxy-6-(sulfooxymethyl)oxan-4-

yl]oxy-2-carboxylato-4,5-disulfooxyoxan-3-yl]oxy-5-sulfooxy-6-(sulfooxymethyl)oxan-4-yl]oxy-2-carboxylato-4,5-disulfooxyoxan-3-yl]oxy-5-sulfooxy-6-(sulfooxymethyl)oxan-4-yl]oxy-2-carboxylato-4,5-disulfooxyoxan-3-yl]oxy-5-sulfooxy-6-(sulfooxymethyl)oxan-4-

yl]oxy-2-carboxylato-4,5-disulfooxyoxan-3-yl]oxy-5-sulfooxy-6-(sulfooxymethyl)oxan-4-yl]oxy-3,4-disulfooxy-3,4-dihydro-2H-pyran-

6-carboxylate

O

NH

O

O O-

O

O

NH

O

O

O O-

O

O

NH

O

O

O O-

O

O

NH

O

O

O O-

O

O

NH

O

O

O O-

O

O

NH

O

O

O O-

O

O

NH

OOH

O

OS

O

O

HO

O

S

O

O

OH

HH

H

H

O

S

O

OOH

HHO

S

O

O

OH

H

H

H

O

O

S

O

O

HO

O

S

O

OOH

HH

H H

H

O

S

O

O

OHH

H

O

S

O

OOH

H

H H

O

O

S

O

O

HO

O

S

O

O

OH

HH

H H

H

O

S

O

OOHH

H

O

S

O

OOH

H

H H

O

O

S

O

OHO

O

S

O

OOH

HH

H

H

H

O

S

O

O

OH

HHO

S

O

O

OH

H

H

H

O

O

S

O

O

HO

O

S

O

O

OH

HH

H H

H

O

S

O

OOHH

H

O

S

O

OOH

H

H H

O

O

S

O

OHO

O

S

O

OOH

H

H

H H

H

O

S

O

O

OHH

H

O

S

O

OOH

H

H H

O

O

S

O

O

HO

O

S

O

O

OH

HO

O

O

S

O

O

OH

H

H

O

S

O

OOH

O O-

HH

H

H

H

Page 8: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

HamburgersPaper 1

Communication 2

Supplementary Information 2

[1] Org. Biomol. Chem., 2010, 8, 3149 - 3156, DOI: 10.1039/c003511d

[2] Org. Biomol. Chem., 2010, 8, 3130 - 3132, DOI: 10.1039/c004556j

Converting PDF to XML is a bit like converting hamburgers into cows. You may be best off printing it and then scanning the result through a decent OCR package. 3

[3] Michael Kay. (2009, August) xml-dev - RE: [xml-dev] How we can convert pdf data into xml?http://lists.xml.org/archives/xml-dev/200607/msg00509.html

Page 9: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.
Page 10: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

STOP!DEMO TIME

Page 11: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

Under the Hood

Page 12: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

Is it CML?• Does it use Chemical Mark-up Language correctly

Is it CMLLite?• Tighter constraints and co-constraints

Is it normalised?• Further constraints and normalization

Validation

Page 13: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

Chemistry Zone

ChemistryZone

CML

Properties

Function&

OPC

Page 14: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

Package Structure

Page 15: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

APP.SPECIFIC

UIMANIPUL-

LATIONDOMAINMODEL

NUMBO

ChemicalIntelligence

Zone

CML

Properties

List ofDepictions

Show List

CIDZone’ Properties’

Page 16: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

Linked Zones

CML

Properties 1

Properties 2

Properties 3

154

Page 17: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

CML

Properties

154CML

Properties

154

CML’

Properties

155

COPY

Fn

Copied Zones

Page 18: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

Record Keeping

Page 19: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

What can we do with a Cow?

5-Cyclobutyl-2,3-dihydro-[1H]-2-benzazepine 82:

Potassium carbonate (0.63 g, 4.56 mmol) and thiophenol (0.19 g, 1.69 mmol) were added to the 2-nitrobenzene sulfonamide 50 (0.50 g, 1.302 mmol) in N,N-dimethylformamide (33 cm3) at room temperature and the mixture was stirred for 16 h. Deionised water (50 cm3) was added and the aqueous phase was extracted with ethyl acetate (5 x 50 cm3). The organic extracts were dried (MgSO4) and concentrated under reduced pressure to give the title compound 82 (0.259 g, 1.302 mmol, ca. 100%) as an oil used without further purification.

Page 20: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

Tokenization and Chunking

Page 21: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

Phrase identification

Page 22: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

RDF of reaction components

Page 23: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

3D Boxes: Solid Double Circles: Oil Octagon: Gum Triple Octagon: Foam Diamond: Crystals or

Needles Ellipses: Unknown or

Unspecified

Page 24: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.
Page 25: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

The World

The Scientist

The Lab

Journals

Web Pages

Repositories

Page 26: Chemistry Add-in for Word OR 10 Joe Townsend University of Cambridge jat45@cam.ac.uk.

Tony Hey, Lee Dirks, Alex Wade, Savas Parastatidis, Oscar Naim, Pablo Fernicola, Geraldine Wade,

Murray Sargent, Rudy Potenzone, Tim Haughton, Mike Galos, Tola Chhoeun, Jim McGill

Peter Murray-Rust, Jim Downing, Sam Adams, Daniel Lowe

http://research.microsoft.com/chem4wordhttp://chem4word.codeplex.com

Thanks