Top Banner
Navigating an Internet of Chemistry via ChemSpider Antony Williams University of Arkansas, Little Rock, October 2011 UALR Chemistry Seminar Guest Lecture
51

Navigating an Internet of Chemistry via ChemSpider

Dec 07, 2014

Download

Technology

This is a presentation I gave via the BigBlueButton system to students and faculty at the University of Arkansas, Little Rock, regarding searching the internet for Chemistry.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Navigating an Internet of Chemistry via ChemSpider

Navigating an Internet of Chemistry via ChemSpider

Antony WilliamsUniversity of Arkansas, Little Rock, October 2011

UALR Chemistry Seminar Guest Lecture

Page 2: Navigating an Internet of Chemistry via ChemSpider

Overview

What type of chemistry is available on the internet?

Representative flavors of chemistry

How can the internet be searched by chemical?

Quality on the Internet

Contributing to the chemistry internet

Page 3: Navigating an Internet of Chemistry via ChemSpider

Where is chemistry online? Encyclopedic articles (Wikipedia) Chemical vendor databases Metabolic pathway databases Property databases Patents with chemical structures Drug Discovery data Scientific publications Compound aggregators Blogs/Wikis and Open Notebook Science

Page 4: Navigating an Internet of Chemistry via ChemSpider

Representative Flavors of Chemistry

Page 5: Navigating an Internet of Chemistry via ChemSpider

Molfiles Molfiles are the primary exchange format between

structure drawing packages Can be different between different drawing packages Most commonly carry X,Y coordinates for layout Can support polymers, organometallics, etc. Can carry 3D coordinates

Page 6: Navigating an Internet of Chemistry via ChemSpider

Molfiles 10 9 0 0 1 0 0 0 0 0 1 V2000 31.2937 -9.0366 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 26.6526 -9.0366 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 31.2937 -7.7066 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 30.1161 -9.6877 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 25.5096 -9.6877 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 28.9731 -9.0366 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 27.8163 -9.7016 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 26.6664 -7.7066 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 32.4367 -9.6877 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 30.1161 -11.0177 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3 1 2 0 0 0 0 4 1 1 0 0 0 0 9 1 1 0 0 0 0 7 2 1 0 0 0 0 5 2 2 0 0 0 0 8 2 1 0 0 0 0 6 4 1 0 0 0 0 4 10 1 6 0 0 0 7 6 1 0 0 0 0 M END

Page 7: Navigating an Internet of Chemistry via ChemSpider

SMILES (http://en.wikipedia.org/wiki/SMILES)

SMILES is a common format Can support polymers,

organometallics, etc. Does NOT carry X,Y or Z

coordinates for layout so requires layout algorithms – can be problematic!

Generally different between drawing packages

Page 8: Navigating an Internet of Chemistry via ChemSpider

Stereo

Page 9: Navigating an Internet of Chemistry via ChemSpider

Tautomeric forms

Page 10: Navigating an Internet of Chemistry via ChemSpider

Vendor-dependent SMILES ACD/LabsCC(C)CCC[C@@H](C)CCC[C@@H](C)CCCC(\

C)=C\CC2=C(C)C(=O)c1ccccc1C2=O

OpenEyeCC1=C(C(=O)c2ccccc2C1=O)C/C=C(\C)/

CCC[C@H](C)CCC[C@H](C)CCCC(C)C

ChEMBLCC(C)CCC[C@@H](C)CCC[C@@H](C)CCC\C(=C\

CC1=C(C)C(=O)c2ccccc2C1=O)\C

Page 11: Navigating an Internet of Chemistry via ChemSpider

The InChI Identifier

Page 12: Navigating an Internet of Chemistry via ChemSpider

InChI

SINGLE code base managed by IUPAC – integrated into drawing packages. No variability as with SMILES

InChI Strings can be reversed to structures – same problem as with SMILES – no layout

Adopted by the community (databases, blogs, Wikipedia) – good for searching the internet

Page 13: Navigating an Internet of Chemistry via ChemSpider

Multiple Layers

Page 14: Navigating an Internet of Chemistry via ChemSpider

Tautomers – “Mobile H Perception”

Page 15: Navigating an Internet of Chemistry via ChemSpider

Stereo

Page 16: Navigating an Internet of Chemistry via ChemSpider

Checking for Stereochemistry

Page 17: Navigating an Internet of Chemistry via ChemSpider

Checking for StereochemistryUse your drawing package!

Page 18: Navigating an Internet of Chemistry via ChemSpider

Checking for Stereochemistry

Page 19: Navigating an Internet of Chemistry via ChemSpider

Checking for Stereochemistry

Page 20: Navigating an Internet of Chemistry via ChemSpider

Checking for Stereochemistry

Page 21: Navigating an Internet of Chemistry via ChemSpider

Databases and Standardization

Page 22: Navigating an Internet of Chemistry via ChemSpider

Databases and Standardization

Page 23: Navigating an Internet of Chemistry via ChemSpider

InChIStrings Hash to InChIKeys

Page 24: Navigating an Internet of Chemistry via ChemSpider

Vancomycin

Page 25: Navigating an Internet of Chemistry via ChemSpider

Vancomycin

Search Molecular SKELETON

Search Full Molecule

Page 26: Navigating an Internet of Chemistry via ChemSpider

Searching Chemistry on the Internet

Searching Vincristine Name searching Google Name searching Wikipedia Name searching Wolfram Alpha Name, name, name, name…searching Structure searching DOZENS of websites,

each with different information or…

Page 27: Navigating an Internet of Chemistry via ChemSpider

Searching Chemistry on the Internet

Searching Vincristine Name searching Google Name searching Wikipedia Name searching Wolfram Alpha Name, name, name, name…searching Structure searching DOZENS of websites,

each with different information or…

Search ONE website integrating the others!

Page 28: Navigating an Internet of Chemistry via ChemSpider

www.chemspider.com

Page 29: Navigating an Internet of Chemistry via ChemSpider

I want to know about “Vincristine”

Page 30: Navigating an Internet of Chemistry via ChemSpider

Vincristine: Identifiers and Properties

Page 31: Navigating an Internet of Chemistry via ChemSpider

Vincristine: Identifiers and Properties

Page 32: Navigating an Internet of Chemistry via ChemSpider

Vincristine: Vendors and Sources

Page 33: Navigating an Internet of Chemistry via ChemSpider

Vincristine: Patents

Page 34: Navigating an Internet of Chemistry via ChemSpider

Vincristine: Articles

Page 35: Navigating an Internet of Chemistry via ChemSpider

Vancomycin

Search Molecular SKELETON

Search Full Molecule

Page 36: Navigating an Internet of Chemistry via ChemSpider

Full Skeleton Search: 104 Hits

Page 37: Navigating an Internet of Chemistry via ChemSpider

Full Molecule Search: 4 Hits

Page 38: Navigating an Internet of Chemistry via ChemSpider

Quality on the Internet

Trust everything on the web???

Page 39: Navigating an Internet of Chemistry via ChemSpider

What’s said on the web is true…

Page 40: Navigating an Internet of Chemistry via ChemSpider

What’s said on the web is true…

Page 41: Navigating an Internet of Chemistry via ChemSpider

What’s said on the web is true…

“We then established a collaboration with professor Sum Ting Wong, a fugitive from the North Korean University Hu Yu Hai Ding, currently in Rome (Italy).”

“This was identified as the new protein Wai So Dim (WSD).”

Page 42: Navigating an Internet of Chemistry via ChemSpider

Contributing Chemistry to the Web If it was not just about me

Page 43: Navigating an Internet of Chemistry via ChemSpider

Contributing Chemistry to the Web If it was not just about me We might have a community

built encyclopedia I might know where the best

restaurants are I might get good advice on

books to read I might know which movies to

watch I might know which plumber

to call Data might just be Open

Page 44: Navigating an Internet of Chemistry via ChemSpider

Contributing Chemistry to the Web If it was not just about me We might have a community

built encyclopedia I might know where the best

restaurants are I might get good advice on

books to read I might know which movies to

watch I might know which plumber

to call Data might just be Open

Page 45: Navigating an Internet of Chemistry via ChemSpider

Contributing Chemistry to the Web

ChemSpider as a host for community contributions Curation and validation input Structures Movies Images Analytical data – especially spectra

Page 46: Navigating an Internet of Chemistry via ChemSpider

Contributing Chemistry to the Web

Sites allow direct feedback – leave it!

Sites allow deposition of data Text – chemical names, properties Structures Spectra

Curation of existing data

Page 47: Navigating an Internet of Chemistry via ChemSpider

Spectra

Page 48: Navigating an Internet of Chemistry via ChemSpider

ChemSpider SyntheticPages

Page 49: Navigating an Internet of Chemistry via ChemSpider

Submission Process Simple template-based submission process

Submissions reviewed by editorial board. Published as is or comments sent to author

Online Peer Review process

Data supported include web movies, images, live spectra etc.

DOI issued to author

Page 50: Navigating an Internet of Chemistry via ChemSpider

Conclusion

Diverse types of chemistry are available on the web Searching of the internet is possible based on

Text Structure searching Substructure searching

The InChI has enabled linking on the internet Quality on the Internet is diverse – separating the

wheat from the chaff is not always easy! It is possible to contribute to the chemistry internet!

Page 51: Navigating an Internet of Chemistry via ChemSpider

Thank you

Email: [email protected] Twitter: ChemConnectorBlog: www.chemspider.com/blogPersonal Blog: www.chemconnector.comSLIDES: www.slideshare.net/AntonyWilliams