Top Banner
<chemical_informatics_project> <inspires> <chemistry_majors> Stuart J. Chalk Department of Chemistry University of North Florida [email protected] 2014 Fall ACS Meeting
18

ACS 248th Paper 104 ChemData Project

Jul 02, 2015

Download

Science

Stuart Chalk

Presentation on a project run in my Chemical Information Science course. Valuable referenced chemical data from 'reliable' static webpages was 'scraped', cleaned, and added to a database for search.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ACS 248th Paper 104 ChemData Project

<chemical_informatics_project> <inspires>

<chemistry_majors>

Stuart J. Chalk

Department of Chemistry

University of North Florida

[email protected]

2014 Fall ACS Meeting

Page 2: ACS 248th Paper 104 ChemData Project

Motivation

Chemical Information Science: The Course

Syllabus

Final Project Outline

Expected Student Activities

Submitted Data Sets

Example Data

Data Modeling

Compound Data

Website

Future Plans

Conclusion

Outline

From http://www.embl.de/chemcore/chemcore_services/computational_chemistry/

Page 3: ACS 248th Paper 104 ChemData Project

Motivation

Students are not exposed to informatics in the regular chemistry curriculum

There is so much information for the chemist to access/use they need to know how to deal with it

Giving students this exposure makes them more competitive in graduate/professional school

We need professionals at the interface of chemistry and information science

Page 4: ACS 248th Paper 104 ChemData Project

Chemical Information Science:A UNF Elective Class

First taught as a Freshmen Honors course in 2003“Chemical Informatics”

Five iterations over the last 10 years

Now an upper-level three credit elective class

Fall 2013 cohort – 21 students (three credit lecture)“Chemical Information Science”

Page 5: ACS 248th Paper 104 ChemData Project

Syllabus

What is information? What is data?What is metadata? What types of data are there?

How and where does informatics fit in chemistry?

How is information organized, stored, related, formatted, typed?

The objected oriented view of information(objects, classes, methods)

The Semantic Web – What is it why is it important?

Defining relationships between data, Concept maps

Controlled vocabularies, Thesauri, Ontologies

Page 6: ACS 248th Paper 104 ChemData Project

Syllabus

The eXtensible Markup Language (XML) andScientific Markup Languages

Understanding and using Web 2.0 technologiesfor information retrieval

Generating Information and Metadata

Finding Chemical Information

Tools for Finding, Organizing and Using Chemical Information

Searching databases

Internet/browser software for Chemistry

Using Excel for searching and organizing scientific information

Page 7: ACS 248th Paper 104 ChemData Project

Final Project Outline

The ChemData Database For your project you will gather chemical data from sources on

the Internet, organize/filter the data, added it too the Excel spreadsheet provided, and then send your completed Excel spreadsheet to Dr. Chalk by the deadline.

Requirements 600 pieces of metadata at minimum must be submitted

(excluding reference data) The data must be correctly entered in the spreadsheet

(no extra spaces, loss of accuracy, etc.) It must be referenced to its origin, and those reference

included in the spreadsheet For chemicals, the InChI must be part of the submitted

metadata for each chemical species A minimum of six hours of time for this activity is expected

The Excel Spreadsheet to use is available on the course website.

Page 8: ACS 248th Paper 104 ChemData Project

Find suitable data source (hand coded web page) on ‘reputable’ site with original reference

Download webpage content to computer

‘Scrape’ data out of webpage

Perform any data normalization (e.g. scientific notation)

Get metadata about chemicals referenced

Get metadata about original reference (DOI)

Import data into Excel and organize

Assign unique ids and add ids to link data

Add units and other metadata

Expected Student Activities

Page 9: ACS 248th Paper 104 ChemData Project

Submitted Data Sets

Students used an Excel spreadsheet to organize their data

Page 10: ACS 248th Paper 104 ChemData Project

Submitted Data Sets

They choose to submit data about

Organic compound properties

Organic compound reactions

Solvent properties

Types of analytical instrumentation

Analytical instrument operating conditions

Mathematical equations used in PChem

Physical constants

Unit conversion factors

Page 11: ACS 248th Paper 104 ChemData Project

Example Data

Page 12: ACS 248th Paper 104 ChemData Project

Data Modeling

Page 13: ACS 248th Paper 104 ChemData Project

Compound Data Table

Page 14: ACS 248th Paper 104 ChemData Project

Website

Page 15: ACS 248th Paper 104 ChemData Project

Very positive

“Course was an informative and enjoyable overview of the emerging field of informatics as it relates to the sciences and Chemistry in particular.”

“What I am taking away from this class is something that can be applied to other courses and my career. Interesting peek behind the curtains of how the sharing of scientific knowledge and discovery are evolving.”

“Dr. Chalk was very enthusiastic about the subject of chemical informatics. He exposed us to some very helpful chemistry resources that I plan on using in the future.”

“Very interesting class with a lot of hands on computer use and learning experience. The homework was relative to the course information and helped to prepare for exams. Would retake this class and recommend to a friend interested in data or computer science.”

Feedback

Page 16: ACS 248th Paper 104 ChemData Project

Future Plans

Finish curating, cleaning up data

Make site publically available

For students: provide detailed instructions on how to find, curate, and submit their own data

For faculty: provide detailed description of the project and Excel spreadsheet

Write up a paper about this for J. Chem. Ed.

Use site as the basis for a question bank for online study questions

Page 17: ACS 248th Paper 104 ChemData Project

Conclusion

This was a fun project to run at the end of the class

Bringing together all that we had talked about in an activity made it much more tangible for students

Students liked the idea that the Chem Data website would be used by other students in chemistry

I can’t wait to teach this again…

Page 18: ACS 248th Paper 104 ChemData Project

[email protected]

Phone: 904-620-5311

Skype: stuartchalk

LinkedIn/Slidehare: https://www.linkedin.com/in/stuchalk

ORCID: http://orcid.org/0000-0002-0703-7776

ResearcherID: http://www.researcherid.com/rid/D-8577-2013

Questions?