ContentMine + EPMC: Finding Zika!

Research Data Management

Content Mine + Europe PubMedCentralPeter Murray-Rust, ContentMine.org and UniversityOfCambridgeWellcome Trust, London, UK 2016-02-08

Getpapers[0] and AMI[1]download and analyze papers from EuropePubMedCentral

[0][1] F/OSS tools from contentmine.org

Hi, Im here to talk about AMI; a data extraction framework and tool. First, I just want highlight some of key contributors to the projects; Andy for his work on the ChemistryVisitor and Peter for the overall architecture.

In this talk, Im going to impress the importance of data in a specific format and its utility to automated machine processing. Then Im going to demonstrate AMIs architecture and the transformation of data as it flows through the process. Im going to dwell a little on a core format used, Scalable Vector Graphics (SVG) before introducing the concept of visitors, which are pluggable context specific data extractors. Next, Im going to introduce Andys ChemVisitor, for extracting semantic chemistry data, along with a few other visitors that can process non-chemistry specific data. Finally, I will demonstrate some uses of the ChemVisitor, within the realm of validation and metabolism.

Automated Semantic FulltextEuropePMC provides coherent OpenAccessgetpapers: wrapper for repos and search engines.AMI filters, checks[1], transforms facts in papers. Here:Sequences in textSpecies and generaGenes User dictionaries(RRIDs, chemistry, places, phylo)

[0] All operations shown run in total of

ContentMine + EPMC: Finding Zika!

Health & Medicine