Top Banner
Lost In Translation when machines meet STM content… This presentation was designed to be delivered live. To help you understand the content we have added these notes…
27

Lost In Translation when machines meet STM content

Oct 30, 2014

Download

Education

scrazzl

This slidedeck outlines the Resource Identification Initiative and how partners within the group are working to improve reproducibility in science by making experimental methods more machine readable.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lost In Translation when machines meet STM content

Lost In Translationwhen machines meet STM content…

This presentation was designed to be delivered live. To help you understand the content we have added these notes…

Page 2: Lost In Translation when machines meet STM content

3Questions…Behind the shared vision held by the partners of the

Resource Identification Initiative lies a number of questions. …

Page 3: Lost In Translation when machines meet STM content

Which one?

REPRODUCIBILITY: In the scientific community it is difficult to find objective qualitative information about research materials. Choosing the wrong products often

means failed experiments…

Page 4: Lost In Translation when machines meet STM content

Where is it?

EFFICIENCY: Poor resource visibility means that labs around the world waste thousands of man hours

duplicating eachothers work. Greater visibility of produced research materials would dramatically improve efficiency

in science and reduce waste. ..

Page 5: Lost In Translation when machines meet STM content

Who has used it?

CONNECTIVITY: By its nature, science is a collaborative endevour. Efficiently identifying knowledge and expertise

when required is key to progressing discovery. ..

Page 6: Lost In Translation when machines meet STM content

The Role of Content.. Research content has evovled over time as a means of communicating conclusions. However, the real untapped

value in content is the information about the journey…

Page 7: Lost In Translation when machines meet STM content

Who has used it?Where is

it?

Which one?

Every article contains valuable information about experimental procedures and materials. When cross

referenced with location, author and time data, powerful new experimental and research insights are revealed…

Page 8: Lost In Translation when machines meet STM content

Challenges.. Todays research articles are designed to be read one at a

time by humans. To cross reference we rely on our notetaking, memory and prior knowledge. Machines have

the potential to dramatically improve the efficiency of how we glean insight from content. But….

Page 9: Lost In Translation when machines meet STM content

3Culture2Ambiguity1XML

1) Every publisher has slightly different XML standards. 2) The vocabularly for describing research entities is

ambiguous.3) There is a poor culture of facilitating data mining and

enforcing best annotation practice in the publishing industry.

Page 10: Lost In Translation when machines meet STM content

XML

The XML produced by different publishers can be significantly different. This makes indexing and analysing content at scale challenging…

Page 11: Lost In Translation when machines meet STM content

Ambiguity

Insufficient annotation and naming in content makes it difficult to disambiguate material entities. Take this glass beads example….

Page 12: Lost In Translation when machines meet STM content

Sigma produces at least 5 variations of glass beads, which version is being referred to in the article?

Page 13: Lost In Translation when machines meet STM content

Culture

VsPublishers have traditionally made money by attracting great content and selling access to as many people as possible . Advances in technology have largely been viewed by publishers as a means to do more of the same at a lower cost. Publishers have been slow to adopt practices that make their content machine accessible…

Page 14: Lost In Translation when machines meet STM content

Who is involved..

The RII is backed a wide group of interests working together to change how experimental resources are

documented in new research content…

Page 15: Lost In Translation when machines meet STM content

The group includes publishers, academic groups, funding agenicies, resource repositories and commercial

companies…

Page 16: Lost In Translation when machines meet STM content

Shared Goals.. The group has a number of shared goals with the aim of

improving the machine accessiblity of STM content in a practical and sustainable way…

Page 17: Lost In Translation when machines meet STM content

1. Unique Identifier

AB_12345781) By agreeing and assigning standard unique identifiers

for all known research materials (commercial and non-commercial)…

Page 18: Lost In Translation when machines meet STM content

2. Editor Awareness

• Drive adoption• Better XML standards• Content machine friendly

2) By working with publishers and other community members to encourage the inclusion of unique indentifiers

at the authoring stage and devising strategies for XML standardisation...

Page 19: Lost In Translation when machines meet STM content

3. Distribution…

3) By developing technology and APIs to diseminate research material information in a standarised form…

Page 20: Lost In Translation when machines meet STM content
Page 21: Lost In Translation when machines meet STM content

4. Annotation

- Pre-publication- Prospective

- Post-Publication- Retrospective

4) While NIF is focussing on research material annotation at the pre-publication stage, scrazzl is working on a

seperate initiative to drive retrospective annotation of published research…

Page 22: Lost In Translation when machines meet STM content

5. Interoperability

5) One of the main aims of the RII is to support the adoption of a standardised public research material

onthology and vocabulary that is interoperable with other exsisting biological onthologies…

Page 23: Lost In Translation when machines meet STM content

So what does success look like?

Our Destination…

Page 24: Lost In Translation when machines meet STM content

Connectivity

Every new article published will contain unique identifiers either in the visible text or in the underlying metadata. This will improve machine

readability and will dramatically improve the semantic connectivity of articles…

Page 25: Lost In Translation when machines meet STM content

Reproducibility

Data driven qualitative metrics of material entities will be available, improving reproducibility and driving efficiency..

Page 26: Lost In Translation when machines meet STM content

Visibility

Improved Geo and time dependent resource availability visualisation will be possible. Finding where resources are and identifying key technical

experts will be more efficient…

Page 27: Lost In Translation when machines meet STM content

E: [email protected]: +353 (0) 863-867-990

Twitter: @dkavanaghwww.scrazzl.com

Questions?