Top Banner
Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, [email protected] Dauvit King, The Open University, UK, [email protected] ViBRANT/BeBOL/JEMU workshop, RBINS, 11 June 2013 ViBRANT Virtual Biodiversity
14

Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, [email protected]@open.ac.uk.

Dec 28, 2015

Download

Documents

Ernest Oliver
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, David.Morse@open.ac.ukDavid.Morse@open.ac.uk.

Virtual BiodiversityViBRANT

Literature Mining and Mark-upViBRANT’s text processing tools

David Morse, The Open University, UK, [email protected] King, The Open University, UK, [email protected]

ViBRANT/BeBOL/JEMU workshop, RBINS, 11 June 2013

ViBRANTVirtual Biodiversity

Page 2: Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, David.Morse@open.ac.ukDavid.Morse@open.ac.uk.

Virtual BiodiversityViBRANT

2 of

Literature Mining

14

ViBRANT is for taxonomists, so we look for:• Taxon names• Authors• Locations

Also interested in:• Citations• Relationships

Mining for Names and Concepts

Page 3: Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, David.Morse@open.ac.ukDavid.Morse@open.ac.uk.

Virtual BiodiversityViBRANT

Literature Mining – harder than you thinkM

BRITISH MUSEUM

(NATURAL HiSi

26JU

PRESENTED GENERAL UC.-lARY

Bulletin ofthe

BritishMuseum (Natural History)

The ichneumon-fly genus Banchus in the OldWorld

(Hymenoptera)

M. G. Fitton series

Entomology Vol51 Nol 25 July 1985

3 of 14

Page 4: Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, David.Morse@open.ac.ukDavid.Morse@open.ac.uk.

Virtual BiodiversityViBRANT

4 of

GoldenGATE

14

Sautter, G., Agosti, D., and Böhm. K. (2007) Semi-Automated XML Markup of Biosystematics Legacy Literature with the GoldenGATE Editor. In Proceedings of PSB 2007, Wailea, HI, USA, 2007

Downloadable from http://psb.stanford.edu/psb-online/proceedings/psb07/sautter.pdf

Page 5: Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, David.Morse@open.ac.ukDavid.Morse@open.ac.uk.

Virtual BiodiversityViBRANT

5 of

GoldenGATE

14

Page 6: Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, David.Morse@open.ac.ukDavid.Morse@open.ac.uk.

Virtual BiodiversityViBRANT

6 of

GoldenGATE in OBOE

14

Page 7: Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, David.Morse@open.ac.ukDavid.Morse@open.ac.uk.

Virtual BiodiversityViBRANT

7 of

GoldenGATE in OBOE

14

Page 8: Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, David.Morse@open.ac.ukDavid.Morse@open.ac.uk.

Virtual BiodiversityViBRANT

8 of

GoldenGATE in OBOE

14

Page 9: Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, David.Morse@open.ac.ukDavid.Morse@open.ac.uk.

Virtual BiodiversityViBRANT

9 of

GoldenGATE in OBOE

14

Page 10: Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, David.Morse@open.ac.ukDavid.Morse@open.ac.uk.

Virtual BiodiversityViBRANT

10 of

Visualising mark up

14

Page 11: Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, David.Morse@open.ac.ukDavid.Morse@open.ac.uk.

Virtual BiodiversityViBRANT

11 of

Taxonomic XML schemas

14

Lyubomir Penev, Christopher Lyal, Anna Weitzman, David Morse, David King, Guido Sautter, Teodor Georgiev, Robert Morris, Terry Catapano, and Donat Agosti. (2011) XML schemas and mark-up practices of taxonomic literature. ZooKeys 150: 89-116.

Downloadable from http://dx.doi.org/10.3897/zookeys.150.2213

Page 12: Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, David.Morse@open.ac.ukDavid.Morse@open.ac.uk.

Virtual BiodiversityViBRANT

12 of

Linked Open Data

14

Page 13: Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, David.Morse@open.ac.ukDavid.Morse@open.ac.uk.

Virtual BiodiversityViBRANT

13 of

Other tools

14

KEAKeyphrase Extraction Algorithm

GNRDGlobal Names Recognition and Discovery

LinnaeusUsed for molecular data

Page 14: Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, David.Morse@open.ac.ukDavid.Morse@open.ac.uk.

Virtual BiodiversityViBRANT

14 of

Conclusion

14

Developing Literature Mining services deployed through OBOE.

Initially aimed at ViBRANT’s core audience.

Setting up workflow integrated with Scratchpads.

Yet still permitting large, slow jobs.