Making molehills out of mountains: Crowdsourcing digital access to natural history collections Laurence Livermore, John Tweddle, Lisa French, Lucy Robinson,

Post on 24-Dec-2015

214 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

Transcript

Making molehills out of mountains:

Crowdsourcing digital access to natural history collections

Laurence Livermore, John Tweddle, Lisa French, Lucy Robinson, Sarah Phillips

and Vincent S. Smith

Link to full report in Google Docs:

http://goo.gl/g6pBcH

Note: This is the working version of the report and will contain comments, notes and rough edges!

Background

• Dual Digital Collections Programme and SYNTHESYS3 report

• Audience - SYNTHESYS3 Taxonomic Access Facilities and internal NHM

• Aims:

– Review current natural history crowdsourcing platforms;

– Provide case studies of natural history crowdsourcing projects;

– Summarise motivation of volunteers;

– Recommend strategies for crowdsourcing success and future crowdsourcing research.

Crowdsourcing Definition & Context

• Crowd-based activity

• Clear task and goal

• Crowd is rewarded

• Distinct crowdsourcer (e.g. the NHM)

• Benefits the crowdsourcer

• Online and open participatory process

Tasks and goals:

• Majority are transcription based (labels, registers or diaries)

• Tasks are well-suited for human intelligence (handwriting interpretation and data categorisation)

Crowdsourcing Platforms

Platform Comparison

Feature ALA h@h LH NfN SDV: TC

Data Entry single single multi multi single

Review Y Y N N Y

Open source Y N N Y ?

Mobile Partial N N N N

PM + Admin Y N ? N Y

Georef tool Y N N N ?

Projects 232 18** 30 4 139

Community 835 419 200+ 6,721 340+

Contributions 128,135 145,574 1,365,200 1,025,033 ?

Plat. Age 4 years 7 years 3 years 2 years 2 years

Statistics gathered on 08/01/2014 unless other stated in notes

Platform age is rounded up

Laurence Livermore
Underestimate - based on most popular project only
Laurence Livermore
Intention to release as open source
Laurence Livermore
Available upon request
Laurence Livermore
Partially available as open source code
Laurence Livermore
Underestimate - based on most popular project only
Laurence Livermore
Filtered on "Biodiverse planet"

NHM Case Studies – Science Uncovered

• 3 weeks to make prototype (1 dev)

• AngularJS, nodeJS, MongoDB (open source)

• Images from Flickr

• Live imaging on the night

• Showcased entire digitisation process from collection to Data Portal

• Dataset: http://data.nhm.ac.uk/dataset/crowdsourcing-the-collection

• Stats: http://data.nhm.ac.uk/dataset/crowdsourcing-the-collection/resource/07555c45-ed3f-4178-83a4-dfa0144e35d2?view_id=59d600c4-5539-42ad-8435-a408f724f246

• Demo available from: http://su2014.benscott.co.uk/

NHM Case Studies – Notes from Nature

• Led by Tim Conyers and Robert Prys-Jones

• Bird register project – initial test project for NfN

• 2,950 pages

• 315,785 transcriptions

• 75% of transcriptions by 1 volunteer!

• Project page: http://www.notesfromnature.org/#/archives/ornithological

• Contributor stats: http://data.nhm.ac.uk/dataset/notes-from-nature/resource/7f8fc5f5-90ae-4959-b286-9cb7951f2875?view_id=ce329dfd-99cb-4223-b615-ce95d6c707c7

• Collaboration with Oxford, Leicester, Royal Society, RCS

• Project that will help to advance and inform NHM crowdsourcing

• Developing two new projects on Zooniverse platform (Spring 2015):

1. Images of nature within C19th periodicals (BHL) – CAHR & Leicester

2. Orchid phenology – AMC, Origins & Evolution Initiative & Oxford

Motivating the Crowd

• Understanding why volunteers participate in crowdsourcing endeavours and how to support, maintain and reward their involvement is central to success

• Narrative, tasks, supporting resources & feedback all affect participation

• Social aspects of crowdsourcing are critical and should not be ignored

• Motivations of participants vary and can be hard to determine

• Increasing number of studies, but biased coverage

• Report synthesises available evidence and relates this to effective project design

Initial decision to participate:

– Enthusiasm and interest in project topic

– Desire to record, find and discover

– Learning and development of new skills

– Contribution to the greater good (society/science)

– Sense of purpose and belonging to a community (social)

Maintaining volunteer participation (reward mechanisms):

– Rapid feedback

– Discussion with scientists and other contributors (forums)

– Opportunity to develop skills and project responsibility (e.g. transcription to verification)

– Acknowledging contributions made

– Gamification (stats, leaderboards and badges)

Report conclusions: Benefits of Crowdsourcing

• A stronger online presence/brand

• Increased rate of collections digitisation, hence access to data

• Higher scientific output

• An effective way of engaging (dispersed) members of the public

• Deeper and more meaningful engagement with our collections

Report conclusions: project choice and design

• Clear project rationale with both cultural and scientific benefits

• Projects should be actively promoted and monitored

• Scientists should be visible and engaged with volunteers

• Develop best practice for motivating and retaining volunteers (self-establishing community structure and forum, good science, tasks of interest, different rewards etc)

• Platform should use existing data standards – reduce bottle neck for collections management ingestion

• Resulting data should be freely available – projects do not end when all tasks are complete!

Recommended Areas of Organisational Investment

• Technical infrastructure (e.g. software, hardware and developers)

• Communication, outreach and support (e.g. dedicated staff time to develop and provide feedback to an external community, internal project manager and scientists)

• Strategic project selection (e.g. strong narrative, potential scientific outputs, public appeal, well-structured tasks of known complexity)

• Preparation of underlying data (e.g. data for autocomplete fields such as collector names or localities)

• Post-processing of data and subsequent import into institutional collections management system

Next steps? Discussion…

• Investigate platforms and differentiators (technical, sustainability, control)

• Consider options for implementation

• Create list of potential projects

• Funding potential

• What is the future of crowdsourcing? Can the “crowd” perform research-orientated activities?

top related