Crowd Sourcing and Community Management Capabilities Available within Symbiota Data Portals Nico Franz 1 , Corinna Gries 2 , Thomas Nash III 2 & Edward Gilbert 1 1 School of Life Sciences, Arizona State University 2 Center for Limnology, University of Wisconsin TDWD 2013 Annual Conference, Florence, Italy Building and Maintaining Crowd Sourcing Websites and Their Communities October 29, 2013 Presentation Overview @ http://taxonbytes.org/tdwg-2013-crowd-sourcing-and-community-management-capabilities-with-symbiota /
25
Embed
Franz Et Al. Crowd Sourcing and Community Management Capabilities Available within Symbiota Data Portals
Symbiota (http://symbiota.org/tiki/tiki-index.php) is an open source software designed to promote and facilitate collaboration among those working to document biodiversity. Symbiota has become increasingly popular in recent years in North America, due in part to its suitability to support large herbarium networks and NSF-sponsored Thematic Collections Networks (TCNs; see https://www.idigbio.org/content/thematic-collectionsnetworks). The specimen-based Content Management System (CMS) provides a shared platform allowing researchers to manage biological resources as an integrated network. Data management through a community-based system has allowed for the development of several features and workflows that have enhanced efficient data entry while improving overall data integrity and quality. On-line data entry directly from an image of the specimen label allows for label transcription and error resolution that can call upon a global user community. A novel crowd sourcing feature in Symbiota offers collection managers the ability to submit specimen label images to a queue for group data entry by a volunteer task force. To improve efficiency and quality, the user interface incorporates Optical Character Recognition (OCR) and Natural Language Processing (NLP) capabilities, as well as duplicate and exsiccati record harvesting and real-time data validation. The duplicate clustering module groups duplicate specimen records across institutions, thereby obviating the need to re-enter a previously processed specimen and enhancing the task of locating and resolving misidentified specimens, viz. by highlighting the most recent annotation events within a cluster. As an additional review step, collections can opt to allow registered users to fix basic errors if and when they encountered them. Collection managers have the ability to review, approve, or revert such edits. Several other novel community features are available through Symbiota, including an integrated loan management module and pre-accessioned data entry by the original collector. We will demonstrate and discuss these features, their underlying concepts, implementation, utility, and future steps to further augment the community of contributing users.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Crowd Sourcing and Community Management
Capabilities Available within Symbiota Data Portals
Nico Franz1, Corinna Gries2 , Thomas Nash III2 & Edward Gilbert1
1 School of Life Sciences, Arizona State University 2 Center for Limnology, University of Wisconsin
TDWD 2013 Annual Conference, Florence, Italy Building and Maintaining Crowd Sourcing Websites and Their Communities
Login to each member portal is simple,requiring no special rights.
"Annotate the Harriman Alaska Expedition"
"Transcribe the ALCAN Expedition"
"Create your own…" ** Instant feedback on data volume.
Listing of pending "Harriman" records; each Symbiota ID is clickable to edit.
• The LBCC digitization workflow pipeline has produced a "skeletal record", including:• Record GUID• Thesaurus-ratified Scientific Name (not editable)• OCR of voucher locality label image
• "Parse OCR (LBCC)" [a custom LBCC program] will get the transcription process underway.
LBCCCrowd
SourcingCentral
Record1545184
1. Initial "Parse OCR" outcome (issues with lat/long transcription)
2. Correction of the parse using Symbiota tools (e.g. GeoLocate)
3. Approaching a clean record1 transcription, ready for saving
1 DwC Class: CleanRecord – Utter these two words in front of a TDWG audience, then immediately prepare to… [remainder not yet ratified].
Crowd Sourcing Central – Score Board *
Options to review one's submitted records and review points assigned (by the collection's manager).
* See also Appendix I.
Crowd Sourcing Central – User's Review Pages *
My pending records with LBCC.
My 2/4 approved records/points.
* See also Appendix III.
Crowd Sourcing Central – Collection Manager's Control Panel
* See also Appendix II.
4215 newly digitized records are available for addition to the queue.
25 submissions pending.
Crowd Sourcing Central – Collection Manager's Review Pages
* See also Appendix III.
2 points = default score. Specific feedback possible.
• A key purpose of the LBCC portal CS entry environment is to create a user
experience that is personalized.
• Special expeditions are a subset of the records queue for CS data entry, and are
identified as being part of a "special group/theme" of specimens.
• Expeditions are meant to educate those who are performing the data entry
about a specific event.
• They also aid data entry because the user generally deals with a homogeneous
type of label format, as opposed to shifting between numerous layout types.
• User input and managerial control (review, feedback, scoring) are interactively
facilitated in the same Crowd Sourcing module implemented in Symbiota.
Lichen, Bryophytes and Climate Change – CS in review
• TDWG 2013 Symposium organizers – Paul Kenneth Flemons
• Ben Brandt & John Brinda – LBCC software development
• Participating CNALH & CNABH collections
• NSF Award EF-1115116. "Digitization TCN – Collaborative Research: North American Lichens and Bryophytes: Sensitive Indicators of Environmental Quality and Change."