Top Banner
Crowdsourcing is for the tail Gianluca Demartini eXascale Infolab University of Fribourg, Switzerland gianlucademartini.net exascale.info
16

Crowdsourcing is for the tail

May 10, 2015

Download

Technology

Talk given at the Dagstuhl Seminar 14282 "Crowdsourcing and the Semantic Web"
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Crowdsourcing is for the tail

Crowdsourcing is for the tail

Gianluca DemartinieXascale Infolab

University of Fribourg, Switzerland

gianlucademartini.netexascale.info

Page 2: Crowdsourcing is for the tail

Crowdsourced Data Curation

• Enforce quality and coverage in KBs• To curate tail entity structured representation• Leveraging the diversity of the crowd• Targeted Crowdsourcing

Page 3: Crowdsourcing is for the tail

The long tail of entity popularity

Page 4: Crowdsourcing is for the tail

Tail Entities

• Local restaurants• Niches sport domains (chess, cricket)• Emerging music bands• Rare diseases

Page 5: Crowdsourcing is for the tail
Page 6: Crowdsourcing is for the tail
Page 7: Crowdsourcing is for the tail

Gianluca Demartini 7

Improving Crowdsourcing Platforms

Page 8: Crowdsourcing is for the tail

8

Push Crowdsourcing

• Pick-A-Crowd: A system architecture that uses Task-to-Worker matching:– The worker’s social profile – The task context

• Workers can provide higher quality answers on tasks they relate to

Djellel Eddine Difallah, Gianluca Demartini, and Philippe Cudré-Mauroux. Pick-A-Crowd: Tell Me What You Like, and I'll Tell You What to Do. In: 22nd International Conference on World Wide Web (WWW 2013), Rio de Janeiro, Brazil, May 2013.

Page 9: Crowdsourcing is for the tail

9

Pick-A-Crowd

Page 10: Crowdsourcing is for the tail

10

Discussion

• Task-to-Worker recommendation / Matchmaking

• Experimental comparison with AMT shows a consistent quality improvement

“Workers Know what they Like”

Page 11: Crowdsourcing is for the tail

Gianluca Demartini 11

OpenTurk

• Yet another a platform? Build on top of Mturk!• Chrome Extension for push / notification• 400+ users• http://bit.ly/openturk-extension• Open source: https

://github.com/openturk/extension

Page 12: Crowdsourcing is for the tail

Transactive Search

Page 13: Crowdsourcing is for the tail

Transactive Search

• Transactive Memories• Transactive Search:– Memory reconstructed by a group of people– Need to target the right people– A form Targeted Crowdsourcing

• “Who attended the ISWC 2013 conference?”

Page 14: Crowdsourcing is for the tail

Gianluca Demartini 14

Transactive Search

• Machines: Harvest the Web + Data Mining• Crowd: Search twitter, look at event pictures• Transactive Memories: Remember who I met

Michele Catasta, Alberto Tonon, Djellel Eddine Difallah, Gianluca Demartini, Karl Aberer, and Philippe Cudré-Mauroux. Hippocampus: Answering Memory Queries using Transactive Search. In: 23rd International Conference on World Wide Web (WWW 2014), Web Science Track. Seoul, South Korea, April 2014.

Page 15: Crowdsourcing is for the tail

Gianluca Demartini 15

Who attended ISWC 2013?

Page 16: Crowdsourcing is for the tail

Conclusions

• Crowdsourcing For Tail Entities• Focusing on the difficult part of the KB– The tail is long!

• Challenges– Which tail entities are valuable?– Who is the right worker?– Focus on passion rather than monetary incentives