COMMUNITY PROFILING FOR CROWDSOURCING QUERIES Khalid Belhajjame 1 Marco Brambilla 2 Daniela Grigori 1 Andrea Mauri 2 1 PSL, Paris-Dauphine University, LAMSADE, France 2 Politecnico di Milano, Italy
Jul 13, 2015
COMMUNITY PROFILING FOR
CROWDSOURCING QUERIES
Khalid Belhajjame 1
Marco Brambilla 2
Daniela Grigori 1
Andrea Mauri 2
1 PSL, Paris-Dauphine University, LAMSADE, France2 Politecnico di Milano, Italy
Traditional vs Community Crowdsourcing
• General structure:
• the requestor poses some questions
• a wide set of responders are in charge of providing answers
(typically unknown to the requestor)
• the system organizes a response collection campaign
• Traditional Crowdsourcing
• Cost – Quality Tradeoff
• Complex results aggregation
• Community Crowdsourcing
• Matching the task to the “correct” group of workers
2Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
Community
Set of people that share
• Interests
• Feature
..or belong to
• common entity
• social network
3Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
Leveraging communities
• Why?
• Experts
• More engaged
• How?
• Determine the communities of performers
• Target the correct community
• Monitor them taking into the account the behavior of their members
4Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
The approach
• Models
• Query Model
• Community Model
• Matching strategies
• Keyword based
• Semantic based
5Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
Query Model
• Textual description of the task
• Examples of query and responses
• Knowledge needed
• Prior knowledge (knowledge base) that can be used for
partially answering or for identifying potential answers.
• Type of the task
• Unary: tag, classify, like, …
• N-ary: match, cluster, …
• Objects
• Kind, description, text, metadata …
• Temporal
6Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
Community Model
• Textual description of the community
• name, web page, …
• Type of the community
• Explicit: Statically existing and consolidated
• Implicit: Dynamically built based on the need
• Definition
• Intentional: defined by a property
• Extensional: list of members
• Both
• Grouping factor
• Friendship, interest, location, expertise, affiliation
7Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
Community Model
• Content
• Produced by the people of the community
• Members’ profiles
• Explicit
• Implicit
• Communication channel
• Email, facebook, linkedin, twitter, blogs or web sites (reviews,
expert sites), AMT
8Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
Relations between Communities
• Subsumption
• A given community contains another community
• i.e. Sport fans contains Soccer fans
• Similarity
• Two communities refer to similar expertise or topic
• i.e. Experts in classical music and experts on opera
9Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
Matching
• Keyword Based
• Communities and query treated as bag of words
• Requires indexing
• Semantic Based
• Communities and query are mapped to concepts
• Requires semantic annotation
10Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
Community Control
Community control consists in the adaptation of the
crowdsourcing campaign according to the behavior of the
community
• Task / Object allocation (granularity)
• Static / Dynamic
SOCM’14, Monday, April 7
11Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
CrowdSearcherA prototype that allows
the definition,
execution and control
of a crowdsourcing
campaign
SOCM’14, Monday, April 7
12
http://crowdsearcher.search-computing.org/
Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
Example (dynamic control)
µTObjExecution
Performer
TaskImage
Statu
s
StartT
s
End
Ts
µTas
kID
Object Control
Performer Control
Task Control
Com
pObj
s
Task
ID
Com
pExe
cs
Name
PerformerID
Score
Pro
fPho
to
Mat
eria
ls
Non
Rel
evan
t
Peo
pleP
lace
TaskIDObjectID
Correct
ImgUrl
Cat
egor
y
ProfessorID
ObjectID
Answer
Eval
ObjectID
PerformerID
TaskID
PerformerID
CommunityCommunityID
Name
Community Control
CommunityID
Score
CommunityID
Enabled
Sta
tus
Sta
tus
Last
Exec
Tests
Execs
SOCM’14, Monday, April 7
13
e: AFTER UPDATE FOR μTObjExecution
c: CommunityControl[CommunityID== NEW.CommunityID].score<=0.5
CommunityControl[CommunityID== NEW.CommunityID].eval=10
a: SET CommunityControl[CommunityID == DB-Group].Enabled = true
?
Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
Experiment• 16 professors within two
research groups in ourdepartment (DB and AIgroups)
• The top 50 imagesreturned by the GoogleImage API for each query
• Each experts have toevaluate 5 images at time
• Results are acceptedwhen enough agreementon the class of the imageis reached
• Evaluated objects areremoved from newexecutions.
SOCM’14, Monday, April 7
14Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
Communities
The communities:
• the research group of the professor,
• the research area containing the group (e.g. Computer Science)
• and the whole department (which accounts for more than 600 people in different areas)
Invitations are sent:
• inside-out: we started with invitations to experts, e.g. people the same groups as the professor (DB and AI), and then expanded invitations to Computer Science, then to the whole Department, and finally to open social networks (Alumni and PhDs communities on Facebook and Linkedin);
• outside-in: we proceeded in the opposite way, starting with the Department members, then restricting to Computer Scientists, and finally to the group's members.
SOCM’14, Monday, April 7
15Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
Number of performers per community
SOCM’14, Monday, April 7
Com
mu
nity-b
ased C
row
dso
urc
ing
16
0"
10"
20"
30"
40"
50"
60"
70"
7/18/2013" 7/19/2013" 7/20/2013" 7/21/2013" 7/22/2013" 7/23/2013" 7/24/2013" 7/25/2013" 7/26/2013" 7/27/2013" 7/28/2013"
#"Perform
ers"
Time"
research"group"
research"area"
department"
social"network"
total"
46%
24%
16%
9 / “a lot”
Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
Precision of performers per community
SOCM’14, Monday, April 7
17
0"
0.1"
0.2"
0.3"
0.4"
0.5"
0.6"
0.7"
0.8"
0.9"
1"
0" 500" 1000" 1500" 2000" 2500" 3000"
Precision)
#Evalua0ons)
research"group"
research"area"
department"
social"network"
total"
Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
Precision of the evaluated objects
• Precision decreases with less expert communities
• Inside-out strategy (from expert to generic users) outperforms
Outside-in strategy (from generic to expert users)
SOCM’14, Monday, April 7
18
0.6$
0.65$
0.7$
0.75$
0.8$
0.85$
0.9$
0.95$
1$
0$ 100$ 200$ 300$ 400$ 500$ 600$ 700$ 800$
Precision)
#Closed)Objects)
precision$(main$experiment)$
precision$(reverse$invita<ons)$
Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
General observations
A given community of workers can be broken down into
(possibly overlapping) sub-communities with different
expertise
Experts from community feel more engaged with the task
• They are more demanding with respect to the quality of
the application UI and the evaluated objects
• Provide feedbacks on the application, question and the
objects evaluated• “How is it possible that this image is related to me?!”
SOCM’14, Monday, April 7
19Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
Conclusions
• Communities can be effectively used for tasks that require
domain expertise
• How to deal with tasks requiring multiple expertise
• How to build a knowledge base that allows profiling of
both communities and queries in a optimal way
• How to cope with the dynamics over time of
• Communities and task (changing needs)
• Communities and worker expertise
20Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
Thanks for your attention
Any Question?
21Monday 15th September 2014 Community Profiling for Crowdsourcing Queries
http://crowdsearcher.search-computing.org/
Contacts
Khalid Belhajjame [email protected]
Marco Brambilla [email protected]
Daniela Grigori [email protected]
Andrea Mauri [email protected]
References• Alessandro Bozzon, Marco Brambilla, Stefano Ceri, Andrea Mauri, Riccardo Volonterio. 2014. Pattern-
Based Speci cation of Crowdsourcing Applications. In Proceedings of the 14th International Conference of Web Engineering (ICWE 2014), 218 - 235
• Marco Brambilla, Stefano Ceri, Andrea Mauri, Riccardo Volonterio. 2014. Community-based Crowdsourcing. In The 2nd International Workshop on the Theory and Practice of Social Machines. Proceedings of the 23nd International Conference on World Wide Web (Companion Volume).
• Alessandro Bozzon, Marco Brambilla, Stefano Ceri, Andrea Mauri. 2013. Reactive Crowdsourcing. In Proceedings of the 22nd International Conference on World Wide Web (WWW 2013).
• Alessandro Bozzon, Marco Brambilla, Stefano Ceri, Matteo Silvestri, Giuliano Vesci. 2013. Choosing the right crowd: expert finding in social networks. In Proceedings of 16th International Conference on Extending Database Technology (EDBT 2013). ACM, USA, 637-648.
• Alessandro Bozzon, Marco Brambilla, and Stefano Ceri. 2012. Answering search queries with CrowdSearcher. In Proceedings of the 21st international conference on World Wide Web (WWW '12). ACM, New York, NY, USA, 1009-1018.
• Alessandro Bozzon, Marco Brambilla, Andrea Mauri. 2012. A Model-Driven Approach for
22Monday 15th September 2014 Community Profiling for Crowdsourcing Queries