Top Banner
CrowdQ: Crowdsourced Query Understanding Gianluca Demar8ni, Beth Trushkowsky, Tim Kraska, Michael J. Franklin
15

CrowdQ: Crowdsourced Query Understanding

May 10, 2015

Download

Technology

Conference talk at CIDR 2013, January 2013, Asilomar, CA, USA.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CrowdQ: Crowdsourced Query Understanding

CrowdQ:  Crowdsourced  Query  Understanding    

Gianluca  Demar8ni,  Beth  Trushkowsky,  Tim  Kraska,  Michael  J.  Franklin  

Page 2: CrowdQ: Crowdsourced Query Understanding

Scenario  

Find  the  birthdate  of  the  mayor  of  the  capital  city  of  France    

Gianluca  Demar8ni   2  

Page 3: CrowdQ: Crowdsourced Query Understanding

Gianluca  Demar8ni   3  

Page 4: CrowdQ: Crowdsourced Query Understanding

Gianluca  Demar8ni   4  

Page 5: CrowdQ: Crowdsourced Query Understanding

Gianluca  Demar8ni   5  

Page 6: CrowdQ: Crowdsourced Query Understanding

Gianluca  Demar8ni   6  

Page 7: CrowdQ: Crowdsourced Query Understanding

Mo8va8on  

•  Web  Search  Engines  can  answer  simple  factual  queries  directly  on  the  result  page  

•  Users  with  complex  informa8on  needs  are  oQen  unsa8sfied  

•  Purely  automa8c  techniques  are  not  enough  

•  We  want  to  solve  it  with  Crowdsourcing!  

Gianluca  Demar8ni   7  

Page 8: CrowdQ: Crowdsourced Query Understanding

Background  

•  Crowdsourcing  so  far  used  for  data  processing  – DB/SemWeb:  Data  integra8on  and  cleaning  –  IR:  Relevance  judgments  

 We  use  the  crowd  to  understand  the  query  

Gianluca  Demar8ni   8  

Page 9: CrowdQ: Crowdsourced Query Understanding

CrowdQ  

•  CrowdQ  is  the  first  system  that  uses  crowdsourcing  to  – Understand  the  intended  meaning  

– Build  a  structured  query  template  – Answer  the  query  over  Linked  Open  Data  

Gianluca  Demar8ni   9  

Page 10: CrowdQ: Crowdsourced Query Understanding

Gianluca  Demar8ni   10  

Page 11: CrowdQ: Crowdsourced Query Understanding

User

Keyword QueryOn#line'Complex'Query

ProcessingComplex

query classifier

CrowdsourcingPlatform

Vetrical selection,

Unstructured Search, ...

POS + NER tagging

Query Template Index

Crowd Manager

N

Y

Queries Templ +Answer Types

StructuredLOD Search

Result Joiner

Template Generation

SERP

t1t2t3

Off#line'Complex'QueryDecomposition

Structured Query

Query Logquery

N

Answ

erCo

mpo

sitio

n

LOD Open Data Cloud

Match with existingquery templates

CrowdQ  Architecture  

Gianluca  Demar8ni   11  

Off-­‐line:  query  template  genera8on  with  the  help  of  the  crowd  On-­‐line:  query  template  matching  using  NLP  and  search  over  open  data  

Page 12: CrowdQ: Crowdsourced Query Understanding

Hybrid  Human-­‐Machine  Pipeline  

Gianluca  Demar8ni   12  

Q=  birthdate  of  actors  of  forrest  gump  

Query  annota8on   Noun   Noun   Named  en8ty  

Verifica8on  

En8ty  Rela8ons  

Is  forrest  gump  this  en8ty  in  the  query?  

Which  is  the  rela8on  between:  actors  and  forrest  gump   starring  

Schema  element   Starring                          <dbpedia-­‐owl:starring>    

Verifica8on   Is  the  rela8on  between:  Indiana  Jones  –  Harrison  Ford  Back  to  the  Future  –  Michael  J.  Fox  of  the  same  type  as  Forrest  Gump  -­‐  actors        

Page 13: CrowdQ: Crowdsourced Query Understanding

Structured  query  genera8on  

SELECT  ?y  ?x  WHERE  {  ?y  <dbpedia-­‐owl:birthdate>  ?x  .  

     ?z  <dbpedia-­‐owl:starring>  ?y  .  

     ?z  <rdfs:label>  ‘Forrest  Gump’  }  

Gianluca  Demar8ni   13  

Results  from  BTC09:  

Q=  birthdate  of  actors  of  forrest  gump  MOVIE  

MOVIE  

Page 14: CrowdQ: Crowdsourced Query Understanding

Current  Status  

•  Realize  the  vision  •  Running  demo:  – Daniel  Haas,  Daniel  Bruckner,  Jonathan  Harper  

•  Next  Steps  – Evalua8on  of  Crowd  effec8veness  at  each  step  – Comparison  hybrid  vs  machine  pipeline  

Gianluca  Demar8ni   14  

Page 15: CrowdQ: Crowdsourced Query Understanding

Conclusions  

•  CrowdQ:  an  hybrid  approach  to  complex  query  understanding  

•  Combines  techniques  from  DB,  NLP,  IR,  Data  Mining,  and  Human  Intelligence    

•  Ini8al  experiments  show  the  poten8al  of  CrowdQ  

Gianluca  Demar8ni   15