Top Banner
CSCI5070 Advanced Topics in Social Computing Community Question Answering Irwin King The Chinese University of Hong Kong [email protected] ©2012 All Rights Reserved.
76

CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Sep 26, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

CSCI5070 Advanced Topics in Social Computing���

���Community Question Answering

Irwin King

The Chinese University of Hong Kong

[email protected]

©2012 All Rights Reserved.

Page 2: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Outline

•  Introduction •  Question Retrieval •  Question Recommendation •  Question Subjectivity Analysis •  Content Quality Evaluation

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 3: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Introduction

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 4: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Yahoo! Answers

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 5: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Stack Overflow

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 6: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Advantages of CQA

•  Could solve information needs that are personal, heterogeneous, specific, open-ended, and cannot be expressed as a short query

•  No single Web page will directly answer these complex and heterogeneous needs, CQA users should understand and answer better than a machine

•  Have accumulated rich knowledge –  More than one billion posted answers in Yahoo! Answers

http://yanswersblog.com/index.php/archives/2010/05/03/1-billion- answers-served/

–  More than 190 million resolved questions in Baidu Zhidao –  In China, 25% of Google's top-research-results page contain at least

one link to some Q&A site, Si et al., VLDB, 2010

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 7: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Covered Topics

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 8: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

QUESTION RETRIEVAL

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 9: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Ask A Question

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 10: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Problem and Opportunity

•  Problem –  Askers need to wait some time to get an answer, time lag –  15% of the questions do not receive any answer in Yahoo!

Anwers, which is one of the first CQA sites on the Web •  Opportunity

–  25% questions in certain categories are recurrent, Anna, Gideon and Yoelle, WWW, 2012

•  Answer new questions by reusing past resolved questions

•  Question Retrieval: find semantically similar past questions for a new question

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 11: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Retrieval Example

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 12: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Benefit of Question Retrieval

•  Provide an alternative to automatic question answering

•  Help askers get an answer in a timely manner

•  Guide answerers to answer unique questions, better utilize users' answering passion

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 13: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Notations

Q A new question

D A candidate question

│·│   Length of the text

C Background collection

w A term in the new question

t A term in a candidate question

Symbol Description

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 14: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Lexical-based Approach: ���Language Model

•  In language modeling, similarity between a query and a document is given by the probability of generating the query from the document language model

•  Unigram language model, i.i.d. sampling

•  In question retrieval syntax, query is the new question, document is a candidate question

( ) ( )∏∈

=Qw

DwPDQP

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 15: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Lexical-based Approach: ���Language Model

•  To avoid zero probabilities and estimate more accurate language models, documents are smoothed using a background collection

 ג is a smoothing parameter, 0 ≥ ג ≥ 1 ––  Maximum likelihood estimator to calculate Pml (·)

( ) ( ) ( ) ( )CwPDwPDwP mlml λλ +−= 1

( ) ( )( )∑

=

Dw

mlDwncytermfreque

DwncytermfrequeDwP

','

,

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 16: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Language Model Example

•  Query (q): revenue down•  Document 1 (d1): xyzzy reports a profit but revenue is down

•  Document 2 (d2): quorus narrows quarter loss but revenue decreases further

  ג 0.5 = •

•  Ranking: d1 > d2

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 17: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Lexical-based Approach: Translation Model

•  Language Model –  Advantage: Simple –  Disadvantage: Lexical Gap

•  Lexical Gap, two questions that have the same meaning use very dierent wording –  Is downloading movies illegal? –  Can I share a copy of a DVD online?

•  Jiwoon Jeon, W. Bruce Croft and Joon Ho Lee, Finding Similar Questions in Large Question and Answer Archives, CIKM, 2005

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 18: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Lexical-based Approach: Translation Model

•  T(w│t ) is the probability that word w is the translation of word t, denotes semantic similarities between words

language Model

Translation Model

( ) ( ) ( ) ( )CwPDwPDwP mlml λλ +−= 1

( ) ( ) ( ) ( )( ) ( )CwPDtPtwTDwP mlDt

ml λλ +−= ∑∈

1

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 19: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Example

Id like to insert music into PowerPoint.

How can I link sounds in PowerPoint?

How can I shut down my system in Dos-mode.

How to turn off computers in Dos-mode.

Photo transfer from cell phones to computers.

How to move photos taken by cell phones.

Which application can run bin files?

I download a game. How can I execute bin files?

Table: Questions share few common words, but may have high semantic relatedness according to translation model

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 20: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Figure: The first row shows the source words and each column shows top 10 words that are most semantically similar to source word. A higher rank means a larger T(w│t ) value

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 21: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Lexical-based Approach: Translation Model

•  How to learn T(w│t)? –  Prepare a monolingual parallel corpus of pairs of text, each

pair should be semantically similar –  Employ machine translation model IBM model 1 on the

parallel corpus to learn T(w│t)

–  IBM model 1: Brown et al., Computational Linguistics, 1990

•  How this paper prepares monolingual parallel corpus –  Each pair contains two questions whose answers are very

similar

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 22: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Lexical-based Approach: Translation Model

•  Delphine Bernhard and Iryna Gurevych, Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding, ACL, 2009

•  Propose several methods to prepare parallel monolingual corpora –  Question answer pairs: question – answer –  Question reformulation pairs: question -- question reformulation by user

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 23: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Lexical-based Approach: Translation Model

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 24: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Lexical-based Approach: Translation Model

•  Lexical Semantic Resources: glosses and definitions for the same lexeme in different lexical semantic and encyclopedic resources can be considered as near-paraphrases, since they define the same terms and hence have the same meaning–  Moon –  Wordnet: the natural satellite of the Earth –  English Wiktionary: the Moon, the satellite of planet Earth

•  English Wikipedia: the Moon (Latin: Luna) is Earth’s only natural satellite and the fifth largest natural satellite in the Solar System

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 25: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Lexical-based Approach: Translation-based Language

Model •  Translation Model

–  Advantage: Tackle lexical gap to some extent

–  Disadvantage: T (w │w)=1 for all w while maintaining other word translation probabilities unchanged, produce inconsistent probability estimates and make the model unstable

•  Xiaobing Xue, Jiwoon Jeon and W. Bruce Croft, Retrieval Models for Question and Answer Archives, SiGIR, 2008

•  Translation-based Language Model

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 26: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Lexical-based Approach: Translation-based Language

Model

–  Linear combination of language model and translation model

–  Answer part should provide additional evidence about relevance, incorporating the answer part

( )( ) ( ) ( ) ( ) ( )

1

,

=++

++= ∑∈

γβα

γβα AwPDtPtwTDwPADwP mlmlDt

mlmx

Translation Model

Translation-based Language Model

( ) ( ) ( ) ( )( ) ( )CwPDtPtwTDwP mlDt

ml λλ +−= ∑∈

1

( ) ( ) ( )

( ) ( ) ( ) ( ) ( )DtPtwTDwPDwP

CwPD

DwPDD

DwP

mlDt

mlmx

mlmx

∑∈

+−=

++

+=

ββλ

λλ

1

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 27: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Syntactic-based Approach: Syntactic Tree Matching

•  Some similar questions neither share many common words, nor follow identical syntactic structure –  How can I lose weight in a few months?

–  Are there any ways of losing pound in a short period?

•  Kai Wang, Zhaoyan Ming and Tat-Seng Chua, A Syntactic Tree Matching Approach to Finding Similar Questions in Community-based QA Services, SIGIR, 2009

•  Syntactic tree matchingThe  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 28: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Figure: (a) The Syntactic Tree of the Question "How to lose weight?". (b)

Tree Fragments of the Sub-tree covering "lose weight".

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 29: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Syntactic-based Approach: ���Syntactic Tree Matching

•  Tree kernel: utilize structural or syntactic information to capture higher order dependencies between grammar rules

•  N1, N2 are sets of nodes in two syntactic trees T1 and T2, and C(n1; n2) equals to the number of common fragments rooted in n1 and n2

( ) ( )∑ ∑∈ ∈

=11 22

2121 ,,Nn Nn

nnCTTk

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 30: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Syntactic-based Approach: ���Syntactic Tree Matching

•  Limitation of tree kernel –  Tree kernel function merely replies on the intuition of

counting the common number of sub-trees, whereas the number might not be a good indicator of the similarity between two questions

–  Two evaluated sub-trees have to be identical to allow further parent matching, for which semantic representations cannot fit in well

•  Syntactic tree matching –  A new weighting scheme for tree fragments that are

robust against some grammatical errors –  Incorporate semantic features

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 31: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

QUESTION RECOMMENDATION

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 32: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Motivation

•  Question Recommendation –  Retrieve and rank other questions according to their

likelihood of being good recommendations of the queried question

–  A good recommendation provides alternative aspects around users' interest

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 33: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Example

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 34: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Recommendation: ���MDL-based Tree Cut Model

•  Yunbo Cao, Huizhong Duan, Chin-Yew Lin, Yong Yu and Hsiao-Wuen Hon, Recommending Questions Using the MDL-based Tree Cut Model, WWW, 2008

–  Step 1: Represent questions as graphs of topic terms

–  Step 2: Rank recommendations on the basis of the graphs

•  Formalize both steps as the tree-cutting problems and employ the MDL (Minimum Description Length) for selecting the best cutsThe  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 35: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Recommendation: ���MDL-based Tree Cut Model

•  Question –  Any cool clubs in Berlin

or Hamburg? •  Question topic

–  Major context/constraint of a question, characterize users’ interests

–  Berlin, Hamburg •  Question focus

–  Certain aspect of the question topic

–  cool club •  Suggest alternative aspects

of the queries question topic

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 36: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Recommendation: ���MDL-based Tree Cut Model

•  Extraction of topic terms: base noun phrase, WH-ngram •  Reduction of topic terms: MDL-based tree cut model

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 37: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Recommendation: ���MDL-based Tree Cut Model

•  Topic profile –  Probability distribution of categories {p(c│t)}

–  count(c,t) is the frequency of the topic term t within the category c •  Specificity

–  Inverse of the entropy of the topic profile –  Topic term of high specificity usually specifies question topic –  Topic term of low specificity is usually used to represent question

focus

•  Topic chain –  Topic chain is a sequence of ordered topic terms sorted from big to

mall according to specificity

•  Question tree –  Prefix tree built over topic chains of the question set Q

Cc∈

( ) ( )( )∑ ∈

=Cc

tccounttccounttcp,

,

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 38: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Recommendation: ���MDL-based Tree Cut Model

•  Ranking recommendation candidates –  Determine what topic terms

(question focus) should be substituted

–  Collect a set of topic chain Qc = { qc

i} Ni-1 such that at least one topic term occurs in both qc and qc

i

–  Construct a question tree from the set of topic chain Qc U qc

–  Employ MDL to separate topic chains into Head, H and Tail, T

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 39: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Recommendation: ���MDL-based Tree Cut Model

•  Ranking recommendation candidates –  Score recommendation

candidates rendered by various substitutions

–  Specificity: the more similar are H(qc ) and H(^qc ), the higher score

–  Generality: the more similar are T(qc ) and T(^qc ), the lower score

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 40: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Recommendation: TopicTRLM

•  Tom Chao Zhou, Chin-Yew Lin, Irwin King, Michael R. Lyu, Young-In Song and Yunbo Cao, Learning to Suggest Questions in Online Forums, AAAI, 2011

•  Suggest semantically related questions in online forums –  How is Orange Beach in Alabama?

•  Is the water pretty clear this time of year on Orange Beach?

•  Do they have chair and umbrella rentals on Orange Beach? –  Topic: travel in Orange Beach

•  Fuse both lexical and latent semantic information

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 41: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Recommendation: TopicTRLM

•  Document representation –  Bag-of-words

•  Independent •  Fine-grained representation Lexically similar

–  Topic model •  Assign a set of latent topic distributions to each word

•  Capturing important relationships between words •  Coarse-grained representation

•  Semantically related

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 42: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

 QuesAon  RecommendaAon:  

TopicTRLM  

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 43: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

QUESTION SUBJECTIVITY ANALYSIS

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 44: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Subjectivity Analysis

•  Question Analysis is to analyze characteristics of questions

•  Understand User Intent •  Provide rich information to question search, question

recommendation, answer quality prediction, etc.

•  Question Subjectivity Analysis is an important aspect of question analysis

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 45: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Definition

•  Subjective question –  Private statements –  Personal opinion and experience –  E.g. What’s the difference between chemotherapy and radiation

treatments?

•  Objective question –  Objective, verifiable information –  Often with support from reliable sources –  E.g. Has anyone got one of those home blood pressure

monitors? And if so what make is it and do you think they are worth getting?

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 46: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Motivation

•  More accurately identify similar questions, improve question search

•  Better rank or filter the answers based on whether an answer matches the question orientation

•  Crucial component of inferring user intent, a long-standing problem in Web search

•  Route subjective questions to users for answer, trigger automatic factual question answering system for objective questions

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 47: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Challenge

•  Ill-formatted, e.g., word capitalization may be incorrect or missing, consecutive words may be concatenated

•  Ungrammatical, include common online idioms, e.g., using “u” means “you”, “2” means “to”

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 48: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Subjectivity Analysis: Supervised Learning

•  Baoli Li, Yandong Liu, Ashwin Ram, Ernest V. Garcia and Eugene Agichtein, Exploring Question Subjectivity Prediction in Community QA, SIGIR, 2008

•  Support Vector Machine with linear kernel •  Features

–  Character 3-gram –  Word –  Word + character 3-gram –  Word n-gram –  Word POS n-gram, mix of word and POS tri-grams

•  Term weighting schemes: binary, TF, TF*IDF The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 49: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Subjectivity Analysis: Semi-Supervised Learning

Figure: Yahoo Answers Example

•  Baoli Li, Yandong Liu and Eugene Agichtein, CoCQA: Co-Training Over Questions and Answers with an Application to Predicting Question Subjectivity Orientation, EMNLP, 2008

•  Incorporate relationships between questions and corresponding answers

•  Co-training, two views of the data, question and answer

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 50: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

•  At step 1,2, each category has top Kj most confident examples chosen as additional “labeled” data

•  Terminate when the increments of both classifiers are less than threshold X or maximum number of iterations are exceeded

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 51: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Subjectivity Analysis: Data-driven Approach

•  Tom Chao Zhou, Xiance Si, Edward Y. Chang, Irwin King and Michael R. Lyu, A Data-Driven Approach to Question Subjectivity Identification in Community Question Answering, AAAI, 2012

•  Li et al. 2008 (supervised), Li et al. 2008 (CoCQA, semi-supervised) based on manual labeling data

•  Manual labeling data is quite expensive

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 52: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Subjectivity Analysis: Data-driven Approach

Web-scale learning is to use available large-scale data rather than hoping for annotated data that isn’t available

- Halevy, Norvig and Pereira

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 53: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Subjectivity Analysis: Data-driven Approach

Whether we can utilize social signals to collect training data for question subjectivity identification with NO manual labeling?

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 54: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Like Signal •  Like an answer if they

find the answer useful •  Intuition –  Subjective: answers

are opinions, different tastes; best answer receives similar number of likes with other answers

– Objective: like an answer which explains universal truth in most detail; best answer receives high likes than other answers

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 55: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Vote Signal

•  Users  could  vote  for  best  answer

•  IntuiAon  –  Subjec've:  vote  for      different  answers,  support  different  opinions;  low    percentage  of  votes  on    best  answer    

–  Objec've:  easy  to  idenAfy    answer  contain  the  most  fact;  percentage  of  votes  of      best  answer  is  high    

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 56: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Source Signal

•  Reference to authoritative resources

•  Intuition –  Only available for

objective question that has fact answer

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 57: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Poll and Survey Signal

•  User intent is to seek opinions •  Very likely to be subjective

•  What is something you learned in school that you think is useful to you today?

•  If you could be a cartoon character, who would you want to be?

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 58: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Answer Number Signal

•  The number of posted answers to each question•  Intuition

–  Subjective: alert post opinions even they notice there are other answers

–  Objective: may not post answers to questions that has received other answers since an expected answer is usually fixed

–  A large answer number indicate subjectivity

–  A small answer number may be due to many reasons, such as objectivity, small page views

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 59: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Subjectivity Analysis: Data-driven Approach

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 60: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Subjectivity Analysis: Data-driven Approach

•  Features –  Word: term frequency

–  Word n-gram: term frequency–  Word: term frequency

–  Question length: information needs of subjective questions are complex, users use descriptions to explain, larger question length Request word: particular words to explicitly indicate their request for seeking opinions; manual list of 9 words

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 61: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Question Subjectivity Analysis: Data-driven Approach

•  Subjectivity clue: external lexicon, over 8000 clues, manually compiled word list from news to express opinions

•  Punctuation density: density of punctuation marks •  Grammatical modifier: inspired by opinion mining

research of using grammatical modifiers on judging users’ opinions, adjective and adverb

•  Entity: objective question expects fact answer, leading to less relationships among entities, subjective questions contains more descriptions, may involve relatively complex relations

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 62: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

CONTENT QUALITY EVALUATION

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 63: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Content Quality Evaluation

•  Motivation –  High variance in the quality of answers & questions

–  Automatically find the best answer & spam –  Significant impact on user satisfaction

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 64: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Approaches

•  Maximum Entropy (Jeon et al. 2006) •  Learning to Rank (Surdeanu et al. 2008) •  Analogical Reasoning (Wang et al., 2009)

•  Graph-based Models –  Coupled Mutual Reinforcement (Bian et al., 2009)

–  EXHITS (Suryanto et al., 2009)

•  Logistic Regression (Shah et al. 2010)

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 65: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Recognizing Reliable Users and Content with Coupled Mutual Reinforcement

•  Given a CQA archive •  Determine the quality of each

question and answer and the answer-reputation and question-reputation of each user simultaneously

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 66: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Content Quality & User Reputation

•  Question Quality –  A question's effectiveness at attracting high quality answers

•  Answer Quality –  The responsiveness, accuracy, and comprehensiveness of

the answer to a question.

•  Question Reputation –  The expected quality of the questions posted by a user

•  Answer Reputation –  The expected quality of the answers posted by a user

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 67: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

CQA-MR Model

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 68: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Mutual Reinforcement Principle

the  quality  of  answer  a’s  quesAon

the  quesAon  reputaAon  of  the  user  who  ask  quesAon  q

u’s  answer  reputaAon  

a’s  quality

q’s  quality

u’s  quesAon  reputaAon  

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 69: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Feature Space

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 70: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Logistic Regression Model

•  P(x): probability of being “good” (x can be a question, answer or user feature vector)

•  Object function

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 71: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Algorithm

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 72: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

Experimental Result

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 73: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

References•  B. Dom and D. Paranjpe. A Bayesian Technique for Estimating the Credibility of Question Answerers.

Proceedings of SIAM Conference on Data Mining (SDM'08), pages 399--409, 2008.

•  Chirag Shah and Jefferey Pomerantz. 2010. Evaluating and predicting answer quality in community QA. In Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval (SIGIR '10). ACM, New York, NY, USA, 411-418. 

•  Dredze, M.; Crammer, K.; and Pereira, F. 2008. Conidence-Weighted Linear Classiication. In Proceedings of the Twenty-Fifth International Conference on Machine Learning (ICML). Princeton, NJ: International Machine Learning Society.

•  Ferrucci, D., and Lally, A. 2004. UIMA: An Architectural Approach to Unstructured Information Processing in the Corporate Research Environment. Natural Language Engineering 10(3–4): 327–348.

•  GUO, J., XU, S., BAO, S., AND YU, Y. 2008. Tapping on the potential of q&a community by recommending answer providers. In Proceeding of the 17th ACM conference on Information and knowledge management. CIKM ’08. ACM, New York, NY, USA, 921–930.

•  Jiang Bian, Yandong Liu, Ding Zhou, Eugene Agichtein, and Hongyuan Zha. 2009. Learning to recognize reliable users and content in social media with coupled mutual reinforcement. In Proceedings of the 18th international conference on World wide web (WWW '09). ACM, New York, NY, USA, 51-60.

•  Jiwoon Jeon, W. Bruce Croft, Joon Ho Lee, and Soyeon Park. 2006. A framework to predict the quality of answers with non-textual features. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '06). ACM, New York, NY, USA, 228-235. 

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 74: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

References•  Jun Zhang, Mark S. Ackerman, and Lada Adamic. 2007. Expertise networks in online communities: structure

and algorithms. In Proceedings of the 16th international conference on World Wide Web (WWW '07). ACM, New York, NY, USA, 221-230. 

•  Lenat, D. B. 1995. Cyc: A Large-Scale Investment in Knowledge Infrastructure. Communications of the ACM 38(11): 33–38.

•  Yunbo Cao, Huizhong Duan, Chin-Yew Lin, Yong Yu, and Hsiao-Wuen Hon . Recommending Questions Using the MDL-based Tree Cut Model, WWW, 2008.

•  Kai Wang, Zhaoyan Ming, and Tat-Seng Chua. A Syntactic Tree Matching Approach to Finding Similar Questions in Community-based QA Services, SIGIR, 2009.

•  Xin Cao, Gao Cong, Bin Cui, Christian Søndergaard Jensen, and Ce Zhang. The Use of Categorization Information in Language Models for Question Retrieval, CIKM, 2009.

•  Tom Chao Zhou, Chin-Yew Lin, Irwin King, Michael R. Lyu, Young-In Song, and Yunbo Cao. Learning to Suggest Questions in Online Forums, AAAI, 2011.

•  LIU, M., LIU, Y., AND YANG, Q. 2010. Predicting best answerers for new questions in community question answering. In Web-Age Information Management, L. Chen, C. Tang, J. Yang, and Y. Gao, Eds. Lecture Notes in Computer Science Series, vol. 6184. Springer Berlin / Heidelberg, 127–138.

•  Maggy Anastasia Suryanto, Ee Peng Lim, Aixin Sun, and Roger H. L. Chiang. 2009. Quality-aware collaborative question answering: methods and evaluation. In Proceedings of the Second ACM International Conference on Web Search and Data Mining (WSDM '09), Ricardo Baeza-Yates, Paolo Boldi, Berthier Ribeiro-Neto, and B. Barla Cambazoglu (Eds.). ACM, New York, NY, USA, 142-151.

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 75: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

References•  Pawel Jurczyk and Eugene Agichtein. 2007. Discovering authorities in question answer communities by using

link analysis. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management (CIKM '07). ACM, New York, NY, USA, 919-922.

•  QU, M., QIU, G., HE, X., ZHANG, C., WU, H., BU, J., AND CHEN, C. 2009. Probabilistic question recommendation for question answering communities. In Proceedings of the 18th international conference on World wide web. WWW ’09. ACM, New York, NY, USA, 1229–1230.

•  Smith T. F., and Waterman M. S. 1981. Identification of Common Molecular Subsequences. Journal of Molecular Biology 147(1): 195–197.

•  Xin-Jing Wang, Xudong Tu, Dan Feng, and Lei Zhang. 2009. Ranking community answers by modeling question-answer relationships via analogical reasoning. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval(SIGIR '09). ACM, New York, NY, USA, 179-186. 

•  X. Liu, W. B. Croft, and M. Koll. Finding experts in community-based question-answering services. In CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management, pages 315–316, New York, NY, USA,2005.

•  ZHOU, Y., CONG, G., CUI, B., JENSEN, C. S., AND YAO, J. 2009. Routing questions to the right users in online communities. In Proceedings of the 2009 IEEE International Conference on Data Engineering.

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King  

Page 76: CSCI5070 Advanced Topics in Social Computingking/PUB/csci5070/CSCI5070-07-CQA.pdf · CSCI5070 Advanced Topics in Social Computing!! Community Question Answering" Irwin King" " The

References •  Fei Song and W. Bruce Croft . A general language model for information retrieval, CIKM, 1999 •  John Lafferty and Chengxiang Zhai. Document language models, query models, and risk minimization for

information retrieval, SIGIR, 2001

•  Chengxiang Zhai and John Lafferty. A study of smoothing methods for language models applied to information retrieval, TOIS

•  Jiwoon Jeon, W. Bruce Croft, and Joon Ho Lee. Finding Semantically Similar Questions Based on Their Answers, SIGIR, 2005

•  Jiwoon Jeon, W. Bruce Croft, and Joon Ho Lee. Finding similar questions in large question and answer archives, CIKM, 2005

•  Gao Cong, Long Wang, Chin-Yew Lin, Young-In Song, and Yueheng Sun. Finding question-answer pairs from online forums, SIGIR, 2008

•  Xiaobing Xue, Jiwoon Jeon, and W. Bruce Croft. Retrieval Models for Question and Answer Archives, SIGIR, 2008

•  Mihai Surdeanu, Massimiliano Ciaramita, and Hugo Zaragoza. Learning to Rank Answers on Large Online QA Collections. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL 2008), 2008.

The  Chinese  University  of  Hong  Kong,  CSCI  5070  Advanced  Topic  in  Social  CompuAng,  Irwin  King