Top Banner
1 A Discriminative Approach to Topic-Based Citation Recommendation Jie Tang and Jing Zhang Presented by Pei Li Knowledge Engineering Group, Dept. of Computer Science and Technology Tsinghua University April, 2009
21

A Discriminative Approach to Topic-Based Citation Recommendation

Jan 04, 2016

Download

Documents

connor-merrill

A Discriminative Approach to Topic-Based Citation Recommendation. Jie Tang and Jing Zhang Presented by Pei Li Knowledge Engineering Group, Dept. of Computer Science and Technology Tsinghua University April, 2009. Motivation. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Discriminative Approach to Topic-Based Citation Recommendation

1

A Discriminative Approach to Topic-Based Citation Recommendation

Jie Tang and Jing ZhangPresented by

Pei Li

Knowledge Engineering Group, Dept. of Computer Science and Technology

Tsinghua UniversityApril, 2009

Page 2: A Discriminative Approach to Topic-Based Citation Recommendation

2

Motivation

However, we are surrounded by the numerous academic data …

“Academic search is insufficient in many

practical applications”

Page 3: A Discriminative Approach to Topic-Based Citation Recommendation

3

Which papers should we refer to?

Researcher A

Examples – Citation Suggestion

Page 4: A Discriminative Approach to Topic-Based Citation Recommendation

4

Problem Formulation

Query-focused Text Summarization

We are considering the extraction-based text summarization. …As for the models, we can adopt many existing probabilistic retrieval models such as the classic probabilistic retrieval models and the Kullback-Leibler (KL) divergence retrieval model.

Page 5: A Discriminative Approach to Topic-Based Citation Recommendation

5

Problem Formulation

Query-focused Text Summarization

We are considering the extraction-based text summarization. …As for the models, we can adopt many existing probabilistic retrieval models such as the classic probabilistic retrieval models and the Kullback-Leibler (KL) divergence retrieval model.

Two challenging questions:• How to identify the topics?• How to recommend citations based on the topics?

Page 6: A Discriminative Approach to Topic-Based Citation Recommendation

6

Outline

• Prior Work• Our Approach

– The RBM-CS model– Ranking and recommendation– Matching recommended papers with sentences

• Experiments• Conclusions

Page 7: A Discriminative Approach to Topic-Based Citation Recommendation

7

Prior Work

• Measuring the quality of journal/paper– Science Citation Index (Garfield, Science’72)

– Bibliographical Coupling (BC) (Kessler, American Documentation’63)

• Paper recommendation– using a graphical framework (Strohman et al. SIGIR’07)

– collaborative filtering (McNee et al. CSCW’02)

• Restricted Boltzmann Machines (RBMs)– generative models based on latent variables to

model an input distribution

Page 8: A Discriminative Approach to Topic-Based Citation Recommendation

8

Outline

• Prior Work• Our Approach

– The RBM-CS model– Ranking and recommendation– Matching recommended papers with sentences

• Experiments• Conclusions

Page 9: A Discriminative Approach to Topic-Based Citation Recommendation

9

Modeling

Query-focused Text Summarization

We are considering the extraction-based text summarization. …As for the models, we can adopt many existing probabilistic retrieval models such as the classic probabilistic retrieval models and the Kullback-Leibler (KL) divergence retrieval model.

Approach Overview

Topic 1 Topic 2

Training data

Topic analysis with RBM-CS

Test data: a new document

RBM-CS

2

+

Discriminative model parameters Θ

UM

a

be

2Citation set

Candidate selection

1

3Matching

1. We are considering the extraction-based text summarization.

2. As for the models, we can adopt many existing probabilistic retrieval models such as the classic probabilistic retrieval models

3. and the Kullback-Leibler (KL) divergence retrieval model.

Page 10: A Discriminative Approach to Topic-Based Citation Recommendation

10

Modeling with RBM-CS model

1

log ( | ) log ( | )L

d d j dd D d D j

L p p l

l w w

Discriminative objective function:

Sigmoid func: σ(x) = 1/(1+exp(-x)) Bias terms

Bias terms

Page 11: A Discriminative Approach to Topic-Based Citation Recommendation

11

Parameter Estimation

Page 12: A Discriminative Approach to Topic-Based Citation Recommendation

12

Ranking and Recommendation

• By applying the same modeling procedure to the citation context, we can obtain a topic representation {hc} of the citation context c.

Therefore, we can calculate:

• Finally, candidate papers are ranked according to p(ld|hc) and the topic ranked K papers are returned as the recommended papers.

1

( | ) ( ( ) )T

d c jk ck jk

p l U f h e

h

Page 13: A Discriminative Approach to Topic-Based Citation Recommendation

13

Matching Recommended Papers with Citation Sentences

1. We are considering the extraction-based text summarization.

2. As for the models, we can adopt many existing probabilistic retrieval models such as the classic probabilistic retrieval models

3. and the Kullback-Leibler (KL) divergence retrieval model.

1

( | )( , ) ( | ) log

( | )

Tk

ci kk k ci

p h dKL d s p h d

p h s

Use KL-divergence to measure the relevance between the recommended paper and the citation sentence:

the ith sentence in the citation context c

Probabilities obtained from RBM-CS

The goal is to match

Page 14: A Discriminative Approach to Topic-Based Citation Recommendation

14

Outline

• Prior Work• Our Approach

– The RBM-CS model– Ranking and recommendation– Matching recommended papers with sentences

• Experiments• Conclusions

Page 15: A Discriminative Approach to Topic-Based Citation Recommendation

15

Experimental Setting

• Data Sets– NIPS: 1,605 papers and 10,472 citations– Citeseer: 3,335 papers and 32,558 citations

• Baseline methods– Language model– Restricted Boltzmann Machines (RBMs)

• Evaluation Measures– P@1, P@3, P@5, P@10, Rprec, Bpref, MRR

• Parameter Setting– K=7 for NIPS and K=11 for Citeseer– Learning rate=0.01/batch-size, momentum=0.9, decay=0.001

Page 16: A Discriminative Approach to Topic-Based Citation Recommendation

16

Discovered “Topics”

Page 17: A Discriminative Approach to Topic-Based Citation Recommendation

17

Recommendation Performance

Page 18: A Discriminative Approach to Topic-Based Citation Recommendation

18

Sentence-level Performance

+7.65%

+9.24%

Page 19: A Discriminative Approach to Topic-Based Citation Recommendation

19

Outline

• Prior Work• Our Approach

– The RBM-CS model– Ranking and recommendation– Matching recommended papers with sentences

• Experiments• Conclusions

Page 20: A Discriminative Approach to Topic-Based Citation Recommendation

20

Conclusion

• Formalize the problems of topic-based citation recommendation

• Propose a discriminative approach based on RBM-CS to solve this problem

• Experimental results show that the proposed RBM-CS can effectively improve the recommendation performance

• The citation recommendation is being integrated as a new feature into the our academic search system ArnetMiner (http://arnetminer.org).

Page 21: A Discriminative Approach to Topic-Based Citation Recommendation

21

Thanks!

Q&AHP: http://keg.cs.tsinghua.edu.cn/persons/tj/