Top Banner
Computing text semantic relatedness using the contents and links of a hypertext encyclopedia Presenter : Bo-Sheng Wang Authors : Majid Yazdani a,b,* , Andrei Popescu-Belis a AI, 2013 1
30

Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Feb 23, 2016

Download

Documents

Liv

Presenter : Bo- Sheng Wang Authors: Majid Yazdani a,b ,* , Andrei Popescu-Belis a AI, 2013. Computing text semantic relatedness using the contents and links of a hypertext encyclopedia. Outlines. Motivation Objectives Methodology Em pirical analyses Experiments Conclusions - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Computing text semantic relatedness using the contents and links of a

hypertext encyclopedia

Presenter : Bo-Sheng Wang  Authors : Majid Yazdania,b,*, Andrei Popescu-Belisa

AI, 2013

1

Page 2: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Outlines

• Motivation• Objectives• Methodology• Empirical analyses• Experiments• Conclusions• Comments

2

Page 3: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Motivation

3

• Existing measures of semantic relatedness based on lexical overlap, though widely used, are of little help when text similarity is not based on identical words.

Page 4: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Objectives• Therefore, they will computing text semantic

relatedness based on concepts and their relations, which have linguistic as well as extra-linguistic dimensions, remains a challenge especially in the general domain and/or over noisy

4

Page 5: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Methodology-build concept network

5

• Concept– They removed all Wikipedia articles.• (Talk,File, Image, Template, Category, Portal, and List,)

– Disambiguation pages were removed.– They set a cut-off limit of 100 non-stop words.– They extracted the corresponding anchor text

and considered it as another possible secondary title for the linked article.

Page 6: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Methodology

6

Page 7: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Methodology-build concept network• Relatoins– They focus in the present study on the hyperlinks

and links computed from similarity of content, of category.

– we computed the lexical similarity between articles as the cosine similarity between the vectors derived from the articles’ texts, after stopword removal and stemming using Snowball.

7

Page 8: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Methodology

8

Page 9: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Methodology-VP

9

Page 10: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Methodology-VP to weighted sets of concepts and to texts

10

Page 11: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Methodology-Approximation

11

Page 12: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Methodology-Approximation• T–truncated

• ε-truncated

12

Page 13: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Methodology-Learning embedding

13

Page 14: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Empirical analyses• Convergence of the T-truncated

14

Page 15: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Empirical analyses

• Convergence of ε-truncated

15

Page 16: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Empirical analyses

16

Page 17: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Experiments

• Average training error

17

Page 18: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Experiments

• Average training error

18

Page 19: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Experiments

• Word Similarity

19

Page 20: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Experiments

• Word Similarity

20

Page 21: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Experiments

21

Page 22: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Experiments

• Document similarity

22

Page 23: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Experiments

• Document clustering

23

Page 24: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Experiments

• Comparison of VP and cosine similarity

24

Page 25: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Experiments

• Text classification

25

Page 26: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Experiments

26

Page 27: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Experiments

27

Page 28: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Experiments

28

Page 29: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Conclusions

29

Page 30: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Comments

• Advantages

• Disadvantage

• Applications– Text categorization

30