Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Post on 14-Jan-2016

41 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Supervised and Traditional Term Weighting Methods for Automatic Text Categorization. Presenter : Cheng-Han Tsai Authors : Man Lan , Chew Lim Tan, Senior Member, IEEE, Jian Su, and Yue Lu, Member, IEEE TPAMI, 2009. Outlines. Motivation Objectives Methodology Experiments - PowerPoint PPT Presentation

Transcript

Intelligent Database Systems Lab

國立雲林科技大學National Yunlin University of Science and Technology

1

Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Presenter : Cheng-Han Tsai  Authors : Man Lan, Chew Lim Tan, Senior Member, IEEE, Jian Su, and Yue Lu, Member, IEEE

TPAMI, 2009

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

2

Outlines

Motivation Objectives Methodology Experiments Conclusions Comments

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Motivation

· The popularly used tf idf‧ method has not shown a uniformly good performance in terms of different data sets

3

Text categorization

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Objectives

· To propose a new simple supervised term weighting method to improve the terms’ discriminating power for text categorization task─ Are supervised term weighting methods better

performance than unsupervised ones for TC?─ Does the difference between supervised and

unsupervised have any relationship with different learning algorithms?

─ Why is the new supervised method, i.e., tf rf, effective ‧for TC?

4

Text categorization

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

5

Text categorization

TF RF‧

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

6

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

7

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

8

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

9

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

10

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

11

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

12

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

13

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Conclusions

· Not all supervised term weighting methods are superior to unsupervised methods (i.e. tf x^2, ‧tf ig)‧

· An adapted learning method is more important than weighting method

· The best performance of tf rf‧ has been analyzed and explained from cross-method comparison, cross-classifier, and cross-corpus validation

14

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

15

Comments

· Advantages─ The writing structure of this paper is clear

· Applications─ Text categorization

top related