Christine Preisach, Steffen Rendle and Lars Schmidt- Thieme Information Systems and Machine Learning Lab (ISMLL) University of Hildesheim Germany Relational.

Christine Preisach, Steffen Rendle and Lars Schmidt-Thieme

Information Systems and Machine Learning Lab (ISMLL)University of Hildesheim

Germany

Relational Classification Using Automatically Extracted

Relations by Record Linkage

2

Outline

• Motivation

• Relation Extraction and Multi-Relational Classification Framework

• Relation Extraction

• Multi-Relational Classification

• Evaluation

• Conclusion

3

• Example:

Motivation

P1

P3

P2

Publication Title Author Conference Category

1Classification of scientific publications

John Smith ICDM Data Mining

2 Classification of Hypertext

John Smith KDD

Data Mining

3 Hierarchical Clustering

Dan Miller ICDM

Data Mining

4

Motivation

• Traditional classifiers takes only local attributes like keywords, title and abstract into account

• Assumption: Instances are independent• But: Assumption does not hold

– Instances can be related to other documents by the authorship, citations, same conference etc.

These relations should be exploited and combined in order to improve classification accuracy.

• But: Manuel extraction of relations by experts is expensive

Automatic extraction of relations from noisy attributes.

5

Data Mining

Data Mining

Data Mining

Category

5th International Conference on Data Mining

KDD

ICDM 2005

Conference

Dan MillerHierarchical Clustering

3

John Smith

Classification of Hypertext

2

J. SmithClassification of scientific publications

1

AuthorTitlePublication

• Relation Extraction Component• Extraction of relations from objects with noisy

attributes

• Multi-Relational Classification Component• Use extracted relations instead or additionally to

local attributes for classification

Relation Extraction and Relational Classification Framework

Xx

a

a

R

6

Relation Extraction• Pairwise feature extraction

– from noisy attributes with several similarity measures (e.g. TFIDF, cosine similarity, Levenshtein)

• Probabilistic pairwise decision model– Use extracted similarities as features for a

probabilistic classifier

and build a model on the training data

– And apply it on unknown pairs

• Collective decision model– If is an equivalence relation then use constrained

clustering (e.g. HAC) using the pair wise decision model as a learned similarity measure to transform into a binary relation

VXa :

Pairwise feature extraction

Probabilistic pairwaise decision model

Collective decision model

Attributes

RelationsIR: 2 Xf

1,0IR:ˆ lC

lf

trXyx ),(

trX

R

R

7

Relation ExtractionCollective Decision Model

Initialisation

Must Links

Cannot Links

8

Multi-Relational Classification

• Relational classification problem:– Make use of additional information of related objects

(i.e. their classes or attributes)– Propositionalize the relational data e.g. with:

where

is the neighborhood of

x

xc N

cxcNxxfreq

)'(|')(

.)(,)',(|': xcRxxXxN x

x

9


• Algorithm:

1. for each relation R:1 to m(a) Build a undirected weighted graph with (b) Perform relational classification simultaneously for all instances in the test set(c) Output a probability distribution2. Apply ensemble classification to the resulting probability distributions of these relations3. Output final classification

),( EXG

…

…Relational

ClassificationRelational

Classification…

Ensemble Classification

IR: XXw

10

• Simple Relational Methods– Probabilistic Relational Neighbor Classifier (EPRN)

[Macskassy and Provost 2003]

Where is a normalization factor, is the weight and is the iteration

– EPRN2HOP• Takes additionally the neighbors of the direct neighbors into

account if the direct neighborhood size is small

)1('

)( )'|()',(1

)|(

tNx

t xcPxxwZ

xcPx

)1(|''

)1('

)( )''|()'','()'|()',(1

)|('

tdNNx

tNx

t xcPxxwxcPxxwZ

xcPxxx


Z t

d

w

11

• Aggregation-based Relational Learning Methods– Use aggregation functions in order to propositionalize

the set-valued attribute

– Use aggregated values as attributes for traditional machine learning methods

– We used Logistic Regression as classifier


Category 1

Category 2

Category 3

Category 1

12

• Methods which combine different models • Increases classification accuracy• Usage

– Combine results achieved by relational classification for different relations

– Combine results of relational and local models

• Voting

• Stacking– Use Meta-classifier to learn a model on the results of different

models– Build new instances– Apply cross validation

L

lM l

xcPL

xcP1

)|(1

)|(

),)|(,...)|(,...,)|(,...,)|(( 1111 cxcPxcPxcPxcPx LnLnnew

Ensemble Classification

13

Evaluation• Data

– CompuScience data set• 147 571 scientific papers• 77 topics (categories)• Relations: authors, reviewer, journals

– Cora deduplication data set• 1 295 citations• 112 unique publications• Relation:samePaper

– Cora data set• 3298 papers• 12 categories• Relations: conferences, authors, citations

14

Evaluation – Relation Extraction

Evaluation set

single linkage

complete linkage

average linkage

Xtst 0.90 0.74 0.92

X 0.92 0.71 0.93

F1 measure for finding the SamePaper relation on Cora

Pairwise feature extraction with TFIDF, Levenshtein, Jaccard, Cosine on all attributes

15• The ensemble of relational and content-based text classification achieved

a significantly higher F-measure then the pure text classifier

Evaluation – Multi-Relational Classification

3-fold cross validation on CompuScience for Author, Reviewer and Journal relation

16

EvaluationMulti-Relational Classification using automatically extracted relations

• 50%/50% splits, 10 runs

Author Relation

0.5

0.55

0.6

0.65

0.7

0.75

1 2 3 4 5 6 7 8 9 10

Runs

Acc

ura

cy

Annotated Relation

Learned Relation

17

• Summary:– Presented framework for relation extraction and multi-

relational classification• Automatic relation extraction with record linkage• Relational classification using each extracted relation for

classification and fusing the results with ensemble methods

• Future Work– Evaluate our framework on different data sets and

relations– Evaluate the relational classifiers quality depending on

the quality of the extracted relations

Conclusion and Future Work

18

Questions ?

www.ismll.uni-hildesheim.de

Christine Preisach

[email protected]

Steffen Rendle

[email protected]

Lars Schmidt-Thieme

[email protected]

Thank you

Christine Preisach, Steffen Rendle and Lars Schmidt- Thieme Information Systems and Machine Learning Lab (ISMLL) University of Hildesheim Germany Relational.

Documents

classification accuracy

relational data

manuel extraction of

decision model use

neighborhood of slide

binary relation

equivalence relation

relation r