Top Banner
Learning Similarity Functions from Qualitative Feedback Weiwei Cheng and Eyke Hüllermeier University of Marburg, Germany
15

Learning similarity functions from qualitative feedback

Jul 02, 2015

Download

Technology

roywwcheng

The performance of a case-based reasoning system often depends on the suitability of an underlying similarity (distance) measure,
and specifying such a measure by hand can be very difficult. In this paper, we therefore develop a machine learning approach to similarity assessment. More precisely, we propose a method that learns how to combine given local similarity measures into a global one. As training information,
the method merely assumes qualitative feedback in the form of similarity comparisons, revealing which of two candidate cases is more similar to
a reference case. Experimental results, focusing on the ranking performance
of this approach, are very promising and show that good models can be obtained with a reasonable amount of training information. See more at www.chengweiwei.com
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Learning similarity functions from qualitative feedback

Learning Similarity Functions fromQualitative Feedback

Weiwei Cheng and Eyke HüllermeierUniversity of Marburg, Germany

Page 2: Learning similarity functions from qualitative feedback

Introduction

Proper definition of similarity (distance) measures is crucial for CBR systems.

The specification of local similarity measures, pertaining to individual properties (attributes) of a case, is often less difficult than their combination into a global measure.

Goal of this work:Using machine learning techniques to support elicitation of similarity measures (combination of local into global measures) on the basis of qualitative feedback.

7/31/20091/13 Weiwei Cheng & Eyke Hüllermeier

Page 3: Learning similarity functions from qualitative feedback

Problem Setting

Local-global principle: The global distance is an aggregation of local distances

For now, we focus on a linear model:

with (monotonicity).

… easy to incorporate background knowledge… amenable to efficient learning scheme… non-linear extension via kernelization

7/31/20092/13 Weiwei Cheng & Eyke Hüllermeier

Page 4: Learning similarity functions from qualitative feedback

Problem Setting cont.

Learning the weights from qualitative feedback:means “case a is more similar to b than to c”.

Given a query, a distance measure induces a linear order on cases:

Notice: Often the ordering of cases is more important than the distance itself it is sufficient to find a , such that

7/31/20093/13 Weiwei Cheng & Eyke Hüllermeier

Page 5: Learning similarity functions from qualitative feedback

The Learning Algorithm

Basic idea: From distance learning to classification

Extension 1: Incorporating monotonicity

Extension 2: Ensemble learning

Extension 3: Active learning

7/31/20094/13 Weiwei Cheng & Eyke Hüllermeier

Page 6: Learning similarity functions from qualitative feedback

From Distance Learning to Classification

7/31/20095/13 Weiwei Cheng & Eyke Hüllermeier

CASE BASE

(d-dim. vector)

Page 7: Learning similarity functions from qualitative feedback

Our model

requires that when a local distance increases, the global distance cannot decrease.

Our approach: (Noise-tolerant) Perceptron learning with a modified update rule:

Monotonicity

7/31/20097/13 Weiwei Cheng & Eyke Hüllermeier

The modified algorithm provably converges after a finite number of iterations.

Page 8: Learning similarity functions from qualitative feedback

Ensemble Learning

7/31/20096/13 Weiwei Cheng & Eyke Hüllermeier

Permutations of training

data

CoM of version space Bayes pointEnsemble of

perceptrons

hypothesis space

version space

Bayes point

committee

Page 9: Learning similarity functions from qualitative feedback

Goal:Reducing the feedback effort of the user by choosing the most informative training data.

Our approach (a variation of QBC):1. choose 2 most conflicting models2. generate 2 rankings with these 2 models3. get the first conflict pair of these rankings

Example:

Active Learning

7/31/20098/13 Weiwei Cheng & Eyke Hüllermeier

ranking 1: a b c d e

ranking 2: a b d e c

Page 10: Learning similarity functions from qualitative feedback

Experimental Setting

7/31/20099/13 Weiwei Cheng & Eyke Hüllermeier

Goal:Investigating the efficacy of our approach and the effectiveness of the extensions:

1. incorporating monotonicity2. ensemble learning3. active learning

Data sets

uni iris wine yeast nba

#features 6 4 13 24 15#cases 200 150 178 2465 3924

Page 11: Learning similarity functions from qualitative feedback

Quality Measures

7/31/2009Weiwei Cheng & Eyke Hüllermeier10/13

Kendall’s tau (a common rank correlation measure)

… defined by number of rank inversions (normalized to [-1,+1]):

Recall (a common retrieval measure) ... defined as number of predicted among true top-k cases (k=10):

Position error… defined by the position of true topmost case (minus 1):

Page 12: Learning similarity functions from qualitative feedback

7/31/200912 Weiwei Cheng & Eyke Hüllermeier

Page 13: Learning similarity functions from qualitative feedback

Extension to Nonlinear Models

7/31/200912/13 Weiwei Cheng & Eyke Hüllermeier

Actually, we only need linearity in the coefficients, not in the local distances. Therefore, some generalizations are easily possible, such as

More generally, with :

Page 14: Learning similarity functions from qualitative feedback

Extensions

7/31/200912/13 Weiwei Cheng & Eyke Hüllermeier

Special case of a kernel function leads to kernelization:

Nonlinear classification and sorting

(a, b, c)

b c

classifier

(b, c)

0.7

distance

Page 15: Learning similarity functions from qualitative feedback

Conclusions

7/31/2009Weiwei Cheng & Eyke Hüllermeier13/13

Learning to combine local distance measures into a global measure.

Only assuming qualitative feedback of the type “a is more similar to b than to c”.

Reduction of distance learning to classification.