Web-Mining Agents: Transfer Learning TrAdaBoostmoeller/Lectures/WS-15-16/Data...Web-Mining Agents: Transfer Learning TrAdaBoost ... Next in Web-Mining Agents: ... Title:...

Web-Mining Agents: Transfer Learning

TrAdaBoost

R. Möller Institute of Information Systems

University of Lübeck

b y : H A I T H A M B O U A M M A R

M A A S T R I C H T U N I V E R S I T Y

Based on an excerpt of: Transfer for Supervised

Learning Tasks

Traditional Machine Learning vs. Transfer

Source Task

Knowledge

Target Task

Learning System

Different Tasks

Learning System

Traditional Machine Learning Transfer Learning

Transfer Learning Definition

�  Notation: ¡  Domain :

÷ Feature Space: ÷ Marginal Probability Distribution:

¢  with

¡  Given a domain then a task is :

P (X)X = {x1, . . . , xn} � X

T = {Y, f(.)}

Transfer Learning Definition

Given a source domain and source learning task, a target domain and a target learning task, transfer learning aims to help improve the learning of the target predictive function using the source knowledge, where

or Ds �= DT Ts �= TT

Transfer Definition

�  Therefore, if either :

XS �= XT PS(X) �= PT (X)

YS �= YT P (YS |XS) �= P (YT |XT )

Domain Differences

Task Differences

Questions to answer when transferring

Algorithms: TrAdaBoost

�  Assumptions: ¡  Source and Target task have same feature space:

¡  Marginal distributions are different: XS = XT

PS(X) 6= PT (X)

Not all source data might be helpful !

Algorithm: TrAdaBoost

�  Idea: ¡  Iteratively reweight source samples such that:

÷  reduce effect of “bad” source instances ÷ encourage effect of “good” source instances

�  Requires: ¡  Source task labeled data set ¡  Very small Target task labeled data set ¡  Unlabeled Target data set ¡  Base Learner

Algorithm: TrAdaBoost

Weights Initialization

Hypothesis Learning and error calculation

Weights Update

Algorithms: Self-Taught Learning

�  Assumptions: ¡  Source and Target task have different feature space:

¡  Marginal distributions are different:

¡  Label Space is different:

PS(X) 6= PT (X)

YS �= YT

XS �= XT

�  Framework: ¡  Source Unlabeled data set:

¡  Target Labeled data set:

DS = {(x(i)s )}mi=1

DT = {(x(j)T , y

(j)T )}nj=1 with n <<< m

Build classifier for cars and Motorbikes

�  Step One: Discover high level features from Source data by

Regularization Term Re-construction Error

Constraint on the Bases

minb,a

||x(i)s �

a(k)si bk||22 + �||asi ||1

s.t. ||bk|| ⇥ 1, ⌅k ⇤ {1, . . . ,Ks}

Algorithm: Self-Taught Learning

Unlabeled Data Set

minb,a

||x(i)s�X

a(k)s ib k||

22+�||a

s i|| 1

s.t. ||b

k|| ⇥

1,⌅k

⇤ {1, .

. . ,K s}

Algorithm: Self-Taught Learning

�  Step Two: Project target data onto the attained features by

a�Tj= argmin

||xTj �X

a(k)Tjbk||22 + �||aTj ||1

Informally, find the activations in the attained bases such that: 1.  Re-construction is minimized 2.  Attained vector is sparse

a �Tj = argmina

Tj ||xT

j � Xk a (k)T

j bk || 22 + �||aT

�  Step Three: Learn a Classifier with the new features

Target Task

Source Task

Learn new features (Step 1)

Project target data (Step 2)

Learn Model (Step 3)

Conclusions

�  Transfer learning is to re-use source knowledge to help a target learner

�  Transfer learning is not generalization

�  TrAdaBoost transfers instances

�  Self-Taught Learning transfers unlabeled features

Next in Web-Mining Agents:

Unlabeled Features Revisited Unsupervised Learning: Clustering

Web-Mining Agents: Transfer Learning TrAdaBoostmoeller/Lectures/WS-15-16/Data...Web-Mining Agents: Transfer Learning TrAdaBoost ... Next in Web-Mining Agents: ... Title:...

Documents

Data Mining and Business Intelligence applications in...

Web-Mining Agentsmoeller/Lectures/... · Web-Mining Agents....

Data mining, transfer and learner corpora: Using data mining...

Web-Mining Agentsmoeller/Lectures/... · Web-Mining Agents....

Diminishing catalyst concentration in atom transfer ... ·....

Agents' Role in Football Transfer Market - Chicago-Kent...

Money Transfer Agents: Know Your Customers!

Macromolecular Chemistry - University of Texas at...

Transfer Pricing in Mining with a Focus on Africa

Benefit transfer ecosystem services valuation of the mining....

The Role of Agents in Distributed Data Mining: Issues and...

Web-Mining Agents Data Mining Prof. Dr. Ralf Möller...

Building a Strong Transfer Culture: How Faculty Can Act as.....

A Framework: Cluster Detection and Multidimensional...

Exploiting Data Mining techniques for improving the...

Email & Web Searches Kin 260 Jackie Kiwata. Overview Email.....