Top Banner
Intelligent Database Systems Presenter : JIAN-REN CHEN Authors : Cihan Kaleli, Huseyin Polat 2012 , KBS Privacy-preserving SOM-based recommendations on horizontally distributed data 1
16

Presenter : Jian-Ren Chen Authors : Cihan Kaleli , Huseyin Polat 2012 , KBS

Feb 24, 2016

Download

Documents

daxia

Privacy-preserving SOM-based recommendations on horizontally distributed data. Presenter : Jian-Ren Chen Authors : Cihan Kaleli , Huseyin Polat 2012 , KBS. Outlines. Motivation Objectives Methodology Privacy analysis Experiments Conclusions Comments. Motivation. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Presenter   :  Jian-Ren  Chen Authors      :  Cihan Kaleli ,  Huseyin Polat  2012 , KBS

Intelligent Database Systems Lab

Presenter : JIAN-REN CHEN

Authors : Cihan Kaleli, Huseyin Polat

2012 , KBS

Privacy-preserving SOM-based recommendations on horizontally

distributed data

1

Page 2: Presenter   :  Jian-Ren  Chen Authors      :  Cihan Kaleli ,  Huseyin Polat  2012 , KBS

Intelligent Database Systems Lab

OutlinesMotivationObjectivesMethodologyPrivacy analysisExperimentsConclusionsComments

2

Page 3: Presenter   :  Jian-Ren  Chen Authors      :  Cihan Kaleli ,  Huseyin Polat  2012 , KBS

Intelligent Database Systems Lab

Motivation• Collaborative Filtering (CF) systems are used to

suggest web pages. limited number of users’ data -> lack of accuracy-> Cold Start Problem

• Horizontally partitioned among multiple vendors

3

Page 4: Presenter   :  Jian-Ren  Chen Authors      :  Cihan Kaleli ,  Huseyin Polat  2012 , KBS

Intelligent Database Systems Lab

Objectives• Those companies holding inadequate number of users’

data might decide to combine their data. accurate predictions Performance

• Privacy-preserving scheme

4

Page 5: Presenter   :  Jian-Ren  Chen Authors      :  Cihan Kaleli ,  Huseyin Polat  2012 , KBS

Intelligent Database Systems Lab

Methodology

Privacy-preserving SOM clustering on horizontally

distributed data

Privacy-preserving k-nn-based predictions on horizontally

distributed data

a. Off-linei. Cluster users’ data distributed among multiple parties using SOM while preserving data owners’ privacy.ii. Compute aggregate data values required for recommendation estimations.

b. Onlinei. Determine a’s cluster.ii. Estimate prediction after receiving required aggregate data from other parties. Return the referral to a.

5

Page 6: Presenter   :  Jian-Ren  Chen Authors      :  Cihan Kaleli ,  Huseyin Polat  2012 , KBS

Intelligent Database Systems Lab

SOM clustering

k-nn-based collaborative filtering

MethodologyDetermine values of initial constants:

Find the winning Kohonen layer neuron:

Update the weight vectors of all neurons:

6

Page 7: Presenter   :  Jian-Ren  Chen Authors      :  Cihan Kaleli ,  Huseyin Polat  2012 , KBS

Intelligent Database Systems Lab

MethodologyPearson correlation coefficient:

The prediction for a on q:

SOM clustering

k-nn-based collaborative filtering

7

Page 8: Presenter   :  Jian-Ren  Chen Authors      :  Cihan Kaleli ,  Huseyin Polat  2012 , KBS

Intelligent Database Systems Lab

Privacy-preserving SOM clustering on

horizontally distributed data

Privacy-preserving k-nn-based predictions

on horizontallydistributed data

Methodology

8

1. number of clusters2. sequence of active party

Determine values of initial constants

SOM

1. all users it holds are assigned to a cluster2. updated Wj vectors to the second party

1. the next party repeats step 22. sends new updated Wj vectors to the next party

The last party sends the updated Wj vectors tothe IP

Page 9: Presenter   :  Jian-Ren  Chen Authors      :  Cihan Kaleli ,  Huseyin Polat  2012 , KBS

Intelligent Database Systems Lab

MethodologyPrivacy-preserving SOM clustering on

horizontally distributed data

Privacy-preserving k-nn-based predictions

on horizontallydistributed data

among C parties, P can be written

paq = va + P, where P is:

choose j percent of the users who did not rate q, where j in (0,)

choose j percent of their zuj values, remove their values, and replace with zero, wherej in(0,].

9

Page 10: Presenter   :  Jian-Ren  Chen Authors      :  Cihan Kaleli ,  Huseyin Polat  2012 , KBS

Intelligent Database Systems Lab

• Attacks and Vulnerabilities:1) A1 : Parties can coalesce for capturing a target

party’s data2) A2 : Paying-off3) V1 : Not able to return any result4) V2 : Missing values in aggregate values vector

Privacy analysis

10

Page 11: Presenter   :  Jian-Ren  Chen Authors      :  Cihan Kaleli ,  Huseyin Polat  2012 , KBS

Intelligent Database Systems Lab

Experiments

• Data sets

11

Page 12: Presenter   :  Jian-Ren  Chen Authors      :  Cihan Kaleli ,  Huseyin Polat  2012 , KBS

Intelligent Database Systems Lab

Experiments

12

Page 13: Presenter   :  Jian-Ren  Chen Authors      :  Cihan Kaleli ,  Huseyin Polat  2012 , KBS

Intelligent Database Systems Lab

Experiments

13

Page 14: Presenter   :  Jian-Ren  Chen Authors      :  Cihan Kaleli ,  Huseyin Polat  2012 , KBS

Intelligent Database Systems Lab

Experiments

14

Page 15: Presenter   :  Jian-Ren  Chen Authors      :  Cihan Kaleli ,  Huseyin Polat  2012 , KBS

Intelligent Database Systems Lab

Conclusions• Integrating split data significantly improves

preciseness.

• Although privacy concerns make accuracy worse,

accuracy losses are smaller than the accuracy gains

due to collaboration.

15

Page 16: Presenter   :  Jian-Ren  Chen Authors      :  Cihan Kaleli ,  Huseyin Polat  2012 , KBS

Intelligent Database Systems Lab

Comments• Advantages– accuracy, performance, and privacy

• Disadvantage– cost, accuracy

• Applications– Collaborative Filtering– Privacy-preserving scheme

16