Top Banner
Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations Rui Li, Shengjie Wang, Hongbo Deng, Rui Wang, Kevin Chen-Chuan Chang University of Illinois at Urbana-Champaign 13/04/24 KDE Seminar: Yuto Yamaguchi 1 Paper Introduction Speaker: Yuto Yamaguchi KDD ‘12
36

Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Jul 14, 2015

Download

Documents

Yuto Yamaguchi
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Rui Li, Shengjie Wang, Hongbo Deng, Rui Wang, Kevin Chen-Chuan Chang University of Illinois at Urbana-Champaign

13/04/24 KDE Seminar: Yuto Yamaguchi 1

Paper Introduction Speaker: Yuto Yamaguchi

KDD ‘12

Page 2: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Introduction • Users’ locations are important to many applications

•  e.g.) Advertisement, Recommendation

• But most of users do not provide their location information •  On Twitter, only 16% of users register city level locations in their

profiles

•  The objective of this paper is to profile users’ home locations in social network.

13/04/24 KDE Seminar: Yuto Yamaguchi 2

Page 3: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

General Ideas for Location Inference • A user more likely to follow another user who lives near

•  e.g.) A user in Chicago follows another user in Chicago

•  [Backstorm et al., WWW ‘10], •  [Clodoveu et al., T-GIS ‘11] , …

• A user more likely to post about a near location to him •  e.g.) A user in Houston posts about rockets

•  [Cheng et al., CIKM ‘10], •  [Chandra et al., SocialCom ’11], •  [Kinsella et al., SMUC ‘11], …

13/04/24 KDE Seminar: Yuto Yamaguchi 3

Page 4: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Challenges • On Twitter, following network and tweets provide valuable

signals for profiling their home locations

• But there are two challenges,

• Scarce Signals •  126 friends on average, but only 16% of them provide locations •  6 location related terms in every 100 tweets

• Noisy Signals •  a user may follow another user who lives in a distant location •  a user may post about distant locations

13/04/24 KDE Seminar: Yuto Yamaguchi 4

Page 5: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Ideas in this paper •  The authors propose a unified discriminative influence

model UDI which has two features below

• Unified Signals (for scarce signal challenge) •  Integrates social network and user-centric data (i.e., tweets) in a

probabilistic framework, which is viewed as a heterogeneous graph

• Discriminative Influence (for noisy signal challenge) •  Users and locations have their own influence scope

e.g.) Lady Gaga (with a broad influence scope) is more likely to be followed by a user far away

à users with broad scopes do not provide so strong signals for location inference

13/04/24 KDE Seminar: Yuto Yamaguchi 5

Page 6: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Contributions • Propose a unified discriminative influence model UDI

•  Heterogeneous graph •  Influence scope

• Propose two location profiling methods using the above model (introduced later) •  Local prediction method •  Global prediction method

• Conduct extensive experiments using Twitter dataset •  Their method can place 66% users within 100 miles error distance

13/04/24 KDE Seminar: Yuto Yamaguchi 6

Page 7: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

PROBLEM FORMULATION

13/04/24 KDE Seminar: Yuto Yamaguchi 7

Page 8: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Heterogeneous Graph

13/04/24 KDE Seminar: Yuto Yamaguchi 8

User nodes ui ∈Uvj ∈VVenue nodes

If ui posts about vj, create an edge <ui, vj>

If ui follows uj, create an edge <ui, uj>

Page 9: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Location Profiling Problem

13/04/24 KDE Seminar: Yuto Yamaguchi 9

Given a Twitter Graph G, estimate a location for each user ui so as to make close to ui’s true location

L̂ uiL̂ ui

L ui

Page 10: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

INFLUENCE MODEL

13/04/24 KDE Seminar: Yuto Yamaguchi 10

Page 11: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Motivation 1/2

13/04/24 KDE Seminar: Yuto Yamaguchi 11

Near users (venues) are more likely to be followed (tweeted) by other users

Page 12: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Motivation 2/2

13/04/24 KDE Seminar: Yuto Yamaguchi 12

Each user (venue) has an influence scope of different size

Influential user

regular user

Page 13: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Basic Ideas for the Influence model • Geographically influential user has a broad influence

scope •  e.g.) world wide celebrities such as Lady Gaga

•  The fact that a user follows a geographically influential user does NOT provide valuable signals for location inference

•  e.g.) NOT VALUABLE: a user follows Lady Gaga VALUABLE: a user follows a regular user in Chicago

13/04/24 KDE Seminar: Yuto Yamaguchi 13

Page 14: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Model Formulation •  The authors adopt a Gaussian distribution to model the

above characteristics

13/04/24 KDE Seminar: Yuto Yamaguchi 14

latitude longitude

probability to follow (tweet)

N(Lni ,Σni)

node ni’s influence scope

Page 15: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Influence scope – users

13/04/24 KDE Seminar: Yuto Yamaguchi 15

latitude longitude

probability to follow

N(Lui ,Σui)

user ui’s influence scope

High probability to follow ui Low probability

to follow ui

user ui’s home location

Page 16: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Influence scope – venues

13/04/24 KDE Seminar: Yuto Yamaguchi 16

latitude longitude

probability to tweet

N(Lvi ,Σvi)

venue vi’s influence scope

High probability to tweet Low probability

to tweet

venue vi’s location

Page 17: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Different scope size – users

13/04/24 KDE Seminar: Yuto Yamaguchi 17

high influence

Regular user Geographically influential user

More likely to be followed by distant users

Page 18: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Different scope size – venues

13/04/24 KDE Seminar: Yuto Yamaguchi 18

high influence

Regular venue Geographically influential venue

More likely to be tweeted by distant users

Page 19: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Model Parameters • Mean and variance for each Gaussian

• Mean is the location of node ni

• Variance decides the size of each influence scope

•  The number of parameters is

13/04/24 KDE Seminar: Yuto Yamaguchi 19

N(Lni ,Σni)

Lni

Σni

Σni=

σ ni0

0 σ ni

"

#

$$

%

&

''

2 U + V( )

Page 20: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

LOCATION PROFILING METHODS Local prediction method Global prediction method

13/04/24 KDE Seminar: Yuto Yamaguchi 20

Page 21: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Basic Ideas for Location Profiling

13/04/24 KDE Seminar: Yuto Yamaguchi 21

Estimate such model parameters that maximize the likelihood of obtaining the given Twitter graph

Lni Σniand for each node ni Parameters:

Page 22: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Local Prediction Method •  This method only considers the ego-network

•  Maximize the likelihood of this network

13/04/24 KDE Seminar: Yuto Yamaguchi 22

tweet

follow

labeled user

labeled user labeled user

unlabeled user

labeled user: his location is known

unlabeled user: his location is unknown

ego-network

Page 23: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Likelihood Function of Local Method

13/04/24 KDE Seminar: Yuto Yamaguchi 23

P ego-network of ui | parameters( ) =

P uj follows ui | Luj ,Lui ,Σui( )uj∈Followers ui( )∏ ×

P ui follows uj | Lui ,Luj ,Σuj( )uj∈Followees ui( )∏ ×

P ui tweets v j | Lui ,Lvj ,Σvj( )vj∈Venues ui( )∏

These are Gaussian

Maximize this function

Page 24: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Each Gaussian

13/04/24 KDE Seminar: Yuto Yamaguchi 24

P uj follows ui | Luj ,Lui ,Σui( ) =

12πσ ui

2 expXui

− Xuj( )2+ Yui −Yuj( )

2

−2σ ui2

#

$

%%%

&

'

(((

•  High probability if ui and uj is close •  High probability if ui has broad influence scope

Page 25: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Solution of Local Method

13/04/24 KDE Seminar: Yuto Yamaguchi 25

Xui=

Xuj

σ uiuj∈ followers ui( )∑ +

Xuj

σ ujuj∈ followees ui( )∑ +

Xvj

σ viv j∈venues ui( )∑

1σ uiuj∈ followers ui( )

∑ +1σ ujuj∈ followees ui( )

∑ +1σ viv j∈venues ui( )

σ ui2 =

Xui− Xuj( )

2+ Yui −Yuj( )

2

2 followers ui( )uj∈ followers ui( )∑

Obtained as closed-form (no need to memorize)

substitute

Page 26: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Global Prediction Method •  This method maximizes the likelihood of the whole network

•  Predict locations of unknown users simultaneously

13/04/24 KDE Seminar: Yuto Yamaguchi 26

Page 27: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Likelihood Function of Global Method

13/04/24 KDE Seminar: Yuto Yamaguchi 27

P whole network | parameters( ) =

P ui follows uj | Lui ,Luj ,Σuj( )ui ,uj ∈FollowEdges∏ ×

P ui tweets v j | Lui ,Lvj ,Σvj( )ui ,vj ∈TweetEdges∏

These are Gaussian

Maximize this function

Page 28: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Iterative Algorithm for Global Method • Global method has no closed form solution

à Iterative algorithm

13/04/24 KDE Seminar: Yuto Yamaguchi 28

1. Initialize locations for all unlabeled users 2.  3. repeat

1. update for all nodes using 2. repeat

1. update for all unlabeled users using 3. until converge 4.  5. 

4. until converge

Lu

σ nk

Luk

Lu ← Luk

k←1

σ nk

Lu

Luk

Luk← k +1

Page 29: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

EXPERIMENTS

13/04/24 KDE Seminar: Yuto Yamaguchi 29

Page 30: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Dataset •  Twitter dataset

•  Crawled Profiles, followers, and followees of 3,980,061 users •  Geocoded their location profiles into coordinates based on U.S.

Gazetteer •  630,187 users are correctly geocoded ß labeled users

•  158,220 of labeled users have at least one labeled neighbor •  neighbor: follower or followee

•  Crawled at most 600 tweets for each labeled user, and obtained 139,180 users’ tweets •  Other users are protected users

•  Using this dataset, the authors conducted five-fold cross validation •  80% of 139,180 users are for training set, 20% are for test set •  Repeat 5 runs

13/04/24 KDE Seminar: Yuto Yamaguchi 30

Page 31: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Methods • Compared 6 methods

•  BaseU: Backstorm et al.’s method [1] •  Using only social graph

•  BaseC: Cheng et al.’s method [2] •  Using only tweets

•  UDIU: Local prediction method, but only uses user nodes •  UDIC: Local prediction method, but only uses venue nodes •  UDII: Local prediction method •  UDIG: Global prediction method

13/04/24 KDE Seminar: Yuto Yamaguchi 31

No influence model

[1] Backstorm et al., “Find me if you can: improving geographical prediction with social and spatial proximity”, WWW’10 [2] Cheng et al., “You are where you tweet: a content-based approach to geo-locating twitter users”, CIKM’10

Page 32: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Results – Prediction results

13/04/24 KDE Seminar: Yuto Yamaguchi 32

ACC: Ratio of correctly predicted users within 100 miles AED@k%: Average error distance of top k% users

•  Influence model is effective to predict locations •  Comparing BaseU and UDIU (BaseC and UDIC)

•  Integrating both signals is effective to predict locations •  Comparing UDIU and UDII (UDIC and UDII)

•  Global method improves Local one only 1.5% •  Comparing UDIG and UDII

Page 33: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Results – Global and Local

13/04/24 KDE Seminar: Yuto Yamaguchi 33

+9% in ACC 20% training users and 80% test users

In the case that most of users are unlabeled, the global method improves the local one substantially

Page 34: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Results – Influence scope

13/04/24 KDE Seminar: Yuto Yamaguchi 34

•  Users with a large number of followers do not always have large σ •  e.g.) MythBusters Official have larger σ than Lady

Gaga but have smaller number of followers

Page 35: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

CONCLUSION

13/04/24 KDE Seminar: Yuto Yamaguchi 35

Page 36: Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations

Conclusion • Proposed

•  Unified discriminative influence model (UDI) •  Two location prediction method based on influence model

•  global and local

• Conducted experiments using large Twitter dataset •  Proposed methods significantly outperform existing methods

• NO future work

13/04/24 KDE Seminar: Yuto Yamaguchi 36