Local/Global Term Analysis for Discovering Community Differences in Social Networks

Local/Global Term Analysis for Discovering Community

Differences in Social Networks

David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy

Data Mining Research LaboratoryDept. of Computer Science and Engineering

The Ohio State University

Communities in Social Networks

Observations:•Social networks consist of many interacting communities of users.•Each community can be characterized by the content which its members generate.

Motivating questions:•Given a community, how can we determine what its members are talking about, relative to the entire social network?•Given two communities, how can we determine the difference between them?

Methodology• A community’s users mention relevant terms

frequently.

• Many works look at #hashtags or most frequent terms.

• But not all frequent terms are relevant.

• Desiderata:– Consider all content terms

– Interpretable

– scalable to million-user social networks

Four-step Process• Four-step process for determining community

differences:– Community Discovery

– Term Extraction & Aggregation

– Visualization

– Handling Time Varying Data

Network

Content

1. Community Discovery (I)

• Keyword search based identification of candidate users

• Extract underlying network of users

• Local community identification• Graph clustering (e.g. METIS

[KARYPIS’99], Graclus [DHILLON’07], MLR-MCL [SATULURI’09], Localized Clustering (L-Spar) [SATULURI’11])

• Modularity [NEWMAN’04]

• Content-Sensitive Viewpoint Neighborhoods [Asur’09]

1. Community Discovery (II)

• Start with the network of all users

• Extract candidate communities• Using any community discovery

algorithm

• Filter candidate communities by keyword strength

2. Term Extraction & Aggregation

• Extract terms from each message and weight them

• Term Frequency• TF/IDF• Domain-dependent

semantic importance

• Merge terms• Combine synonyms• Handling hypernyms

• Aggregate them by user

3. Visualization

• Plot terms by frequency across two axes.

• Global (all users) on Y-axis• Local (community users) on

X-axis.• Terms on the regression line

are equifrequent in both groups

• Terms off the regression line are relatively more frequent in one group

• Support for multiple scales of local community identification

4. Handling Time Varying Data

• Time range divided into batches• Perform steps 1 to 3 for each batch• Visualize results

Experimental Results

Between Nikon and Olympus communities, Olympus community talks more about blogs.

Using a dataset of 1M tweets we look at groups discussing Canon, Nikon, and Olympus cameras:

Experimental Results

Between camera and global communities, camera community talks less about health, teeth, and success.

Experimental ResultsUsing a dataset of 2M tweets about the “Occupy” movement, we compare “Occupy Oakland” to the entire “Occupy” movement:

Occupy Oakland movement talks less about NYPD, p2 (group of progressives using social media), and tcot (“Top Conservatives On Twitter”).

Filter and Zoom

Conclusions• Four-part visual analytic framework for

discovering differences between communities in social networks.– Simple– Scalable

• Qualitative and quantitative results.

• Future– Temporal– More quantitative measures– Automatically determine best scale

Thank You!

Local/Global Term Analysis for Discovering Community Differences in Social Networks

community differences

camera community

original community

olympus community talks

community discovery

yaxislocal community

olympus communities

frequent terms

Documents

Discovering physical concepts with neural networks

Discovering Computers Fundamentals Fifth Edition Chapter 8.....

Discovering functional interaction patterns in...

Discovering Social Networks from Event Logshreijers/H.A....

Discovering Computers 2009 CSC 100 – Computer Literacy Dr....

Discovering Excitatory Relationships using Dynamic...

1 Discovering Molecular Functional Groups Using Graph...

Age and Sex Differences in Perceptions of Networks … ·.....

Process mining in Collaborative Networks: Discovering...

Racial Differences in Networks: Do Neighborhood Conditions.....

Gender Differences in Brain Networks Supporting...

Discovering Computers 2009 Chapter 9 Communications and...

Discovering Fine-grained RRC State Dynamics and...

Local/Global Term Analysis for Discovering Community...

Discovering Computers -...

Discovering the Most Potential Stars in Social Networks