Top Banner
Social Web 2015 Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? Anca Dumitrache & Lora Aroyo The Network Institute VU University Amsterdam
49

Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Jul 14, 2015

Download

Education

Lora Aroyo
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Social Web���2015

Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web?

Anca Dumitrache & Lora AroyoThe Network Institute

VU University Amsterdam

Page 2: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

•  25 billion tweets on Twitter in 2010, by 175 million users

•  360 billion pieces of contents on Facebook in 2010, by 600 million different users

•  35 hours of videos uploaded to YouTube every minute

•  130 million photos uploaded to flickr per month

The Age of BIG Data

Social Web 2015, Lora Aroyo

Page 3: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Science with BIG Data

Social Web 2015, Lora Aroyo

Page 4: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

BIG Data Challenges

Social Web 2015, Lora Aroyo

Page 5: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

enormous wealth of data = lots of insights•  insights in users’ daily lives and activities•  insights in history•  insights in politics•  insights in communities•  insights in trends•  insights in businesses & brands

Why?

Social Web 2015, Lora Aroyo

Page 6: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

enormous wealth of data = lots of insights•  who uploads/talks? (age, gender, nationality,

community, etc.)•  what are the trending topics? when?•  what else do these users like? on which platform?•  who are the most/least active users?•  ..…

Why?

Social Web 2015, Lora Aroyo

Page 7: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Image: http://www.co.olmsted.mn.us/prl/

propertyrecords/RecordingDocuments/PublishingImages/forms.jpg

This doesn’t work

Social Web 2015, Lora Aroyo

Page 8: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

How about this?

Social Web 2015, Lora Aroyo

Page 9: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Who uses it?

Social Web 2015, Lora Aroyo

Page 10: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Politicians!Governmental

institutions!

Social Web 2015, Lora Aroyo

Page 11: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Whole society!

Social Web 2015, Lora Aroyo

Page 12: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Whole society!

repurposing data

danger of second order effect

Social Web 2015, Lora Aroyo

Page 13: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Whole society!

repurposing data

discoveries & correlations

Web-Scale Pharmacovigilance: Listening to Signals from the Crowd, R.W. White et al (2013)

Social Web 2015, Lora Aroyo

Page 14: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Scientists!

Bibliometrics

Social Web 2015, Lora Aroyo

Page 15: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Culture !History!

Social Web 2015, Lora Aroyo

Page 16: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Culture !History!

Social Web 2015, Lora Aroyo

Page 17: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Culture

Bill Howe, University of Washington

Social Web 2015, Lora Aroyo

Page 18: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Entertainment !

Social Web 2015, Lora Aroyo

Page 19: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

You?!

Social Web 2015, Lora Aroyo

Page 20: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Companies!

Social Web 2015, Lora Aroyo

Page 21: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Who does it?

Social Web 2015, Lora Aroyo

Page 22: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

The Rise of the Data Scientist

Data Geeks Skills: !Statistics!

Data munging !Visualisation!

Social Web 2015, Lora Aroyo

Page 23: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

http://radar.oreilly.com/2010/06/what-is-data-science.html

The Rise of the Data Scientist

Social Web 2015, Lora Aroyo

Page 24: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

•  Data Science enables the creation of data products

•  Data products are applications that acquire their value from the data, and create more data as a result.

•  Users are in a feedback loop: they constantly provide information about the products they use, which gets used in the data product.

Data Science

Social Web 2015, Lora Aroyo

Page 25: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Data Science Venn Diagram

Drew Conway

Social Web 2015, Lora Aroyo

Page 26: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Social Web 2015, Lora Aroyo

Page 27: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Popular Data Products

Data Science is about building products

not just answering questionsSocial Web 2015, Lora Aroyo

Page 28: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Popular Data Products

empower the others to use the data

empower the others to their own analysis

Social Web 2015, Lora Aroyo

Page 29: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

(Inspired by George Tziralis’ FOSS Conf’09, John Elder IV’s Salford Systems Data Mining Conf. and Toon Calders’ slides)

Data mining is the exploration & analysis of large quantities of data

in order to discover valid, novel, potentially useful, & ultimately understandable patterns in data

http://www.freefoto.com/images/33/12/33_12_7---Pebbles_web.jpg

Data Mining 101

Social Web 2015, Lora Aroyo

Page 30: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Databases Statistics

Artificial Intelligence

Data Mining 101

• Data input & exploration

• Preprocessing• Data mining algorithms

• Evaluation & Interpretation

Social Web 2015, Lora Aroyo

Page 31: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

•  What data do I need to answer question X?

•  What variables are in the data?

•  Basic stats of my data?

Data Input & Exploration

“LikeMiner” Social Web 2015, Lora Aroyo

Page 32: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

•  Cleanup!

•  Choose a suitable data model

• What happens if you integrate data from multiple sources?

•  Reformat your data

Preprocessing

“LikeMiner”

Social Web 2015, Lora Aroyo

Page 33: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

•  Classification: Generalising a known structure & apply to new data

•  Association: Finding relationships between variables

•  Clustering: Discovering groups and structures in data

Data Mining Algorithms

Social Web 2015, Lora Aroyo

Page 34: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

•  Filter users by interests

•  Construct user graphs

•  PageRank on graphs to mine representativeness

•  Result: set of influential users

•  Compare page topics to user interests to find pages most representative for topics

Mining in “LikeMiner”

Social Web 2015, Lora Aroyo

Page 35: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Evaluation & Interpretation What does the pattern I found mean?!•  Pitfalls: • Meaningless Discoveries•  Implication ≠ Causality (Intensive care -> death)•  Simpson’s paradox•  Data Dredging•  Redundancy• No New Information

• Overfitting•  Bad Experimental Setup

Social Web 2015, Lora Aroyo

Page 36: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Data Mining is not easy

Social Web 2015, Lora Aroyo

Page 37: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Data Journalism

Social Web 2015, Lora Aroyo

Page 38: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Social Web 2015, Lora Aroyo

Page 39: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Social Web 2015, Lora Aroyo

Page 40: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

source: http://kunau.us/wp-content/uploads/2011/02/Screen-shot-2011-02-09-

at-9.03.46-PM-w600-h900.png

Mining Social Web Data

Social Web 2015, Lora Aroyo

Page 41: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Source: http://infosthetics.com/archives/2011/12/all_the_information_facebook_knows_about_you.htmlSee also: http://www.youtube.com/watch?feature=player_embedded&v=kJvAUqs3Ofg

Single Person

Social Web 2015, Lora Aroyo

Page 42: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

http://www.brandrants.com/brandrants/obama/

Populations

Social Web 2015, Lora Aroyo

Page 43: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Brand Sentiment via Twitter

http://flowingdata.com/2011/07/25/brand-sentiment-showdown/

Social Web 2015, Lora Aroyo

Page 44: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Sentiment Analysis as Service

Social Web 2015, Lora Aroyo

Page 45: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

http://text-processing.com/demo/sentiment/

Social Web 2015, Lora Aroyo

Page 46: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

http://www.cs.cornell.edu/home/kleinber/networks-book/networks-book.pdf

Recommended Reading

Social Web 2015, Lora Aroyo

Page 47: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

http://www.actmedia.eu/media/img/text_zones/English/small_38421.jpg

Assignment 2: Semantic Markup •  Part I: enrich/create a Web page with semantic markup!

•  Step 1: Mark up two different Web pages with the appropriate markup describing properties of at least people, relationships to other people, locations, some temporally related data and some multimedia. You can also try out tools such as Google Markup Helper

•  Step 2: Validate your semantic markup. Use existing validator.•  Step 3: Explain why you chose particular markups. Compare the advantages and disadvantages of

the different markups. Include screenshots from validators.

•  Part II: analyse other team’s Web page markup - as a consumer & as a publisher!•  Step 1: Perform evaluation and report your findings (consider findability or content extraction)•  Step 2: Support your critique with examples of how the semantic markup could be improved.•  In introductory section explain what semantic markup is, what it is for, what it looks like etc. •  Support your choices and explanations with appropriate literature references. •  5 pages (excluding screen shots). •  Other group’s evaluation details in appendix.

•  Deadline: 3 March 23:59!

Page 48: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

Image Source: http://blog.compete.com/wp-content/uploads/2012/03/Like.jpg

Final Assignment: Your SocWeb App

•  Create your own Social Web app (in a group)•  Use structured data, entity relations, data analysis, visualisation•  Write individual report on one of the main aspects of your app•  Pitch your app idea before finalising: 12 Mar, during Hands-on•  Submit final assignment : 27 March 23:59

Social Web 2015, Lora Aroyo

Page 49: Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam Social Web Course)

image source: http://www.flickr.com/photos/bionicteaching/1375254387/

Hands-on Teaser

•  Build your own recommender system 101•  Recommend pages on del.icio.us •  Recommend pages to your Facebook friends

Social Web 2015, Lora Aroyo