Top Banner
Taming Social Media with MongoDB Danny Holloway [email protected] om June 26, 2012
13

Taming Social Media with MongoDB

Jun 20, 2015

Download

Technology

HumanGeo Group

Slides from presentation given at MongoDC on June 26, 2012.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Taming Social Media with MongoDB

Taming Social Media with MongoDB

Danny [email protected]

June 26, 2012

Page 2: Taming Social Media with MongoDB

2

Overview

• Introduction• Social Media Challenges• MongoDB Setup• Collecting Tweets• Querying Tweets• Accessing the Data• Finding Most Active Tweeter• Lessons Learned• Building an Interface• Demo

Page 3: Taming Social Media with MongoDB

3

Introduction

• Built a tool to collect tweets over Australia and interact with them on a map

• Working at HumanGeo– Building tools and services for geospatial analysis

of Big Data– Using MongoDB for horizontally scalable storage

and geospatial analysis

Page 4: Taming Social Media with MongoDB

4

Social Media Challenges

• No control over data– “Consumers of Tweets should tolerate the addition

of new fields and variance in ordering of fields with ease.” - Twitter

• High Volume– ~17k tweets in a day or 6.2M per year with exact

coordinates in Australia– Record high of >25k tweets per second or >788B

per year around the world - Twitter

Page 5: Taming Social Media with MongoDB

5

MongoDB Setup

• Create database• Create capped collections• Create indexes

Page 6: Taming Social Media with MongoDB

6

Collecting Tweets

• Using tweetstream to collect tweets over Australia from statuses/filter endpoint

• Insert results into collections

Page 7: Taming Social Media with MongoDB

7

Collecting Tweets (cont)

• Augment results for better queries– Twitter provides date strings like "Wed Jun 13

23:17:58 +0000 2012“

Page 8: Taming Social Media with MongoDB

8

Querying Tweets

• Get all of the latest tweets

• Get all the tweets from a user

Page 9: Taming Social Media with MongoDB

9

Querying Tweets (cont)

• Get tweets near a point

• Get tweets within a bounding box

Page 10: Taming Social Media with MongoDB

10

Accessing the Data

• Using Bottle to create a RESTful API

Page 11: Taming Social Media with MongoDB

11

Finding Most Active Tweeter

• Calculate tweet count for each user and return tweets for that user

Page 12: Taming Social Media with MongoDB

12

Lessons Learned

• Use Longitude, Latitude ordering for coordinates

• Default index value range is exclusive of upper bound

• Twitter has bugs too• Making your own maps isn’t hard (it can take

some time)

Page 13: Taming Social Media with MongoDB

13

Building an Interface

• Dust javascript templating library• Leaflet javascript interactive map library• jQuery javascript library• TileStream map tile server