Top Banner
Scalable Event Analytics with Ruby on Rails & MongoDB Ruby Conf China 2010 Jared Rosoff (@forjared) [email protected]
45

Scalable Event Analytics with MongoDB & Ruby on Rails

Sep 08, 2014

Download

Technology

Jared Rosoff

Slides from my talk at RubyConfChina 2010.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Scalable Event Analytics with MongoDB & Ruby on Rails

Scalable Event Analytics with Ruby on Rails & MongoDB

Ruby Conf China 2010Jared Rosoff (@forjared)

[email protected]

Page 2: Scalable Event Analytics with MongoDB & Ruby on Rails

Yottaa!!!! (www.yottaa.com)

Page 3: Scalable Event Analytics with MongoDB & Ruby on Rails

Overview

• Ruby at Scale

• What is Event Analytics?

• What are the different ways you could do it?

• How we did it

Page 4: Scalable Event Analytics with MongoDB & Ruby on Rails

Ruby At Scale?http://www.flickr.com/photos/laughingsquid

Page 5: Scalable Event Analytics with MongoDB & Ruby on Rails

Event Analytics

Event Analytics

Data Source

EventEvent

Event

Data Source

EventEvent

Event

Data Source

EventEvent

Event

Data Source

EventEvent

Event

UserQuery

Report

User

Query

Report

Page 6: Scalable Event Analytics with MongoDB & Ruby on Rails

High Write Volume•Each new data source adds X requests per second•Data never stops arriving

Continuous Data Growth•We only add more data•Historical data is valuable

Flexible Data Exploration•Ad hoc queries •Complex aggregations

Page 7: Scalable Event Analytics with MongoDB & Ruby on Rails

Oh and we are a startup

Page 8: Scalable Event Analytics with MongoDB & Ruby on Rails

Our requirements:On Launch Day

# of data sources 15# of events per minute 80# GBs data stored 20

3 months later (projected)# of data sources 45# of events per minute 5600# GBs data stored 100

Page 9: Scalable Event Analytics with MongoDB & Ruby on Rails

Rails default architecture

MySQL

Data Source Collection Server

User Reporting Server

Page 10: Scalable Event Analytics with MongoDB & Ruby on Rails

Rails default architecture

MySQL

Data Source Collection Server

User Reporting Server

“Just” a Rails App

Page 11: Scalable Event Analytics with MongoDB & Ruby on Rails

Rails default architecture

MySQL

Data Source Collection Server

User Reporting Server

“Just” a Rails App

Performance Bottleneck: Too much load

Page 12: Scalable Event Analytics with MongoDB & Ruby on Rails

Let’s add replication!

MySQLMasterMySQL

MasterMySQLMaster

MySQLMaster

Replication

Data Source Collection Server

User Reporting Server

Page 13: Scalable Event Analytics with MongoDB & Ruby on Rails

Let’s add replication!

MySQLMasterMySQL

MasterMySQLMaster

MySQLMaster

Replication

Data Source Collection Server

User Reporting Server

Off the shelf!Scalable Reads!

Page 14: Scalable Event Analytics with MongoDB & Ruby on Rails

Let’s add replication!

MySQLMasterMySQL

MasterMySQLMaster

MySQLMaster

Replication

Data Source Collection Server

User Reporting Server

Off the shelf!Scalable Reads!

Performance Bottleneck: Still can’t scale

writes

Page 15: Scalable Event Analytics with MongoDB & Ruby on Rails

What about sharding?

MySQLMasterMySQL

MasterMySQLMaster

Data Source Collection Server

User Reporting Server

Shar

ding

Shar

ding

Page 16: Scalable Event Analytics with MongoDB & Ruby on Rails

What about sharding?

MySQLMasterMySQL

MasterMySQLMaster

Data Source Collection Server

User Reporting Server

Shar

ding

Shar

ding

Scalable Writes!

Page 17: Scalable Event Analytics with MongoDB & Ruby on Rails

What about sharding?

MySQLMasterMySQL

MasterMySQLMaster

Data Source Collection Server

User Reporting Server

Shar

ding

Shar

ding

Scalable Writes!

Development Bottleneck:

Need to write custom code

Page 18: Scalable Event Analytics with MongoDB & Ruby on Rails

Key Value stores to the rescue?

MySQLMasterMySQL

MasterCassandra

orVoldemort

Data Source Collection Server

User Reporting Server

Page 19: Scalable Event Analytics with MongoDB & Ruby on Rails

Key Value stores to the rescue?

MySQLMasterMySQL

MasterCassandra

orVoldemort

Data Source Collection Server

User Reporting Server

Scalable Writes!

Page 20: Scalable Event Analytics with MongoDB & Ruby on Rails

Key Value stores to the rescue?

MySQLMasterMySQL

MasterCassandra

orVoldemort

Data Source Collection Server

User Reporting Server

Scalable Writes!

Development Bottleneck:

Reporting is limited / hard

Page 21: Scalable Event Analytics with MongoDB & Ruby on Rails

Can I Hadoop my way out of this?

MySQLMasterMySQL

MasterCassandra

orVoldemort

Data Source Collection Server

User Reporting Server

Hadoop

MySQLMasterMySQL

MasterMySQLSlave

MySQLMaster

Page 22: Scalable Event Analytics with MongoDB & Ruby on Rails

Can I Hadoop my way out of this?

MySQLMasterMySQL

MasterCassandra

orVoldemort

Data Source Collection Server

User Reporting Server

Hadoop

MySQLMasterMySQL

MasterMySQLSlave

MySQLMaster

Scalable Writes!

Page 23: Scalable Event Analytics with MongoDB & Ruby on Rails

Can I Hadoop my way out of this?

MySQLMasterMySQL

MasterCassandra

orVoldemort

Data Source Collection Server

User Reporting Server

Hadoop

MySQLMasterMySQL

MasterMySQLSlave

MySQLMaster

Scalable Writes!

Flexible Reports!

Page 24: Scalable Event Analytics with MongoDB & Ruby on Rails

Can I Hadoop my way out of this?

MySQLMasterMySQL

MasterCassandra

orVoldemort

Data Source Collection Server

User Reporting Server

Hadoop

MySQLMasterMySQL

MasterMySQLSlave

MySQLMaster

Scalable Writes!

Flexible Reports!

“Just” a Rails App

Page 25: Scalable Event Analytics with MongoDB & Ruby on Rails

Can I Hadoop my way out of this?

MySQLMasterMySQL

MasterCassandra

orVoldemort

Data Source Collection Server

User Reporting Server

Hadoop

MySQLMasterMySQL

MasterMySQLSlave

MySQLMaster

Scalable Writes!

Flexible Reports!

“Just” a Rails App

Development Bottleneck:

Too many systems!

Page 26: Scalable Event Analytics with MongoDB & Ruby on Rails

MongoDB!

MySQLMasterMySQL

MasterMongoDB

Data Source Collection Server

User Reporting Server

Page 27: Scalable Event Analytics with MongoDB & Ruby on Rails

MongoDB!

MySQLMasterMySQL

MasterMongoDB

Data Source Collection Server

User Reporting Server

Scalable Writes!

Page 28: Scalable Event Analytics with MongoDB & Ruby on Rails

MongoDB!

MySQLMasterMySQL

MasterMongoDB

Data Source Collection Server

User Reporting Server

Scalable Writes!

Flexible Reporting!

Page 29: Scalable Event Analytics with MongoDB & Ruby on Rails

MongoDB!

MySQLMasterMySQL

MasterMongoDB

Data Source Collection Server

User Reporting Server

Scalable Writes!“Just” a rails app

Flexible Reporting!

Page 30: Scalable Event Analytics with MongoDB & Ruby on Rails

MongoD

MongoD

MongoD

Data Source

App Server

CollectionN

ginx

Pass

enge

r

Mon

gos

ReportingUser

LoadBalancer

Page 31: Scalable Event Analytics with MongoDB & Ruby on Rails

MongoD

MongoD

MongoD

Data Source

App Server

CollectionN

ginx

Pass

enge

r

Mon

gos

ReportingUser

Sharding!

LoadBalancer

Page 32: Scalable Event Analytics with MongoDB & Ruby on Rails

MongoD

MongoD

MongoD

Data Source

App Server

CollectionN

ginx

Pass

enge

r

Mon

gos

ReportingUser

Sharding!

High Concurrency

LoadBalancer

Page 33: Scalable Event Analytics with MongoDB & Ruby on Rails

MongoD

MongoD

MongoD

Data Source

App Server

CollectionN

ginx

Pass

enge

r

Mon

gos

ReportingUser

Sharding!

High ConcurrencyScale-Out

LoadBalancer

Page 34: Scalable Event Analytics with MongoDB & Ruby on Rails

MongoDB Sharding

Page 35: Scalable Event Analytics with MongoDB & Ruby on Rails

MongoDB Sharding

Replica Sets let us scale storage &

transaction capacity for each shard

Page 36: Scalable Event Analytics with MongoDB & Ruby on Rails

MongoDB Sharding

Replica Sets let us scale storage &

transaction capacity for each shard

Mongos routes transactions to shards based on “shard key”

Page 37: Scalable Event Analytics with MongoDB & Ruby on Rails

MongoDB Sharding

Replica Sets let us scale storage &

transaction capacity for each shard

Mongos routes transactions to shards based on “shard key”

Config servers store information about which shards exist

Page 38: Scalable Event Analytics with MongoDB & Ruby on Rails

Inserting

1 insert { ‘name’ : bob }

2Shard key == namebob Shard 2

3 Insert { ‘name’ : bob }

Page 39: Scalable Event Analytics with MongoDB & Ruby on Rails

Querying

1 Query { ‘name’ : bob }

2Shard key == namebob Shard 2

3 Query { ‘name’ : bob }

Page 40: Scalable Event Analytics with MongoDB & Ruby on Rails

Map Reduce

1 Map-reduce( … )

2

Map-reduce(…)

2 2 2

Page 41: Scalable Event Analytics with MongoDB & Ruby on Rails

Working with Mongo

• MongoMapper makes it look like ActiveRecord

• Documents are more natural than rows in many cases

• Map-Reduce rocks (but needs better support in rails)

http://www.flickr.com/photos/elhamalawy/2526783078/

Page 42: Scalable Event Analytics with MongoDB & Ruby on Rails
Page 43: Scalable Event Analytics with MongoDB & Ruby on Rails

Ruby

Mongo

Page 44: Scalable Event Analytics with MongoDB & Ruby on Rails

Runs over all the objects in the views table, counting how many times a page was viewed

Adds up all the counts for a unique url / date combination

Run the map reduce job and return a collection containing the results

Page 45: Scalable Event Analytics with MongoDB & Ruby on Rails

Results• Version 1 of our analytics system took 2 weeks with 1

engineer – We have since added a lot more complexity, but we did it

incrementally

• We replaced MySQL entirely with MongoDB – No need for joins, transactions – Every table is now a document collection

• It’s fast! – 63ms – Average response time for sending data to server– 93ms – Average response time for displaying reports