Top Banner
MongoDB at Groupon Peter Bakkum @pbbakkum
32

MongoDB San Francisco 2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

Jul 18, 2015

Download

Technology

MongoDB
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

MongoDB at Groupon

Peter Bakkum

@pbbakkum

Page 2: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

MerchantData

CRM

MerchantPages

Page 3: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon
Page 4: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

MerchantData

CRM

MerchantPages

Self-Service Others

Page 5: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

Arnold: Declarative Crowd-Machine

Data IntegrationShawn Jeffery, Liwen Sun, Matt DeLand,

Nick Pendar, Rick Barber, Andrew Galdi

CIDR 2013cidrdb.org/cidr2013/Papers/CIDR13_Paper22.pdf

Page 6: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

Concordance

{

name: “Joe’s Pizza”,

location: {

address: “1000 Market St.”,

postal_code: “94100-1001”

},

source: 1

}

{

name: “Joes Pizza”,

location: {

address: “1000 Market Street”,

postal_code: “94100”

},

source: 2

}

{

name: “Joes”,

location: {

address: “1000 Market”,

postal_code: “94100”

},

source: 3

}

Page 7: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

Data Systems

Page 8: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

Content Input

Data Processing

Serving

Page 9: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

Content Input

Data SetsInput Feeds

Normalization

Crowd Sourcing

Web Crawling

Page 10: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

Content Input

Crawl Store

Web Crawler

Configuration

Normalized Data

Recent Feed History

Page 11: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

Data Processing

Storm Topology

Storm

Page 12: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

Serving

HTTP Access Layer Varnish

Page 13: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

Data Processing

Page 14: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

storm topology

bolts

parsing

normalization

concordance

geocoding

persistence

Page 15: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

place model

record

tree

Page 16: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

places.find({

_id: “013e4e2afc26”

})

placeCollection.find({

location.postcode: “94100”,

location.country: “US”

})

places.findAndModify(

{

_id: “013e4e2afc26”

persisted_at: “2013-02-01T0:00:00Z”

},

{ place model })

Concordance

Persistence

Page 17: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

config cluster

4 arbiters

4 shards of 2 nodes

replica set failover

64 GB dedicated hardware

storm workers

mongos routers

Page 18: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

ID Scheme

Page 19: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

UUID v1

82d991c6-b098-11e2-8fc0-c82a14fffe86

82d9996e-b098-11e2-8fc0-c82a14fffe86

82d99f04-b098-11e2-8fc0-c82a14fffe86

82d9a40e-b098-11e2-8fc0-c82a14fffe86

Page 20: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon
Page 21: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

UUID vB

wwwwwwww-xxxx-byyy-yyyy-zzzzzzzzzzzz

w: controllable counter

x: process id

b: literal 'b’

y: fragment of MAC address

z: milliseconds since epoch (UTC)

Page 22: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

toggle

c8c9cef9-7a7f-bd53-7a50-013e4e2afbde 14951cfa-7a7f-bd53-7a50-013e4e2afbde 6f5169fb-7a7f-bd53-7a50-013e4e2afbde ba2da6fc-7a7f-bd53-7a50-013e4e2afbde

f5166777-7a7f-bd53-7a50-013e4e2afc26 f5166778-7a7f-bd53-7a50-013e4e2afc26 f5166779-7a7f-bd53-7a50-013e4e2afc26 f516677a-7a7f-bd53-7a50-013e4e2afc26

Page 23: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon
Page 24: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

github.com/groupon/locality-uuid.java

github.com/groupon/locality-uuid.rb

Page 25: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

Backup and MapReduce

Page 26: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

Hadoop Cluster

Page 27: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

places.ns

places.0

places.1

places.2

…char[128]

name

DiskLoc

firstExtent

DiskLoc

lastExtent

places.ns

places.0

places.1

places.2

Page 28: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

places.ns

places.0

places.1

places.2

places.0 places.1

extent extent extent

Page 29: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

places.ns

places.0

places.1

places.2

places.0 places.1

extent extent extent

MapReduceInput Split

MapReduceInput Split

MapReduceInput Split

Page 30: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

public void map(

Text key,

WritableBSONObject value,

Context context)

{

String id = (String) value.get(“_id”);

...

}

Page 31: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

Mongo Cluster

Hadoop Cluster

MapReduce Job

Backs up Mongo data to

Hadoop

Much faster data export

Exploits our Hadoop cluster

Page 32: MongoDB San Francisco  2013: Using MongoDB for Groupon's Place Data presented by Peter Bakkum, Member of Technical Staff, Groupon

Peter Bakkum

@pbbakkum

[email protected]