Top Banner
Building a web scale architecture Kaushik Paranjape [email protected] CTO, Sokrati
33

Scalable web architecture

Jul 19, 2015

Download

Software

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Scalable web architecture

Building a web scale architecture

Kaushik Paranjape [email protected] CTO, Sokrati

Page 2: Scalable web architecture

Put your thinking caps on!• Lets design an e-com web-site which should

• capture all user interactions (every event)

• should be able to run analytics and come up with good recommendations

• have a stock ticker for the owner to monitor performance across categories

Page 3: Scalable web architecture

This is what it looks like!

Website with recommendations

category wise conversions ticker for the owner

Page 4: Scalable web architecture

Components?

UI for users

Queuing

Database

UI to monitor performance

Data fetcher

Analytics algorithms (R)

Page 5: Scalable web architecture

Correct way?

• Yes as-long-as-it-works • build simple solutions with lesser time to market • Don’t run blind

Page 6: Scalable web architecture

Problems?• Which queuing system to choose?

• How do I handle the load?

• How do I provide real time insights?

• How real time is data fetcher?

• When am I doomed?

Page 7: Scalable web architecture

Capture every stat! Monitor everything!

• Stats logging tools

• Graphite

• Ganglia

• OpenTSD

• Monitoring tools

• Nagios

• Bosun

Page 8: Scalable web architecture

Graphite

Page 9: Scalable web architecture

Nagios

Page 10: Scalable web architecture

Service based architecture

Database

Service layer

UI for the user

UI for the

owner

Queue

Analytics algo (R)

Page 11: Scalable web architecture

Scaling up service layer

• Load balancing + auto scaling

• stateless services - easier to scale

Page 12: Scalable web architecture

Scaling up app layer• Distributed scheduler

• Map-Reduce jobs

• Storm

• Spark

• Kafka + storm for stream processing

• SQS

Page 13: Scalable web architecture

#mychoice?• HBase, Mongo, neo4j are cool

• operational maturity

• expertise / skills

• MySQL / PostgreSQL

• Every computer engineer would have learnt this in college

• Start with a simple solution, capture right signals, know when to scale

Page 14: Scalable web architecture

Signals to capture• Disk usage

• RAM usage

• size of indexes

• Disk / RAM ratio

• Slow logs

• Table crashes

• Box crashes

• Number of queries

• Locks? Lock wait timeouts?

Page 15: Scalable web architecture

Scaling up database layer• Probably the hardest

• Inherently stateful!

• Replication is a must

• Large data-sets! - GBs, TBs, PBs - keeps growing

• fault tolerance harder

• “last mile” of complete web-stack scalability

Page 16: Scalable web architecture

Challenges for high volume MySQL

• Indexes don’t fit in memory any more!

• schema changes are harder / impossible

• frequent table crashes

• Reliable backup-restore

• locking issues

Page 17: Scalable web architecture

Sharding• Scale out • MySQL clustering

DB Service

Routing Table

DatabaseDatabaseDatabase

Page 18: Scalable web architecture

Helps?• Small databases are fast

• Bigger ones are slower

• keep them small and reap the benefits

• Run queries using parallel processing and collate the results

• Keep collecting stats!

• Re-shard when needed

• replication lag can result in lost transactions

Page 19: Scalable web architecture

#NoSQL• Johan Oskarsson

• In-Memory database

• eventual consistency

• no transactional support

• Typical NoSQL DBs

• Document databases

• key/value store

• Hybrid

• graph databases

• columnar databases

Page 20: Scalable web architecture

Criteria for choosing a DB• ACID Properties

• Join support?

• Performance (inserts, updates, queries, deletes)

• Machine requirements -> TCO

• Community edition / enterprise edition / community support

• Schemaless?

• scalable?

• write-to-master-read-from-slave

• Always consistent / eventual consistency

• Business problem being solved

Page 21: Scalable web architecture

Document stores{ “customer_id” : 842378947, “customer” : { “name”: “Harshad”, “company”: “Sokrati”, “interestAreas”: “Algorithms, Analytics”, }, “Address”: { . . }, }

Page 22: Scalable web architecture

Column Family

Page 23: Scalable web architecture

Key-value stores

Key1

Key2

.

.

.

KeyN

Value1

Value2

ValueN

Page 24: Scalable web architecture

Graph Databases

Page 25: Scalable web architecture

Lets re-design our solution for scale!

Page 26: Scalable web architecture

Problem statement• Lets design an e-com web-site which should

• capture all user interactions (every event)

• should be able to run analytics and come up with good recommendations

• have a stock ticker for the owner to monitor performance across categories

Page 27: Scalable web architecture

UI (user)

UI (Owner)

Kafka

Data collector

DB Service

DatabaseDatabaseMySQLETLDatabaseDatabaseColumnar

DB

Service Analytics algorithms Mongo

Service

Data Collector

Graph DB

Page 28: Scalable web architecture

Collect every stat! Monitor every event!

Page 29: Scalable web architecture

Sokrati architecture evolution

Single MySQL server

Sharded MySQL Solution

HBaseDatabase

as a service

Sharded Columnar

DBs

Page 30: Scalable web architecture

DB As A Service• We decided to build our DB warehouse as a service

• for it makes developers life easier

• for it makes schema modifications seamless

• for it makes database choice more flexible

• for it lets app teams focus exclusively on business logic

• One service to rule all data :-)

Page 31: Scalable web architecture

Take-aways• All the databases are here to stay

• Your solution will have a combination of databases

• Choose the right one for your problem

• Business needs drive selection

• collect every stat, monitor every event!

• Be prepared for a failure

Page 33: Scalable web architecture

At Sokrati we do all of this and we are hiring!

• Send your resumes to [email protected] !