Top Banner
Aviran Mordo Head of Back - end Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling Wix with Microservices Architecture
40

Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Jul 17, 2015

Download

Engineering

Aviran Mordo
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Aviran MordoHead of Back-end Engineering

@aviranm

linkedin.com/in/aviran

aviransplace.com

Scaling Wix with Microservices Architecture

Page 2: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015
Page 3: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Wix in Numbers

Over 61M users1.5M new users/month

Static storage is >2PB of data1.5TB new files/day

3 data centers + 3 clouds (Google, Amazon, Azure)

1.5B HTTP requests/day

900 people work at Wix, of which ~ 300 in R&D

Page 4: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Initial Architecture

Built for fast development

Stateful login (Tomcat session), Ehcache, file uploads

No consideration for performance, scalability and testing

Intended for short-term use

Tomcat, Hibernate, custom web framework

Lighttpd(file serving) MySQL

DB

Wix(Tomcat)

Page 5: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

The Monolithic Giant

One monolithic server that handled everything

Dependency between features

Changes in unrelated areas of the system caused deployment of the whole system

Failure in unrelated areas will cause system wide downtime

Page 6: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Breaking the System Apart

Page 7: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Concerns and SLA

Data Validation

Security / Authentication

Data consistency

Lots of data

Edit websites

High availability

High performance

Lots of static files

Very high traffic volume

Viewport optimization

Long tail (immutable)

Serving Media

High availability

High performance

High traffic volume

Long tail (mutable)

View sites, created by

Wix editor

Page 8: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Wix Segmentation

1. Editor Segment 3. Public Segment2. Media Segment

Networking

Page 9: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

HTML Editor

Flash Editor

MSM

Private Media

Public Media

Editor Segment Public Segment

Premium Services

eCommerse

List DB

App Builder

App Store

App Market

Dashboard

Statics/media

Mailer

TimeZone

Public HTML API

Public API (Flash)

MSP

Public Server

HTML Renderer

HTML SEO Renderer

Flash Renderer

Flash SEO Renderer

Sitemap Renderer

Robots.txt Renderer

User Server

Template Viewer

ContactsHUBActivit

y

Site Members

Provided Mailing Service

Comments

Snapshoter

User Pref

Feed Me

Shout-out Hotels

PETRI

Site Pref

Dist LoggerSlicer

eComRenderer

eCom Cart

eComCheckout

eComCatalog

eComOrders

Payment Facade

Account Info

HTML API

HTML Embeder

BlogMobile

Page 10: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Microservices Guidelines

Each service has its own database (if one is needed)

Only one service can write to a specific DB table(s)

There may be additional read-only services that directly accesses the DB (for performance reasons)

Services are stateless

No DB transactions

Cache is not a building block, but an optimization

Page 11: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015
Page 12: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Microservices TradeoffsEach service has its own database (if one is needed)

Easy to scale microservices based on SLA concerns Tradeoff – system complexity, performance

Only one service can write to a specific DB table(s)De-coupling architecture – faster development Tradeoff – system complexity / performance

May be additional read-only services that accesses the DBPerformance gainTradeoff coupling

Services are stateless

Easy to scale out (just add more servers)

Tradeoff performance / consistency

No DB transactionsBetter DB performance, easier to scaleTradeoff system complexity

Page 13: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

1. Editor Segment

Page 14: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Editor Server

Immutable JSON pages (~2.5M / day)

Site revisions

Active – standby MySQL cross datacenters

Editor Server

MySQL Active Sites

MySQL Archive

Page 15: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015
Page 16: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Protect The Data

DB outage with fast recovery = replication

Data poisoning/corruption = revisions / backup

Make the data available at all times = data distribution to multiple locations / providers

Page 17: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

BrowserEditor Server

Static Grid

Notify

Google Cloud

Storage

MySQL Active Sites

MySQL Archive

Notify

Saving Editor Data

Archive (Amazon)

Archive (Google)

Save Page(s)

200 OK

Upload

Save Page

DC replication

Download Page

MySQL Archive

MySQL Active Sites

Page 18: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

BrowserEditor Server

Static Grid

Save Page(s)

Save Page

Upload

NotifyDownload Page

Google Cloud

Storage

MySQL Archive

MySQL Active Sites

MySQL Archive

DC replication

Notify

Self Healing Process

Archive (Amazon)

Archive (Google)

MySQL Active Sites

200 OK

Page 19: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

No DB Transactions

Save each page (JSON) as an atomic operation

Page ID is a content based hash (immutable/idempotent)

Finalize transaction by sending site header (list of pages)

Can generate orphaned pages, not a problem in practice

Page 20: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

2. Media Segment

Page 21: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Prospero – Wix Media Storage

2PB user media files

3M files uploaded daily

800M metadata records

Dynamic media processing• Picture resize, crop and sharpen “on the fly”• Watermark• Audio format conversion

Page 22: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

T

GoogleCloud

Prospero – Wix Media Manager

get image.jpg

First fallback

Secondfallback

If not in CDN

Amazon

x36

Tx36

Tx32

Austin

CDN

Page 23: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

3. Public Segment

Page 24: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Public Segment Roles

Routing (resolve URLs)

Dispatching (to a renderer)

Rendering (HTML,XML,TXT)

Public Server

HTML Renderer

HTML SEO Renderer

Flash Renderer

Sitemap Renderer

Robots.txt Renderer

www.example.com

Flash SEO Renderer

Page 25: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Public SLA

Our goal: 99% response time <100ms at peak traffic

Page 26: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Publish Site

Publish site header (a map of pages for a site)

Publish routing table

Publish site header / routes (CQRS)

Editor Segment Public Segment

Page 27: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Built For Speed

Minimize out-of-process hops (2 DB, 1 RPC)

Lookup tables are cached in memory, updated every few minutes

Denormalized data – optimize for read by primary key (MySQL)

Minimize business logic

Page 28: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

How a Page Gets Rendered

Bootstrap HTML template that contains only data

Only JavaScript imports

JSON data (site-header + dynamic data)

No “real” HTML view

Page 29: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Offload rendering work to the browser

Page 30: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

The average Intel Core i750 can push up to 7 GFLOPS without overclocking

Page 31: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Why JSON?

Easy to parse in JavaScript and Java/Scala

Fairly compact text format

Highly compressible (5:1 even for small payloads)

Easy to fix rendering bugs (just deploy a new code)

Page 32: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Minimum Number of Public Servers Needed to Serve 60M Sites

4

Page 33: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Public SLABe Available 99.999%

Page 34: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Serving a Site – Sunny Day

Archive

CDN StaticsBrowser

http://example.wix.com

Store HTML to cache

HTTP Request

Notify site view

LB

Public

Renderer

HTML

Resources / Media

HTTP Request

Page 35: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Serving a Site – DC Lost

Archive

CDN StaticsBrowser

http://example.wix.com

LB

Public

Renderer

LB

Public

Renderer

Change DNS

HTTP Request

Page 36: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Serving a Site – Public Lost

Archive

Browserhttp://example.wix.com

LB

Public

Renderer

Get Cached HTML Version

HTMLHTTP Request

LB

Public

Renderer

Fallback to 2nd

DC

Page 37: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Living in the Browser

Archive

CDN StaticsBrowser

http://example.wix.com

LB

Public

Renderer

Editor Pages

Fallback

JSON / Media

HTMLHTTP Request

Fallback

Page 38: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Summary

Identify your critical path and concerns

Build redundancy in critical path (for availability)

De-normalize data (for performance)

Minimize out-of-process hops (for performance)

Take advantage of client’s CPU power

Page 39: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015
Page 40: Scaling Wix with microservices architecture and multi-cloud platforms - Reversim Summit 2015

Aviran MordoHead of Back-end Engineering

Q&A

@aviranm

linkedin.com/in/aviran

aviransplace.com

http://engineering.wix.com

http://goo.gl/wlq9Ih

@WixEng