Top Banner
Introduction Spilgames & Me Another Storage Motivation System Properties Overview Internals Versions & Shards Worldwide Scaling Datacenters Disaster Scenarios Lessons Learned Currents Status Contributions Spilgames Storage Platform Enrique Paz Senior Backend Developer 21/03/2013 1/42
45

Spilgames Storage Platform

Nov 11, 2014

Download

Technology

enriquepazperez

A presentation of the Spilgames Storage Platform, a massively scalable solution to abstract your storage with transparent sharding and expanding your datacenters worldwide.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Spilgames Storage Platform

Enrique Paz

Senior Backend Developer

21/03/2013

1/42

Page 2: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Introduction

2/42

Page 3: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

About Me

• Passionate Erlang developer• Testing enthusiast• Love beautiful code!

3/42

Page 4: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Spilgames

• Gaming Platform• Serving data to 190+ countries world-wide• 200+ million unique users per month• Multiple Platforms: Desktop, Mobile Native & Web• 300+ employees• Offices in The Netherlands & China• Revenue: Advertising & EUM

4/42

Page 5: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Gaming Portals

5/42

Page 6: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Gaming Portals

6/42

Page 7: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Games

7/42

Page 8: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Another Storage

8/42

Page 9: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Another Storage

9/42

Page 10: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

LAMP Stack & MySql

• Not all developers are DB experts• Difficult to shard the databases• Storage model all over the place• Security• Performance• Caching• . . .

10/42

Page 11: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Our Ambition

• Transparent sharding layer• Sharding on data ownership• High availability• Centralized caching layer• Storage engine agnostic• One strict data model• Transparent storage changes• Scaling geographically

11/42

Page 12: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

System Properties

12/42

Page 13: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Mindset

• Be always available• Avoid global locks• Accept change as the only constant• Embrace inconsistencies

I Hardware breaks down (power failures)I Version mismatches (upgrading system not atomic)I State mismatches (adding new machines)

13/42

Page 14: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

A Key-Value Store With Schema

• BucketsI Largely generated OTP applicationsI Offer a CRUD-like interface (with filters)

• GIDs (64 bits) identify the data ownerI userI game

• Buckets can use different storage enginesI Several MySql tables in different databasesI Just a binary storage (SWIFT)I . . .

• Data for a bucket/GID is cached• Requests can be atomic per bucket/GID

14/42

Page 15: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Optimistic Operations

• Speed > Consistency• Losing some updates in case of crash is affordable• Act first on cache and then on disk• No warranties of eventual consistency upon crashes

I i.e. Activity feedsI i.e. Popular games list

15/42

Page 16: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Pessimistic Operations

• Consistency is key and confirmation is required• Dealing with critical data• Persist data and, upon success, update cache

I i.e. PaymentsI i.e. Personal information

16/42

Page 17: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

How does it work?

17/42

Page 18: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

System Components

• lookup application in all nodes.Uses a hashing ring (mnesia):

I replicated in all nodesI ram_copiesI dirty readsI transactional writes

• Buckets have Pipeline Factories• Buckets register PFs in lookup

18/42

Page 19: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Tracing Pessimistic Operations

1. Bucket/GID request in a node2. Local lookup to find a PF3. PF receives request4. PF builds job5. PF ensures Pipeline for GID6. PF queues operation in Pipeline

19/42

Page 20: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Wait a minute. . .

• “Why do we need pipelines?”• “Sequential == Bottleneck !!!”• “Don’t you guys know Erlang is

about parallelizing work?”

20/42

Page 21: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

About Pipelines

• CONSI Sequential (read) access for hotspots

i.e. Popular gamesI Optimization: read from SSP cache in PF

• PROSI No need for storage engines to support global locksI A bucket can combine several engines

• Requests to most GIDs (users) are evenly distributed

21/42

Page 22: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

About Pipelines

• CONSI Sequential (read) access for hotspots

i.e. Popular gamesI Optimization: read from SSP cache in PF

• PROSI No need for storage engines to support global locksI A bucket can combine several engines

• Requests to most GIDs (users) are evenly distributed

21/42

Page 23: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

About Pipelines

• CONSI Sequential (read) access for hotspots

i.e. Popular gamesI Optimization: read from SSP cache in PF

• PROSI No need for storage engines to support global locksI A bucket can combine several engines

• Requests to most GIDs (users) are evenly distributed

21/42

Page 24: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Schema Versions

• Schema Versions determine allowed operations and storage(s)• Client is not aware of them• Max 2 schema versions of a bucket at the time• Schema version can be changed at bucket/GID level

22/42

Page 25: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Shards

• Useful for partitioning big blocks of data• Shard points to the physical location of the data• Sharding rules are bucket specific. Default is GID % Shards• bucket/GID combinations can be migrated between shards

23/42

Page 26: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Working With Versions & Shards

1. insert(Bucket, Gid, Value)2. insert(Gid, Value)3. get_vs(Bucket, Gid)4. {v2, Shard1}5. build_job(insert, Gid, Shard1)6. {ok, InsertJob}7. {ok, InsertJob}

24/42

Page 27: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

One API To Rule Them All

• “Don’t care where it is, just want my data!!!”

• PIQI helps with the API

I Erlang client + Protocol Buffers

I HTTP + JSON

I HTTP + Protocol Buffers

25/42

Page 28: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

One API To Rule Them All

• “Don’t care where it is, just want my data!!!”• PIQI helps with the API

I Erlang client + Protocol Buffers

I HTTP + JSON

I HTTP + Protocol Buffers

25/42

Page 29: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Worldwide

26/42

Page 30: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Worlwide

27/42

Page 31: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Masters & Satellites

• Master DatacentersI Have persistent storageI Can own GIDsI GIDs can be migrated between Master DCs

• Satellite DatacentersI Don’t have persistent storageI Easy to setup and decommissionI Virtual/Cloud-based

28/42

Page 32: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Start With One DC

29/42

Page 33: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Scale Up A Satellite Where Needed

30/42

Page 34: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Turn Satellites Into Masters When Ready

31/42

Page 35: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Working With Multiple Datacenters

32/42

Page 36: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Disaster Scenarios

33/42

Page 37: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Losing A Satellite Datacenter

34/42

Page 38: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Losing A Master Datacenter

35/42

Page 39: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Losing A Master Datacenter

36/42

Page 40: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Lessons Learned

37/42

Page 41: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Where We Are

• Using simple buckets on LIVE in one DC• Added relup support for bucket only updates• Hammering SSP using property based testing• Integrating restricted search capabilities• Testing the WAN protocol for inter DC communication• More buckets to go live in H1 2013• Satellite DCs coming on H2 2013

38/42

Page 42: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

What We’ve Used

• EmysqlI (+) multi-database transaction supportI (*) multi-timezone support

• Eep0018/Jiffy• Estatsd• PropEr• Poolboy• Lager• Rebar

I (*) semantic versioning, i.e. [">=1.3.1", "<2.0.0"]I (*) shared dependenciesI (*) xref fixes

• Piqi• BashoBench

I (+) Several tests on the same plot support

39/42

Page 43: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Questions?

40/42

Page 44: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

That’s All Folks!

Thanks!

41/42

Page 45: Spilgames Storage Platform

IntroductionSpilgames & Me

AnotherStorageMotivation

SystemPropertiesOverview

Internals

Versions & Shards

WorldwideScalingDatacenters

DisasterScenarios

LessonsLearnedCurrents Status

Contributions

Join Us!

http://www.spilgames.com/careers/job-openings/

42/42