A presentation of the Spilgames Storage Platform, a massively scalable solution to abstract your storage with transparent sharding and expanding your datacenters worldwide.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Spilgames Storage Platform
Enrique Paz
Senior Backend Developer
21/03/2013
1/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Introduction
2/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
About Me
• Passionate Erlang developer• Testing enthusiast• Love beautiful code!
3/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Spilgames
• Gaming Platform• Serving data to 190+ countries world-wide• 200+ million unique users per month• Multiple Platforms: Desktop, Mobile Native & Web• 300+ employees• Offices in The Netherlands & China• Revenue: Advertising & EUM
4/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Gaming Portals
5/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Gaming Portals
6/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Games
7/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Another Storage
8/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Another Storage
9/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
LAMP Stack & MySql
• Not all developers are DB experts• Difficult to shard the databases• Storage model all over the place• Security• Performance• Caching• . . .
10/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Our Ambition
• Transparent sharding layer• Sharding on data ownership• High availability• Centralized caching layer• Storage engine agnostic• One strict data model• Transparent storage changes• Scaling geographically
11/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
System Properties
12/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Mindset
• Be always available• Avoid global locks• Accept change as the only constant• Embrace inconsistencies
I Hardware breaks down (power failures)I Version mismatches (upgrading system not atomic)I State mismatches (adding new machines)
• GIDs (64 bits) identify the data ownerI userI game
• Buckets can use different storage enginesI Several MySql tables in different databasesI Just a binary storage (SWIFT)I . . .
• Data for a bucket/GID is cached• Requests can be atomic per bucket/GID
14/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Optimistic Operations
• Speed > Consistency• Losing some updates in case of crash is affordable• Act first on cache and then on disk• No warranties of eventual consistency upon crashes
I i.e. Activity feedsI i.e. Popular games list
15/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Pessimistic Operations
• Consistency is key and confirmation is required• Dealing with critical data• Persist data and, upon success, update cache
I i.e. PaymentsI i.e. Personal information
16/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
How does it work?
17/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
System Components
• lookup application in all nodes.Uses a hashing ring (mnesia):
I replicated in all nodesI ram_copiesI dirty readsI transactional writes
• Buckets have Pipeline Factories• Buckets register PFs in lookup
18/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Tracing Pessimistic Operations
1. Bucket/GID request in a node2. Local lookup to find a PF3. PF receives request4. PF builds job5. PF ensures Pipeline for GID6. PF queues operation in Pipeline
19/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Wait a minute. . .
• “Why do we need pipelines?”• “Sequential == Bottleneck !!!”• “Don’t you guys know Erlang is
about parallelizing work?”
20/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
About Pipelines
• CONSI Sequential (read) access for hotspots
i.e. Popular gamesI Optimization: read from SSP cache in PF
• PROSI No need for storage engines to support global locksI A bucket can combine several engines
• Requests to most GIDs (users) are evenly distributed
21/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
About Pipelines
• CONSI Sequential (read) access for hotspots
i.e. Popular gamesI Optimization: read from SSP cache in PF
• PROSI No need for storage engines to support global locksI A bucket can combine several engines
• Requests to most GIDs (users) are evenly distributed
21/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
About Pipelines
• CONSI Sequential (read) access for hotspots
i.e. Popular gamesI Optimization: read from SSP cache in PF
• PROSI No need for storage engines to support global locksI A bucket can combine several engines
• Requests to most GIDs (users) are evenly distributed
21/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Schema Versions
• Schema Versions determine allowed operations and storage(s)• Client is not aware of them• Max 2 schema versions of a bucket at the time• Schema version can be changed at bucket/GID level
22/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Shards
• Useful for partitioning big blocks of data• Shard points to the physical location of the data• Sharding rules are bucket specific. Default is GID % Shards• bucket/GID combinations can be migrated between shards
• Master DatacentersI Have persistent storageI Can own GIDsI GIDs can be migrated between Master DCs
• Satellite DatacentersI Don’t have persistent storageI Easy to setup and decommissionI Virtual/Cloud-based
28/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Start With One DC
29/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Scale Up A Satellite Where Needed
30/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Turn Satellites Into Masters When Ready
31/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Working With Multiple Datacenters
32/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Disaster Scenarios
33/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Losing A Satellite Datacenter
34/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Losing A Master Datacenter
35/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Losing A Master Datacenter
36/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Lessons Learned
37/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
Where We Are
• Using simple buckets on LIVE in one DC• Added relup support for bucket only updates• Hammering SSP using property based testing• Integrating restricted search capabilities• Testing the WAN protocol for inter DC communication• More buckets to go live in H1 2013• Satellite DCs coming on H2 2013
38/42
IntroductionSpilgames & Me
AnotherStorageMotivation
SystemPropertiesOverview
Internals
Versions & Shards
WorldwideScalingDatacenters
DisasterScenarios
LessonsLearnedCurrents Status
Contributions
What We’ve Used
• EmysqlI (+) multi-database transaction supportI (*) multi-timezone support