Deliver on the Promise of Internet Scale Design and Architecture Marc Jones, VP of Product Innovation
Jun 10, 2015
Deliver on the Promise of Internet Scale Design and Architecture
Marc Jones, VP of Product Innovation
100k servers
24k customers
23million domains
Defined user base
Global potential
Enterprise apps
Internet scale apps
Relative chaos
Relative predictability Cost
Revenue
DefiningCloud Computing
Marketing termReference Architecture
Operations Model
Capacity on demandConsumption-based pricing
Self-service provisioningAccessible via API
Internet Scale?
On-demand, elastic resources with 24×7 reliability.
Scale is resource availability and capacity.
Scale does not directly correlate to “faster” performance.
Scale can help provide predictable performance.
51million blogs
57million daily posts
21billion blog posts
50 million downloads
30 hrs drawings / sec
200 million dollars
$0
$3,250.00
$6,500.00
$9,750.00
$13,000.00
Q1 08 Q2 08 Q3 08 Q4 08 Q1 09 Q2 09 Q3 09 Q4 09 Q1 10 Q2 10 Q3 10 Q4 10
seasonality
Online Retail
Net Sales (1,000,000’s)
Does an SMB need Internet scale?
0
25,000
50,000
75,000
100,000
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
promotional activity, social buzz
SMB
Monthly Unique Visitors
Built forInternet Scale
Unpredictable tra!c patternsUnconstrained user baseGlobal potentialNetwork-sensitiveLeading-edge technology stack
Workload characteristicsPerformance targetsControl requirementsAvailability mandates
ChoiceApplications dictate infrastructure
Design + Architecture
Design + Architecture
Why is scalability so hard? It can’t be an after-thought. It requires applications and platforms designed with scaling in mind.
Is achieving good scalability possible? Absolutely, but only if we architect and engineer our systems to take scalability into account.
Simply throwing additional CPU cycles or storage at an application is not going to deliver linear scalability unless the application was designed to scale in such a manner.
Simplicity Over Complexity
Eliminate multi-step processes. Reduce each step of every process to its atoms. They perform their work in complete isolation, and communicate among one another with messages.
Think simple. If it is a complex problem, reduce to a simpler form. Iterate.
Simple is actually hard. You have to work to simplify.
“A system can be so simple that there are obviously no bugs, or it can be so complex that there are no obvious bugs.”
Design + Architecture
Choose tools that can grow. The cloud makes it easy to add nodes. Does your software?
Choose tools that can shrink. The cloud makes it easy to remove nodes. Most software does not.
Leverage the best available to meet your requirements.
Design + Architecture
Stateless and async. One of the guiding principles for linear scalability is to have lightweight, independent, stateless operations that can be executed anywhere and run on newly deployed threads/processes/cores/machines transparently as needed in order to service an increasing number of requests. Share nothing.
Testing async code can be non-trivial. Test coverage should be pursued early (as in at the start). Test early, test often.
Bottlenecks
Eliminate choke points. Everything that has to be coordinated by a single machine, or even a single cluster, is a failure waiting to happen.
- Network
- StorageI/OArchitecture, design, and/or implementation flaws. Try to find them intentionally, not accidentally.
Design + Architecture
Expect failure. Hosting infrastructure and the cloud are comprised of a lot of moving parts, each of which are prone to fail in their own way. Understand the potential failure points and architect your mission critical resources to survive.
Connectivity
< 40ms
IPv6
10Gb
Reach13 data centers
16 network POPs
20Gb fiber interconnects
Singapore Dallas Amsterdam
Distributed
Local Storage
Network Storage
Inventory(You can’t deploy what doesn’t exist)
Common network
API
Hybrid architectures
Scale Up
Scale Out
Scale Down
Unified image-based provisioning system
Move between virtual, physical
Clone/reload/snapshot physical servers
TimeProvisioning speed
API feature set
Contract length
Control
Control
Better living through programmingIncrease agility
Reduce human error
Enable application autoscaling
Evaluate on scope, documentation, support & community
Global deployments ! No capital expenditure ! Significant scale Consumptive billing ! Complete control
Mature Easy-to-use GUI
Citrix-backed
RightScale compatible Flexible network architecture
Internet scale Open source platform
Why CloudPlatform?!
Management Server!
ZONE 1!DATA CENTER 1!
Private VLAN!
Zones!
MULTIPLE OTHER ZONES!MULTIPLE DATA CENTERS!
Private VLAN!
One Management node supports one or more zones
VLAN !Spanning
Management Server!
ZONE 1!DATA CENTER 1!
Clusters & Hosts!
Physical Host!
Guest VMs!
Physical Host!
Guest VMs!
Physical Host!
Guest VMs!
One or more clusters per zone, one or more hosts per cluster Cluster defined by storage
Local storage == single-server cluster
Cluster Cluster
Management Server!
Hardware Options!Management Node Single Proc Quad Core Westmere, 6 GB DDR3 2x2TB SATA
Host Node Options Smallest: Dual Proc Quad Core Nehalem, 6-192GB DDR3 Biggest: Quad Processor 10 Core Westmere EX, 32-512GB DDR3 Storage: SATA, SA SCSI SSD
Network: 100Mb-10Gb line speed
Guest VMs! Guest VMs!
Host Node! Host Node!
DEDICATED!HADOOP CLUSTER: DALLAS!
PRIVATE CLOUD!ZONE 2: SINGAPORE!
Physical Hosts!
Guest VMs!
Object Storage!Management Server! Object Storage!
Physical Hosts!
Guest VMs!
PRIVATE CLOUD!ZONE 1: DALLAS!
San Jose!
PUBLIC CLOUD WEB SERVERS!
Dallas! Singapore!
COMMAND & CONTROL!