How to build a state-of-the-art rails cluster Tim Lossen Infopark AG, Berlin
Aug 17, 2014
How to build a state-of-the-art
rails cluster
Tim LossenInfopark AG, Berlin
You mean ...like this?
System X
• Virginia Tech
• 1100 x Apple XServe G5
• 2 CPUs each (2,3 GHz)
• Infiniband network
• 12 Teraflops
How to build a state-of-the-art
rails cluster for 100.000 Koruny
Tim LossenInfopark AG, Berlin
100.000 Koruny=
4700 $=
3500 €
• Evolving a cluster setup
• Scaling
• Make or buy?
• Discussion
• performance
• efficiency
• availability
• scalability
Aims
Performance
• pages / second
• throughput
• 10 Mbit connection = max. about 1 million unique page impressions per day
Efficiency
• bang per buck
• bang per watt
Availability
• uptime
• can be expressed in ‚nines‘
• 99% (2 nines) = max. 100 minutes downtime per week
• 99,9999% (6 nines) = max. 0,6 seconds downtime per week
Scalability
• grow (or shrink) app as needed
• ideal: linear with number of machines
Setup 1
Example software
• nginx
• mongrel cluster
• mysql
Ideas
• put everything on one machine
• use top quality hardware
• go for high processor speed
Example hardware
• Sun Fire X2200 M2
• 2 x Opteron 2220 (dual core / 2,8 GHz)
• 4 GB ram
• 250 GB disk
• 3290 €
Result
• performance ✔
• efficiency
• availability
• scalability
Setup 2
Ideas
• put app and database on separate machines
• use cheap commodity hardware
• like Google!
• go for high number of CPUs
Example hardware
• Dell PowerEdge SC 1435
• 2 x Opteron 2212 (dual core / 2 GHz)
• 4 GB ram, 160 GB disk
• 2 machines = 8 cores
• 3340 €
Result
• performance ✔
• efficiency ✔
• availability
• scalability
Setup 3
Ideas
• cluster of identical nodes
• loadbalancing
• failover
• eliminate single points of failure through redundancy
Setup 3a
Virtualization
• partition one physical machine into multiple virtual machines
• assign dedicated ressources (cpu, ram) to virtual machines
• example: xen (linux), zones (solaris)
Why virtualization?
• flexibility
• simplified deployment
• move images between machines
• optimization
• hardware utilization
• tuning
Why virtualization?
• isolation
• use different operating systems
• contain intrusion
• multiple environments (staging, production)
• but: small performance overhead (ca. 5%)
Setup 3a
Loadbalancer
• openBSD
• CARP: shared virtual ip, automatic failover
• pf: firewall + loadbalancer
• openBSD on xen: currently only on ramdisk
• better use dedicated hardware instead
Redundant firewall
Database
• mysql
• master-master replication
• caveat: avoid primary key clashes!
• other possibilities:
• master-slave replication
• mysql cluster
Result
• performance ✔
• efficiency ✔
• availability ✔
• scalability ?
• Evolving a cluster setup
• Scaling
• Make or buy?
• Discussion
Scaling
• not a rails-specific problem
• find and eliminate bottlenecks!
Scaling approaches
• scaling up
• increase power of machines
• scaling out
• increase number of machines
Scaling up: memory
• add more memory
• mongrel can be very memory-hungry
• useful for caching (memcached)
• hint: check crucial.com for low prices
Scaling up: CPU
• upgrade to quad-core Opterons
• available later this year
• same socket
• same power / heat envelope
• use as drop-in replacement
Scaling out
Scaling out
• app servers scale (almost) linearly
• use same hardware (if possible)
• one type of processor, memory, disk
• simplifies maintenance / repair
Bottleneck: application
• lots of concurrent users
• rails is single threaded (mutex lock)
• solution: add more mongrels
• rule of thumb: 10 mongrels per cpu
• twitter runs on ca. 300 mongrels
Bottleneck: database
• avoid to hit database
• cache as much as possible
• scale up database masters
• more disks = faster access
• add read-only db slaves
Scaling 2.0
Scaling storage
• disk failure very common
• use RAID1 for database masters
• network storage
• example: ATA-over-Ethernet (AoE)
• accessible as block device
• combine with cluster file system
Coraid EtherDrive, 11 TB, 200MB/s
• Evolving a cluster setup
• Scaling
• Make or buy?
• Discussion
Make or buy?
• think about total cost of ownership (TCO)
• not only hardware costs
• rackspace, power, bandwidth
• administration overhead
• „Utility? Or generator in the back yard?“ (David Young)
Our cluster
• hardware: 3340 €
• colocation with burst.net (for 1 year)
• 60 $ + 12 x 140 $ = 900 $ = 670 €
• 4000 € + administration
Rails hosting
• example: Engine Yard
• 1 slice = 1 CPU core, 400 MB ram
• 4 slices (for 1 year)
• 796 $ + 12 x 1116 $ = 14.200 $
• 10.500 €
• Evolving a cluster setup
• Scaling
• Make or buy?
• Discussion
• 2-machine cluster is cool (and quite affordable) setup
• virtualization is a Good Thing
• scaling the database can be tricky
• make or buy depends on concrete situation
Summary
Questions?
http://tim.lossen.de
❦
photo creditshttp://www.flickr.com/photos/shivayanamahohm/192322513http://www.flickr.com/photos/nicholas_t/286352669http://www.flickr.com/photos/redjar/480709870