SmugMug spent six years split between its datacenters and AWS. Find out how and why SmugMug went 100% AWS, migrating 30 TB of databases, hundreds of frontends, load balancing, and caches, across the US in one night with zero downtime.We show you specific techniques and processes that made our large-scale migration a resounding success: moving massive MySQL databases, testing and sizing a new AWS infrastructure, automating AWS operations, managing the risks involved in wholesale infrastructure change, and architecting for reliability in multiple AWS Availability Zones. We talk about the performance, scalability, operational, and business benefits and challenges we've seen since moving 100% to AWS. Finally, we share secrets about our favorite AWS products.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Andrew Shieh, SmugMug Operationsshandrew @ smugmug.comNovember 15, 2013
SmugMug’s Zero Downtime Migration to AWSARC312
Friday, November 15, 13
SmugMug—Who are we?
Friday, November 15, 13
The early days of SmugMug• Gradual bootstrapped growth• Multiple self-managed datacenter cages• Too many servers of varying types• Too many disks• Tons of valuable skilled employee
hours spent in cages
Friday, November 15, 13
DataCenter Fantasy
Friday, November 15, 13
Data Center Reality
Friday, November 15, 13
Data Center Reality
Friday, November 15, 13
SmugMug <3 AWS• Early adopter of Amazon S3• Over the years, moved rendering,
upload, archiving, payments, permissions, email, and more compute to AWS