The Journey to the Cloud: Databases @ AWS Yan Sun Database Service Team AOL AWS Meetup July 2016
Agenda
● Who are We?● Myths of Cloud Databases● Options to Move Databases to AWS● Database as Code (DBaC)● Best Practice and Recommendations● Q&A
Who are We?
● Database Service Team: 20+ DBAs/DBEs● Manage databases across AOL● Support Oracle, MySQL, PostgreSQL, MongoDB, Vertica, Sybase,
Redis, MSSQL…● Hybrid: Data Centers & AWS
Myth 1: Cloud databases are expensive
● Cloud cost model○ Pay-as-you-go
○ No upfront cost
● No more over-provisioning● DBA/SA: need to design with
cost in mind ($$-smart):○ Compute, storage, network
transfer ..
Truth:Cloud databases are cost effective
Myth 2: The databases couldn’t perform well in the cloud
● Many choices of cloud resources○ Compute resources (CPU, memory
optimized instances etc.)
○ Storage options
● DBA/SA: ○ Benchmark your systems
○ Tune your system and database
parameters
Truth:Cloud databases perform better with more capacity and scalability
Myth 3: Cloud databases are not reliable
● Cloud provides options for easy replication and redundancy
● DBA/SA: design for failures○ Auto-healing○ Auto-failover
○ Replication: cross-AZ/region
○ Backup/Recovery
Truth:Cloud databases are designed for high availability & redundancy
Myth 4: Cloud is not secure
● For example: network security in AWS○ Network ACL○ Security Group for instances
● Shared Security Model● DBA/SA: bottom-up security
○ OS/Filesystems/Networks
○ Databases
○ Applications…
Truth: Cloud has lot of security options
Lift & Shift (Move as-is)
● What is Life & Shift? ○ Move in-house databases to AWS (EC2) without modifying the designs and
build/deployment processes.
● Pros: ○ Minimal changes required to move databases○ Straightforward migration and deployment process
● Cons:○ Doesn’t take advantage of native cloud features○ More manual efforts in long term
AWS RDS - Relational Database Service● Pros
○ Hands-Off Approach (Oracle, MySQL, Postgres, MSSQL...)■ Quick & easy!■ Quick provisioning/deploying■ Multi-AZ High Availability / Auto-Failover
● Cons○ No access to OS○ Limited options of database platform and versions
■ No MongoDB, Vertica, Aerospike ...○ Cost approx. 30% more than databases@EC2○ Limited architecture and features○ 30+ minutes downtime for database minor version upgrades ○ Limited support for major version upgrades
Database as Code - DBs@EC2
● What is Database as Code?○ Scripting the process of provisioning and configuring databases on EC2 instances
● Pros○ Flexibility
■ Support any database platform and any version■ Customizable to any projects and applications
○ Cost effectiveness○ Automation
● Cons○ Resource and time to build & maintain the automation processes, but long term business
benefit
Goal #1 - End-to-End Automation
● Host provisioning● Database provisioning: install and
create database● Patch/upgrades● Replication/clustering● Failover● Rebuilding of failing host● Monitoring● Metrics collection 99% of automation = 0% automation
Goal #2 - Leverage Cloud Features
● Take advantages of cloud features and AWS native services
● Design for Failures
DBaC -How?
CloudFormation● Auto Scaling Groups● DynamoDB● S3● Lambda● SNS/SQS● IAM Role● Consul (non-aws)
CloudFormation
● Cfn Templates ○ JSON templates + scripts + Lambda (Custom
Resources)■ User Data■ AWS::CloudFormation::Init
○ Database as Code■ Standardized ■ Reusable ■ Versioning
● Stacks: deploy cfn template
DBaCTemplates
● Oracle (RDS & EC2)● MySQL (EC2)● Galera MySQL Cluster (EC2)● MongoDB (EC2)● PostgreSQL (EC2)
Implementation Example
Oracle@EC2
● Click-a-button Provision (host & db)
● Auto Failover● Fast Failover (~30 seconds)● Auto Healing of Failed Host● Rolling Updates/Upgrades● Cost Effective● DBA Support
Best Practices & Recommendations
● Money talks:○ EC2: Start with small instances or use Reserved Instances○ Storage: Stripe EBS to get best performance with lower costs○ Backup/Archive: S3/Glacier
● RDS vs. EC2● Choose Right Instance (instance class & size)
○ Understand your applications and databases● Storage, Storage, Storage!
○ General Purpose vs. Provisioned IOPS○ EBS Optimized Instances
● Backup○ Snapshots vs. Database Backup
● Design for Failures!