© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Scott Ward, Solutions Architect March 30 th 2016 Managed Database Services on Amazon Web Services
Apr 12, 2017
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Scott Ward, Solutions Architect
March 30th 2016
Managed Database Services on Amazon Web Services
Today’s agenda
• Why managed database services?• A non-relational managed database• A relational managed database• A managed in-memory cache• A managed data warehouse• What to do next
Why managed database services?
Options for running your database
• Self-Managed—You are responsible for the hardware, OS, security, updates, backups, replication etc., but have full control over it.
• EC2 Instances—You only need to focus on the database level updates, patches, replication, backups etc. and don’t have to worry about the hardware or the OS installation.
• Fully Managed—You get features such as backup and replication etc. as a package service and don’t have to bother with patching and updates.
What are the AWS managed DB options?
A managed service for each major DB type
In-Memory Key-Value Store
Amazon ElastiCache
Data Warehouse
Amazon Redshift
SQL Database Engines
Amazon RDS
Document and Key-Value Store
Amazon DynamoDB
Pick the best tool for the job
What is Amazon RDS?
Relational databases
Fully managed
Fast, predictable performance
Simple and fast to scale
Low cost, pay for what you use
AmazonRDS
Amazon Aurora
Use cases
Applicable wherever you need relational databases
eCommerce Gaming
Websites IT Solutions
Apps
Reporting
RDS feature matrixFeature Aurora MySQL PostgreSQL MariaDB Oracle SQL Server
VPC
High availability
Instance scaling
Encryption
Read replicas Oracle Golden GateCross region
Max storage 64 TB 6 TB 6 TB 6 TB 6 TB 4 TB
Scale storage Auto Scaling
Provisioned IOPS NA 30,000 30,000 30,000 30,000 20,000
Largest instance R3.8XL R3.8XLM4.10XL
R3.8XLM4.10XL
R3.8XLM4.10XL
R3.8XLM4.10XL
R3.8XLM4.10XL
Amazon Aurora: Fast, available, and MySQL-compatible
SQL
Transactions
AZ 1 AZ 2 AZ 3
Caching
Amazon S3
5x faster than MySQL on same hardware
Sysbench: 100K writes/sec and 500K reads/sec
Designed for 99.99% availability
6-way replicated storage across 3 AZs
Scale to 64 TB and 15 read replicas
Amazon RDS is simple and fast to scale
Database instance types offer a range of CPU and memory selections
Scale up or down amongnstancetypes on demand
Database storage is scalable on demand
Amazon RDS offers fast, predictable storage
General Purpose (SSD) for most workloads
Provisioned IOPS (SSD) for OLTP workloads up to 30,000 IOPS
Magnetic for small workloads with infrequent access
High availability Multi-AZ deployments
Enterprise-grade fault tolerance solution for production databases
Automatic failoverSynchronous replication
Inexpensive and enabled with a few clicks
Choose Read Replicas for greater scalability
Bring data close to your customer’s applications in different regions
Relieve pressure on your master node for supporting reads and writes.
Promote a read replica to a master for faster recovery in the event of disaster
Choose cross-region replication for enhanced data locality, even more ease of migration
Even faster recovery in the event of disaster
Bring data close to your customers
Promote to a master for easy migration
Choose cross-region snapshot copy for even greater durability, ease of migration
Copy a database snapshot to a different AWS region
Warm standby for disaster recovery
Base for migration to a different region
How Amazon RDS backups work?
Automated backupsRestore your database to a point in time
Enabled by default
Choose a retention period, up to 35 days
Manual snapshotsBuild a new database instance from a snapshot when needed
Initiated by you
Persist until you delete them
Stored in Amazon S3
You pay for the resources that you use
Monthly bill = N ×
Further details at http://aws.amazon.com/elasticache/pricing/
Duration for which the nodes were used
Number of nodes
(Price depends on type of node)
Free tier (for first 12 months)750 micro DB instance hours 20 GB of DB storage20 GB for backups10 million I/O operations
+
Storage consumed
(Price depends on type of storage)
GB
Selected Amazon RDS customers
What is Amazon DynamoDB?
Amazon DynamoDB
NoSQL database
Fully managed
Single-digit millisecond latency
Massive and seamless scalability
Low cost
AmazonDynamoDB
Amazon DynamoDB: a managed document and key-value storeSimple and fast to deploy
Simple and fast to scale• To millions of IOPS
Data is automatically replicated
Fast, predictable performance• Backed by SSD storage
Secondary indexes offer fast lookups
No cost to get started; pay only for what you consume
Popular use cases
Ad serving, retargeting, ID
lookup, user profile
management, session-
tracking, RTB
Tracking state, metadata and
readings from millions of
devices, real-time
notifications
Recording game details,
leaderboards, session
information, usage history,
and logs
Storing user profiles, session
details, personalization
settings, entity specific
metadata
Ad Tech IoT Gaming Mobile& Web
WritesReplicated continuously to 3 AZsPersisted to disk (custom SSD)
ReadsStrongly or eventually consistent
No latency trade-off
Automatic replication for rock-solid durability and availability
Amazon DynamoDB is a schemaless database
Table Items
Attributes (name-value pairs)
Each item must include a key
Hash key (DynamoDB maintains an
unordered index)
Each item must include a key
Hash key
Range key (DynamoDB maintains a
sorted index)
Local secondary indexes = alternate range keys
Hash key
Range key
LSI key
Global secondary indexes = “pivot charts” for your table
Choose which attributes
to project (if any)
Define the desired performance using provisioned throughput
Read capacity units
Writecapacity units
1 RPS > 2.5 M requests in a month
DynamoDB: What are capacity units?
One write per second up to 1KB
One strongly consistent read per second up to 4KB
or
Two eventually consistent reads per second
One write capacity unit
One read capacity unit
Simple app architecture with Amazon DynamoDB
Elastic LoadBalancing Amazon EC2
app instances
Clients
DynamoDB
Business logic
How DynamoDB billing works
Monthly bill = GB +
Assumes DB instance accessed only from AWS regionFurther details at http://aws.amazon.com/dynamodb/pricing/
≈ 5 GB * $0.25 + 21 * 720 hrs * $0.0065/10 + 35 * 720 hrs * $0.0065/50 ≈ $14.36
Storage consumed(plus 100 bytes per item)
Charge for write capacity units
per hour
+Charge for
read capacity unitsper hour
How DynamoDB billing works (with free tier)
Monthly bill = GB +
Assumes DB instance accessed only from AWS regionFurther details at http://aws.amazon.com/dynamodb/pricing/
≈ 5–25 GB * $0.25 + 21–25 * 720 hrs * $0.0065/10 + 35–25 * 720 hrs * $0.0065/50
Storage consumed(plus 100 bytes per item)
Charge for write capacity units
per hour
Charge for read capacity units
per hour
Free tier (for first 12 months)• 25 GB Storage• 25 Units Write Capacity• 25 Units Read Capacity
+
How DynamoDB billing works (with free tier)
Monthly bill = GB +
Assumes DB instance accessed only from AWS regionFurther details at http://aws.amazon.com/dynamodb/pricing/
≈ 0 + 0 + 10 * 720 hrs * $0.0065/50 ≈ $0.94
Storage consumed(plus 100 bytes per item)
Charge for write capacity units
per hour
+Charge for
read capacity unitsper hour
Selected DynamoDB customers
NoSQL vs. SQL for a new app: how to choose?• Strong schema, complex
relationships, transactions and joins
• Scaling is difficult• Focus on consistency
over scale and availability
• Schema-less, easy reads and writes, simple data model
• Scaling is easy• Focus on performance and
availability at any scale
NoSQL SQL
What is Amazon Redshift?
Amazon Redshift
a lot fastera lot cheapera whole lot simpler
Relational data warehouse
Massively parallel; petabyte scale
Fully managed
HDD and SSD platforms
$1,000/TB/year; starts at $0.25/hour
Who uses Amazon Redshift?
• Reduce costs by extending DW rather than adding HW
• Migrate completely from existing DW systems
• Respond faster to business; provision in minutes
• Improve performance by an order of magnitude
• Make more data available for analysis
• Access business data via standard reporting tools
• Add analytic functionality to applications
• Scale DW capacity as demand grows
• Reduce HW and SW costs by an order of magnitude
Traditional enterprise DW Companies with big dataSaaS companiesCompanies with big data
SQL Clients/BI Tools
Compute Node
Compute Node
Compute Node
LeaderNode
Amazon Redshift architectureLeader node
• Simple SQL endpoint• Stores metadata• Optimizes query plan• Coordinates query execution
Compute nodes• Local columnar storage• Parallel/distributed execution of all queries,
loads, backups, restores, resizes
Start at just $0.25/hour, grow to 2 PB (compressed)
• DC1: SSD; scale 160 GB–326 TB• DS2: HDD; scale 2 TB–2 PB
10 GigE(HPC)
IngestionBackupRestore Amazon S3/DynamoDB/Amazon EMR
JDBC/ODBC
Amazon Redshift dramatically reduces I/O
• With row storage, you do unnecessary I/O
• To get total amount, you have to read everything
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
• With column storage, you only read the data you need
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
analyze compression listing;
Table | Column | Encoding ---------+----------------+---------- listing | listid | delta listing | sellerid | delta32k listing | eventid | delta32k listing | dateid | bytedict listing | numtickets | bytedict listing | priceperticket | delta32k listing | totalprice | mostly32 listing | listtime | raw
Amazon Redshift dramatically reduces I/O
• COPY compresses automatically
• You can analyze and override
• More performance, less cost
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
Amazon Redshift dramatically reduces I/O
10 | 13 | 14 | 26 |…
… | 100 | 245 | 324
375 | 393 | 417…
… 512 | 549 | 623
637 | 712 | 809 …
… | 834 | 921 | 959
10
324
375
623
637
959
• Track the minimum and maximum value for each block
• Skip over blocks that don’t contain relevant data
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
DW.HS1.8XL:
• > 2 GB/sec scan rate
• Optimized for data processing
• High disk density
16 GB RAM
2 cores
2 TB disk
DW.HS1.XL:
128 GB RAM
16 cores
16 TB disk
Fully managed, continuous/incremental backups
Multiple copies within cluster
Continuous and incremental backups to Amazon S3
Continuous and incremental backups across regions
Streaming restore Amazon S3
Amazon S3
Region 1
Region 2
Compute Node
Compute Node
Compute Node
Amazon Redshift offers rock-solid fault tolerance
Amazon S3
Amazon S3
Region 1
Region 2
Compute Node
Compute Node
Compute NodeDisk failures
Node failures
Network failure
AZ/region level disasters
You pay for what you use
Further details at https://aws.amazon.com/redshift/pricing/
Monthly bill = N ×
Number of nodes Duration for which the
nodes were used(Price depends on type of node)
Free Tier (2 month free trial)• 750 DC1.Large hours per month
Redshift has a large ecosystemData Integration Systems IntegratorsBusiness Intelligence
Selected Amazon Redshift customers
What is Amazon ElastiCache?
In-memory key-value store
High-performance
Resizable in-memory caching
Memcached and Redis
Fully managed; zero admin
Compatible with your existing applications
AmazonElastiCache
Popular use cases
Caching layer for performance or cost optimization of an underlying database
Storage of ephemeral key-value data
High-performance application patterns such as leaderboards (for gaming users), session management, event counters, in-memory lists
• Fully managed• Cache node auto-discovery• Multi-AZ node placement
Key ElastiCache features
• Fully managed• Multi-AZ with auto-failover• Persistence• Read replicas
Amazon ElastiCache: simple app architecture
Elastic LoadBalancing Amazon EC2
app instances
Clients
Amazon RDSAmazon ElastiCache
Amazon ElastiCache: resilient app architecture
Elastic LoadBalancing
Clients
AZ a
AZ b
How ElastiCache billing works
Monthly bill = N ×
Further details at http://aws.amazon.com/elasticache/pricing/
Duration for which the nodes were used
Number of nodes
(Price depends on type of node)
Free tier (for first 12 months)• 750 micro cache node hours
Selected ElastiCache customers
Managed DB services: better together
Elastic LoadBalancing
Clients
AZ a
AZ b
Next Steps
Free Tier
DynamoDB
RDS
ElastiCache
Redshift
Thank you!