Disaster Recovery Site on AWS - Minimal Cost Maximum Efficiency (STG305) | AWS re:Invent 2013

Post on 12-Jan-2015

1244 Views

Category:

Technology

6 Downloads

Preview:

Click to see full reader

DESCRIPTION

Implementation of a disaster recovery (DR) site is crucial for the business continuity of any enterprise. Due to the fundamental nature of features like elasticity, scalability, and geographic distribution, DR implementation on AWS can be done at 10-50% of the conventional cost. In this session, we do a deep dive into proven DR architectures on AWS and the best practices, tools and techniques to get the most out of them.

Transcript

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

Disaster Recovery Site on AWS:

Minimal Cost Maximum Efficiency

Abdul Sathar Sait, Vikram Garlapati, and Kamal Arora (AWS)

November 15, 2013

What you will learn

• Why AWS for disaster recovery?

• Common DR architectures

– Pilot light architecture

• Demo

• Code walkthrough

– Backup and restore

• Customer case studies

• Where to go next

Conventional Disaster Recovery sites

• High cost

• Low ROI

• Implemented only for most critical systems

• Usually scaled down to 50% of production

• Systems in a remote region challenging

• Costly software licenses based on hardware usage

Disaster Recovery site on AWS

• Unprecedented capabilities to implement DR sites

• Easily setup DR sites on different geographic regions

• Cut down DR site cost by up to 70%

• Substantial savings on software licenses

Global reach from your desktop

Common DR architectures

Backup and

restore Pilot light

Warm standby

Hot standby

Pilot light architecture

Pilot light architecture

Create instances from

AMIs

Build resources around

replicated dataset

Keep ‘pilot light’ on by replicating core

databases

Build AWS resources around dataset and

leave in stopped state

Pilot light architecture

Build resources around

replicated dataset

Keep ‘pilot light’ on by replicating core

databases

Build AWS resources around dataset and

leave in stopped state

Scale resources in AWS in

response to a DR event

Start up pool of resources in AWS when

events dictate

Scale up the database instance to handle

production capacity

Pilot light architecture

Pilot light architecture

Switchover to AWS Make necessary DNS changes to redirect

traffic to the DR site on AWS

Pilot Light

DEMO

Setup Data Replication

Active Passive

Amazon Route 53

Scaled down Standby

Elastic Load

Balancing

Data Volume

Web/ App servers

US East (N. Virginia)

Web/ App Server AMI

Simple DR solution – awsdrdemo.com

Copy AMI

US West (N. California)

Active

Auto scaling Group

Oracle Master

DB

Oracle Slave DB

Active

Amazon Route 53

Elastic Load

Balancing

Data Volume

Web/ App servers

US East (N. Virginia)

Simple DR solution – awsdrdemo.com

US West (N. California)

Gone Active

Elastic Load

Balancing

Data Volume

Web/ App servers

Active

Auto Scaling group

Oracle Master

DB

Oracle Slave DB

DNS Failover

Autoscale

Scale up DB

Architecture

Active Mirroring /

Replication

Active Passive Amazon Route 53

AMI - Scaled down

Standby

Data Volume

Secondary DB

US West (N. California) Data

Volume

Primary

Web/ App server

US East (N. Virginia)

Webserver AMI

AMI Copy

(ami-996634f0)

Failover App

VPC ID - vpc-a4f2efcc

Subnet IDs-

subnet-bbf2efd3

subnet-884b01ce

subnet-bef2efd6

VPC ID - vpc-5f9ef53e

Subnet IDs-

subnet-440c786c

subnet-289ef549

subnet-2c9ef54d

DR ELB -

Created on Failover

Web Servers:

i-36af5751

awsdrdemo.com

Active ELB:

DRDemoPrimaryELB-

52152634.us-east-

1.elb.amazonaws.com

Primary Database Server:

(i-026aad65)

Private IP

174.168.1.11

Secondary Database Server:

(i-3b266960)

Private IP

174.168.1.11

Failover App Instance:

i-55cfde0e

Elastic IP

54.215.157.25

Web Servers -

Created on Failover

failover.awsdrdemo.com

console.aws.amazon.com

Demo – AWS Resources

awsdrdemo.com

Demo – Application

failover.awsdrdemo.com

Demo – Failover Kickoff

status.awsdrdemo.com/dr

Demo – Failover Status Updates

Failover Steps

Launch Failover

Application

AWS CloudFormation

- Launch web servers

Resize Target

Database Instance

Route 53 DNS

Updates

AWS CloudFormation

Launch ELB Go Live

Failover Application Architecture

AWS Region

Webserver AMI

Failover App

CLI

(3)

Launch

CloudFormation

Admin

Users

SNS HTTP

Notification

(5)

CF

Updates

(4)

Script

Updates

(2)

Invoke

Shell Script

(1)

Trigger DR

procedure

(6)

Real-time

feed from SNS

Metadata Requests // Sample code for metadata request using .NET API SDK

string uri = "http://169.254.169.254/latest/meta-data/placement/availability-zone";

// Create Web Request

HttpWebRequest webrequest = (HttpWebRequest)WebRequest.Create(uri);

HttpWebResponse webresponse =

webresponse = (HttpWebResponse)webrequest.GetResponse();

Encoding enc = System.Text.Encoding.GetEncoding(1252);

StreamReader loResponseStream = new

StreamReader(webresponse.GetResponseStream(), enc);

// get availability zone value

string availzone = loResponseStream.ReadToEnd();

Amazon Route53 Updates

# Retrieving existing ELB details from Route53 Hosted Zone..“

domainname=www.awsdrdemo.com

hostedzoneid="ZXXXXXXXXXXXXR“

# Retrieve ELB alias zone-id from existing Route53 zone

zoneid= $(aws --region us-west-1 --output text route53 list-resource-record-sets --hosted-zone-id $hostedzoneid --

start-record-name $domainname --start-record-type A --max-items 1 | grep ALIASTARGET | awk {'print $2'})

dns=$(aws --region us-west-1 --output text route53 list-resource-record-sets --hosted-zone-id $hostedzoneid --start-

record-name $domainname --start-record-type A --max-items 1 | grep ALIASTARGET | awk {'print $4'})

aws --region us-west-1 route53 change-resource-record-sets --hosted-zone-id $hostedzoneid --

change-batch file:///usr/local/bin/route53.json

http://vrg.s3.amazonaws.com/downloads/route53.json

Resize Database Instance # Stopping DB instance for resizing

aws --region us-west-1 ec2 stop-instances --instance-ids $dbInstanceId

# Publish Amazon SNS messages for actions

aws --region us-west-1 sns publish --topic-arn $snsarn --message "Resizing the stopped

instance“

# Resize the DB instance

aws --region us-west-1 ec2 modify-instance-attribute --instance-id $dbInstanceId --instance-

type "{\"Value\": \"m1.small\"}"

# Start the resized DB instance

aws --region us-west-1 ec2 start-instances --instance-ids $dbInstanceId

AWS CloudFormation Stack Launch # Launch DR stack using AWS CloudFormation script

launchedstackid =$(aws --region us-west-1 --output text cloudformation create-stack --stack-

name $stackname --template-body file:///usr/local/bin/ELBWithEC2Instances.template --

notification-ar-ns $snsarn --parameters

ParameterKey="HostedZoneId",ParameterValue="$hostedzoneid")

AWS CloudFormation Template {

"AWSTemplateFormatVersion" : "2010-09-09",

"Description" : "AWS CloudFormation Template ELBWithEC2Instances: Create a load balanced, Auto Scaled sample website where the instances are locked down to only accept traffic from the load balancer. This script creates an Auto Scaling group behind a load balancer with a simple health check. The web site is available on port 80, however, the instances can be configured to listen on any port (8888 by default).",

"Parameters" : {

"KeyPairName" : {

"Description" : "Name of an existing Amazon EC2 key pair for SSH access",

"Type" : "String",

"Default" : "kamalkeydr"

},

"InstanceType" : {

"Description" : "WebServer EC2 instance type",

"Type" : "String",

"Default" : "m1.small",

"AllowedValues" : [ "t1.micro","m1.small","m1.medium","m1.large","m1.xlarge","m2.xlarge","m2.2xlarge","m2.4xlarge","c1.medium","c1.xlarge","cc1.4xlarge","cc2.8xlarge","cg1.4xlarge"],

"ConstraintDescription" : "must be a valid EC2 instance type."

},

"WebServerPort" : {

"Description" : "TCP/IP port of the web server",

"Type" : "String",

"Default" : "80"

},

"HostedZoneId" : {

"Type" : "String",

"Description" : "The Record Set's Hosted Zone Id for the existing hosted zone",

"Default" : "Z1M58G0W56PQJA"

}

},

"Mappings" : {

"AWSInstanceType2Arch" : {

"t1.micro" : { "Arch" : "64" },

"m1.small" : { "Arch" : "64" },

"m1.medium" : { "Arch" : "64" },

"m1.large" : { "Arch" : "64" },

"m1.xlarge" : { "Arch" : "64" },

"m2.xlarge" : { "Arch" : "64" },

"m2.2xlarge" : { "Arch" : "64" },

"m2.4xlarge" : { "Arch" : "64" },

"c1.medium" : { "Arch" : "64" },

"c1.xlarge" : { "Arch" : "64" }

},

"AWSRegionArch2AMI" : {

"us-west-1" : { "32" : "ami-5e41761b", "64" : "ami-5e41761b" }

}

},

"Resources" : {

"WebServerGroup" : {

"Type" : "AWS::AutoScaling::AutoScalingGroup",

"Properties" : {

"AvailabilityZones" : [ "us-west-1a"],

"LaunchConfigurationName" : { "Ref" : "LaunchConfig" },

"MinSize" : "2",

"MaxSize" : "2",

"LoadBalancerNames" : [ { "Ref" : "ElasticLoadBalancer" }],

"VPCZoneIdentifier" : ["subnet-bbf2efd3"]

}

},

"LaunchConfig" : {

"Type" : "AWS::AutoScaling::LaunchConfiguration",

"Properties" : {

"ImageId" : { "Fn::FindInMap" : [ "AWSRegionArch2AMI", { "Ref" : "AWS::Region" },

{ "Fn::FindInMap" : [ "AWSInstanceType2Arch", { "Ref" : "InstanceType" },

"Arch" ] } ] },

"UserData" : { "Fn::Base64" : { "Ref" : "WebServerPort" }},

"SecurityGroups" : [ { "Ref" : "InstanceSecurityGroup" } ],

"InstanceType" : { "Ref" : "InstanceType" },

"KeyName" : { "Ref" : "KeyPairName" },

"AssociatePublicIpAddress" : "true"

}

},

"ElasticLoadBalancer" : {

"Type" : "AWS::ElasticLoadBalancing::LoadBalancer",

"Properties" : {

"SecurityGroups" : [ { "Ref" : "LoadBalancerSecurityGroup" } ],

"Subnets" : ["subnet-bbf2efd3"],

"Listeners" : [ {

"LoadBalancerPort" : "80",

"InstancePort" : { "Ref" : "WebServerPort" },

"Protocol" : "HTTP"

} ],

"HealthCheck" : {

"Target" : { "Fn::Join" : [ "", ["HTTP:", { "Ref" : "WebServerPort" }, "/"]]},

"HealthyThreshold" : "2",

"UnhealthyThreshold" : "10",

"Interval" : "10",

"Timeout" : "3"

}

}

},

"LoadBalancerSecurityGroup" : {

"Type" : "AWS::EC2::SecurityGroup",

"Properties" : {

"GroupDescription" : "Enable HTTP access on port 80",

"VpcId" : "vpc-a4f2efcc",

"SecurityGroupIngress" : [ {

"IpProtocol" : "tcp",

"FromPort" : "80",

"ToPort" : "80",

"CidrIp" : "0.0.0.0/0"

} ],

"SecurityGroupEgress" : [ {

"IpProtocol" : "tcp",

"FromPort" : { "Ref" : "WebServerPort" },

"ToPort" : { "Ref" : "WebServerPort" },

"CidrIp" : "0.0.0.0/0"

} ]

}

},

"myDNS" : {

"Type" : "AWS::Route53::RecordSetGroup",

"Properties" : {

"HostedZoneName" : "awsdrdemo.com.",

"Comment" : "Zone apex alias targeted to myELB LoadBalancer.",

"RecordSets" : [

{

"Name" : "www.awsdrdemo.com.",

"Type" : "A",

"AliasTarget" : {

"HostedZoneId" : { "Fn::GetAtt" : ["ElasticLoadBalancer", "CanonicalHostedZoneNameID"] },

"DNSName" : { "Fn::GetAtt" : ["ElasticLoadBalancer","CanonicalHostedZoneName"] }

}

}

]

}

},

"InstanceSecurityGroup" : {

"Type" : "AWS::EC2::SecurityGroup",

"Properties" : {

"GroupDescription" : "Enable SSH access and HTTP access on the inbound port",

"VpcId" : "vpc-a4f2efcc",

"SecurityGroupIngress" : [ {

"IpProtocol" : "tcp",

"FromPort" : { "Ref" : "WebServerPort" },

"ToPort" : { "Ref" : "WebServerPort" },

"CidrIp" : "0.0.0.0/0"

} ]

}

}

},

"Outputs" : {

"URL" : {

"Description" : "URL of the website",

"Value" : { "Fn::Join" : [ "", [ "http://", { "Fn::GetAtt" : [ "ElasticLoadBalancer", "DNSName" ]}]]}

}

}

}

HEADERS

PARAMETERS

MAPPINGS

RESOURCES

OUTPUTS

http://vrg.s3.amazonaws.com/downloads/ELBWithEC2Instances.template

Parameters "Parameters" : {

"KeyPairName" : {

"Description" : "Name of an existing Amazon EC2 key pair for SSH access",

"Type" : "String"

},

"InstanceType" : {

"Description" : "WebServer EC2 instance type",

"Type" : "String",

"Default" : "m1.small",

"AllowedValues" : [

"t1.micro","m1.small","m1.medium","m1.large","m1.xlarge","m2.xlarge","m2.2xlarge","m2.4xlarge","c1.medium","c1.xlarge","cc1.4xlarge","cc2.8xl

arge","cg1.4xlarge"],

"ConstraintDescription" : "must be a valid EC2 instance type."

},

"HostedZoneId" : {

"Type" : "String",

"Description" : "The Record Set's Hosted Zone Id for the existing hosted zone"

}

}

Resources – Web Servers "WebServerGroup" : {

"Type" : "AWS::AutoScaling::AutoScalingGroup",

"Properties" : {

"AvailabilityZones" : [ "us-west-1a"],

"LaunchConfigurationName" : { "Ref" : "LaunchConfig" },

"MinSize" : "2",

"MaxSize" : "2",

"LoadBalancerNames" : [ { "Ref" : "ElasticLoadBalancer" }],

"VPCZoneIdentifier" : ["subnet-bbf2efd3"]

}

},

"LaunchConfig" : {

"Type" : "AWS::AutoScaling::LaunchConfiguration",

"Properties" : {

"ImageId" : { "Fn::FindInMap" : [ "AWSRegionArch2AMI", { "Ref" : "AWS::Region" },

{ "Fn::FindInMap" : [ "AWSInstanceType2Arch", { "Ref" : "InstanceType" }, "Arch" ] } ] },

"UserData" : { "Fn::Base64" : { "Ref" : "WebServerPort" }},

"SecurityGroups" : [ { "Ref" : "InstanceSecurityGroup" } ],

"KeyName" : { "Ref" : "KeyPairName" }

}

status.awsdrdemo.com/dr

Demo – Failover Status Updates

Disaster recovery site on AWS can be for

• Primary site on customer data center

• Primary on AWS itself

Primary and DR sites on AWS

Backup & Restore pattern

Simple to get started

Easy starting point for exploring the

AWS cloud

Low technical barrier to entry

Focus on incorporating cloud into your

DR strategy, not on complex technical

issues related to hot-hot systems

Cost-effective

Very high levels of data durability at

low price

Cost of storing snapshots in

Amazon S3

Archiving possibilities beyond tape

using Amazon Glacier

Backup and restore

Backup and restore

Create instances from

AMIs

Restore data from backups

Backup and restore

Many ways to backup

Disaster Recovery site on AWS can be for

• Primary site on customer data center

• Primary on AWS itself

Primary and DR sites on AWS

Customer case study

We are sincerely eager to hear

your feedback on this

presentation and on re:Invent.

Please fill out an evaluation form

when you have a chance.

top related