Top Banner
The Columbus Dispatch on Amazon
28

The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: [email protected] Twitter: @GraphIt2000 LinkedIn:

Dec 17, 2015

Download

Documents

Dinah Davidson
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

The Columbus Dispatch on

Amazon

Page 2: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

The Presenters

David LandremanWeb Services IT Manager

Email: [email protected]

Twitter: @GraphIt2000

LinkedIn: www.linkedin.com/in/davidlandreman

Andrew RothSenior Internet Development Engineer

Email: [email protected]

LinkedIn: www.linkedin.com/in/rothandrew

Page 3: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

• Newspapers, weekly periodicals, TV stations, and radio stations

• 23 unique websites hosted on Amazon in a unified content management system (OpenCMS)

• Millions of pageviews daily

Page 4: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

Cloud Migration

• Project to upgrade content management system to new version

• Original plan was to migrate hardware from co-lo data center physical to VMWare virtual at the same data center

• 2 months prior to completion the decision was made to migrate to Amazon Web Services (AWS)

Page 5: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

Amazon Selection Factors• Team familiarity with AWS

• Ease of beginning an engagemento No contract

• Costo Limited payment options

• Large client base (Netflix, Instagram, …)

• Large selection of services beyond virtual computing resources

Page 6: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:
Page 7: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

Cloud Computing Paradigms• Infrastructure as a Service (IaaS)

• Virtualized computing hardware

• Platform as a Service (PaaS)• Prepackaged / managed runtime application

platform. Reduced complexity when compared to IaaS

• Software as a Service (SaaS)• Full service software solution running in the

cloud.

* Amazon has offerings in all these areas *

Page 8: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

The Cloud Model

Page 9: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

Amazon Offerings• Infrastructure as a Service• EC2, Elastic Load Balancers

• Platform as a Service• Elastic BeanStalk, Elastic Map Reduce,

CloudFormation, Relational Database Service, SimpleDB, DynamoDB

• Software as a Service• Flexible Payments Service, Mechanical Turk

Page 10: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

Scalability on Amazon

• RDS and EC2 allows for easy scaling up / down of server sizes

• Elastic Beanstalk compatible applications can make use of autoscaling of EC2 instances (EC2 as a PaaS)

• Spot Instances for non time sensitive tasks

Page 11: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

Reliability on Amazon• Regions

o Consist of one or more Availability Zones, are geographically dispersed, and will be in separate geographic areas or countries. Currently there are 8 Regions.

• Availability Zoneso Distinct locations engineered to be insulated from

failures in other Availability Zones and provide inexpensive, low latency network connectivity to other Availability Zones in the same Region.

• Protect application against failures in a single location by launching instances in different Regions and Availability Zones.

Page 12: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

Security on Amazon• Can it be secure?

• Security Groups• IAM - User / Role Management

• Can it be PCI compliant?• PCI DSS Level 1 for most services (RDS, S3, …)

• Can it be HIPAA compliant?• Security Groups• In flight / at rest encryption• Case Study: MedCommons patient records

• Government Compliance• GovCloud - Requires pre-approval from AWS to start

infrastructure in this cloud.• Only United States

• Virtual Private Cloud (VPC) vs Classic Cloud

Page 13: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

How Dispatch Uses Amazon

• All 23 public facing sites are hosted entirely in Amazon

• Limited communication back to internal infrastructure via Web Services

• Multiple environments (Prod, QA, Development)o Ease of keeping QA as a mirror of production

Page 14: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:
Page 15: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

CloudWatch• Default metrics for all EC2

Instances• e.g. CPU, memory, disk IO• 1 minute measurement

interval

• Tracking custom metrics• OpenCMS Publish Times,

Java Heap Usage, Database Connections

• Alert Thresholds• Text message, email,

JSON posts

Page 16: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

SimpleDB - NoSQL• Used for tracking user content access for

metered site access• 60 million+ records and growing

• Managed NoSQL Solution• NoSQL = Non-relational data store

• Extremely simple and flexible data model (e.g. key-value store)

ClcGrU8Eapig2y5eD7r2Ag==201202www.dispatch.com/content/index.html

Unique User ID

Year/Month of Access

Site Content Accessed

Page 17: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

SimpleDB – NoSQL (cont.)• High availability and scalability• Amazon manages multiple geographically

distrubuted replicas of your data

• SimpleDB is eventually consistent

• Weaknesses• Complex pricing structure / hard to estimate

• No good mechanism for backups

• Decision to use SimpleDB came before launch of DynamoDB

Page 18: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

Relational Database Service (RDS)

• Hosts all web application data • Exception is user access data in SimpleDB

• Managed Relational Database• MySQL, Oracle, MSSQL

• We use MySQL

• Change to traditional database administration• No access to server console

• No access to a true SA account

Page 19: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

Relational Database Service (cont.)

• Multi Availability Zone deployment• Failover is not instantaneous (1 - 2 minutes)

• Easy server upgrades with maintenance windows

• Support for read replica databases

• Restore snapshots automatically generated• Up to past 10 days

Page 20: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

Simple Queue Service (SQS)• Message passing service to facilitate indirect communication

• Similar to Java Message Service (JMS)

• Why use SQS?

• We run in Amazon's classic cloud configuration• IP Addresses randomly assigned

• Ideal architecture on AWS reduces direct communication between boxes

• Used for application servers communicating with video transcoding server

• Starting new transcoding jobs

• Getting transcoding progress updates

• Scalability

Page 21: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

ElastiCache• Managed Memcached Service

• Web service that makes it easy to deploy, operate, and scale an in-memory cache in the cloud.

• Integration of ElastiCache metrics with CloudWatch

• Hit rates, Eviction Rates etc.

• Reduces hits to our database servers, faster page loads and increases ability to handle high traffic volumes

Page 22: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

Simple Email Service (SES)

• Managed service for bulk email sending

• CloudWatch Integration• Monitor deliveries, bounces etc. from within

AWS console

• Multiple endpoints• SMTP• Web Service Calls

Page 23: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

Simple Storage Service (S3)

• First Amazon Web Service Offering

• File storage in the cloud

• Can serve static websites directly from S3

• We use for various processes:• Backing up a SQL dump of our production database• Temporary storage of video content during

transcoding process• Data storage for workflows involving importing

content from other areas of the business into their websites.

Page 24: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

June 29th Weekend• Wind Storm

• US East Region (1 of 4 zones offline)• Both primary and backup power lost• Network connectivity issues between

availability zones

• RDS fail-over to a bad zone

• Leap Second Bug

Page 25: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

Wind Storm - Lessons Learned

• Single point of failure on ElastiCache• All nodes in a cluster occupy a single AWS Zone

• Database failover• Up to a minute of no connectivity

• Limited communication from Amazon

• No time estimates for service restoration

Page 26: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

Leap Second – Lessons Learned

• Not an Amazon specific issue

• Amazon support options• Open a ticket• Phone Calls• Online Chat• Forums (Only free option)

• Amazon able to diagnose issue quickly

23:59:60Saturday, June 30, 2012

UTC

Page 27: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

What is Next?• Moving to Cloud Search

• Reduce Total Cost of Ownership

• Moving from Limelight to CloudFront CDN• Significant cost savings

Page 28: The Columbus Dispatch on Amazon. The Presenters David Landreman Web Services IT Manager Email: dlandreman@dispatch.com Twitter: @GraphIt2000 LinkedIn:

Questions?David Landreman

Web Services IT Manager

Email: [email protected]

Twitter: @GraphIt2000

LinkedIn: www.linkedin.com/in/davidlandreman

Andrew RothSenior Internet Development Engineer

Email: [email protected]

LinkedIn: www.linkedin.com/in/rothandrew