Page 1
Webinar Audio Options• Audio will remain quiet until we
begin at the top of the hour
• Streaming Audio• Appears automatically in pop-up window
• Or click Communicate : Join Audio Broadcast
• Remember to unmute your computer
• No Streaming Audio?• Request phone access
• Technical Support• US & Canada 866.229.3239
• International Support 408.435.7088
Thank you for joining! The webinar will begin
shortly.
Page 2
Housekeeping• Slides and recording will be posted in next
48 hours
• Submit questions via the Q&A Tab in WebEx, we’ll answer as many as we can
• Try it now: tell us where you are joining from
• Hashtags: #acquia #drupal
http://acquia.com/resources/recorded_webinars
Page 3
Upcoming Webinars
• Building a Common Drupal Platform for Your Organization Using Drupal 7
• Accessible Theming in Drupal
• Integrating a CDN with Acquia Cloud
• Ensuring Success When Migrating Your Content to Drupal
• OpenPublic & Drupal: Taking the Guesswork Out of Open Source For Government
• Community Box 2.0, mehrsprachige Communities mit Commons
http://acquia.com/resources/webinars
Page 4
Acquia is Hiring• Do you love working with Drupal?
• Acquia is hiring in North America, Europe, and Australia!
• Engineering / DevOps
• Design
• Support
• Operations
• Client Advisors
• Sales and Marketing
http://acquia.com/careers
Page 5
Constructing a Fault-Tolerant, Highly Available cloud Infrastructure for your Drupal site
Andrew KenneyVP of Platform Engineering
December 12, 2012
Jess IandiorioSr. Director, Cloud Product Marketing
Page 6
Creating killer websites is hard …
Page 7
Hosting them shouldn’t be.
Page 8
For business-critical sites,How do you avoid a crisis?
Page 9
Agenda
• Drupal Hosting Challenges
• Cloud Failure Scenarios
• HA & Resiliency
• Resource Challenges
• Designing for Failure
• Architecting & Automating failover
• Testing Failure
Page 10
Drupal Hosting Challenges
• Drupal expects a POSIX filesystem
• Drupal is not optimized for high-latency MySQL operations
• Drupal is not built with partition tolerance in mind
• Shortage of talent or expertise for operating Drupal in the Cloud or at scale
Page 11
Cloud Failure Scenarios
• Machine loss
• Service outage
• Network disruption
• Inaccessible/unreliable storage system
• Traffic spike
• Control Plane failure
• Corrupt/Partial Backups
Page 12
High Availablity & Resiliency
• Plan for Failure
• Automate deployment & configurations
• Eliminate SPOFs
• Two (at least) of everything
• Monitor everything
• Monitor the monitors
• Back up all data
• Periodically test all backups
• Test emergency procedures
• Never assume any procedure works unless it’s periodically tested
Page 13
Resource Challenges
• Cloud Hype – the cloud frees developers from needing operations staff to do their job
• Cloud Reality – the cloud introduces even more instability unless you plan for failure
Page 14
Designing for Failure
1. Multiple AZ hosting
Page 15
Designing for Failure
1. Multiple AZ hosting
2. Multiple region hosting
Page 16
Designing for Failure
1. Multiple AZ hosting
2. Multiple region hosting
3. Shared security model
Page 17
Designing for Failure
1. Multiple AZ hosting
2. Multiple region hosting
3. Shared security model
4. Monitoring
Infrastructure & Application Health
Acquia Operations Team
Security Scanning
Acquia Security Team
Page 18
Monitoring
Web servers
Mon servers
US-West US-East
RackspaceRackspace
PingdomPingdom
External Monitoring
Page 19
Designing for Failure
1. Multiple AZ hosting
2. Multiple region hosting
3. Shared security model
4. Monitoring
5. Recovering from failure
Page 20
Failover in the Cloud• Amazon Elastic Load Balancers (ELBs) allow for
failover from one Availability Zone (AZ) to another
• Acquia load balancers allow for unhealthy web nodes in any given AZ to be removed from service
• DNS switch allows for failover or promotion of database servers
• Manual DNS switch allows for (one way) failover of a site from one region to another
Page 21
Testing failover
• Failover and failback should be a scriptable process able to be routinely handled by automated systems or be operations personnel
• Failover scenarios may be useful in events such application deployment or database schema changes
Page 22
Why not DIY?
• Your core competency is not HA• Let your precious engineering/IT ops staff focus on what’s key
to your organizations success
• Most organization are not 24x7x365• The Internet doesn’t sleep and failure can strike at any time
• Don’t get stuck in the blame game• If your site goes down and you are called upon at an
inconvenient time, you’ll be between the hosting provider or team, and the Drupal application team
Page 23
Why Acquia?
• White glove service
• 24x7 operations
• Drupal expertise• Operations
• Scalability
• Performance
• HA Offerings• Multi-zone
• Multi-region
Page 24
Dev Cloud
Acquia’s Continuous Integration Platform for Developers.
• Intuitive development workflow
• Power tools for power users
• Drupal-tuned hosting infrastructure
Page 25
Managed CloudNever let your best day become your worst.
• White-glove managed service for mission- critical Drupal websites
• Drupal-tuned hosting infrastructure
• HA, elastic resources with multi-region failover
Page 26
• For more information visit: http://www.acquia.com
• Contact us: [email protected] or 888.9.ACQUIA
• Follow us: @acquia
• Comments welcome:
• [email protected]
• [email protected]
Today’s webinar recording will be posted to:http://acquia.com/resources/recorded_webinars
Questions?