Top Banner
Cloudpocalypse We put “fail” in failover Vlad Mazek, MCSE CEO, Own Web Now Corp [email protected] facebook.com/vladmmd @vladmazek Cell: (407) 536-VLAD
15

Cloudpocalypse We put “fail” in failover

Feb 24, 2016

Download

Documents

espen

Cloudpocalypse We put “fail” in failover. Vlad Mazek, MCSE CEO, Own Web Now Corp [email protected] facebook.com/ vladmmd @ vladmazek Cell: (407) 536-VLAD. Agenda. Summary of events What to tell your clients about the outage Our current network design What failed? - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Cloudpocalypse We put “fail” in failover

CloudpocalypseWe put “fail” in failover

Vlad Mazek, MCSECEO, Own Web Now Corp

[email protected]/vladmmd

@vladmazekCell: (407) 536-VLAD

Page 2: Cloudpocalypse We put “fail” in failover

Agenda

• Summary of events• What to tell your clients about the outage• Our current network design• What failed?• What we are doing to address it

Page 3: Cloudpocalypse We put “fail” in failover

Power Infrastructure

Page 4: Cloudpocalypse We put “fail” in failover

So what failed?

ATSAutomatic Transfer Switch

Electrical switch that reconnects electric power source from it’sprimary source to a standbysource.

Page 5: Cloudpocalypse We put “fail” in failover

Summary of Events

• 12:04 Power failure • 1:34 ATS replacement advised by DC• 2:00 Partial power restored• 4:10 First ETA issued, 6:30 PM• 4:30 Emergency systems start coming online• 4:46 DC offers additional details on the problem• 5:10 Restored Exchange 2010 clusters• 7:10 DC restores power

Page 6: Cloudpocalypse We put “fail” in failover

How this really felt

Page 7: Cloudpocalypse We put “fail” in failover

How this really felt

Page 8: Cloudpocalypse We put “fail” in failover

How this really felt

Page 9: Cloudpocalypse We put “fail” in failover

How this really felt

Page 10: Cloudpocalypse We put “fail” in failover

How this really felt

Page 11: Cloudpocalypse We put “fail” in failover

Impact

• This is the first major issue with the Dallas DC in over a decade

• We moved our critical systems to Dallas from California and Florida due to the weather and power issues

• This has adjusted our roadmap for service delivery

Page 12: Cloudpocalypse We put “fail” in failover

Agenda

• Extend LiveArchive to a second DC• Extend Exchange 2010 hosting to additional

data centers• Improve our communications across partner

networks– Facebook: ExchangeDefender– Twitter: @xdnoc @ExchangDefender

Page 13: Cloudpocalypse We put “fail” in failover

What can I tell my clients?

• Power issues happen.• There will be a partial refund.• There is no additional support cost.• The company is going to improve the solution.• The uptime record thus far has been impressive.• Complex systems lead to complex problems and

aren’t you glad you don’t have to worry about it?

Page 14: Cloudpocalypse We put “fail” in failover

What next?

• Look for an email from me in the morning.• Advise customers about LiveArchive.• Stay tuned for network enhancements.• Keep the issue in perspective: This isn’t

Microsoft’s fault or general negligence/incompetence, it’s a massive failure.

Page 15: Cloudpocalypse We put “fail” in failover

Something funny…

You know why I don’t trust the cloud?It’s still powered by guys who’s butt cracks show when they squat to fix an electrical issue.