MICROSOFT CONFIDENTIAL – INTERNA Ulrich Homann Chief Architect WW Services FailSafe IaaS Marc Mercuri Sr. Director MCS, Applied Incubation Presented 2013
Ulrich HomannChief ArchitectWW Services
FailSafe IaaS
Marc MercuriSr. DirectorMCS, Applied Incubation
Presented 2013
Today's Platform
Each layer “early bound” to layer belowMust provision entire stack for each layer instanceDifficult to balance isolation and utilization/efficiency
1. Purchase
OS2. InstallRole3. InstallApp4. Deploy
Context5. Configure
Requests
Today's Platform
Virtualization breaks the tight coupling between hardware & softwareSoftware stack is still mostly statically bound though…
OSRoleApp
Context
OSRoleApp
Context
Virtualization
“Fabric Based” Computing PlatformInfrastructure Fabric
OSRole
OSRole
OSRole
OSRole
OSRole
OSRole
OSRole
OSRole
OSRole
OSRole
OSRole
OSRole
OSRole
OSRole
InfrastructureFabric
Base infrastructure serves multiple workloads / rolesInfrastructure is managed as one resourceProvisioned to aggregate need rather than per project
Hardware becomes fungible
What are the “9”s
90% ("one nine")99% ("two nines")99.9% ("three nines")99.99% ("four nines")99.999% ("five nines")99.9999% ("six nines")
The Truth About 9s
SLA Constraints and Throttling
SendGrid
Not every service has an SLA
TODO: IaaS is a Bridge Slide
Define Lifecycle Model
Workload 1
Workload 2
Workload 1
Workload 2
Scale
Resources
Demands
Unit of ScaleWorkloads
Workload 1
Workload 2
Bottom Ramp Peak
Deployment Redundancy
Auto-Scaling Compute in Windows Azure
Fault Domains
Fault and upgrade domains
• Failed component can’t take down service
• Isolated infrastructure• Physical hosts, racks• Network equipment
• Two by default• Role instances across 2+ fault
domains
Upgrade Domains
• VM rolling upgrades, no availability impact
• Logical grouping of role instances
• Five by default
• Role instances spread over upgrade domains
• Deployment upgraded for all or one at a time
Fault DomainRack
IIS1
SQL1
Fault DomainRack
IIS2
SQL2
Web Availability Set
SQL Availability Set
Make VMs Resilient to Failures with Availability SetsGet SLA by deploying multiple instances in availability sets
Ensure availability during updates & maintenance
Continue to architecture availability into the application
Custom Health Probes
LB
VM VM
Your Application
Your Application
LB
VM VM
AzureAgent
CustomerApplication
AzureAgent
CustomerApplication
Role Status Role Status
Understand Geo-Replication
VM Disks: Built on Windows Azure Storage
Windows Azure Storageasynchronous geo-replication
WEST
DCEASTDC
> 400 miles
Hybrid solutions in Windows Azure
Secure Site-to-Site Network Connectivity
Windows Azure Virtual Network
CLOUD ENTERPRISE
Data Synchronization
Multiple Options
Application-Layer Connectivity &
Messaging Service Bus
Secure Machine-to-Machine Network
ConnectivityWindows Azure Connect
Secure Point-to-Site Network Connectivity
Windows Azure Virtual Network
StorSimple: Extend your storage to Azure
24
PrimaryVolume
Snapshots
Backup, Restore & DR with StorSimple: Automated, Optimized, Reliable
Cloud Snapshots
• Backup copy of data volume created in cloud• Changes to local volume automatically transferred• Cloud snapshots mountable for restoreBenefits• Backup now as easy as snapshots• Fast restores from off-site backups• Integrated, easy to test disaster
recovery• Eliminates tape
Primary Volume
Virtual Tape/Replication
Physical TapeSnapshot Offsite Tape
Storage
Backup, Restore & DR Today: Inefficient, Complex, Laborious, and Risky
…Enables Seamless Scalability and Rapid Recovery
25
CloudSnapshots
Enterprise Data Center 1
Enterprise Data Center 2
Connect Many Servers to Cloud Storage and Scale
Data Sets with StorSimple Solution
Rapidly Recover to Any Data Center, Location-
Independent, via Mounting the Cloud
Production Data Production Data
Backup datacenter data to Windows using System Center Data Protection ManagerBackup and recover files/folders from Windows Server 2012
Windows Azure Backup
System Center Data Protection Manager
BenefitsReliable offsite data protectionSimple, familiar, integratedEfficient backup and recoveryEasy set up
Windows Server 2012Windows Server 2012 EssentialsWindows Server 2008 R2 (SP1)System Center 2012 DPM SP1
Your On-Premises Datacenter
SQL Server 2012 on IaaS: High Availability
High availability within regions using SQL Availability Groups
SQL Server 2012 on IaaS: High Availability High availability within regions using databasemirroring
Two approaches – • Use Domain Controller• User Certificates
Domain Controller Approach
Certificate-Based Approach
SQL Server 2012 on IaaS: High Availability
High availability across regions using database mirroring and log shipping
Domain Controller Approach
Certificate-Based Approach
SQL Server 2012 on IaaS: Disaster Recovery
High availability and Disaster Recovery with Availability Groups across on-prem and cloud
SQL Server Management Studio
Reliable off-site data backup
for SQL imagesEasily restore databases using VMs
Benefits
Microsoft SQL Server backup and restore to the cloud
Direct URL backup to Azure Storage
Restore in Azure Virtual Machine
Backup and restore database to the cloud
Failure Points - Virtualized DCs • Background• common virtualization operations
such as backing up/restoring VMs/VHDs can rollback the state of a virtual DC
• with Active Directory, this can introduce USN bubbles leading to permanently divergent state causing:• lingering objects• inconsistent passwords• inconsistent attribute values• schema mismatches if the Schema
FSMO is rolled back• the potential also exists for security
principals to be created with duplicate SIDs
How Domain Controllers are ImpactedTi
mel
ine
of e
vent
s
TIME: T2
TIME: T3
TIME: T4
CreateSnapsh
ot
T1 SnapshotApplied!
USN: 100 ID: A
RID Pool: 500 - 1000
USN: 100 ID: A
RID Pool: 500 - 1000
USN: 250ID: A
RID Pool: 650 - 1000
+150 more users created
DC1(A)@USN = 200
DC2 receives updates: USNs >200
DC1(A)@USN = 250
USN: 200ID: A
RID Pool: 600- 1000
+100 users added
DC2 receives updates: USNs >100
DC1
DC2
TIME: T1
USN rollback NOT detected: only 50 users converge across the two DCsAll others are either on one or the other DC100 security principals (users in this example) with RIDs 500-599 have conflicting SIDs
PowerShell for Automation and Advanced ManagementEverything has an API – Automate, automate automate
Automation Query, manage and configure – at scale:
• Virtual machines• Storage across multiple
subscriptions and storage accounts• Tiered deployment workflows
Virtual Machines Configure storage and networking Domain join to AD DS on-premises Bring your own machine images or
disks Use remote PowerShell
Virtual Network Configure virtual network Manage configuration and
gateway Connect to on-premises networks
Storage Upload and download VHDs from storage accounts to on-premises Copy VHDs between storage accounts and subscriptions
© 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.