Azure Coursebook

LECTURE 1: INTRODUCTION TO CLOUD COMPUTING .................................................................... 6

Essential Characteristics: ............................................................................................................. 6

Service Models: ........................................................................................................................... 7

Deployment Models: ................................................................................................................... 7

LECTURE 2: Introducing Windows Azure ................................................................................... 9

Azure Overview ........................................................................................................................... 9

Is Your Application a Good Fit for Windows Azure? ................................................................. 11

Understand the Benefits of Windows Azure ............................................................................. 11

Target Scenarios that Leverage the Strengths of Windows Azure ............................................ 12

Scenarios that Do Not Require the Capabilities of Windows Azure.......................................... 15

Evaluate Architecture and Development .................................................................................. 16

Summary.................................................................................................................................... 18

LECTURE 3: Main Components of Windows Azure ....................................................................... 19

Table of Contents .................................................................................................................... 19

The Components of Windows Azure .................................................................................... 19

Execution Models .................................................................................................................... 20

Data Management .................................................................................................................. 23

Networking .............................................................................................................................. 25

Business Analytics ................................................................................................................... 27

Messaging ................................................................................................................................ 29

Caching ..................................................................................................................................... 30

Identity ...................................................................................................................................... 32

High-Performance Computing .............................................................................................. 33

Media ........................................................................................................................................ 33

Commerce ................................................................................................................................ 34

SDKs .......................................................................................................................................... 35

Getting Started ........................................................................................................................ 36

Lecture 4: WINDOWS AZURE COMPUTE ....................................................................................... 62

Web Sites vs Cloud Services vs Virtual Machines ...................................................................... 62

WINDOWS AZURE CLOUD SERVICES: ................................................ Error! Bookmark not defined.

WEB ROLE AND WORKER ROLE ..................................................................................................... 70

THE THREE RULES OF THE WINDOWS AZURE PROGRAMMING MODEL................................... 70

A WINDOWS AZURE APPLICATION IS BUILT FROM ONE OR MORE ROLES ............................... 71

A WINDOWS AZURE APPLICATION RUNS MULTIPLE INSTANCES OF EACH ROLE ..................... 72

A WINDOWS AZURE APPLICATION BEHAVES CORRECTLY WHEN ANY ROLE INSTANCE FAILS . 73

WHAT THE WINDOWS AZURE PROGRAMMING MODEL PROVIDES ......................................... 75

IMPLICATIONS OF THE WINDOWS AZURE PROGRAMMING MODEL: WHAT ELSE CHANGES? . 78

MOVING WINDOWS SERVER APPLICATIONS TO WINDOWS AZURE ......................................... 82

CONCLUSION ............................................................................................................................. 84

Cloud service concept ................................................................................................................ 85

Concepts ....................................................................................... Error! Bookmark not defined.

Data Management and Business Analytics .............................................................................. 88

Table of Contents .................................................................................................................... 88

Blob Storage ............................................................................................................................ 88

Running a DBMS in a Virtual Machine ................................................................................. 90

SQL Database .......................................................................................................................... 91

Table Storage ........................................................................................................................... 96

Hadoop ..................................................................................................................................... 97

Lecture 6: .................................................................................................................................... 100

Windows Azure SQL Database .................................................................................................... 100

Similarities and Differences ..................................................................................................... 100

Compare SQL Server with Windows Azure SQL Database (en-US) .................................... 103

Table of Contents .................................................................................................................. 103

............................................................................................................................................... 103

Similarities and Differences.......................................................................................... 103

Logical Administration vs. Physical Administration .................................................. 103

Provisioning ................................................................................................................... 103

Transact-SQL Support .................................................................................................. 103

Features and Types ....................................................................................................... 103

Key Benefits of the Service ........................................................................................... 103

o Self-Managing ............................................................................................................... 103

o High Availability ............................................................................................................ 103

o Scalability ....................................................................................................................... 103

o Familiar Development Model ...................................................................................... 103

o Relational Data Model .................................................................................................. 103

See Also .......................................................................................................................... 103

Other Languages ........................................................................................................... 103

Similarities and Differences ................................................................................................. 103

Logical Administration vs. Physical Administration .......................................................... 103

Provisioning ........................................................................................................................... 104

Transact-SQL Support .......................................................................................................... 104

Features and Types ............................................................................................................... 105

Key Benefits of the Service ................................................................................................... 105

Federations in Windows Azure SQL Database (formerly SQL Azure) .......................................... 109

Federation Architecture .......................................................................................................... 109

Design Considerations ............................................................................................................. 111

LECTURE 7: ................................................................................................................................. 125

NETWORKING, CACHING AND ACCESS CONTROL IN WINDOWS AZURE ....................... 125

Windows Azure Networking .................................................................................................... 125

Table of Contents .................................................................................................................. 125

Windows Azure Virtual Network ......................................................................................... 125

Windows Azure Connect ...................................................................................................... 127

Windows Azure Traffic Manager ......................................................................................... 129

Caching in Windows Azure .......................................................................................................... 131

Caching (Preview) on Roles ..................................................................................................... 131

Shared Caching ........................................................................................................................ 131

In This Section ................................................................................ Error! Bookmark not defined.

Lecture 8: .................................................................................................................................... 117

WINDOWS AZURE SERVICE BUS ............................................................................................. 117

Software, Services, Clouds, and Devices ................................................................................ 117

Fulfilling the Potential ............................................................................................................. 117

Feature Overview .................................................................................................................... 118

Relayed and Brokered Messaging ............................................................................................... 120

Relayed Messaging .................................................................................................................. 120

Brokered Messaging ................................................................................................................ 121

What are Service Bus Queues ................................................................................................. 121

What are Service Bus Topics and Subscriptions ...................................................................... 122

What is the Service Bus Relay ................................................................................................. 123

LECTURE 1: INTRODUCTION TO CLOUD

COMPUTING

(From ―The NIST Definition of Cloud Computing‖)

Cloud computing is a model for enabling ubiquitous, convenient, on-demand network

access to a shared pool of configurable computing resources (e.g., networks, servers,

storage, applications, and services) that can be rapidly provisioned and released with

minimal management effort or service provider interaction.

This cloud model is composed of five essential characteristics, three service models, and

four deployment models.

Essential Characteristics:

On-demand self-service: A consumer can unilaterally provision computing

capabilities, such as server time and network storage, as needed automatically

without requiring human interaction with each service provider.

Broad network access: Capabilities are available over the network and accessed

through standard mechanisms that promote use by heterogeneous thin or thick

client platforms (e.g., mobile phones, tablets, laptops, and workstations).

Resource pooling: The provider‘s computing resources are pooled to serve multiple

consumers using a multi-tenant model, with different physical and virtual resources

dynamically assigned and reassigned according to consumer demand. There is a

sense of location independence in that the customer generally has no control or

knowledge over the exact location of the provided resources but may be able to specify

location at a higher level of abstraction (e.g., country, state, or datacenter). Examples

of resources include storage, processing, memory, and network bandwidth.

Rapid elasticity: Capabilities can be elastically provisioned and released, in some

cases automatically, to scale rapidly outward and inward commensurate with demand.

To the consumer, the capabilities available for provisioning often appear to be unlimited

and can be appropriated in any quantity at any time.

Measured service: Cloud systems automatically control and optimize resource use by

leveraging a metering capability at some level of abstraction appropriate to the type of

service (e.g., storage, processing, bandwidth, and active user accounts). Typically this

is done on a pay-per-use or charge-per-use basis. Resource usage can be monitored,

controlled, and reported, providing transparency for both the provider and

consumer of the utilized service.

A cloud infrastructure is the collection of hardware and software that enables

the five essential characteristics of cloud computing. The cloud infrastructure can be

viewed as containing both a physical layer and an abstraction layer. The physical layer

consists of the hardware resources that are necessary to support the cloud services being

provided, and typically includes server, storage and network components. The

abstraction layer consists of the software deployed across the physical layer, which

manifests the essential cloud characteristics. Conceptually the abstraction layer sits

above the physical layer.

Service Models:

Software as a Service (SaaS): The capability provided to the consumer is to use

the provider‘s applications running on a cloud infrastructure.

The applications are accessible from various client devices through either a thin client

interface, such as a web browser (e.g., web-based email), or a program interface. The

consumer does not manage or control the underlying cloud infrastructure including

network, servers, operating systems, storage, or even individual application capabilities,

with the possible exception of limited user-specific application configuration settings.

Platform as a Service (PaaS): The capability provided to the consumer is to deploy onto

the cloud infrastructure consumer-created or acquired applications created using

programming languages, libraries, services, and tools supported by the provider.

The consumer does not manage or control the underlying cloud infrastructure

including network, servers, operating systems, or storage, but has control over the

deployed applications and possibly configuration settings for the application-hosting

environment.

Infrastructure as a Service (IaaS): The capability provided to the consumer is to

provision processing, storage, networks, and other fundamental computing

resources where the consumer is able to deploy and run arbitrary software, which

can include operating systems and applications. The consumer does not manage or

control the underlying cloud infrastructure but has control over operating systems,

storage, and deployed applications; and possibly limited control of select networking

components (e.g., host firewalls).

Deployment Models:

Private cloud. The cloud infrastructure is provisioned for exclusive use by a single

organization comprising multiple consumers (e.g., business units). It may be owned,

managed, and operated by the organization, a third party, or some combination of

them, and it may exist on or off premises.

Community cloud. The cloud infrastructure is provisioned for exclusive use by a

specific community of consumers from organizations that have shared concerns (e.g.,

mission, security requirements, policy, and compliance considerations). It may be

owned, managed, and operated by one or more of the organizations in the

community, a third party, or some combination of them, and it may exist on or off

premises.

Public cloud. The cloud infrastructure is provisioned for open use by the general public.

It may be owned, managed, and operated by a business, academic, or government

organization, or some combination of them. It exists on the premises of the cloud

provider.

Hybrid cloud. The cloud infrastructure is a composition of two or more distinct

cloud infrastructures (private, community, or public) that remain unique entities, but are

bound together by standardized or proprietary technology that enables data and

application portability (e.g., cloud bursting for load balancing between clouds).

LECTURE 2: Introducing Windows

Azure

Azure Overview

Windows Azure is an open and flexible cloud platform that enables you to quickly build,

deploy and manage applications across a global network of Microsoft-managed datacenters.

You can build applications using any language, tool or framework. And you can integrate

your public cloud applications with your existing IT environment.

Always up. Always on.

Windows Azure delivers a 99.95% monthly SLA and enables you to build and run highly

available applications without focusing on the infrastructure. It provides automatic OS and

service patching, built in network load balancing and resiliency to hardware failure. It

supports a deployment model that enables you to upgrade your application without

downtime.

Open

Windows Azure enables you to use any language, framework, or tool to build applications.

Features and services are exposed using open REST protocols. The Windows Azure client

libraries are available for multiple programming languages, and are released under an open

source license and hosted on GitHub.

Unlimited servers. Unlimited storage.

Windows Azure enables you to easily scale your applications to any size. It is a fully

automated self-service platform that allows you to provision resources within minutes.

Elastically grow or shrink your resource usage based on your needs. You only pay for the

resources your application uses. Windows Azure is available in multiple datacenters around

the world, enabling you to deploy your applications close to your customers.

Powerful Capabilities

Windows Azure delivers a flexible cloud platform that can satisfy any application need. It

enables you to reliably host and scale out your application code within compute roles. You

can store data using relational SQL databases, NoSQL table stores, and unstructured blob

https://www.windowsazure.com/en-us/support/sla/

stores, and optionally use Hadoop and business intelligence services to data-mine it. You can

take advantage of Windows Azure‘s robust messaging capabilities to enable scalable

distributed applications, as well as deliver hybrid solutions that run across a cloud and on-

premises enterprise environment. Windows Azure‘s distributed caching and CDN services

allow you to reduce latency and deliver great application performance anywhere in the

world.

Is Your Application a Good Fit for Windows Azure?

If you're considering using Windows Azure to host an application, you might wonder

if your application or business requirements are best served by the platform. This

topic attempts to answer this question by:

Looking at the benefits Windows Azure provides to your application

Applying the strengths of the platform to common scenarios

Rejecting scenarios that do not leverage the strengths of the platform

Examining some common architecture and development considerations

The intent is to provide a framework for thinking about your application and how it

relates to the capabilities of Windows Azure. In many cases, links to additional

resources are provided to improve your ability to analyze your application and make a

decision on how to move to the cloud.

Understand the Benefits of Windows Azure

Before you can determine if your application is well-suited for Windows Azure, you

must first understand some of the main benefits of the platform. A complete list of

benefits can be found in the Windows Azure documentation and many articles and

videos about Windows Azure. One excellent paper on this subject is Cloud

Optimization – Expanding Capabilities, while Aligning Computing and Business

Needs.

There are several benefits to having hardware and infrastructure resources managed

for you. Let's look at a few of these benefits at a high level before we discuss

scenarios that take advantage of these features.

Resource Management

When you deploy your application and services to the cloud, Windows Azure provides

the necessary virtual machines, network bandwidth, and other infrastructure

resources. If machines go down for hardware updates or due to unexpected failures,

new virtual machines are automatically located for your application.

Because you only pay for what you use, you can start off with a smaller investment

rather than incurring the typical upfront costs required for an on-premises

deployment. This can be especially useful for small companies. In an on-premises

scenario, these organizations might not have the data center space, IT skills, or

hardware skills necessary to successfully deploy their applications. The automatic

infrastructure services provided by Windows Azure offer a low barrier of entry for

application deployment and management.

Dynamic Scaling

http://msdn.microsoft.com/en-us/library/windowsazure/hh694036.aspx#goodfit_benefits

http://msdn.microsoft.com/en-us/library/windowsazure/hh694036.aspx#goodfit_target

http://msdn.microsoft.com/en-us/library/windowsazure/hh694036.aspx#goodfit_avoid

http://msdn.microsoft.com/en-us/library/windowsazure/hh694036.aspx#goodfit_architecture

http://go.microsoft.com/fwlink/?LinkID=225051

http://go.microsoft.com/fwlink/?LinkId=235866



Dynamic scaling refers to the capability to both scale out and scale back your

application depending on resource requirements. This is also referred to as elastic

scale. Before describing how this works, you should understand the basic architecture

of a Windows Azure application. In Windows Azure, you create roles that work

together to implement your application logic. For example, one web role could host

the ASP.NET front-end of your application, and one or more worker roles could

perform necessary background tasks. Each role is hosted on one or more virtual

machines, called role instances, in the Windows Azure data center. Requests are load

balanced across these instances. For more information about roles, see the paper The

Windows Azure Programming Model.

If resource demands increase, new role instances running your application code can

be provisioned to handle the load. When demand decreases, these instances can be

removed so that you don't have to pay for unnecessary computing power. This is

much different from an on-premises deployment where hardware must be over-

provisioned to anticipate peak demands. This scaling does not happen automatically,

but it is easily achieved through either the web portal or the Service Management API.

The paper Dynamically Scaling an Application demonstrates one way to automatically

scale Windows Azure applications. There is also an Autoscaling Application

Block created by the Microsoft Patterns and Practices team.

If your application requires fluctuating or unpredictable demands for computing

resources, Windows Azure allows you to easily adjust your resource utilization to

match the load.

High Availability and Durability

Windows Azure provides a platform for highly available applications that can reliably

store and access backend data through storage services or Windows Azure SQL

Database.

First Windows Azure ensures high availability of your compute resources when you

have multiple instances of each role. Role instances are automatically monitored, so it

is able to respond quickly to hardware restarts or failures by automatically deploying

a role to a new instance.

Second, Windows Azure ensures high availability and durability for data stored

through one of the storage services. Windows Azure storage services replicate all data

to at least three different servers. Similarly, SQL Database replicates all data to

guarantee availability and durability.

Other Windows Azure services provide similar high availability guarantees. For more

information, see theWindows Azure SLA.

Target Scenarios that Leverage the Strengths of Windows

Azure







With an understanding of the strengths of the Windows Azure platform, you can

begin to look at the scenarios that are best suited for the cloud. The following

sections discuss several of these patterns and how Windows Azure is ideally suited for

certain workloads and goals. The video Windows Azure Design Patterns explains

many of the scenarios below and provides a good overview of the Windows Azure

platform.

Tip

Although there is a focus on application scenarios here, understand that you can choose to

use individual services of Windows Azure. For example, if you find that using blob storage

solves an application problem, it is possible that the rest of your application remains outside

of the Cloud. This is called a hybrid application and is discussed later in this topic.

Highly Available Services

Windows Azure is well-suited to hosting highly available services. Consider an online

store deployed in Windows Azure. Because an online store is a revenue generator, it is

critical that it stay running. This is accomplished by the service monitoring and

automatic instance management performed in the Windows Azure data center. The

online store must also stay responsive to customer demand. This is accomplished by

the elastic scaling ability of Windows Azure. During peak shopping times, new

instances can come online to handle the increased usage. In addition, the online store

must not lose orders or fail to completely process placed orders. Windows Azure

storage and SQL Database both provide highly available and durable storage options

to hold the order details and state throughout the order lifecycle.

Periodic Workloads

Another good fit for Windows Azure is some form of an "on and off" workload. Some

applications do not need to run continuously. One simple example of this is a demo

or utility application that you want to make available only for several days or weeks.

Windows Azure allows you to easily create, deploy, and share that application with

the world. But once its purpose is accomplished, you can remove the application and

you are only charged for the time it was deployed.

Note

Note: You must remove the deployment, not just suspend the application, to avoid charges

for compute time.

Also consider a large company that runs complex data analysis of sales numbers at

the end of each month. Although processing-intensive, the total time required to

complete the analysis is at most two days. In an on-premises scenario, the servers

required for this work would be underutilized for the majority of the month. In

Windows Azure, the business would only pay for the time that the analysis application


is running in the cloud. And assuming the architecture of the application is designed

for parallel processing, the scale out features of Windows Azure could enable the

company to create large numbers of worker role instances to complete more complex

work in less time. In this example, you should use code or scripting to automatically

deploy the application at the appropriate time each month.

Unpredictable Growth

All businesses have a goal of rapid and sustainable growth. But growth is very hard to

handle in the traditional on-premises model. If the expected growth does not

materialize, you've spent money maintaining underutilized hardware and

infrastructure. But if growth happens more quickly than expected, you might be

unable to handle the load, resulting in lost business and poor customer experience.

For small companies, there might not even be enough initial capital to prepare for or

keep up with rapid growth.

Windows Azure is ideal for handling this situation. Consider a small sports news web

site that makes money from advertising. The amount of revenue is directly

proportional to the amount of traffic that the site generates. In this example, initial

capital for the venture is limited, and they do not have the money required to setup

and run their own data center. By designing the web site to run on Windows Azure,

they can easily deploy their solution as an ASP.NET application that uses a backend

SQL Database for relational data and blob storage for pictures and videos. If the

popularity of the web site grows dramatically, they can increase the number of web

role instances for their front-end or increase the size of the SQL Database. The blob

storage has built-in scalability features within Windows Azure. If business decreases,

they can remove any unnecessary instances. Because their revenue is proportional to

the traffic on the site, Windows Azure helps them to start small, grow fast, and reduce

risk.

With Windows Azure, you have complete control to determine how aggressively to

manage your computing costs. You could decide to use the Service Management

API or the Autoscaling Application Block to create an automatic scaling engine that

creates and removes instances based on custom rules. You could choose to vary the

number of instances based on a predetermined amount, such as four instances

during business hours versus two instances during non-business hours. Or you could

keep the number of instances constant and only increase them manually through the

web portal as demand increases over time. Windows Azure gives you the flexibility to

make the decisions that are right for your business.

Workload Spikes

This is another workload pattern that requires elastic scale. Consider the previous

example of a sports news web site. Even as their business is steadily growing, there is

still the possibility of temporary spikes or bursts of activity. For example, if they are

referenced by another popular news outlet, the numbers of visitors to their site could

dramatically increase in a single day. In a more predictable scenario, major sporting

events and sports championships will result in more activity on their site.




An alternative example is a service that processes daily reports at the end of the day.

When the business day closes, each office sends in a report which is processed at the

company headquarters. Since the process needs to be active only a few hours each

day, it is also a candidate for elastic scaling and deployment.

Windows Azure is well suited for temporarily scaling out an application to handle

spikes in load and then scaling back again after the event has passed.

Infrastructure Offloading

As demonstrated in the previous examples, many of the most common cloud

scenarios take advantage of the elastic scale of Windows Azure. However, even

applications with steady workload patterns can realize cost savings in Windows Azure.

It is expensive to manage your own data center, especially when you consider the

cost of energy, people-skills, hardware, software licensing, and facilities. It is also hard

to understand how costs are tied to individual applications. In Windows Azure, the

goal is to reduce total costs as well as to make those costs more transparent. The

paper, Cloud Optimization – Expanding Capabilities, while Aligning Computing and

Business Needs, does a great job of explaining typical on-premises hosting costs and

how these can be reduced with Windows Azure. Windows Azure also provides a

pricing calculator for understanding specific costs and a TCO (Total Cost of

Ownership) calculator for estimating the overall cost reduction that could occur by

adopting Windows Azure. For links to these calculator tools and other pricing

information, see the Windows Azure web site.

Scenarios that Do Not Require the Capabilities of Windows

Azure

Not all applications should be moved to the cloud. Only applications that benefit

from Windows Azure features should be moved to the cloud.

A good example of this would be a personal blog website intended for friends and

family. A site like this might contain articles and photographs. Although you could use

Windows Azure for this project, there are several reasons why Windows Azure is not

the best choice. First, even though the site might only receive a few hits a day, one

role instance would have to be continuously running to handles those few requests

(note that two instances would be required to achieve the Windows Azure SLA for

compute). In Windows Azure, the cost is based on the amount of time that each role

instance has been deployed (this is known in Windows Azure nomenclature

as compute time); suspending an application does not suspend the consumption of

(and charge for) compute time. Even if this site responded to only one hit during the

day, it would still be charged for 24 hours of compute time. In a sense, this is rented

space on the virtual machine that is running your code. So, at the time of writing this

topic, even one extra small instance of a web role would cost $30 a month. And if 20

GB of pictures were stored in blob storage, that storage plus transactions and

bandwidth could add another $6 to the cost. The monthly cost of hosting this type of

site on Windows Azure is higher than the cost of a simple web hosting solution from




a third party. Most importantly, this type of web site does not require resource

management, dynamic scaling, high availability, and durability.

Windows Azure allows you to choose only the options that are suited to your

business‘ needs. For example, you might find instances in which certain data cannot

be hosted in the cloud for legal or regulatory reasons. In these cases, you might

consider a hybrid solution, where only certain data or specific parts of your

application that are not as sensitive and need to be highly available are hosted in

Windows Azure.

There are other scenarios that are not well-suited to Windows Azure. By

understanding the strengths of Windows Azure, you can recognize applications or

parts of an application that will not leverage these strengths. You can then more

successfully develop the overall solution that most effectively utilizes Windows Azure

capabilities.

Evaluate Architecture and Development

Of course, evaluating a move to Windows Azure involves more than just knowing that

your application or business goals are well-suited for the cloud. It is also important to

evaluate architectural and development characteristics of your existing or new

application. A quick way to start this analysis is to use the Microsoft Assessment

Tool (MAT) for Windows Azure. This tool asks questions to determine the types of

issues you might have in moving to Windows Azure. Next to each question is a link

called "See full consideration", which provides additional information about that

specific area in Windows Azure. These questions and the additional information can

help to identify potential changes to the design of your existing or new application in

the cloud.

In addition to the MAT tool, you should have a solid understanding of the basics of

the Windows Azure platform. This includes an understanding of common design

patterns for the platform. Start by reviewing the Windows Azure videos or reading

some of the introductory white papers, such as The Windows Azure Programming

Model. Then review the different services available in Windows Azure and consider

how they could factor into your solution. For an overview of the Windows Azure

services, see the MSDN documentation.

It is beyond the scope of this paper to cover all of the possible considerations and

mitigations for Windows Azure solutions. However, the following table lists four

common design considerations along with links to additional resources.

Area Description

Hybrid

Solutions

It can be difficult to move complex legacy applications to Windows Azure.

There are also sometimes regulatory concerns with storing certain types of

data in the cloud. However, it is possible to create hybrid solutions that







connect services hosted by Windows Azure with on-premises applications

and data.

There are multiple Windows Azure technologies that support this

capability, including Service Bus, Access Control Service, and Windows

Azure Connect. For a good video on this subject from October 2010,

see Connecting Cloud & On-Premises Apps with the Windows Azure

Platform. For hybrid architecture guidance based on real-world customer

implementations, see Hybrid Reference Implementation Using BizTalk

Server, Windows Azure, Service Bus and Windows Azure SQL Database.

State

Management

If you are moving an existing application to Windows Azure, one of the

biggest considerations is state management. Many on-premises

applications store state locally on the hard drive. Other features, such as

the default ASP.NET session state, use the memory of the local machine for

state management. Although your roles have access to their virtual

machine's local drive space and memory, Windows Azure load balances all

requests across all role instances. In addition, your role instance could be

taken down and moved at any time (for example, when the machine

running the role instance requires an update).

This dynamic management of running role instances is important for the

scalability and availability features of Windows Azure. Consequently,

application code in the cloud must be designed to store data and state

remotely using services such as Windows Azure storage or SQL Database.

For more information about storage options, see the resources in the Store

and Access Data section of the Windows Azure web site.

Storage

Requirements

SQL Database is the relational database solution in Windows Azure. If you

currently use SQL Server, the transition to SQL Database should be easier.

If you are migrating from another type of database system, there are SQL

Server Migration Assistants that can help with this process. For more

information on migrating data to SQL Database, see Data Migration to

Windows Azure SQL Database: Tools and Techniques.

Also consider Windows Azure storage for durable, highly available, and

scalable data storage. One good design pattern is to effectively combine

the use of SQL Database and Windows Azure storage tables, queues, and

blobs. A common example is to use SQL Database to store a pointer to a

blob in Windows Azure storage rather than storing the large binary object

in the database itself. This is both efficient and cost-effective. For a

discussion of storage options, see the article onData Storage Offerings on

the Windows Azure Platform.

http://msdn.microsoft.com/en-us/library/windowsazure/ee732537.aspx

http://msdn.microsoft.com/en-us/library/windowsazure/hh147631.aspx













Interoperability The easiest application to design or port to Windows Azure is a .NET

application. The Windows Azure SDK and tools for Visual Studio greatly

simplify the process of creating Windows Azure applications.

But what if you are using open source software or third-party development

languages and tools? The Windows Azure SDK uses a REST API that is

interoperable with many other languages. Of course, there are some

challenges to address depending on your technology. For some

technologies, you can choose to use a stub .NET project in Visual Studio

and overload the Run method for your role. Microsoft provides Windows

Azure SDKs for Java and Node.js that you can use to develop and deploy

applications. There are also community-created SDKs that interact with

Windows Azure. A great resource in this area is the Ineroperability Bridges

and Labs Center.

Deploying projects that use open source software can also be a challenge.

For example, the following blog post discusses options for deploying Ruby

applications on Windows

Azure:http://blogs.msdn.com/b/silverlining/archive/2011/08/29/deploying-

ruby-java-python-and-node-js-applications-to-windows-azure.aspx.

The important point is that Windows Azure is accessible from a variety of

languages, so you should look into the options for your particular

language of choice before determining whether the application is a good

candidate for Windows Azure.

Beyond these issues, you can learn a lot about potential development challenges and

solutions by reviewing content on migrating applications to Windows Azure. The

Patterns and Practices group at Microsoft published the following guidance on

migration: Moving Applications to the Cloud on the Microsoft Windows Azure

Platform. You can find additional resources on migration from the Windows Azure

web site: Migrate Services and Data.

Summary

Windows Azure offers a platform for creating and managing highly scalable and

available services. You pay only for the resources that you require and then scale

them up and down at any time. And you don't have to own the hardware or

supporting infrastructure to do this. If your business can leverage the platform to

increase agility, lower costs, or lower risk, then Windows Azure is a good fit for your

application. After making this determination, you can then look at specific

architecture and development options for using the platform. This includes decisions

about new development, migration, or hybrid scenarios. At the end of this analysis,

you should have the necessary information to make an informed decision about how

to most effectively use Windows Azure to reach your business goals.








LECTURE 3:

Main Components of Windows Azure

Windows Azure is Microsoft's application platform for the public cloud. You can use this

platform in many different ways. For instance, you can use Windows Azure to build a web

application that runs and stores its data in Microsoft datacenters. You can use Windows

Azure just to store data, with the applications that use this data running on-premises (that is,

outside the public cloud). You can use Windows Azure to create virtual machines for

development and test or to run SharePoint and other applications. You can use Windows

Azure to build massively scalable applications with lots and lots of users. Because the

platform offers a wide range of services, all of these things-and more-are possible.

To do any of them, though, you need to understand the basics. Even if you don't know

anything about cloud computing, this article will walk you through the fundamentals of

Windows Azure. The goal is to give you a foundation for understanding and using this cloud

platform.

Table of Contents

The Components of Windows Azure

Execution Models

Data Management

Networking

Business Analytics

Messaging

Caching

Identity

High-Performance Computing (HPC)

Media

Commerce

SDKs

Getting Started

The Components of Windows Azure

To understand what Windows Azure offers, it's useful to group its services into distinct

categories. Figure 1 shows one way to do this.

https://www.windowsazure.com/en-us/develop/net/fundamentals/intro-to-windows-azure/#components

https://www.windowsazure.com/en-us/develop/net/fundamentals/intro-to-windows-azure/#models

https://www.windowsazure.com/en-us/develop/net/fundamentals/intro-to-windows-azure/#data

https://www.windowsazure.com/en-us/develop/net/fundamentals/intro-to-windows-azure/#networking

https://www.windowsazure.com/en-us/develop/net/fundamentals/intro-to-windows-azure/#analytics

https://www.windowsazure.com/en-us/develop/net/fundamentals/intro-to-windows-azure/#messaging

https://www.windowsazure.com/en-us/develop/net/fundamentals/intro-to-windows-azure/#caching

https://www.windowsazure.com/en-us/develop/net/fundamentals/intro-to-windows-azure/#identity

https://www.windowsazure.com/en-us/develop/net/fundamentals/intro-to-windows-azure/#HPC

https://www.windowsazure.com/en-us/develop/net/fundamentals/intro-to-windows-azure/#media

https://www.windowsazure.com/en-us/develop/net/fundamentals/intro-to-windows-azure/#commerce

https://www.windowsazure.com/en-us/develop/net/fundamentals/intro-to-windows-azure/#sdk

https://www.windowsazure.com/en-us/develop/net/fundamentals/intro-to-windows-azure/#start

Figure 1: Windows Azure provides Internet-accessible application services running in

Microsoft datacenters.

To get started with Windows Azure, you need to know at least the basics about each of its

components. The rest of this article walks through the technologies shown in the figure,

describing what each one offers and when you might use it.

Execution Models

One of the most basic things a cloud platform does is execute applications. Windows Azure

provides three options for doing this, as Figure 2 shows.

Figure 2: Windows Azure provides Infrastructure as a Service (IaaS), web hosting, and

Platform as a Service (PaaS).

Each of these three approaches-Virtual Machines, Web Sites, and Cloud Services-can be used

separately. You can also combine them to create an application that uses two or more of

these options together.

Virtual Machines

The ability to create a virtual machine on demand, whether from a standard image or from

one you supply, can be very useful. Add the ability to pay for this VM by the hour, and it's

even more useful. This approach, commonly known as Infrastructure as a Service (IaaS), is

what Windows Azure Virtual Machines provides.

To create a VM, you specify which VHD to use and the VM's size. You then pay for each hour

the VM is running. As Figure 2 shows, Windows Azure Virtual Machines offers a gallery of

standard VHDs. These include Microsoft-provided options, such as Windows Server 2008 R2,

Windows Server 2012, and Windows Server 2008 R2 with SQL Server, along with Linux

images provided by Microsoft partners. You're free to upload and create VMs from your own

VHDs as well.

Wherever the image comes from, you can persistently store any changes made while a VM is

running. The next time you create a VM from that VHD, things pick up where you left off. It's

also possible to copy the changed VHD out of Windows Azure, then run it locally.

Windows Azure VMs can be used in many different ways. You might use them to create an

inexpensive development and test platform that you can shut down when you've finished

using it. You might also create and run applications that use whatever languages and

libraries you like. Those applications can use any of the data management options that

Windows Azure provides, and you can also choose to use SQL Server or another DBMS

running in one or more virtual machines. Another option is to use Windows Azure VMs as an

extension of your on-premises datacenter, running SharePoint or other applications. To

support this, it's possible to create Windows domains in the cloud by running Active

Directory in Windows Azure VMs. This quite general approach to cloud computing can be

used to address many different problems. What you do is up to you.

Web Sites

One of the most common things that people do in the cloud is run websites and web

applications. Windows Azure Virtual Machines allows this, but it still leaves you with the

responsibility of administering one or more VMs. What if you just want a website where

somebody else takes care of the administrative work for you?

This is exactly what Windows Azure Web Sites provides. This execution model offers a

managed web environment using Internet Information Services (IIS). You can move an

existing IIS website into Windows Azure Web Sites unchanged, or you can create a new one

directly in the cloud. Once a website is running, you can add or remove instances

dynamically, relying on Web Sites to load balance requests across them. And as Figure 2

shows, Windows Azure Web Sites offers both a shared option, where your website runs in a

virtual machine with other sites, and a way for a site to run in its own VM.

Windows Azure Web Sites is intended to be useful for both developers and web design

agencies. For development, it supports .NET, PHP, and Node.js, along with SQL Database and

(from ClearDB, a Microsoft partner) MySQL for relational storage. It also provides built-in

support for several popular applications, including WordPress, Joomla, and Drupal. The goal

is to provide a low-cost, scalable, and broadly useful platform for creating websites and web

applications in the public cloud.

Cloud Services

Suppose you want to build a cloud application that can support lots of simultaneous users,

doesn't require much administration, and never goes down. You might be an established

software vendor, for example, that's decided to embrace Software as a Service (SaaS) by

building a version of one of your applications in the cloud. Or you might be a start-up

creating a consumer application that you expect will grow fast. If you're building on Windows

Azure, which execution model should you use?

Windows Azure Web Sites allows creating this kind of web application, but there are some

constraints. You don't have administrative access, for example, which means that you can't

install arbitrary software. Windows Azure Virtual Machines gives you lots of flexibility,

including administrative access, and you certainly can use it to build a very scalable

application, but you'll have to handle many aspects of reliability and administration yourself.

What you'd like is an option that gives you the control you need but also handles most of the

work required for reliability and administration.

This is exactly what's provided by Windows Azure Cloud Services. This technology is

designed expressly to support scalable, reliable, and low-admin applications, and it's an

example of what's commonly called Platform as a Service (PaaS). To use it, you create an

application using the technology you choose, such as C#, Java, PHP, Python, Node.js, or

something else. Your code then executes in virtual machines (referred to as instances)

running a version of Windows Server.

But these VMs are distinct from the ones you create with Windows Azure Virtual Machines.

For one thing, Windows Azure itself manages them, doing things like installing operating

system patches and automatically rolling out new patched images. (This implies that your

application shouldn't maintain state in web or worker role instances; it should instead be

kept in one of the Windows Azure data management options described in the next section.)

Windows Azure also monitors the VMs, restarting any that fail.

As Figure 2 shows, you have two roles to choose from when you create an instance, both

based on Windows Server. The main difference between the two is that an instance of a web

role runs IIS, while an instance of a worker role does not. Both are managed in the same way,

however, and it's common for an application to use both. For example, a web role instance

might accept requests from users, then pass them to a worker role instance for processing.

To scale your application up or down, you can request that Windows Azure create more

instances of either role or shut down existing instances. And just like Windows Azure Virtual

Machines, you're charged by the hour for each web or worker role instance.

Each of the three Windows Azure execution models has its own role to play. Windows Azure

Virtual Machines provides a general-purpose computing environment, Windows Azure Web

Sites offers low-cost web hosting, and Windows Azure Cloud Services is the best choice for

creating scalable, reliable applications with low administration costs. And as mentioned

earlier, you can use these technologies separately or combine them as needed to create the

right foundation for your application. The approach you choose depends on what problems

you're trying to solve.

Data Management

Applications need data, and different kinds of applications need different kinds of data.

Because of this, Windows Azure provides several different ways to store and manage data.

One of these has already been mentioned: the ability to run SQL Server or another DBMS in a

VM created with Windows Azure Virtual Machines. (It's important to realize that this option

isn't limited to relational systems; you're also free to run NoSQL technologies such as

MongoDB and Cassandra.) Running your own database system is straightforward-it replicates

what we're used to in our own datacenters-but it also requires handling the administration of

that DBMS. To make life easier, Windows Azure provides three data management options

that are largely managed for you. Figure 3 shows the choices.

Figure 3: For data management, Windows Azure provides relational storage, scalable

NoSQL tables, and unstructured binary storage.

Each of the three options addresses a different need: relational storage, fast access to

potentially large amounts of simple typed data, and unstructured binary storage. In all three

cases, data is automatically replicated across three different computers in a Windows Azure

datacenter to provide high availability. It's also worth pointing out that all three options can

be accessed either by Windows Azure applications or by applications running elsewhere,

such as your on-premises datacenter, your laptop, or your phone. And however you apply

them, you pay for all Windows Azure data management services based on usage, including a

gigabyte-per-month charge for stored data.

SQL Database

For relational storage, Windows Azure provides SQL Database. Formerly called SQL Azure,

SQL Database provides all of the key features of a relational database management system,

including atomic transactions, concurrent data access by multiple users with data integrity,

ANSI SQL queries, and a familiar programming model. Like SQL Server, SQL Database can be

accessed using Entity Framework, ADO.NET, JDBC, and other familiar data access

technologies. It also supports most of the T-SQL language, along with SQL Server tools such

as SQL Server Management Studio. For anybody familiar with SQL Server (or another

relational database), using SQL Database is straightforward.

But SQL Database isn't just a DBMS in the cloud-it's a PaaS service. You still control your data

and who can access it, but SQL Database takes care of the administrative grunt work, such as

managing the hardware infrastructure and automatically keeping the database and operating

system software up to date. SQL Database also provides a federation option that distributes

data across multiple servers. This is useful for applications that work with large amounts of

data or need to spread data access requests across multiple servers for better performance.

If you're creating a Windows Azure application (using any of the three execution models) that

needs relational storage, SQL Database can be a good option. Applications running outside

the cloud can also use this service, though, so there are plenty of other scenarios. For

instance, data stored in SQL Database can be accessed from different client systems,

including desktops, laptops, tablets, and phones. And because it provides built-in high

availability through replication, using SQL Database can help minimize downtime.

Tables

Suppose you want to create a Windows Azure application that needs fast access to typed

data, maybe lots of it, but doesn't need to perform complex SQL queries on this data. For

example, imagine you're creating a consumer application that needs to store customer

profile information for each user. Your app is going to be very popular, so you need to allow

for lots of data, but you won't do much with this data beyond storing it, then retrieving it in

simple ways. This is exactly the kind of scenario where Windows Azure Tables makes sense.

Don't be confused by the name: this technology doesn't provide relational storage. (In fact,

it's an example of a NoSQL approach called a key/value store.) Instead, Windows Azure

Tables let an application store properties of various types, such as strings, integers, and

dates. An application can then retrieve a group of properties by providing a unique key for

that group. While complex operations like joins aren't supported, tables offer fast access to

typed data. They're also very scalable, with a single table able to hold as much as a terabyte

of data. And matching their simplicity, tables are usually less expensive to use than SQL

Database's relational storage.

Blobs

The third option for data management, Windows Azure Blobs, is designed to store

unstructured binary data. Like Tables, Blobs provides inexpensive storage, and a single blob

can be as large as one terabyte. An application that stores video, for example, or backup data

or other binary information can use blobs for simple, cheap storage. Windows Azure

applications can also use Windows Azure drives, which let blobs provide persistent storage

for a Windows file system mounted in a Windows Azure instance. The application sees

ordinary Windows files, but the contents are actually stored in a blob.

Networking

Windows Azure runs today in several datacenters spread across the United States, Europe,

and Asia. When you run an application or store data, you can select one or more of these

datacenters to use. You can also connect to these datacenters in various ways:

You can use Windows Azure Virtual Network to connect your own on-premises local network

to a defined set of Windows Azure VMs.

You can use Windows Azure Connect to link one or more on-premises Windows servers to a

specific Windows Azure application.

If your Windows Azure application is running in multiple datacenters, you can use Windows

Azure Traffic Manager to route requests from users intelligently across instances of the

application.

Figure 4 illustrates these three options.

Figure 4: Windows Azure allows creating a cloud VPN, connecting a Windows Azure

application to on-premises machines, and intelligently distributing user requests across

different datacenters.

Virtual Network

One useful way to use a public cloud is to treat it as an extension of your own datacenter.

Because you can create VMs on demand, then remove them (and stop paying) when they're

no longer needed, you can have computing power only when you want it. And since

Windows Azure Virtual Machines lets you can create VMs running SharePoint, Active

Directory, and other familiar on-premises software, this approach can work with the

applications you already have.

To make this really useful, though, your users ought to be able to treat these applications as

if they were running in your own datacenter. This is exactly what Windows Azure Virtual

Network allows. Using a VPN gateway device, an administrator can set up a virtual private

network (VPN) between your local network and a defined group of VMs running in Windows

Azure. Because you assign your own IP v4 addresses to the cloud VMs, they appear to be on

your own network. Users in your organization can access the applications those VMs contain

as if they were running locally.

Connect

Creating a VPN between your local network and a group of VMs in the cloud is useful, but it

also requires VPN gateway hardware and the services of a network administrator. Suppose

you're a developer who just wants to connect a single Windows Azure application to a

specific group of Windows machines within your organization. Perhaps you've built a Cloud

Services application that needs to access a database on one of those servers, for example,

and you don't want to go to the trouble of configuring a VPN gateway.

Windows Azure Connect is designed for this situation. Connect provides a simple way to

establish a secure connection between a Windows Azure application and a group of

computers running Windows. A developer just installs the Connect software on the on-

premises machines-there's no need to involve a network administrator-and configures the

Windows Azure application. Once this is done, the application can communicate with the on-

premises computers as if they were on the same local network.

Traffic Manager

A Windows Azure application with users in just a single part of the world might run in only

one Windows Azure datacenter. An application with users scattered around the world,

however, is more likely to run in multiple datacenters, maybe even all of them. In this second

situation, you face a problem: How do you intelligently assign users to application instances?

Most of the time, you probably want each user to access the datacenter closest to her, since

it will likely give her the best response time. But what if that copy of the application is

overloaded or unavailable? In this case, it would be nice to route her request automatically to

another datacenter. This is exactly what's done by Windows Azure Traffic Manager.

The owner of an application defines rules that specify how requests from users should be

routed to datacenters, then relies on Traffic Manager to carry out these rules. For example,

users might normally be routed to the closest Windows Azure datacenter, but get sent to

another one when the response time from their default datacenter exceeds a certain

threshold. For globally distributed applications with many users, having a built-in service to

handle problems like these is useful.

Business Analytics

Analyzing data is a fundamental part of how businesses use information technology. A cloud

platform provides a pool of on-demand, pay-per-use resources, which makes it a good

foundation for this kind of computing. Accordingly, Windows Azure provides two options for

business analytics. Figure 5 illustrates the choices.

Figure 5: For business analytics, Windows Azure provides reporting and support for big

data.

Analyzing data can take many forms, and so these two options are quite different. It's worth

looking at each one separately.

SQL Reporting

One of the most common ways to use stored data is to create reports based on that data. To

let you do this with data in SQL Database, Windows Azure provides SQL Reporting. A subset

of the reporting services included with SQL Server, SQL Reporting lets you build reporting

into applications running on Windows Azure or on premises. The reports you create can be in

various formats, including HTML, XML, PDF, Excel, and others, and they can be embedded in

applications or viewed via a web browser.

Another option for doing analytics with SQL Database data is to use on-premises business

intelligence tools. To a client, SQL Database looks like SQL Server, and so the same

technologies can work with both. For example, you're free to use on-premises SQL Server

Reporting Services to create reports from SQL Database data.

Hadoop

For many years, the bulk of data analysis has been done on relational data stored in a data

warehouse built with a relational DBMS. This kind of business analytics is still important, and

it will be for a long time to come. But what if the data you want to analyze is so big that

relational databases just can't handle it? And suppose the data isn't relational? It might be

server logs in a datacenter, for example, or historical event data from sensors, or something

else. In cases like this, you have what's known as a big data problem. You need another

approach.

The dominant technology today for analyzing big data is Hadoop. An Apache open source

project, this technology stores data using the Hadoop Distributed File System (HDFS), then

lets developers create MapReduce jobs to analyze that data. HDFS spreads data across

multiple servers, then runs chunks of the MapReduce job on each one, letting the big data be

processed in parallel.

As Figure 5 suggests, the Apache Hadoop-based Service for Windows Azure lets HDFS

distribute data across multiple virtual machines, then spreads the logic of a MapReduce job

across those VMs. Just as with on-premises Hadoop, data is processed locally-the logic and

the data it works on are in the same VM-and in parallel for better performance. The Apache

Hadoop-based Service for Windows Azure supports other components of the technology as

well, including Hive and Pig, and Microsoft has also created an Excel plug-in for issuing Hive

queries.

Messaging

No matter what it's doing, code frequently needs to interact with other code. In some

situations, all that's needed is basic queued messaging. In other cases, more complex

interactions are required. Windows Azure provides a few different ways to solve these

problems. Figure 6 illustrates the choices.

Figure 6: For connecting applications, Windows Azure provides queues,

publish/subscribe, and synchronous connections via the cloud.

Queues

Queuing is a simple idea: One application places a message in a queue, and that message is

eventually read by another application. If your application needs just this straightforward

service, Windows Azure Queues might be the best choice.

One common use of Queues today is to let a web role instance communicate with a worker

role instance within the same Cloud Services application. For example, suppose you create a

Windows Azure application for video sharing. The application consists of PHP code running

in a web role that lets users upload and watch videos, together with a worker role

implemented in C# that translates uploaded video into various formats. When a web role

instance gets a new video from a user, it can store the video in a blob, then send a message

to a worker role via a queue telling it where to find this new video. A worker role instance-it

doesn't matter which one-will then read the message from the queue and carry out the

required video translations in the background. Structuring an application in this way allows

asynchronous processing, and it also makes the application easier to scale, since the number

of web role instances and worker role instances can be varied independently.

Service Bus

Whether they run in the cloud, in your data center, on a mobile device, or somewhere else,

applications need to interact. The goal of Windows Azure Service Bus is to let applications

running pretty much anywhere exchange data.

As Figure 6 shows, Service Bus provides a queuing service. This service isn't identical to the

Queues just described, however. Unlike Windows Azure Queues, for example, Service Bus

provides a publish-and-subscribe mechanism. An application can send messages to a topic,

while other applications can create subscriptions to this topic. This allows one-to-many

communication among a set of applications, letting the same message be read by multiple

recipients. And queuing isn't the only option: Service Bus also allows direct communication

through its relay service, providing a secure way to interact through firewalls.

Applications that communicate through Service Bus might be Windows Azure applications or

software running on some other cloud platform. They can also be applications running

outside the cloud, however. For example, think of an airline that implements reservation

services in computers inside its own datacenter. The airline needs to expose these services to

many clients, including check-in kiosks in airports, reservation agent terminals, and maybe

even customers' phones. It might use Service Bus to do this, creating loosely coupled

interactions among the various applications.

Caching

Applications tend to access the same data over and over. One way to improve performance is

to keep a copy of that data closer to the application, minimizing the time needed to retrieve

it. Windows Azure provides two different services for doing this: in-memory caching of data

used by Windows Azure applications and a content delivery network (CDN) that caches blob

data on disk closer to its users. Figure 7 shows both.

Figure 7: A Windows Azure application can cache data in memory, and copies of a blob

can be cached at sites around the world.

Caching

Accessing data stored in any of Windows Azure's data management services-SQL Database,

Tables, or Blobs-is quite fast. Yet accessing data stored in memory is even faster. Because of

this, keeping an in-memory copy of frequently accessed data can improve application

performance. You can use Windows Azure's in-memory Caching to do this.

A Cloud Services application can store data in this cache, then retrieve it directly without

needing to access persistent storage. As Figure 7 shows, the cache can be maintained inside

your application's VMs or be provided by VMs dedicated solely to caching. In either case, the

cache can be distributed, with the data it contains spread across multiple VMs in a Windows

Azure datacenter.

An application that repeatedly reads a product catalog might benefit from using this kind of

caching, for example, since the data it needs will be available more quickly. The technology

also supports locking, letting it be used with read/write as well as read-only data. And

ASP.NET applications can use the service to store session data with just a configuration

change.

CDN

Suppose you need to store blob data that will be accessed by users around the world. Maybe

it's a video of the latest World Cup match, for instance, or driver updates, or a popular e-

book. Storing a copy of the data in multiple Windows Azure datacenters will help, but if there

are lots of users, it's probably not enough. For even better performance, you can use the

Windows Azure CDN.

The CDN has dozens of sites around the world, each capable of storing copies of Windows

Azure blobs. The first time a user in some part of the world accesses a particular blob, the

information it contains is copied from a Windows Azure datacenter into local CDN storage in

that geography. After this, accesses from that part of the world will use the blob copy cached

in the CDN-they won't need to go all the way to the nearest Windows Azure datacenter. The

result is faster access to frequently accessed data by users anywhere in the world.

Identity

Working with identity is part of most applications. For example, knowing who a user is lets an

application decide how it should interact with that user. To help you do this, Microsoft

provides Windows Azure Active Directory.

Like most directory services, Windows Azure Active Directory stores information about users

and the organizations they belong to. It lets users log in, then supplies them with tokens they

can present to applications to prove their identity. It also allows synchronizing user

information with Windows Server Active Directory running on premises in your local network.

While the mechanisms and data formats used by Windows Azure Active Directory aren‘t

identical with those used in Windows Server Active Directory, the functions it performs are

quite similar.

It's important to understand that Windows Azure Active Directory is designed primarily for

use by cloud applications. It can be used by applications running on Windows Azure, for

example, or on other cloud platforms. It's also used by Microsoft's own cloud applications,

such as those in Office 365. If you want to extend your datacenter into the cloud using

Windows Azure Virtual Machines and Windows Azure Virtual Network, however, Windows

Azure Active Directory isn't the right choice. Instead, you'll want to run Windows Server

Active Directory in cloud VMs, as described earlier.

To let applications access the information it contains, Windows Azure Active Directory

provides a RESTful API called Windows Azure Active Directory Graph. This API lets

applications running on any platform access directory objects and the relationships among

them. For example, an authorized application might use this API to learn about a user, the

groups he belongs to, and other information. Applications can also see relationships between

users-their social graph-letting them work more intelligently with the connections among

people.

Another capability of this service, Windows Azure Active Directory Access Control, makes it

easier for an application to accept identity information from Facebook, Google, Windows Live

ID, and other popular identity providers. Rather than requiring the application to understand

the diverse data formats and protocols used by each of these providers, Access Control

translates all of them into a single common format. It also lets an application accept logins

from one or more Active Directory domains. For example, a vendor providing a SaaS

application might use Windows Azure Active Directory Access Control to give users in each

of its customers single sign-on to the application.

Directory services are a core underpinning of on-premises computing. It shouldn't be

surprising that they're also important in the cloud.

High-Performance Computing

One of the most attractive ways to use a cloud platform is for high-performance computing

(HPC), The essence of HPC is executing code on many machines at the same time. On

Windows Azure, this means running many virtual machines simultaneously, all working in

parallel to solve some problem. Doing this requires some way to schedule applications, i.e.,

to distribute their work across these instances. To allow this, Windows Azure provides the

HPC Scheduler.

This component can work with HPC applications built to use the industry-standard Message

Passing Interface (MPI). Software that does finite element analysis, such as car crash

simulations, is one example of this type of application, and there are many others. The HPC

Scheduler can also be used with so-called embarrassingly parallel applications, such as

Monte Carlo simulations. Whatever problem is addressed, the value it provides is the same:

The HPC Scheduler handles the complex problem of scheduling parallel computing work

across many Windows Azure virtual machines. The goal is to make it easier to build HPC

applications running in the cloud.

Media

Video makes up a large part of Internet traffic today, and that percentage will be even larger

tomorrow. Yet providing video on the web isn't simple. There are lots of variables, such as

the encoding algorithm and the display resolution of the user's screen. Video also tends to

have bursts in demand, like a Saturday night spike when lots of people decide they'd like to

watch an online movie.

Given its popularity, it's a safe bet that many new applications will be created that use video.

Yet all of them will need to solve some of the same problems, and making each one solve

those problems on its own makes no sense. A better approach is to create a platform that

provides common solutions for many applications to use. And building this platform in the

cloud has some clear advantages. It can be broadly available on a pay-as-you-go basis, and it

can also handle the variability in demand that video applications often face.

Windows Azure Media Services addresses this problem. It provides a set of cloud

components that make life easier for people creating and running applications using video

and other media. Figure 8 illustrates the technology.

Figure 8: Media Services is a platform for applications that provide video and other

media to clients around the world.

As the figure shows, Media Services provides a set of components for applications that work

with video and other media. For example, it includes a media ingest component to upload

video into Media Services (where it's stored in Windows Azure Blobs), an encoding

component that supports various video and audio formats, a content protection component

that provides digital rights management, a component for inserting ads into a video stream,

components for streaming, and more. Microsoft partners can also provide components for

the platform, then have Microsoft distribute those components and bill on their behalf.

Applications that use this platform can run on Windows Azure or elsewhere. For example, a

desktop application for a video production house might let its users upload video to Media

Services, then process it in various ways. Alternatively, a cloud-based content management

service running on Windows Azure might rely on Media Services to process and distribute

video. Wherever it runs and whatever it does, each application chooses which components it

needs to use, accessing them through RESTful interfaces.

To distribute what it produces, an application can use the Windows Azure CDN, another

CDN, or just send bits directly to users. However it gets there, video created using Media

Services can be consumed by various client systems, including Windows, Macintosh, HTML 5,

iOS, Android, Windows Phone, Flash, and Silverlight. The goal is to make it easier to create

modern media applications.

Commerce

The rise of Software as a Service is transforming how we create applications. It's also

transforming how we sell applications. Since a SaaS application lives in the cloud, it makes

sense that its potential customers should look for solutions online. And this change applies

to data as well as to applications. Why shouldn't people look to the cloud for commercially

available datasets? Microsoft addresses both of these concerns with Windows Azure

Marketplace, illustrated in Figure 9.

Figure 9: Windows Azure Marketplace lets you find and buy Windows Azure

applications and commercial datasets.

Potential customers can search the Marketplace to find Windows Azure applications that

meet their needs, then sign up to use them either through the application's creator or

directly through the Marketplace. Customers can search the Marketplace for commercial

datasets as well, including demographic data, financial data, geographic data, and more.

When they find something they like, they can access it either from the vendor or directly

through the Marketplace. Applications can also use the Bing Search API through the

Marketplace, giving them access to the results of web searches.

SDKs

Back in 2008, the very first pre-release version of Windows Azure supported only .NET

development. Today, however, you can create Windows Azure applications in pretty much

any language. Microsoft currently provides language-specific SDKs for .NET, Java, PHP,

Node.js, and Python. There's also a general Windows Azure SDK that provides basic support

for any language, such as C++.

These SDKs help you build, deploy, and manage Windows Azure applications. They're

available either from www.windowsazure.com or GitHub, and they can be used with Visual

Studio and Eclipse. Windows Azure also offers command line tools that developers can use

with any editor or development environment, including tools for deploying applications to

Windows Azure from Linux and Macintosh systems.

Along with helping you build Windows Azure applications, these SDKs also provide client

libraries that help you create software running outside the cloud that uses Windows Azure

services. For example, you might build an application running at a hoster that relies on

Windows Azure blobs, or create a tool that deploys Windows Azure applications through the

Windows Azure management interface.

Getting Started

Now that you have the big-picture, the next step is to write your first Windows Azure

application. Choose your language, get the appropriate SDK, and go for it. Cloud computing

is the new default--get started now.

http://www.windowsazure.com/en-us/develop/overview/

Failsafe: Guidance for Resilient Cloud

Architectures

11 out of 11 rated this helpful - Rate this topic

Authors: Marc Mercuri, Ulrich Homann, and Andrew Townhill

Publish date: November 2012

Introduction Fail-safe noun. Something designed to work or function automatically to prevent breakdown of a

mechanism, system, or the like.

Individuals - whether in the context of employee, citizen, or consumer – demand instant access to

application, compute and data services. The number of people connected and the devices they use to

connect to these services are ever growing. In this world of always-on services, the systems that support

them must be designed to be both available and resilient.

The Fail-Safe initiative within Microsoft is intended to deliver general guidance for building resilient cloud

architectures, guidance for implementing those architectures on Microsoft technologies, and recipes for

implementing these architectures for specific scenarios. The authors of this document are part of Microsoft‘s

Services organization, with contributions from and collaboration with members of the Customer Advisory

Team (CAT) within the product group.

This document focuses on the architectural considerations for designing scalable and resilient systems.

This paper is organized into the following sections:

Decompose the Application by Workload: Defining how a workload-centric approach provides better

controls over costs, more flexibility in choosing technologies best suited to the workload, and enables a

more finely tuned approach to availability and resiliency.

Establish a Lifecycle Model: Establishing an application lifecycle model helps define the expected behavior of

an application in production and will provide requirements and insight for the overall architecture.

Establish an Availability Model and Plan: The availability model identifies the level of availability that is

expected for your workload. It is critical as it will inform many of the decisions you‘ll make when establishing

your service.

Identify Failure Points and Failure Modes: To create a resilient architecture, it‘s important to understand and

identify failure points and modes. Specifically, making a proactive effort to understand and document what

can cause an outage will establish an outline that can be used in analysis and planning.

Resiliency Patterns and Considerations: This section represents the majority of the document, and contains

key considerations across compute, storage, and platform services. These considerations focus on proven

practices to deliver a healthy application at key considerations across compute, storage, and platform

services.

Design for Operations: In a world that expects services to be ―always on‖, it‘s important that services be

designed for operations. This section looks at proven practices for designing for operations that span the

lifecycle, including establishing a health model to implementing telemetry to visualizing that telemetry

information for the operations and developer audiences.

Decompose the Application by Workload Applications are typically composed of multiple workloads. When looking at Microsoft products, you can see

that products such as SharePoint and Windows Server are designed with this principle in mind.

http://msdn.microsoft.com/en-us/library/windowsazure/jj853352.aspx#feedback

http://msdn.microsoft.com/en-us/library/windowsazure/jj853352.aspx#bkmk_DecompApp

http://msdn.microsoft.com/en-us/library/windowsazure/jj853352.aspx#bkmk_EstablistLifecycle

http://msdn.microsoft.com/en-us/library/windowsazure/jj853352.aspx#bkmk_EstablishAvail

http://msdn.microsoft.com/en-us/library/windowsazure/jj853352.aspx#bkmk_identifyFailurePts

http://msdn.microsoft.com/en-us/library/windowsazure/jj853352.aspx#bkmk_ResiliencyPatterns

http://msdn.microsoft.com/en-us/library/windowsazure/jj853352.aspx#bkmk_DesignOps

Different workloads can, and often do, have different requirements, different levels of criticality to the

business, and different levels of financial consideration associated with them. By decomposing an application

into workloads, an organization provides itself with valuable flexibility. A workload-centric approach provides

better controls over costs, more flexibility in choosing technologies best suited to the workload, workload

specific approaches to availability and security, flexibility and agility in adding and deploying new

capabilities, etc.

Scenarios When thinking about resiliency, it‘s sometimes helpful to do so in the context of scenarios. The following are

examples of typical scenarios:

Scenario 1 – Sports Data Service

A customer provides a data service that provides sports information. The service has two primary workloads.

The first provides statistics for the player and teams. The second provides scores and commentary for games

that are currently in progress.

Scenario 2 – E-Commerce Web Site

An online retailer sells goods via a website in a well-established model. The application has a number of

workloads, with the most popular being ―search and browse‖ and ―checkout.‖

Scenario 3 – Social

A high profile social site allows members of a community to engage in shared experiences around forums,

user generated content, and casual gaming. The application has a number of workloads, including

registration, search and browse, social interaction, gaming, email, etc.

Scenario 4 - Web

An organization wishes to provide an experience to customers via its web site. The application needs to

deliver experiences on both PC-based browsers as well as popular mobile device types (phone, tablet) The

application has a number of workloads including registration, search and browse, content publishing, social

commenting, moderation, gaming, etc.

Example of Decomposing by Workload Let‘s take a closer look at one of the scenarios and decompose it into its child workloads. Scenario #2, an

ecommerce web site, could have a number of workloads – browse & search, checkout & management, user

registration, user generated content (reviews and rating), personalization, etc.

Example definitions of two of the core workloads for the scenario would be:

Browse & Search enables customers to navigate through a product catalog, search for specific items, and

perhaps manage baskets or wish lists. This workload can have attributes such as anonymous user access,

sub-second response times, and caching. Performance degradation may occur in the form of increased

response times with unexpected user load or application-tolerant interrupts for product inventory refreshes.

In those cases, the application may choose to continue to serve information from the cache.

Checkout & Management helps customers place, track, and cancel orders; select delivery methods and

payment options; and manage profiles. This workload can have attributes such as secure access, queued

processing, access to third-party payment gateways, and connectivity to back-end on-premise systems.

While the application may tolerate increased response time, it may not tolerate loss of orders; therefore, it is

designed to guarantee that customer orders are always accepted and captured, regardless of whether the

application can process the payment or arrange delivery.

Establish a Lifecycle Model

An application lifecycle model defines the expected behavior of an application when operational. At different

phases and times, an application will put different demands on the system whether at a functional or scale

level. The lifecycle model(s) will reflect this.

Workloads should have defined lifecycle models for all relevant and applicable scenarios. Services may have

hourly, daily, weekly, or seasonal lifecycle differences that, when modeled, identify specific capacity,

availability, performance, and scalability requirements over time.

Many services will have a minimum of two applicable models, particularly if service demand bursts in a

predictable fashion. Whether it‘s a spike related to peak demand during a holiday period, increased filing of

tax returns just before their due date, morning and afternoon commuter time windows, or end-of-year filing

of employee performance reviews, many organizations have an understanding of predictable spike in

demand for a service that should be modeled.

Figure 1. A view of the lifecycle model on a month by month basis

Figure 2. A look at the lifecycle model more granularly, at the daily level

Establish an Availability Model and Plan Once a lifecycle model is identified, the next step is to establish an availability model and plan. An availability

model for your application identifies the level of availability that is expected for your workload. It is critical as

it will inform many of the decisions you‘ll make when establishing your service.

There are a number of things consider and a number of potential actions that can be taken.

SLA Identification When developing your availability plan, it‘s important to understand what the desired availability is for your

application, the workloads within that application, and the services that are utilized in the delivery of those

workloads.

Defining the Desired SLA for Your Workload

Understanding the lifecycle of your workload will help you understand the desired Service Level Agreement

that you‘d like to deliver. Even if an SLA is not provided for your service publicly, this is the baseline to which

you‘ll aspire to meet in terms of availability.

There are a number of options that can be taken that will provide scalability and resiliency. These can take

varying costs and contain multiple layers. At an application level, utilizing all of these is unfeasible for most

projects due to cost and implementation time. By decomposing your application to the workload level, you

gain a benefit that you can make these investments at a more targeted level, the workload.

Even at the workload level, you may not choose to implement every option. What you choose to implement

or not is determined by your requirements. Regardless of the options you do choose, you should make a

conscious choice that‘s informed and considerate of all of the options.

Autonomy Autonomy is about independence and reducing dependency between the parts which make up the service

as a whole. Dependency on components, data, and external entities must be examined when designing

services, with an eye toward building related functionality into autonomous units within the service. Doing so

provides the agility to update versions of distinct autonomous units, finer tuned control of scaling these

autonomous units, etc.

Workload architectures are often composed of autonomous components that do not rely on manual

intervention, and do not fail when the entities they depend upon are not available. Applications composed

of autonomous parts are:

available and operational

resilient and easily fault-recoverable

lower-risk for unhealthy failure states

easy to scale through replication

less likely to require manual interventions

These autonomous units will often leverage asynchronous communication, pull-based data processing, and

automation to ensure continuous service.

Looking forward, the market will evolve to a point where there are standardized interfaces for certain types

of functionality for both vertical and horizontal scenarios. When this future vision is realized, a service

provider will be able to engage with different providers and potentially different implementations that solve

the designated work of the autonomous unit. For continuous services, this will be done autonomously and

be based on policies.

As much as autonomy is an aspiration, most services will take a dependency on a third party service – if only

for hosting. It‘s imperative to understand the SLAs of these dependent services and incorporate them into

your availability plan.

+Understanding the SLAs and Resiliency Options for Service

Dependencies This section identifies the different types of SLAs that can be relevant to your service. For each of these

service types, there are key considerations and approaches, as well as questions that should be asked.

Public Cloud Platform Services

Services provided by a commercial cloud computing platform, such as compute or storage, have service level

agreements that are designed to accommodate a multitude of customers at significant scale. As such, the

SLAs for these services are non-negotiable. A provider may provide tiered levels of service with different

SLAs, but these tiers will be non-negotiable.

Questions to consider for this type of service:

Does this service allow only a certain number of calls to the Service API?

Does this service place limit on the call frequency to the Service API?

Does the service limit the number of servers that can call the Service API?

What is the publicly available information on how the service delivers on its availability promise?

How does this service communicate its health status?

What is the stated Service Level Agreement (SLA)?

What are the equivalent platform services provided by other 3rd parties?

3rd Party ―Free‖ Services

Many third parties provide ―free‖ services to the community. For private sector organizations, this is largely

done to help generate an ecosystem of applications around their core product or service. For public sector,

this is done to provide data to the citizenry and businesses that have ostensibly have paid for its collection

through the funding of the government through taxes.

Most of these services will not come with service level agreements, so availability is not guaranteed. When

SLAs are provided, they typically focus on restrictions that are placed on consuming applications and

mechanisms that will be used to enforce them. Examples of restrictions can include throttling or blacklisting

your solution if it exceeds a certain number of service calls, exceeds a certain number of calls in a given time

period (x per minute), or exceeds the number of allowable servers that are calling the service.



Does this service place limits on the call frequency to the Service API?





Is this a commodity service where the required functionality and/or data are available from multiple service

providers?

If a commodity service, is the interface interoperable across other service providers (directly or through an

available abstraction layer)?


3rd Party Commercial Services

Commercial services provided by third parties have service level agreements that are designed to

accommodate the needs of paying customers. A provider may provide tiered levels of SLAs with different

levels of availability, but these SLAs will be non-negotiable.









providers?




Community Cloud Services

A community of organizations, such as a supply chain, may make services available to member

organizations.








As a member of the community, is there a possibility of negotiating a different SLA?


providers?




1st Party Internal Enterprise Wide Cloud Services

An enterprise may make core services, such as stock price data or product metadata, available to its divisions

and departments.








As a member of the organization, is there a possibility of negotiating a different SLA?


providers?




1st Party Internal Divisional or Departmental Cloud Services

An enterprise division or department may make services available to other members of their immediate

organization.








As a member of the division, is there a possibility of negotiating a different SLA?


providers?




The ―True 9s‖ of Composite Service Availability

Taking advantage of existing services can provide significant agility in delivering solutions for your

organization or for commercial sale. While attractive, it is important to truly understand the impacts these

dependencies have on the overall SLA for the workload.

Availability is typically expressed as a percentage of uptime in a given year. This is expressed availability

percentage is referred to as the number of ―9s.‖ For example, 99.9 represents a service with ―three nines‖ and

99.999 represents a service with ―five nines.‖

Availability % Downtime per

year

Downtime per

month

Downtime per

week

90% ("one nine") 36.5 days 72 hours 16.8 hours

95% 18.25 days 36 hours 8.4 hours

97% 10.96 days 21.6 hours 5.04 hours

98% 7.30 days 14.4 hours 3.36 hours

99% ("two nines") 3.65 days 7.20 hours 1.68 hours

99.5% 1.83 days 3.60 hours 50.4 minutes

99.8% 17.52 hours 86.23 minutes 20.16 minutes

99.9% ("three nines") 8.76 hours 43.2 minutes 10.1 minutes

99.95% 4.38 hours 21.56 minutes 5.04 minutes

99.99% ("four nines") 52.56 minutes 4.32 minutes 1.01 minutes

99.999% ("five nines") 5.26 minutes 25.9 seconds 6.05 seconds

99.9999% ("six nines") 31.5 seconds 2.59 seconds 0.605 seconds

99.99999% ("seven

nines")

3.15 seconds .259 seconds 0.0605 seconds

Figure 3. Downtime related to the more common ―9s‖

One common misconception is related to the number of ―9s‖ for a composite service provides. Specifically, it

is often assumed that if a given service is composed of 5 services, each with a promised 99.999 uptime in

their SLAs, that the resulting composite service has availability of 99.999. This is not the case.

The percentage is actually a calculation which considers the amount of downtime per year. A service with an

SLA of ―four 9s‖ (99.99%) can be offline up to 52.56 minutes. Incorporating 5 of these services into a

composite introduces an identified SLA risk of 262.8 minutes or 4.38 hours. This reduces the availability to

99.95% before a single line of code is written! You generally can‘t change the availability of a third-party

service; however, when writing your code, you can increase the overall availability of your application using

concepts laid out in this document.

When leveraging external services, the importance of understanding SLAs – both individually and their

impact on the composite - cannot be stressed enough.

Identify Failure Points and Failure Modes To create a resilient architecture, it‘s important to understand it. Specifically, making a proactive effort to

understand and document what can cause an outage.

Understanding the failure points and failure modes for an application and its related workload services can

enable you to make informed, targeted decisions on strategies for resiliency and availability.

Failure Points A failure point is a design element that can cause an outage. An important focus is on design elements that

are subject to external change.

Examples of failure points include –

Database connections

Website connections

Configuration files

Registry keys

Categories of common failure points include –

ACLs

Database access

External web site/service access

Transactions

Configuration

Capacity

Network

Failure Modes While failure points define the areas that can result in an outage, failure modes identify the root cause of an

outage at those failure points.

Examples of failure modes include –

A missing configuration file

Significant traffic exceeding resource capacity

A database reaching maximum capacity

Resiliency Patterns and Considerations This document will look at key considerations across compute, storage, and platform services. Before

covering these topics, it is important to recap several basic resiliency impacting topics that are often either

misunderstood and/or not implemented.

Default to Asynchronous As mentioned previously, a resilient architecture should optimize for autonomy. One of the ways to achieve

autonomy is by making communication asynchronous. A resilient architecture should default to

asynchronous interaction, with synchronous interactions happening only as the result of an exception.

Stateless web-tiers or web-tiers with a distributed cache can provide this on the front end of a solution.

Queues can provide this capability for communication for interaction between workload services or for

services within a workload service.

The latter allows messages to be placed on queues and secondary services can retrieve them. This can be

done based on logic, time, or volume considerate logic. In addition to making the process asynchronous, it

also allows scaling of tiers ―pushing‖ or ―pulling‖ from the queues as appropriate.

Timeouts A common area where transient faults will occur is where your architecture connects to a service or a

resource such as a database. When consuming these services, it‘s a common practice to implement logic that

introduces the concept of a time out. This logic identifies an acceptable timeframe in which a response is

expected and will generate an identifiable error when exceeding that time frame. Based on the appearance

of the timeout error, appropriate steps will be taken based on the context in which the error occurs. Context

can include the number of times this error has occurred, the potential impact of the unavailable resource,

SLA guarantees for the current time period for the given customer, etc.

Handle Transient Faults When designing the service(s) that will deliver your workload, you must accept and embrace that failures will

occur and take the appropriate steps to address them.

One of the common areas to address is transient faults. As no service has 100% uptime, it‘s realistic to expect

that you may not be able to connect to a service that a workload has taken a dependency on. The inability to

connect to or faults seen from one of these services may be fleeting (less than a second) or permanent (a

provider shuts down).

Degrade Gracefully

Your workload service should aspire to handle these transient faults gracefully. Netflix, for example, during

an outage at their cloud provider utilized an older video queue for customers when the primary data store

was not available. Another example would be an ecommerce site continuing to collect orders if its payment

gateway is unavailable. This provides the ability to process orders when the payment gateway is once again

available or after failing over to a secondary payment gateway.

When doing this, the ideal scenario is to minimize the impact to the overall system. In both cases, the service

issues are largely invisible to end users of these systems.

Transient Fault Handling Considerations

There are several key considerations for the implementation of transient fault handling, as detailed in the

following sections.

Retry logic

The simplest form of transient fault handling is to retry the operation that failed. If using a commercial third

party service, implementing ―retry logic‖ will often resolve this issue.

It should be noted that designs should typically limit the number of times the logic will be retried. The logic

will typically attempt to execute the action(s) a certain number of times, registering an error and/or utilizing

a secondary service or workflow if the fault continues.

Exponential Backoff

If the result of the transient fault is due to throttling by the service due to heavy load, repeated attempts to

call the service will only extend the throttling and impact overall availability.

It is often desirable to reduce the volume of the calls to the service to help avoid or reduce throttling. This is

typically done algorithmically, such as immediately retrying after the first failure, waiting 1 second after the

second failure, 5 seconds after the 3rd failure, etc. until ultimately succeeding or a hitting an application

defined threshold for failures.

This approach is referred to ―exponential backoff.‖

Idempotency

A core assumption with connected services is that they will not be 100% available and that transient fault

handling with retry logic is a core implementation approach. In cases where retry logic is implemented, there

is the potential for the same message to be sent more than once, for messages to be sent out of sequence,

etc.

Operations should be designed to be idempotent, ensuring that sending the same message multiple times

does not result in an unexpected or polluted data store.

For example, inserting data from all requests may result in multiple records being added if the service

operation is called multiple times. An alternate approach would be to implement the code as an intelligent

‗upsert‘. A timestamp or global identifier could be used to identify new from previously processed messages,

inserting only newer ones into the database and updating existing records if the message is newer than what

was received in the past.

Compensating Behavior

In addition to idempotency, another area for consideration is the concept compensating behavior. In a world

of an every growing set of connected systems and the emergence of composite services, the importance of

understanding how to handle the compensating behavior is important.

For many developers of line of business applications, the concepts of transactions are not new, but the frame

of reference is often tied to the transactional functionality exposed by local data technologies and related

code libraries. When looking at the concept in terms of the cloud, this mindset needs to take into new

considerations related to orchestration of distributed services.

A service orchestration can span multiple distributed systems and be long running and stateful. The

orchestration itself is rarely synchronous, can span multiple systems and can span from seconds to years

based on the business scenario.

In a supply chain scenario that could tie together 25 organizations in the same workload activity, for

example, there may be a set of 25 or more systems that are interconnected in one or more service

orchestrations.

If success occurs, the 25 systems must be made aware that the activity was successful. For each connection

point in the activity, participant systems can provide a correlation ID for messages it receives from other

systems. Depending on the type of activity, the receipt of that correlation ID may satisfy the party that the

transaction is notionally complete. In other cases, upon the completion of the interactions of all 25 parties,

and confirmation message may be sent to all parties (either directly from a single service or via the specific

orchestration interaction points for each system).

To handle failures in composite and/or distributed activities, each service would expose a service interface

and operation(s) to receive requests to cancel a given transaction by a unique identifier. Behind the service

façade, workflows would be in place to compensate for the cancellation of this activity. Ideally these would

be automated procedures, but they can be as simple as routing to a person in the organization to remediate

manually.

Circuit Breaker Pattern A circuit breaker is a switch that automatically interrupts the flow of electric current if the current exceeds a

preset limit. Circuit breakers are used most often as a safety precaution where excessive current through a

circuit could be hazardous. Unlike a fuse, a circuit breaker can be reset and re-used.

The same pattern is applicable to software design, and particularly applicable for services where availability

and resiliency are a key consideration.

In the case of a resource being unavailable, implementing a software circuit breaker can respond with

appropriate action and respond appropriately.

A common implementation of this pattern is related to accessing of databases or data services. Once an

established type and level of activity fails, the circuit breaker would react. With data, this is typically caused

by the inability to connect to a database or a data service in front of that database.

If a call to a database resource failed after 100 consecutive attempts to connect, there is likely little value in

continuing to call the database. A circuit breaker could be triggered at that threshold and the appropriate

actions can be taken.

In some cases, particularly when connecting to data services, this could be the result of throttling based on a

client exceeding the number of allowed calls within a given time period. The circuit breaker may inject delays

between calls until such time that connections are successfully established and meet the tolerance levels.

In other cases, the data store may not be unavailable. If a redundant copy of the data is available, the system

may fail over to that replica. If a true replica is unavailable or if the database service is down broadly across

all data centers within a provider, a secondary approach may be taken. This could include sourcing data from

a version of the data requested via an alternate data service provider. This alternate source could be from a

cache, an alternate persistent data store type on the current cloud provider, a separate cloud provider, or an

on premise data center. When such an alternate is not available, the service could also return a recognizable

error that could be handled appropriately by the client.

Circuit Breaker Example: Netflix

Netflix, a media streaming company, is often held up as a great example of a resilient architecture. When

discussing the circuit breaker pattern at Netflix, that team calls out several criteria that are included in their

circuit breaker in their Netflix Tech Blog. These included:

1. A request to the remote service times out.

2. The thread pool and bounded task queue used to interact with a service dependency are at 100% capacity.

3. The client library used to interact with a service dependency throws an exception.

All of these contribute to the overall error rate. When that error rate exceeds their defined thresholds, the

circuit breaker is ―tripped‖ and the circuit for that service immediately serves fallbacks without even

attempting to connect to the remote service.

In that same blog entry, the Netflix team states that the circuit breaker for each of their services implements

a fallback using one of the following three approaches:

1. Custom fallback – a service client library provides an invokable fallback method or locally available data on

an API server (e.g., a cookie or local cache) is used to generate a fallback response.

2. Fail silent – a method returns a null value to the requesting client, which works well when the data being

requested is optional.

3. Fail fast – when data is required or no good fallback is available, a 5xx response is returned to the client. This

approach focuses on keeping API servers healthy and enabling a quick recovery when impacted services

come back online, but does so at the expense of negatively impacting the client UX.

Handling SLA Outliers: Trusted Parties and Bad Actors To enforce an SLA, an organization should address how its data service will deal with two categories of

outliers—trusted parties and bad actors.

Trusted Parties and White Listing

Trusted parties are organizations with whom the organization could have special arrangements, and for

whom certain exceptions to standard SLAs might be made.

http://techblog.netflix.com/2011_12_01_archive.html

Third Parties with Custom Agreements

There may be some users of a service that want to negotiate special pricing terms or policies. In some cases,

a high volume of calls to the data service might warrant special pricing. In other cases, demand for a given

data service could exceed the volume specified in standard usage tiers. Such customers should be defined as

trusted parties to avoid inadvertently being flagged as bad actors.

White Listing

The typical approach to handling trusted parties is to establish a white list. A white list, which identifies a list

of trusted parties, is used by the service when it determines which business rules to apply when processing

customer usage. White listing is typically done by authorizing either an IP address range or an API key.

When establishing a consumption policy, an organization should identify if white listing is supported; how a

customer would apply to be on the white list; how to add a customer to the white list; and under what

circumstances a customer is removed from the white list.

Handling Bad Actors

If trusted parties stand at one end of the customer spectrum, the group at the opposite end is what is

referred to as ―bad actors.‖ Bad actors place a burden on the service, typically from attempted

―overconsumption.‖ In some cases bad behavior is genuinely accidental. In other cases it is intentional, and,

in a few situations, it is malicious. These actors are labeled ―bad‖, as their actions – intentional or otherwise –

have the ability to impact the availability of one or more services.

The burden of bad actors can introduce unnecessary costs to the data service provider and compromise

access by consumers who faithfully follow the terms of use and have a reasonable expectation of service, as

spelled out in an SLA. Bad actors must therefore be dealt with in a prescribed, consistent way. The typical

responses to bad actors are throttling and black listing.

Throttling

Organizations should define a strategy for dealing with spikes in usage by data service consumers.

Significant bursts of traffic from any consumer can put an unexpected load on the data service. When such

spikes occur, the organization might want to throttle access for that consumer for a certain period of time. In

this case the service refuses all requests from the consumer for a certain period of time, such as one minute,

five minutes, or ten minutes. During this period, service requests from the targeted consumer result in an

error message advising that they are being throttled for overuse.

The consumer making the requests can respond accordingly, such as by altering its behavior.

The organization should determine whether it wants to implement throttling and set the related business

rules. If it determines that consumers can be throttled, the organization will also need to decide what

behaviors should trigger the throttling response.

Black listing

Although throttling should correct the behavior of bad actors, it might not always be successful. In cases in

which it does not work, the organization might want to ban a consumer. The opposite of a white list, a black

list identifies consumers that are barred from access to the service. The service will respond to access

requests from black-listed customers appropriately, and in a fashion that minimizes the use of data service

resources.

Black listing, as with white listing, is typically done by using either an API key or with an IP address range.

When establishing a consumption policy, the organization should specify what behaviors will place a

consumer on the black list; how black listing can be appealed; and how a consumer can be removed from

the black list.

“Automate All the Things” People make mistakes. Whether it‘s a developer making a code change that could have unexpected

consequences, a DBA accidentally dropping a table in a database, or an operations person who makes a

change but doesn‘t document it, there are multiple opportunities for a person to inadvertently make a

service less resilient.

To reduce human error, a logical approach is to reduce the amount of humans in the process. Through the

introduction of automation, you limit the ability for ad hoc, inadvertent deltas from expected behavior to

jeopardize your service.

There is a meme in the DevOps community with a cartoon character saying ―Automate All the Things.‖ In the

cloud, most services are exposed with an API. From development tools to virtualized infrastructure to

platform services to solutions delivered as Software as a Service, most things are scriptable.

Scripting is highly recommended. Scripting makes deployment and management consistent and predictable

and pays significant dividends for the investment.

Automating Deployment

One of the key areas of automation is in the building and deployment of a solution. Automation can make it

easy for a developer team to test and deploy to multiple environments. Development, test, staging, beta,

and production can all be deployed readily and consistently through automated builds. The ability to deploy

consistently across environments works toward ensuring that what‘s in production is representative of what‘s

been tested.

Establish and Automating a Test Harness

Testing is another area that can be automated. Like automated deployment, establishing automated testing

is valuable in ensuring that your system is resilient and stays resilient over time. As code and usage of your

service evolves it‘s important to remain that all appropriate testing is done, both functionally and at scale.

Automating Data Archiving and Purging

One of the areas that gets little attention is that of data archiving and purging. Data volume is growing and

continues to grow at a higher volume and in greater variety than any time in history. Depending on the

database technology and the types of queries required, unnecessary data can reduce the response time of

your system and increase costs unnecessarily. For resiliency plans that include one or more replicas of a data

store, removing all but the necessary data can expedite management activities such as backing up and

restoring data.

Identify the requirements for your solution related to data needed for core functionality, data needed for

compliance purposes but can be archived, and data that is no longer necessary and can be purged.

Utilize the APIs available from the related products and services to automate the implementation of these

requirements.

Understand Fault Domains and Upgrade Domains When building a resilient architecture, it‘s also important to understand the concepts of fault domains and

upgrade domains.

Fault Domains

Fault domains constrain the placement of services based on known hardware boundaries and the likelihood

that a particular type of outage will affect a set of machines. A fault domain is defined as a series of

machines can fail simultaneously, and are usually defined by physical properties (a particular rack of

machines, a series of machines sharing the same power source, etc).

Upgrade Domains

Upgrade domains are similar to fault domains. Upgrade domains define a physical set of services that are

updated by the system at the same time. The load balancer at the cloud provider must be aware of upgrade

domains in order to ensure that if a particular domain is being updated that the overall system remains

balanced and services remain available.

Depending on the cloud provider and platform services utilized, fault domains and upgrade domains may be

provided automatically, be something your service can opt-in to via APIs, or require a 1st or 3rd party

solution.

Identify Compute Redundancy Strategies On-premises solutions have often relied on redundancy to help them with availability and scalability. From

an availability standpoint, redundant data centers provided the ability to increase likelihood of business

continuity in the face of infrastructure failures in a given data center or part of a data center.

For applications with geo-distributed consumers, traffic management and redundant implementations

routed users to local resources, often with reduced latency.

Note

Data resiliency, which includes redundancy, is covered

as a separate topic in the section titled Establishing a

Data Resiliency Approach.

Redundancy and the Cloud

On-premises, redundancy has historically been achieved through duplicate sets of hardware, software, and

networking. Sometimes this is implemented in a cluster in a single location or distributed across multiple

data centers.

When devising a strategy for the cloud, it is important to rationalize the need for redundancy across three

vectors. These vectors include deployed code within a cloud provider‘s environment, redundancy of

providers themselves, and redundancy between the cloud and on premises.

Deployment Redundancy

When an organization has selected a cloud provider, it is important to establish a redundancy strategy for

the deployment within the provider.

If deployed to Platform as a Service (PaaS), much of this may be handled by the underlying platform. In an

Infrastructure as a Service (IaaS) model, much of this is not.

Deploy n number of roles with in a data center

The simplest form of redundancy is deploying your solution to multiple compute nodes within a single cloud

provider. By deploying to multiple nodes, the solution can limit downtime that would occur when only a

single node is deployed.

In many Platform as a Service environments, the state of the virtual machine hosting the code is monitored

and virtual machines detected to be unhealthy can be automatically replaced with a healthy node.

Deploy Across Multiple Data Centers

While deploying multiple nodes in a single data center will provide benefits, architectures must consider that

an entire data center could potentially be unavailable. While not a common occurrence, events such as

natural disasters, war, etc. could result in a service disruption in a particular geo-location.

To achieve your SLA, it may be appropriate for you to deploy your solution to multiple data centers for your

selected cloud provider. There are several approaches to achieving this, as identified below.

1. Fully Redundant Deployments in Multiple Data Centers

The first option is a fully redundant solution in multiple data centers done in conjunction with a traffic

management provider. A key consideration for this approach will be impact to the compute-related costs for

this type of redundancy, which will increase 100% for each additional data center deployment.

2. Partial Deployment in Secondary Data Center(s) for Failover

Another approach is to deploy a partial deployment to a secondary data center of reduced size. For example,

if the standard configuration utilized 12 compute nodes, the secondary data center would contain a

deployment containing 6 compute nodes.

This approach, done in conjunction with traffic management, would allow for business continuity with

degraded service after an incident that solely impacted the primary center.

Given the limited number of times a data center goes offline entirely, this is often seen as a cost-effective

approach for compute – particularly if a platform allows the organization to readily onboard new instances in

the second data center.

3. Divided Deployments across Multiple Data Centers with Backup Nodes

For certain workloads, particularly those in the financial services vertical, there is a significant amount of data

that must be processed within a short, immovable time window. In these circumstances, work is done in

shorter bursts and the costs of redundancy are warranted to deliver results within that window.

In these cases, code is deployed to multiple data centers. Work is divided and distributed across the nodes

for processing. In the instance that a data center becomes unavailable, the work intended for that node is

delivered to the backup node which will complete the task.

4. Multiple Data Center Deployments with Geography Appropriate Sizing per Data Center

This approach utilizes redundant deployments that exist in multiple data centers but are sized appropriately

for the scale of a geo-relevant audience.

Provider Redundancy

While data-center-centric redundancy is good, Service Level Agreements are at the Service Level vs. the data

center. There is the possibility that the services delivered by a provider could become unavailable across

multiple or all data centers.

Based on the SLAs for a solution, it may be desirable to also incorporate provider redundancy. To realize this,

cloud-deployable products or cloud services that will work across multiple cloud platforms must be

identified. Microsoft SQL Server, for example, can be deployed in a Virtual Machine inside of Infrastructure as

a Service offerings from most vendors.

For cloud provided services, this is more challenging as there are no standard interfaces in place, even for

core services such as compute, storage, queues, etc. If provider redundancy is desired for these services, it is

often achievable only thorugh an abstraction layer. An abstraction layer may provide enough functionality

for a solution, but it will not be innovated as fast as the underlying services and may inhibit an organization

from being able to readily adopt new features delivered by a provider.

If redundant provider services may are warranted, it can be at one of several levels--an entire application, a

workload, or an aspect of a workload. At the appropriate level, evaluate the need for compute, data, and

platform services and determine what must truly be redundant and what can be handled via approaches to

provide graceful degradation.

On-Premises Redundancy

While taking a dependency on a cloud provider may make fiscal sense, there may be certain business

considerations that require on-premises redundancy for compliance and/or business continuity.

Based on the SLAs for a solution, it may be desirable to also incorporate on-premises redundancy. To realize

this, private cloud-deployable products or cloud services that will work across multiple cloud types must be

identified. As with the case of provider redundancy, Microsoft SQL Server is a good example of a product

that can be deployed on-premises or in an IaaS offering.

For cloud provided services, this is more challenging as there are often no on-premises equivalents with

interface and capability symmetry.

If redundant provider services are required on premises, this can be at one of several levels--an entire

application, a workload, or an aspect of a workload. At the appropriate level, evaluate the need for compute,

data, and platform services and determine what must truly be redundant and what can be handled via

approaches to provide graceful degradation.

Redundancy Configuration Approaches

When identifying your redundancy configuration approaches, classifications that existed pre-cloud also

apply. Depending on the types of services utilized in your solution, some of this may be handled by the

underlying platform automatically. In other cases, this capability is handled through technologies like

Windows Fabric.

1. Active/active — Traffic intended for a failed node is either passed onto an existing node or load balanced

across the remaining nodes. This is usually only possible when the nodes utilize a homogeneous software

configuration.

2. Active/passive — Provides a fully redundant instance of each node, which is only brought online when its

associated primary node fails. This configuration typically requires the most extra hardware.

3. N+1 — Provides a single extra node that is brought online to take over the role of the node that has failed.

In the case of heterogeneous software configuration on each primary node, the extra node must be

universally capable of assuming any of the roles of the primary nodes it is responsible for. This normally

refers to clusters which have multiple services running simultaneously; in the single service case, this

degenerates to active/passive.

4. N+M — In cases where a single cluster is managing many services, having only one dedicated failover node

may not offer sufficient redundancy. In such cases, more than one (M) standby servers are included and

available. The number of standby servers is a tradeoff between cost and reliability requirements.

5. N-to-1 — Allows the failover standby node to become the active one temporarily, until the original node can

be restored or brought back online, at which point the services or instances must be failed-back to it in order

to restore high availability.

6. N-to-N — A combination of active/active and N+M, N to N redistributes the services, instances or

connections from the failed node among the remaining active nodes, thus eliminating (as with active/active)

the need for a 'standby' node, but introducing a need for extra capacity on all active nodes.

Traffic Management Whether traffic is always geo-distributed or routed to different data centers to satisfy business continuity

scenarios, traffic management functionality is important to ensure that requests to your solution are being

routed to the appropriate instance(s).

It is important to note that taking a dependence on a traffic management service introduces a single point of

failure. It is important to investigate the SLA of your application‘s primary traffic management service and

determine if alternate traffic management functionality is warranted by your requirements.

Establish a Data Partitioning Strategy While many high scale cloud applications have done a fine job of partitioning their web tier, they are less

successful in scaling their data tier in the cloud. With an ever growing diversity of connected devices, the

level of data generated and queried is growing at levels not seen before in history. The need to be able to

support 500,000 new users per day, for example, is now considered reasonable.

Having a partitioning strategy is critically important across multiple dimensions, including storing, querying,

or maintaining that data.

Decomposition and Partitioning

Because of the benefits and tradeoffs of different technologies, it is common to leverage technologies that

are most optimal for the given workload.

Having a solution that is decomposed by workloads provides you with the ability to choose data

technologies that are optimal for a given workload. For example, a website may utilize table storage for

content for an individual, utilizing partitions at the user level for a response experience. Those table rows

may be aggregated periodically into a relational database for reporting and analytics.

Partitioning strategies may, and often will, vary based on the technologies chosen.

Understanding the 3 Vs

To properly devise a partitioning strategy, an organization must first understand it.

The 3 Vs, made popular by Gartner, look at three different aspects of data. Understanding how the 3 Vs

relate to your data will assist you in making an informed decision on partitioning strategies.

Volume

Volume refers to the size of the data. Volume has very real impacts on the partitioning strategy. Volume

limitations on a particular data technology may force partitioning due to size limitations, query speeds at

volume, etc.

Velocity

Velocity refers to the rate at which your data is growing. You will likely devise a different partitioning strategy

for a slow growing data store vs. one that needs to accommodate 500,000 new users per day.

Variety

Variety refers to the different types of data that are relevant to the workload. Whether it‘s relational data,

key-value pairs, social media profiles, images, audio files, videos, or other types of data, it‘s important to

understand it. This is both to choose the right data technology and make informed decisions for your

partitioning strategy.

Horizontal Partitioning

Likely the most popular approach to partitioning data is to partition it horizontally. When partitioning

horizontally, a decision is made on criteria to partition a data store into multiple shards. Each shard contains

the entire schema, with the criteria driving the placement of data into the appropriate shards.

Based on the type of data and the data usage, this can be done in different ways. For example, an

organization could choose to partition their data based on a customer last name. In another case, the

partition could be date centric, partitioning on the relevant calendar interval of hour, day, week, or month.

Figure 4. An example of horizontal partitioning by last name

Vertical Partitioning

Another approach is vertical partitioning. This optimizes the placement of data in different stores, often tied

to the variety of the data. Figure 5 shows an example where metadata about a customer is placed in one

store while thumbnails and photos are placed in separate stores.

Vertical partitioning can result in optimized storage and delivery of data. In Figure 5, for example, if the

photo is rarely displayed for a customer, returning 3 megabytes per records can add unnecessary costs in a

pay as you go model.

Figure 5. An example of vertical partitioning.

Hybrid Partitioning

In many cases it will be appropriate to establish a hybrid partitioning strategy. This approach provides the

efficiencies of both approaches in a single solution.

Figure 6 shows an example of this, where the vertical partitioning seen earlier is now augmented to take

advantage of horizontal partitioning of the customer metadata.

Figure 6. An example of horizontal partitioning.

Cloud computing == network computing

At the heart of cloud computing is the network. The network is crucial as it provides the fabric or backbone

for devices to connect to services as well as services connecting to other services. There are three network

boundaries to consider in any FailSafe application.

Those network boundaries are detailed below with Windows Azure used as an example to provide context:

1. Role boundaries are traditionally referred to as tiers. Common examples are a web tier or a business logic

tier. If we look at Windows Azure as an example, it formally introduced roles as part of its core design to

provide infrastructure support the multi-tier nature of modern, distributed applications. Windows Azure

guarantees that role instances belonging t the same service are hosted within the scope of a single network

environment and managed by a single fabric controller.

2. Service boundaries represent dependencies on functionality provided by other services. Common examples

are a SQL environment for relational database access and a Service Bus for pub/sub messaging support.

Within Windows Azure, for example, service boundaries are enforced through the network: no guarantee will

be given that a service dependency will be part of the same network or fabric controller environment. That

might happen, but the design assumption for any responsible application has to be that any service

dependency is on a different network managed by a different fabric controller.

3. Endpoint boundaries are external to the cloud. They include any consuming endpoint, generally assumed to

be a device, connecting to the cloud in order to consume services. You must make special considerations in

this part of the design due to the variable and unreliable nature of the network. Role boundaries and service

boundaries are within the boundaries of the cloud environment and one can assume a certain level of

reliability and bandwidth. For the external dependencies, no such assumptions can be made and extra care

has to be given to the ability of the device to consume services, meaning data and interactions.

The network by its very nature introduces latency as it passes information from one point of the network to

another. In order to provide a great experience for both users and as dependent services or roles, the

application architecture and design should look for ways to reduce latency as much as sensible and manage

unavoidable latency explicitly. One of the most common ways to reduce latency is to avoid services calls that

involve the network--local access to data and services is a key approach to reduce latency and introduce

higher responsiveness Using local data and services also provides another layer of failure security; as long as

the requests of the user or application can be served from the local environment, there is no need to interact

with other roles or services, removing the possibility of dependent component unavailability as a failure

point.

http://searchcio-midmarket.techtarget.com/definition/latency

Introducing caching

Caching is a technique that can be used to improve data access speeds when it‘s not possible to store data

locally. Caching is utilized to great effect in most cloud service operating at scale today. As the definition

provided by Wikipedia outlines, a cache provides local access to data that is repeatedly asked for by

applications. Caching relies on two things:

Usage patterns for the data by users and dependent applications are predominantly read-only. In certain

scenarios such as ecommerce websites, the percentage of read-only access (sometimes referred to as

browsing) is up to 95% of all user interactions with the site.

The application‘s information model provides an additional layer of semantic information that supports the

identification of stable, singular data that is optimal for caching.

Device caching

While not the focus for the FailSafe initiative, device caching is one of the most effective ways to increase the

usability and robustness of any devices + services application. Numerous ways exist to provide caching

services on the device or client tier, ranging from the HTML5 specification providing native caching

capabilities implemented in all the standard browsers to local database instances such as SQL Server

Compact Edition or similar.

Distributed caching

Distributed caching is a powerful set of capabilities, but its purpose is not to replace a relational database or

other persistent store; rather, its purpose is to increase the responsiveness of distributed applications that by

nature are network centric and thus latency sensitive. A side benefit of introducing caching is the reduction

of traffic to the persistent data store, which drives the interactions with your data service to a minimum.

Information models optimized for caching

Cached data by its very nature is stale data, i.e. data that is not necessarily up-to-date anymore. A great

example of cached data although from a very different domain is a product catalog that is being sent to

thousands of households. The data used to produce the product catalog was up-to-date when the catalog

was created. Once the printing presses were going, the data, by the very nature of time passing during the

catalog production process, went stale. Due to cached data being stale, the attributes of data with respect to

stability and singularity are critical to the caching design:

o Stability - Data that has an unambiguous interpretation across space and time. This often means data values

that do not change. For example, most enterprises never recycle customer identifications or SKU numbers.

Another technique to create stable data is the addition of expiration dates to the existing data. The printed

product catalog example above is a great example. Generally retailers accept orders from any given catalog

for 2 periods of publication. If one publishes a catalog four times a year, the product catalog data is stable

for 6 months and can be utilized for information processing such as placing and fulfilling orders.

Stable data is often referenced as master or reference data. In the FailSafe initiative we will utilize the term

reference data as it is a semantically more inclusive term than master data. In a lot of enterprises, master

data has a very specific meaning and is narrower than reference data.

o Singularity - Data that can be isolated through association with uniquely identifiable instances with no or

low concurrent updates. Take the example of a shopping basket. While the shopping basket will clearly be

updated, the updates occur relatively infrequently and can be completely isolated from the storage as well as

the processing perspective.

Isolatable data as described above is referenced as activity data or session data.

With these two attributes in mind, the following schema emerges:

http://en.wikipedia.org/wiki/Cache_(computing)

http://www.microsoft.com/en-us/download/details.aspx?id=17876

http://www.microsoft.com/en-us/download/details.aspx?id=17876

Managing the cache

Caching the right information at the right time is a key part of a successful caching strategy. Numerous

techniques exist for loading the cache: a good overview is described here. In addition, the sections below

outline a few considerations for FailSafe application design that is dependent upon distributed caching.

o Reference data - If the hosting environment (fabric controller or datacenter) encounters a disaster, your

application will be moved to another environment. In the case where an active instance of your application is

already active (active-active design), the likelihood that your cache already contains a lot of the relevant

information (especially reference data) is high. In the case that a new instance of your application gets spun

up, no information will be in the cache nodes. You should design your application so that on a cache miss, it

automatically loads the desired data. In the case of a new instance, you can have a startup routine that bulk

loads reference data into the cache. A combination of the two is desirable as users might be active as soon

as the application is being served by the cloud infrastructure.

o Activity data - The basic techniques described for reference data hold true for activity data as well. However,

there is a specific twist to activity data. Reference data is assumed to be available in any persistent store of

the application. As it will change on a less frequent basis, synchronization ought not to be a problem,

although it needs to be considered. However, activity data, albeit being updated in isolation and with low

frequency, will be more volatile than reference data. Ideally the distributed cache persists activity data on a

frequent basis and replicates the data between the various instances of the application. Take care to ensure

that the persistence and synchronization intervals are spaced far enough to avoid contention but close

enough to keep possible data loss at a minimum.

Establishing a Data Resiliency Approach A common misunderstanding is the relationship, specifically the areas of responsibility, between platform

and application. One of the areas where this is most troublesome is in respect to data.

While a platform such as Windows Azure will deliver on promises of storing multiple copies of the data (and

in some services even going so far as to provide geo-redundancy), the data that is stored is driven by the

application, workload, and its component services. If the application takes an action that corrupts its

application data, the platform stores multiple copies of it.

When establishing your failure modes and failure points it‘s important to identify areas of the application

that could potentially cause data corruption. While the point of origin could vary from bad code or poison

messages to your service, it‘s important to identify the related failure modes and failure points.

Application Level Remediation

Idempotency

A core assumption with connected services is that they will not be 100% available and that transient fault

handling with retry logic is a core implementation approach. In cases where retry logic is implemented, there

is the potential for the same message to be sent more than once, for messages to be sent out of sequence,

etc.

Operations should be designed to be idempotent, ensuring that sending the same message multiple times

does not result in an unexpected or polluted data store.

http://msdn.microsoft.com/en-us/library/dd129907.aspx

For example, inserting data from all requests may result in multiple records being added if the service

operation is called multiple times. An alternate approach would be to implement the code as an intelligent

―upsert‖ which performs an update if a record exists or an insert if it does not. A timestamp or global

identifier could be used to identify new vs. previously processed messages, inserting only newer ones into

the database and updating existing records if the message is newer than what was received in the past.

Workload Activities and Compensating Behavior

In addition to idempotency, another area for consideration is the concept of compensating behavior.

A real world example of compensating behavior is seen when returning of a product which was paid for with

a credit card. In this scenario, a consumer visits a retailer, provides a credit card and a charge is applied to

the consumers credit card account. If the consumer returns the product to the retailer, a policy is evaluated

and if the return conforms to the policy, the retailer issues a credit for the amount of the purchase to the

consumers credit card account.

In a world of an every growing set of connected systems and the emergence of composite services, the

importance of understanding how to handle the compensating behavior is important.

For many developers of line-of-business applications, the concepts of transactions are not new, but the

frame of reference is often tied to the transactional functionality exposed by local data technologies and

related code libraries. When looking at the concept in terms of the cloud, this mindset needs to take into

account new considerations related to orchestration of distributed services.

A service orchestration can span multiple distributed systems and be long running and stateful. The

orchestration itself is rarely synchronous and it can span from seconds to years based on the business

scenario.

In a supply chain scenario, that could tie together 25 organizations in the same workload activity. For

example, there may be a set of 25 or more systems that are interconnected in one or more service

orchestrations.

If success occurs, the 25 systems must be made aware that the activity was successful. For each connection

point in the activity, participant systems can provide a correlation ID for messages it receives from other

systems. Depending on the type of activity, the receipt of that correlation ID may satisfy the party that the

transaction is notionally complete. In other cases, upon the completion of the interactions of all 25 parties, a

confirmation message may be sent to all parties (either directly from a single service or via the specific

orchestration interaction points for each system).

To handle failures in composite and/or distributed activities, each service would expose a service interface

and operation(s) to receive requests to cancel a given transaction by a unique identifier. Behind the service

façade, workflows would be in place to compensate for the cancellation of this activity. Ideally these would

be automated procedures, but they can be as simple as routing to a person in the organization to remediate

manually.

Backups

In addition to application-level remediation to avoid data corruption, there is also remediation that is put in

place to provide options if application remediation is not successful.

Processes for both creating and restoring backup copies of your data store – either in whole or in part –

should be part of your resiliency plan. While the concepts of backing up and restoring data are not new,

there are new twists to this in the cloud.

Your backup strategy should be defined with a conscious understanding of the business requirements for

restoring data. If a data store is corrupted or taken offline due a disaster scenario, you need to know what

type of data must be restored, what volume must be restored, and what pace is required for the business.

This will impact your overall availability plan and should drive your backup and restore planning.

Relational Databases

Backing up of relational databases is nothing new. Many organization have tools, approaches, and processes

in place for the backing up of data to either satisfy disaster recovery or compliance needs. In many cases

traditional backup tools, approaches, and processes may work with little or no modification. In addition,

there are new or variant alternatives, such as backing up data and storing a copy in cloud-based blob

storage, that can be considered.

When evaluating existing processes and tools, it‘s important to evaluate which approach is appropriate for

the cloud based solution. In many cases, one or more of the approaches listed below will be applied to

remediate different failure modes.

1. Total Backup - This is a backup of a data store in its entirety. Total backups should occur based on a

schedule dictated by your data volume and velocity. A total backup is the complete data set needed to

deliver on the service level agreement for your service. Mechanisms for this type of backup are generally

available either by the database / database service provider or its vendor ecosystem.

2. Point in Time - A point in time backup is a backup that reflects a given point in time in the databases

existence. If an error were to occur in the afternoon that corrupted the data store, for example, a point in

time backup done at noon could be restored to minimize business impact.

Given the ever-growing level of connectivity of individuals, the expectation to engage with your service at

any time of day makes the ability to quickly restore to a recent point in time a necessity.

3. Synchronization - In addition to traditional backups, another option is synchronization of data. Data could

be stored in multiple data centers, for example, with a periodic synchronization of data from one datacenter

to another. In addition to providing synchronized data in solutions that utilize traffic management as part of

a normal availability plan, this can also be used to fail over to a second data center if there is a business

continuity issue.

Given the constant connectivity of individuals consuming services, downtime becomes less and less

acceptable for a number of scenarios and synchronization can be a desirable approach.

Patterns for synchronization can include:

- data center to data center of a given cloud provider

- data center to data center across cloud providers

- data center to data center from on premise to a given cloud provider

- data center to device synchronization for consumer specific data slices

Sharded Relational Databases

For many, the move to the cloud is driven by a need to facilitate large numbers of users and high traffic

scenarios such as those related to mobile or social applications. In these scenarios, the application pattern

often involves moving away from a single database model to a number of database shards that contain a

portion of the overall data set and are optimized for large scale engagement. One recent social networking

project built on Windows Azure launched with a total of 400 database shards.

Each shard is a standalone database and your architecture and management should facilitate total backups,

point in time backups, and restoration of backups for both individual shards and a complete data set

including all shards.

NoSQL Data Stores

In addition to relational data stores, backup policies should be considered for ―Not only SQL‖ or NoSQL data

stores as well. The most popular form of NoSQL databases provided by major cloud providers would be a

form of high availability key-value pair store, often referred to as a table store.

NoSQL stores may be highly available. In some cases they will also be geo-redundant, which can help

prevent loss in the case of a catastrophic failure in a specific data center. These stores typically do not

provide protections from applications overwriting or deleting content unintentionally. Application or user

errors are not handled automatically by platform services such as blob storage and a backup strategy should

be evaluated.

While relational databases typically have existing and well-established tools for performing backups, many

NoSQL stores do not. A popular architectural approach is to create a duplicate copy of the data in a replica

NoSQL store and use a lookup table of some kind to identify which rows from the source store have been

placed in the replica store. To restore data, this same table would be utilized, reading from the table to

identify content in the replica store available to be restored.

Depending on the business continuity concerns, the placement of this replica could be hosted with the same

cloud provider, in the same data center, and/or the same No SQL data store. It could also reside in a

different data center, a different cloud provider, and/or a different variant of NoSQL data store. The driver for

placement will be largely influence by the desired SLA of your workload service and any related regulatory

compliance considerations.

A factor to consider when making this determination is cost, specifically as it relates to data ingress and

egress. Cloud providers may provide free movement of data within their data center(s) and allow free

passage of data into their environment. No cloud provider offers free data egress, and the cost of moving

data to a secondary cloud platform provider could introduce significant costs at scale.

Blob Storage

Like relational and NoSQL data stores, a common misconception is that the availability features implemented

for a blob storage offering will remove the need to consider implementing a backup policy.

Blob storage also may be geo-redundant, but, as discussed earlier, this does not guard against application

errors. Application or user errors are not handled automatically by platform services such as blob storage

and a backup strategy should be evaluated.

Backup strategies could be very similar to those used for NoSQL stores. Due to the potentially large size of

blobs, cost and time to move data will be important parts of a backup and restore strategy.

Restoring Backups

By now, most people have heard the cautionary tale of the organization that established and diligently

followed backup policies but never tested restoring the data. On that fateful day when a disaster did occur,

they went to restore their database backup only to discover they had configured their backup incorrectly and

the tapes they‘d been sending offsite for years didn‘t have the information they needed on them.

Whatever back up processes are put into place, it‘s important to establish testing to verify that data can be

restored correctly and to ensure that restoration occurs in a timely fashion and with minimal business

impact.

Content Delivery Networks Content Delivery Networks (CDNs) are a popular way to provide availability and enhanced user experience

for frequently requested files. Content in a CDN is copied to a local node on its first use, and then served up

from that local node for subsequent requests. The content will expire after a designated time period, after

which content must be re-copied to the local node upon the next request.

Utilizing a CDN provides a number of benefits but it also adds a dependency. As is the case with any

dependency, remediation of a service failure should be proactively reviewed.

Appropriate Use of a CDN

A common misconception is that CDNs are a cure all for scale. In one scenario, a customer was confident it

was the right solution for an online ebook store. It was not. Why? In a catalog of a million books, there is a

small subset of books that would be frequently requested (the ―hits‖) and a very long tail of books that

would be requested with very little predictability. Frequently requested titles would be copied to the local

node on the first request and provide cost effective local scale and a pleasant user experience. For the long

tail, almost every request is copied twice – once to the local node, then to the customer – as the infrequent

requests result in content regularly expiring. This is evidence that a CDN improperly will have the opposite of

the intended effect – a slower, more costly solution.

Design for Operations In many cases, operations of a solution may not be planned until further along in the lifecycle. To build truly

resilient applications, they should be designed for operations. Designing for operations typically will include

key activities such as establishing a health model, capturing telemetry information, incorporating health

monitoring services and workflows, and making this data actionable by both operations and developers.

Lecture 4: WINDOWS AZURE COMPUTE

Web Sites vs Cloud Services vs Virtual Machines

With its recent expansion, Windows Azure (Microsoft‘s public cloud platform) now

supports 3 modes of cloud computing. In this article, we‘ll explain the rules of the

road using a highway metaphor. Which lane of this 3-way highway should you drive

in, and what kind of vehicles are permitted in each lane?

http://2.bp.blogspot.com/-c0LDCxcc7wY/T_Teur8v9DI/AAAAAAAAC2Y/dEPM8U2W_bo/s1600/car_blue_clouds.png

http://2.bp.blogspot.com/-qECOx5wseB0/T_Te00U7N7I/AAAAAAAAC2g/TJyz7SjmDdE/s1600/azure_3lanehighway_lanes.png

http://2.bp.blogspot.com/-c0LDCxcc7wY/T_Teur8v9DI/AAAAAAAAC2Y/dEPM8U2W_bo/s1600/car_blue_clouds.png

http://2.bp.blogspot.com/-qECOx5wseB0/T_Te00U7N7I/AAAAAAAAC2g/TJyz7SjmDdE/s1600/azure_3lanehighway_lanes.png

Traditionally, there was one way to use Windows Azure: Platform-as-a-Service (now

called Cloud Services). Recently, the platform was greatly expanded to also provide

Infrastructure-as-a-Service capability (Virtual Machines) and a special mode for web

sites (Windows Azure Web Sites). Let‘s point out here and now that although one

area of the platform has ―web sites‖ in its name, you can in fact host web sites at all

three levels. The table below will give you a quick idea of what kind of apps belong

where, after which we‘ll take a look at each of these ways to use Windows Azure and

compare them.

Windows Azure Web

Sites

2-Tier Simple Web Sites

• Web sites using open source frameworks

• Web sites using SQL DB or MySQL

• Web sites that run on IIS: ASP.NET, PHP, node.js

Cloud Services Multi-Tier Modern Solutions

• Stateless VM farms

• Run on Windows Server

• Where automated management is required

• Require advanced functionality such as service bus, identity

federation, CDN, traffic management

Virtual Machines Legacy Solutions

• Stateful single VM or VM farms

• Run on Windows Server or Linux

• Legacy LOB applications

• Enterprise server products

• Where portability between cloud and on-premises is

required

Windows Azure Web Sites: The Fast Lane

Windows Azure Web Sites provides an accelerated way to work in the cloud for

modern web sites. It has the automated management benefits normally associated

with Platform-as-a-Service and the portability of workload normally associated with

Infrastructure-as-a-Service. Unlike the other two modes for using Windows Azure,

which support a great diversity of solution types, WAWS is limited to simple 2-tier

web sites that can run on a standard IIS configuration. At the time of this writing,

WAWS is in preview and does not yet come with an SLA. A certain level of use is free

before charges apply.

Why do we call WAWS the fast lane? First of all, provisioning is lightning-quick: you

can provision a web site and accompanying SQL or MySQL database in well under a

minute—far less than the 10-15 minutes of provisioning time other segments of the

platform require. Second, you have high productivity from the get-go because your

web developers don‘t have to learn anything new or change the way they work. Your

web site can run as-is in the cloud. Deployment is a beautiful thing in WAWS: you

can use Web Deploy, FTS, Git or TFS. Moreover, you only have to deploy to a single

server regardless of how many VMs your site runs on: Windows Azure takes care of

distributing deployments out to all instances. Lastly, one other nice speedy aspect of

WAWS is that you can provision web sites ready to go with your favorite web

framework by choosing from a gallery that includes DotNetNuke, Drupal, Joomla,

Orchard, Umbraco, and WordPress.

The reason WAWS deployment is so fast is that it uses a pre-allocated pool of VMs

for web serving. By default, you are working in shared mode, which means you are

using this pool and sharing your VMs with other tenants. That might sound a little

dangerous, but virtual directory isolation keeps your work and that of other tenants

walled off from each other. At a higher cost, you can choose to switch over

toreserved mode, in which your VMs are dedicated to you alone. In either mode, you

can scale the number of instances using the Windows Azure management portal.

http://1.bp.blogspot.com/-Ts9rdmh58UU/T_Tfa43heqI/AAAAAAAAC2o/xDScTQPdY-g/s1600/highway_waws.png

What kind of web technologies can you run in WAWS? Anything compatible with IIS,

including ASP.NET, classic ASP, PHP, and node.js. WAWS does come with some

limitations. Architecturally, you are limited to simple 2-tier web sites that run on IIS

with a standard configuration. However, you cannot change the IIS configuration,

remote desktop to the VM, or otherwise make changes that might affect other

tenants. If you need more than this, such as a third tier for web services, you should

look at Cloud Services or Virtual Machines instead.

From a VM persistence standpoint, disk file changes are persistently saved in

Windows Azure Storage. However, all VM instances for a site are sharing common

storage—so you need to consider file overwrites.

The bottom line: Windows Azure Web Sites provide a fast lane for web sites in the

cloud, offering the best attributes of PaaS (automated management) and IaaS

(portability) without requiring web developers to change the way the work—but they

can only be used with simple 2-tier web sites that can run on IIS in a default

configuration.

See my post Reintroducing Windows Azure: Web Sites for a detailed walk-through of

this feature.

Cloud Services: Leave the Driving to Us

Cloud Services are the Platform-as-a-Service (PaaS) way to use Windows Azure.

PaaS gives you automated management and a large toolbox of valuable services to

compose solutions from.

PaaS can be controversial: some see it as the modern way to design applications that

leverage the differences and advantages of cloud computing environments; others

view PaaS with concerns about vendor/platform lock-in. Keep in mind, the expense

http://davidpallmann.blogspot.com/2012/06/reintroducing-windows-azure-part-2.html

http://1.bp.blogspot.com/-QC0QOCN96wI/T_Tf0Z2V_bI/AAAAAAAAC2w/wZNyi6fqDgs/s1600/highway_cloudservices.png

of preserving full cloud/on-prem portability is limiting yourself to the least common

denominator of both; that is, not taking advantage of the special functionality

available in the cloud.

The cornerstone Windows Azure Compute service which hosts applications uses a

declarative model to specify the shape and size of your solution. Although many

kinds of software can run in the cloud at this level, including complex multi-tier

solutions, this model must be adhered to. Your solution consists of one of

more roles, where each role is a VM farm. There are several kinds of roles,

including web roles for Internet-facing software and worker roles for everything else.

You can connect your roles to each other or to the outside world using load-

balanced endpoints or alternative methods such as queues. The VM instances that

make up a role are not persistent, meaning you cannot rely on VM state changes

(such as disk file updates) to stay around; for persistence you must use a storage or

database service. You‘ll need at least 2 VM instances per role in order for the 99.95%

(3½ 9‘s) Compute SLA to apply. Most of the other cloud services give you a 99.9% (3

9‘s) SLA.

There are many other cloud services to choose from that provide storage, relational

data, identity, communication, caching, traffic management, and more. Many of

these services provide extremely valuable functionality for pennies such as federating

identity or brokering messages—and they‘re managed for you. Most PaaS solutions

leverage a combination of cloud services.

The automated management in Cloud Services is very attractive. For the Compute

service, this includes automated patching and orchestrated software updates across

your VMs. Storage and database services store data with triple-redundancy and have

automatic fail-over. The traffic management service can check the health of your

applications and fail over from one data center to another when necessary. VM

instances and storage units are spread across fault domains in the data center to

maintain availability in the event of a data center failure.

Cloud Services do require you to adhere to its model, and sometimes that means

designing for the cloud or adapting existing applications to fit the cloud. Sometimes

this is a small effort and sometimes it is not. Modern SOA applications based on tiers

of stateless VM farms are generally very straightforward to move to Cloud Services.

Solutions that have a single-server model, or depend on local VM state, usually

require moderate-to-significant changes. These changes can sacrifice portability of

your application, especially if you are highly dependent on cloud service functionality

not available elsewhere.

Cloud Services provide an SDK and a simulation environment that allow you to

develop and test locally before deploying to a cloud data center. To deploy, you

must package your solution and publish it to the cloud to a Staging slot or a

Production slot. You can promote a cloud deployment from Staging to Production in

a fast, one-click operation.

The bottom line: cloud services provide automated management, valuable

functionality, and architectural versatility—but apps may need to be adapted to fit its

model, and strong dependence on platform-specific cloud services can result in apps

that are no longer portable to other environments.

See my post Reintroducing Windows Azure: Cloud Services for highlights of what‘s

new in this area.

Virtual Machines: Self-Service

Virtual Machines are the Infrastructure-as-a-Service (IaaS) level of Windows Azure.

You stand up virtual machines, configure them yourself, and manage them yourself.

If you value portability of workload (the ability to move your IT assets between the

cloud and on-premises seamlessly) and are fine managing your systems yourself, this

lane might be for you. At the time of this writing, Virtual Machines are in preview;

however, they do come with an SLA and are not free despite the pre-release status.


http://3.bp.blogspot.com/-XPCrqqZQWxo/T_TgeJa7BvI/AAAAAAAAC24/4rf2D6spbcw/s1600/highway_virtualmachines.png

Although it comes at the cost of self-management, Virtual Machines provide great

versatility. You provision Linux or Windows Server VMs and either compose the VM

images in the cloud or upload a VHD you‘re previously created using Hyper-V. You

can capture a VM and add it your image gallery for easy reuse.

In Virtual Machines, individual VMs are persistent. It‘s fine to run just one instance of

a VM, you don‘t have to worry about losing VM state, and you get a 99.9% SLA. This

makes virtual machines the right choice for single-server solutions and server

products that use local disk files. You can‘t run a product like Active Directory or SQL

Server or SharePoint Server successfully in Cloud Services, but you can in Virtual

Machines. Virtual machines are also often the best fit for legacy applications.

You can group your virtual machines in a common availability set, which will spread

instances across fault domains in the data center for high availability in the event of a

data center failure. You can provision load-balanced endpoints to direct Internet

traffic to your availability set.

The bottom line: Virtual Machines are for the do-it-yourself IT person (like a driving

enthusiast who also likes to work on their car and do their own tuning and

maintenance). It's also the only way to run certain kinds of applications, such as

single-server stateful solutions and some server products.

See my post Reintroducing Windows Azure: Virtual Machines for a detailed walk-

through of this feature.

WAWS, Cloud Services, & VMs Compared

The table below contrasts Windows Azure, Web Sites, Cloud Services, and Virtual

Machines.

WAWS Cloud Services Virtual Machines

Level (PaaS/IaaS) PaaS with the portability of

IaaS PaaS IaaS

Portability Fully portable Design or adapt for cloud Fully portable

Management Automated Automated Customer Responsibility


Architecture 2-tier IIS web sites only Versatile Versatile

Unit of Management Web Site Cloud Service Virtual Machine

Persistence VMs share persistence VMs are not persistent Each VM is persistent

Provisioning Under a minute 10-15 minutes 10-15 minutes

Technology platform Windows Server / IIS Windows Server Windows Server, Linux

Deployment Web Deploy, FTP, Git, TFS Package and Publish Compose VMs or upload

VHDs

Gallery Common web frameworks Microsoft Guest OS images Microsoft-provided and user-

saved VM images

SLA None during preview 3 9‘s / (3½ 9‘s Compute)

requires 2+ VMs per role 3 9‘s (single VM)

Let‘s contrast three characteristics that you might care about when using the cloud:

automated management, portability of workload between cloud and on-premises,

and architectural versatility. As you can see from the magic triangle below, each

mode will give you two of the three.

Choice is good, but it comes with the responsibility to choose well. I hope the above has

helped to characterize the 3-lane highway that is Windows Azure. Drive safely!

http://2.bp.blogspot.com/-qtKFwwyyyp0/T_TxaXLYHDI/AAAAAAAAC3Y/ny-XVGUU7uE/s1600/azure_magic_triangle.png

INTRODUCTION TO WINDOWS AZURE CLOUD SERVICES

Millions of developers around the world know how to create applications using the Windows Server

programming model. Yet applications written for Windows Azure, Microsoft’s cloud platform, don’t

exactly use this familiar model. While most of a Windows developer’s skills still apply, Windows Azure

provides its own programming model.

Why? Why not just exactly replicate the familiar world of Windows Server in the cloud? Many vendors’

cloud platforms do just this, providing virtual machines (VMs) that act like on-premises VMs. This

approach, commonly called Infrastructure as a Service (IaaS), certainly has value, and it’s the right

choice for some applications. Yet cloud platforms are a new world, offering the potential for solving

today’s problems in new ways. Instead of IaaS, Windows Azure offers a higher-level abstraction that’s

typically categorized as Platform as a Service (PaaS). While it’s similar in many ways to the on-premises

Windows world, this abstraction has its own programming model meant to help developers build better

applications. The Windows Azure programming model focuses on improving applications in three areas:

Administration: In PaaS technologies, the platform itself handles the lion’s share of administrative

tasks. With Windows Azure, this means that the platform automatically takes care of things such as

applying Windows patches and installing new versions of system software. The goal is to reduce the

effort—and the cost—of administering the application environment.

Availability: Whether it’s planned or not, today’s applications usually have down time for Windows

patches, application upgrades, hardware failures, and other reasons. Yet given the redundancy that

cloud platforms make possible, there’s no longer any reason to accept this. The Windows Azure

programming model is designed to let applications be continuously available, even in the face of

software upgrades and hardware failures.

Scalability: The kinds of applications that people want to write for the cloud are often meant to

handle lots of users. Yet the traditional Windows Server programming model wasn’t explicitly

designed to support Internet-scale applications. The Windows Azure programming model, however,

was intended from the start to do this. Created for the cloud era, it’s designed to let developers build

the scalable applications that massive cloud data centers can support. Just as important, it also allows

applications to scale down when necessary, letting them use just the resources they need.

Whether a developer uses an IaaS technology or a PaaS offering such as Windows Azure, building

applications on cloud platforms has some inherent benefits. Both approaches let you pay only for the

computing resources you use, for example, and both let you avoid waiting for your IT department to

deploy servers. Yet important as they are, these benefits aren’t the topic here. Instead, the focus is

entirely on making clear what the Windows Azure programming model is and what it offers.

THE THREE RULES OF THE WINDOWS AZURE

PROGRAMMING MODEL

To get the benefits it promises, the Windows Azure programming model imposes three rules on

applications:

A Windows Azure application is built from one or more roles. A Windows Azure application runs multiple instances of each role.

A Windows Azure application behaves correctly when any role instance fails.

It’s worth pointing out that Windows Azure can run applications that don’t follow all of these rules—it

doesn’t actually enforce them. Instead, the platform simply assumes that every application obeys all

three. Still, while you might choose to run an application on Windows Azure that violates one or more

of the rules, be aware that this application isn’t actually using the Windows Azure programming model. Unless you understand and follow the model’s rules, the application might not run as you expect it to.

A WINDOWS AZURE APPLICATION IS BUILT FROM ONE OR

MORE ROLES

Whether an application runs in the cloud or in your data center, it can almost certainly be divided into

logical parts. Windows Azure formalizes these divisions into roles. A role includes a specific set of code,

such as a .NET assembly, and it defines the environment in which that code runs. Windows Azure today

lets developers create three different kinds of roles:

Web role: As the name suggests, Web roles are largely intended for logic that interacts with the

outside world via HTTP. Code written as a Web role typically gets its input through Internet

Information Services (IIS), and it can be created using various technologies, including ASP.NET,

Windows Communication Foundation (WCF), PHP, and Java.

Worker role: Logic written as a Worker role can interact with the outside world in various ways—it’s

not limited to HTTP. For example, a Worker role might contain code that converts videos into a

standard format or calculates the risk of an investment portfolio or performs some kind of data

analysis.

Virtual Machine (VM) role: A VM role runs an image—a virtual hard disk (VHD)—of a Windows Server

2008 R2 virtual machine. This VHD is created using an on-premises Windows Server machine, then

uploaded to Windows Azure. Once it’s stored in the cloud, the VHD can be loaded on demand into a VM role and executed. From January 2012 onwards the Virtual Machine (VM) role is replaced by Windows Azure Virtual Machine.

All three roles are useful. The VM role was made available quite recently, however, and so it’s fair to say

that the most frequently used options today are Web and Worker roles. Figure 1 shows a simple

Windows Azure application built with one Web role and one Worker role. This application might use a Web role to accept HTTP requests from users, then hand off the work these users request, such as reformatting a video file and making it available for viewing, to a Worker role. A

primary reason for this two-part breakdown is that dividing tasks in this way can make an application easier to scale.

It’s also fine for a Windows Azure application to consist of just a single Web role or a single Worker role—

you don’t have to use both. A single application can even contain different kinds of Web and Worker

roles. For example, an application might have one Web role that implements a browser interface, perhaps

built using ASP.NET, and another Web role that exposes a Web services interface implemented using

WCF. Similarly, a Windows Azure application that performed two different kinds of data analysis might

define a distinct Worker role for each one. To keep things simple, though, we’ll assume that the example

application described here has just one Web role and one Worker role.

As part of building a Windows Azure application, a developer creates a service definition file that names

and describes the application’s roles. This file can also specify other information, such as the ports each

role can listen on. Windows Azure uses this information to build the correct environment for running

the application.

A WINDOWS AZURE APPLICATION RUNS MULTIPLE

INSTANCES OF EACH ROLE

Every Windows Azure application consists of one or more roles. When it executes, an application

that conforms to the Windows Azure programming model must run at least two copies—two distinct

instances—of each role it contains. Each instance runs as its own VM, as Figure 2 shows.

Figure 2: A Windows Azure application runs multiple instances of each role.

As described earlier, the example application shown here has just one Web role and one Worker role. A

developer can tell Windows Azure how many instances of each role to run through a service

configuration file (which is distinct from the service definition file mentioned in the previous section).

Here, the developer has requested four instances of the application’s Web role and three instances of its

Worker role.

Every instance of a particular role runs the exact same code. In fact, with most Windows Azure

applications, each instance is just like all of the other instances of that role—they’re interchangeable. For

example, Windows Azure automatically load balances HTTP requests across an application’s Web role

instances. This load balancing doesn’t support sticky sessions, so there’s no way to direct all of a client’s

requests to the same Web role instance. Storing client-specific state, such as a shopping cart, in a

particular Web role instance won’t work, because Windows Azure provides no way to guarantee that all

of a client’s requests will be handled by that instance. Instead, this kind of state must be stored

externally, as described later.

A WINDOWS AZURE APPLICATION BEHAVES CORRECTLY

WHEN ANY ROLE INSTANCE FAILS

An application that follows the Windows Azure programming model must be built using roles, and it

must run two or more instances of each of those roles. It must also behave correctly when any of those

role instances fails. Figure 3 illustrates this idea.

Figure 3: A Windows Azure application behaves correctly even when a role instance fails.

Here, the application shown in Figure 2 has lost two of its Web role instances and one of its Worker role

instances. Perhaps the computers they were running on failed, or maybe the physical network

connection to these machines has gone down. Whatever the reason, the application’s performance is

likely to suffer, since there are fewer instances to carry out its work. Still, the application remains up and

functioning correctly

If all instances of a particular role fail, an application will stop behaving as it should—this can’t be

helped. Yet the requirement to work correctly during partial failures is fundamental to the Windows

Azure programming model. In fact, the service level agreement (SLA) for Windows Azure requires

running at least two instances of each role. Applications that run only one instance of any role can’t get

the guarantees this SLA provides. The most common way to achieve this is by making every role instance equivalent, as with load-balanced

Web roles accepting user requests. This isn’t strictly required, however, as long as the failure of a single

role instance doesn’t break the application. For example, an application might use a group of Worker

role instances to cache data for Web role instances, with each Worker role instance holding different

data. If any Worker role instance fails, a Web role instance trying to access the cached data it contained

behaves just as it would if the data wasn’t found in the cache (e.g., it accesses persistent storage to

locate that data). The failure might cause the application to run more slowly, but as seen by a user, it still

behaves correctly. One more important point to keep in mind is that even though the sample application described so far

contains only Web and Worker rules, all of these rules also apply to applications that use VM roles. Just

like the others, every VM role must run at least two instances to qualify for the Windows Azure SLA,

and the application must continue to work correctly if one of these instances fails. Even with VM roles,

Window Azure still provides a form of PaaS—it’s not traditional IaaS.

WHAT THE WINDOWS AZURE PROGRAMMING MODEL

PROVIDES

The Windows Azure programming model is based on Windows, and the bulk of a Windows developer’s

skills are applicable to this new environment. Still, it’s not the same as the conventional Windows Server

programming model. So why bother to understand it? How does it help create better applications? To

answer these questions, it’s first worth explaining a little more about how Windows Azure works. Once

this is clear, understanding how the Windows Azure programming model can help create better software

is simple.

SOME BACKGROUND: THE FABRIC CONTROLLER

Windows Azure is designed to run in data centers containing lots of computers. Accordingly, every

Windows Azure application runs on multiple machines simultaneously. Figure 4 shows a simple example

of how this looks.

Figure 4: The Windows Azure fabric controller creates instances of an application’s roles on different

machines, then monitors their execution.

As Figure 4 shows, all of the computers in a particular Windows Azure data center are managed by

an application called the fabric controller. The fabric controller is itself a distributed application that

runs across multiple computers.

When a developer gives Windows Azure an application to run, he provides the code for the application’s

roles together with the service definition and service configuration files for this application. Among

other things, this information tells the fabric controller how many instances of each role it should create.

The fabric controller chooses a physical machine for each instance, then creates a VM on that machine

and starts the instance running. As the figure suggests, the role instances for a single application are

spread across different machines within this data center.

Once it’s created these instances, the fabric controller continues to monitor them. If an instance fails for

any reason—hardware or software—the fabric controller will start a new instance for that role. While

failures might cause an application’s instance count to temporarily drop below what the developer

requested, the fabric controller will always start new instances as needed to maintain the target number

for each of the application’s roles. And even though Figure 4 shows only Web and Worker roles, VM roles

are handled in the same way, with each of the role’s instances running on a different physical machine.

THE BENEFITS: IMPROVED ADMINISTRATION, AVAILABILITY, AND SCALABILITY

Applications built using the Windows Azure programming model can be easier to administer, more

available, and more scalable than those built on traditional Windows servers. These three attributes are

worth looking at separately.

The administrative benefits of Windows Azure flow largely from the fabric controller. Like every operating

system, Windows must be patched, as must other system software. In on-premises environments, doing

this typically requires some human effort. In Windows Azure, however, the process is entirely automated:

The fabric controller handles updates for Web and Worker role instances (although not for VM role

instances). When necessary, it also updates the underlying Windows servers those VMs run on. The result

is lower costs, since administrators aren’t needed to handle this function.

Lowering costs by requiring less administration is good. Helping applications be more available is also

good, and so the Windows Azure programming model helps improve application availability in

several ways. They are the following:

Protection against hardware failures. Because every application is made up of multiple instances of

each role, hardware failures—a disk crash, a network fault, or the death of a server machine—won’t

take down the application. To help with this, the fabric controller doesn’t choose machines for an

application’s instances at random. Instead, different instances of the same role are placed in different

fault domains. A fault domain is a set of hardware—computers, switches, and more—that share a

single point of failure. (For example, all of the computers in a single fault domain might rely on the

same switch to connect to the network.) Because of this, a single hardware failure can’t take down an

entire application. The application might temporarily lose some instances, but it will continue to

behave correctly.

Protection against software failures. Along with hardware failures, the fabric controller can also

detect failures caused by software. If the code in an instance crashes or the VM in which it’s running

goes down, the fabric controller will start either just the code or, if necessary, a new VM for that role.

While any work the instance was doing when it failed will be lost, the new instance will become part

of the application as soon as it starts running.

The ability to update applications with no application downtime. Whether for routine maintenance or to install a whole new version, every application needs to be updated. An application built using the

Windows Azure programming model can be updated while it’s running—there’s no need to take it

down. To allow this, different instances for each of an application’s roles are placed in different

update domains (which aren’t the same as the fault domains described earlier). When a new version

of the application needs to be deployed, the fabric controller can shut down the instances in just one

update domain, update the code for these, then create new instances from that new code. Once

those instances are running, it can do the same thing to instances in the next update domain, and so

on. While users might see different versions of the application during this process, depending on

which instance they happen to interact with, the application as a whole remains continuously

available.

The ability to update Windows and other supporting software with no application downtime. The

fabric controller assumes that every Windows Azure application follows the three rules listed earlier,

and so it knows that it can shut down some of an application’s instances whenever it likes, update the

underlying system software, then start new instances. By doing this in chunks, never shutting down

all of a role’s instances at the same time, Windows and other software can be updated beneath a

continuously running application.

Availability is important for most applications—software isn’t useful if it’s not running when you need it—

but scalability can also matter. The Windows Azure programming model helps developers build more

scalable applications in two main ways:

Automatically creating and maintaining a specified number of role instances. As already described, a

developer tells Windows Azure how many instances of each role to run, and the fabric controller

creates and monitors the requested instances. This makes application scalability quite

straightforward: Just tell Windows Azure what you need. Because this cloud platform runs in very

large data centers, getting whatever level of scalability an application needs isn’t generally a problem.

Providing a way to modify the number of executing role instances for a running application: For

applications whose load varies, scalability is more complicated. Setting the number of instances just

once isn’t a good solution, since different loads can make the ideal instance count go up or down

significantly. To handle this situation, Windows Azure provides both a Web portal for people and an

API for applications to allow changing the desired number of instances for each role while an

application is running. Making applications simpler to administer, more available, and more scalable is useful, and so using the Windows Azure programming model generally makes sense. But as mentioned earlier, it’s possible to run

applications on Windows Azure that don’t follow this model. Suppose, for example, that you build an

application using a single role (which is permitted) but then run only one instance of that role (violating

the second and third rules). You might do this to save money, since Windows Azure charges separately

for each running instance. Anybody who chooses this option should understand, however, that the fabric

controller won’t know that his application doesn’t follow all three rules. It will shut down this single

instance at unpredictable times to patch the underlying software, then restart a new one. To users, this

means that the application will go down from time to time, since there’s no other instance to take over.

This isn’t a bug in Windows Azure; it’s a fundamental aspect of how the technology works.

Getting all of the benefits that Windows Azure offers requires conforming to the rules of its programming

model. Moving existing applications from Windows Server to Windows Azure can require some work, a topic

addressed in more detail later in this paper. For new applications, however, the argument for using the

Windows Azure model is clear. Why not build an application that costs less to administer? Why not build an

application that need never go down? Why not build an application that can easily scale up and down? Over

time, it’s reasonable to expect more and more applications to be created using the Windows

Azure programming model.

IMPLICATIONS OF THE WINDOWS AZURE PROGRAMMING

MODEL: WHAT ELSE CHANGES?

Building applications for Windows Azure means following the three rules of its programming model.

Following these rules isn’t enough, though—other parts of a developer’s world must also adjust. The

changes the Windows Azure programming model brings to the broader development environment can be

grouped into three areas:

How role instances interact with the operating system.

How role instances interact with persistent storage.

How role instances interact with other role instances. This section looks at all three.

INTERACTIONS WITH THE OPERATING SYSTEM

For an application running on a typical Windows Server machine, the administrator of that machine is in

control. She can reboot VMs or the machine they run on, install Windows patches, and do whatever

else is required to keep that computer available. In Windows Azure, however, all of the servers are

owned by the fabric controller. It decides when VMs or machines should be rebooted, and for Web and

Worker roles (although not for VM roles), the fabric controller also installs patches and other updates to

the system software in every instance.

This approach has real benefits, as already described. It also creates restrictions, however. Because the

fabric controller owns the physical and virtual machines that Windows Azure applications use, it’s free to

do whatever it likes with them. This implies that letting a Windows Azure application modify the system

it runs on—letting it run in administrator mode rather than user mode—presents some challenges. Since

the fabric controller can modify the operating system at will, there’s no guarantee that changes a role

instance makes to the system it’s running on won’t be overwritten. Besides, the specific virtual (and

physical) machines an application runs in change over time. This implies that any changes made to the

default local environment must be made each time a role instance starts running.

In its first release, Windows Azure simply didn’t allow applications to modify the systems they ran on—

applications only ran in user mode. This restriction has been relaxed—both Web and Worker roles now

give developers the option to run applications in admin mode—but the overall programming model hasn’t

changed. Anybody creating a Windows Azure application needs to understand what the fabric controller

is doing, then design applications accordingly.

INTERACTIONS WITH PERSISTENT STORAGE

Applications aren’t just code—they also use data. And just as the programming model must change to

make applications more available and more scalable, the way data is stored and accessed must also

change. The big changes are these:

Storage must be external to role instances. Even though each instance is its own VM with its own file

system, data stored in those file systems isn’t automatically made persistent. If an instance fails, any

data it contains may be lost. This implies that for applications to work correctly in the face of failures,

data must be stored persistently outside role instances. Another role instance can now access data

that otherwise would have been lost if that data had been stored locally on a failed instance.

Storage must be replicated. Just as a Windows Azure application runs multiple role instances to allow

for failures, Windows Azure storage must provide multiple copies of data. Without this, a single

failure would make data unavailable, something that’s not acceptable for highly available

applications.

Storage must be able to handle very large amounts of data. Traditional relational systems aren’t

necessarily the best choice for very large data sets. Since Windows Azure is designed in part for

massively scalable applications, it must provide storage mechanisms for handling data at this scale.

To allow this, the platform offers blobs for storing binary data along with a non-SQL approach called tables for storing large structured data sets.

Figure 5 illustrates these three characteristics, showing how Windows Azure storage looks to an application.

Figure 5: While applications see a single copy, Windows Azure storage replicates all blobs and tables three times.

In this example, a Windows Azure application is using two blobs and one table from Windows Azure

storage. The application sees each blob and table as a single entity, but under the covers, Windows Azure

storage actually maintains three instances of each one. These copies are spread across different physical

machines, and as with role instances, those machines are in different fault domains. This improves the

application’s availability, since data is still accessible even when some copies are unavailable. And because

persistent data is stored outside any of the application’s role instances, an instance failure loses only

whatever data it was using at the moment it failed.

The Windows Azure programming model requires an application to behave correctly when a role instance

fails. To do this, every instance in an application must store all persistent data in Windows Azure storage

or another external storage mechanism (such as SQL Azure, Microsoft’s cloud-based service for relational

data). There’s one more option worth mentioning, however: Windows Azure drives. As already described,

any data an application writes to the local file system of its own VM can be lost when that VM stops

running. Windows Azure drives change this, using a blob to provide persistent storage for the file system

of a particular instance. These drives have some limitations—only one instance at a time is allowed to

both read from and write to a particular Windows Azure drive, for example, with all other instances in this application allowed only read access—but they can be useful in some situations.

INTERACTIONS AMONG ROLE INSTANCES

When an application is divided into multiple parts, those parts commonly need to interact with one

another. In a Windows Azure application, this is expressed as communication between role instances.

For example, a Web role instance might accept requests from users, then pass those requests to a

Worker role instance for further processing. The way this interaction happens isn’t identical to how it’s done with ordinary Windows applications. Once again, a key fact to keep in mind is that, most often, all instances of a particular role are

equivalent—they’re interchangeable. This means that when, say, a Web role instance passes work to a

Worker role instance, it shouldn’t care which particular instance gets the work. In fact, the Web role

instance shouldn’t rely on instance-specific things like a Worker role instance’s IP address to

communicate with that instance. More generic mechanisms are required.

The most common way for role instances to communicate in Windows Azure applications is through Windows Azure queues. Figure 6 illustrates the idea.

Figure 6: Role instances can communicate through queues, each of which replicates the messages it holds three times.

In the example shown here, a Web role instance gets work from a user of the application, such as a person

making a request from a browser (step 1). This instance then creates a message containing this work and

writes it into a Windows Azure queue (step 2). These queues are implemented as part of Windows Azure

storage, and so like blobs and tables, each queue is replicated three times, as the figure

shows. As usual, this provides fault-tolerance, ensuring that the queue’s messages are still available if a failure occurs.

Next, a Worker role instance reads the message from the queue (step 3). Notice that the Web role

instance that created this message doesn’t care which Worker role instance gets it—in this application,

they’re all equivalent. That Worker role instance does whatever work the message requires (step 4),

then deletes the message from the queue (step 5).

This last step—explicitly removing the message from the queue—is different from what on-premises

queuing technologies typically do. In Microsoft Message Queuing (MSMQ), for example, an application can

do a read inside an atomic transaction. If the application fails before completing its work, the transaction

aborts, and the message automatically reappears on the queue. This approach guarantees that every

message sent to an MSMQ queue is delivered exactly once in the order in which it was sent.

Windows Azure queues don’t support transactional reads, and so they don’t guarantee exactly-once,

in-order delivery. In the example shown in Figure 6, for instance, the Worker role instance might finish

processing the message, then crash just before it deletes this message from the queue. If this happens,

the message will automatically reappear after a configurable timeout period, and another Worker role

instance will process it. Unlike MSMQ, Windows Azure queues provide at-least-once semantics: A

message might be read and processed one or more times.

This raises an obvious question: Why don’t Windows Azure queues support transactional reads? The

answer is that transactions require locking, and so they necessarily slow things down (especially with the

message replication provided by Windows Azure queues). Given the primary goals of the platform, its

designers opted for the fastest, most scalable approach.

Most of the time, queues are the best way for role instances within an application to communicate. It’s

also possible for instances to interact directly, however, without going through a queue. To allow this,

Windows Azure provides an API that lets an instance discover all other instances in the same application

that meet specific requirements, then send a request directly to one of those instances. In the most

common case, where all instances of a particular role are equivalent, the caller should choose a target

instance randomly from the set the API returns. This isn’t always true—maybe a Worker role

implements an in-memory cache with each role instance holding specific data, and so the caller must

access a particular one. Most often, though, the right approach is to treat all instances of a role as

interchangeable.

MOVING WINDOWS SERVER APPLICATIONS TO WINDOWS

AZURE

Anybody building a new Windows Azure application should follow the rules of the Windows Azure

programming model. To move an existing application from Windows Server to Windows Azure,

however, that application should also be made to follow the same rules. In addition, the application

might need to change how it interacts with the operating system, how it uses persistent storage, and the

way its components interact with each other.

How easy it is to make these changes depends on the application. Here are a few representative examples:

An ASP.NET application with multiple load-balanced instances that share state stored in SQL Server.

This kind of application typically ports easily to Windows Azure, with each instance of the original

application becoming an instance of a Web or Worker role. Applications like this don’t use sticky

sessions, which helps make them a good fit for Windows Azure. (Using ASP.NET session state is

acceptable, however, since Windows Azure provides an option to store session state persistently in

Windows Azure Storage tables.) And moving an on-premises SQL Server database to SQL Azure is

usually straightforward.

An ASP.NET application with multiple instances that maintains per-instance state and relies on sticky

sessions. Because it maintains client-specific state in each instance between requests, this application

will need some changes. Windows Azure doesn’t support sticky sessions, and so making the

application run on this cloud platform will require redesigning how it handles state.

A Silverlight or Windows Presentation Foundation (WPF) client that accesses WCF services running in

a middle tier. If the services don’t maintain per-client state between calls, moving them to Windows

Azure is straightforward. The client will continue to run on user desktops, as always, but it will now

call services running on Windows Azure. If the current services do maintain per-client state, however,

they’ll need to be redesigned.

An application with a single instance running on Windows Server that maintains state on its own

machine. Whether the clients are browsers or something else, many enterprise applications are built

this way today, and they won’t work well on Windows Azure without some redesign. It might be

possible to run this application unchanged in a single VM role instance, but its users probably won’t

be happy with the results. For one thing, the Windows Azure SLA doesn’t apply to applications with

only a single instance. Also, recall that the fabric controller can at any time reboot the machine on

which this instance runs to update that machine’s software. The application has no control over when

this happens; it might be smack in the middle of a workday. Since there’s no second instance to take

over—the application wasn’t built to follow the rules of the Windows Azure programming model—it

will be unavailable for some period of time, and so anybody using the application will have their work

interrupted while the machine reboots. Even though the VM role makes it easy to move a Windows

Server binary to Windows Azure, this doesn’t guarantee that the application will run successfully in its

new home. The application must also conform to the rules of the Windows Azure programming

model.

A Visual Basic 6 application that directly accesses a SQL Server database, i.e., a traditional

client/server application. Making this application run on Windows Azure will most likely require

rewriting at least the client business logic. While it might be possible to move the database (including

any stored procedures) to SQL Azure, then redirect the clients to this new location, the application’s

desktop component won’t run as is on Windows Azure. Windows Azure doesn’t provide a local user

interface, and it also doesn’t support using Remote Desktop Services (formerly Terminal Services) to

provide remote user interfaces.

Windows Azure can help developers create better applications. Yet the improvements it offers require

change, and so moving existing software to this new platform can take some effort. Making good

decisions requires understanding both the potential business value and any technical challenges that

moving an application to Windows Azure might bring.

CONCLUSION

Cloud platforms are a new world, and they open new possibilities. Reflecting this, the Windows Azure

programming model helps developers create applications that are easier to administer, more available, and

more scalable than those built in the traditional Windows Server environment. Doing this requires following

three rules:

A Windows Azure application is built from one or more roles.

A Windows Azure application runs multiple instances of each role.

A Windows Azure application behaves correctly when any role instance fails.

Using this programming model successfully also requires understanding the changes it brings to how applications

interact with the operating system, use persistent storage, and communicate between role instances. For

developers willing to do this, however, the value is clear. While it’s not right for every scenario, the Windows

Azure programming model can be useful for anybody who wants to create easier to administer, more available,

and more scalable applications.

FOR FURTHER READING

Introducing Windows Azure: http://go.microsoft.com/?linkid=9682907 Introducing the Windows Azure Platform: http://go.microsoft.com/?linkid=9752185

http://go.microsoft.com/?linkid=9682907

Cloud service concept

When you create an application and run it in Windows Azure, the code and configuration together

are called a Windows Azure cloud service (known as a hosted service in earlier Windows Azure

releases).

By creating a cloud service, you can deploy a multi-tier application in Windows Azure, defining

multiple roles to distribute processing and allow flexible scaling of your application. A cloud service

consists of one or more web roles and/or worker roles, each with its own application files and

configuration.

For a cloud service, Windows Azure maintains the infrastructure for you, performing routine

maintenance, patching the operating systems, and attempting to recover from service and hardware

failures. If you define at least two instances of every role, most maintenance, as well as your own

service upgrades, can be performed without any interruption in service. A cloud service must have at

least two instances of every role to qualify for the Windows Azure Service Level Agreement, which

guarantees external connectivity to your Internet-facing roles at least 99.95 of the time.

Each cloud service has two environments to which you can deploy your service package and

configuration. You can deploy a cloud service to the staging environment to test it before you

promote it to production. Promoting a staged cloud service to production is a simple matter of

swapping the virtual IP addresses (VIPs) that are associated with the two environments.

cloud service role: A cloud service role is comprised of application files and a configuration. A cloud

service can have two types of role:

web role:A web role provides a dedicated Internet Information Services (IIS) web-server used

for hosting front-end web applications.

worker role: Applications hosted within worker roles can run asynchronous, long-running or

perpetual tasks independent of user interaction or input.

role instance: A role instance is a virtual machine on which the application code and role

configuration run. A role can have multiple instances, defined in the service configuration file.

guest operating system: The guest operating system for a cloud service is the operating system

installed on the role instances (virtual machines) on which your application code runs.

cloud service components: Three components are required in order to deploy an application as a

cloud service in Windows Azure:

service definition file: The cloud service definition file (.csdef) defines the service model,

including the number of roles.

service configuration file: The cloud service configuration file (.cscfg) provides configuration

settings for the cloud service and individual roles, including the number of role instances.

service package: The service package (.cspkg) contains the application code and the service

definition file.

cloud service deployment: A cloud service deployment is an instance of a cloud service deployed to

the Windows Azure staging or production environment. You can maintain deployments in both

staging and production.

deployment environments: Windows Azure offers two deployment environments for cloud services:

astaging environment in which you can test your deployment before you promote it to

the production environment. The two environments are distinguished only by the virtual IP addresses

(VIPs) by which the cloud service is accessed. In the staging environment, the cloud service's globally

unique identifier (GUID) identifies it in URLs (GUID.cloudapp.net). In the production environment, the

URL is based on the friendlier DNS prefix assigned to the cloud service (for

example, myservice.cloudapp.net).

swap deployments: To promote a deployment in the Windows Azure staging environment to the

production environment, you can "swap" the deployments by switching the VIPs by which the two

deployments are accessed. After the deployment, the DNS name for the cloud service points to the

deployment that had been in the staging environment.

minimal vs. verbose monitoring: Minimal monitoring, which is configured by default for a cloud

service, uses performance counters gathered from the host operating systems for role instances

(virtual machines).Verbose monitoring gathers additional metrics based on performance data within

the role instances to enable closer analysis of issues that occur during application processing. For

more information, see How to Monitor Cloud Services.

Windows Azure Diagnostics: Windows Azure Diagnostics is the API that enables you to collect

diagnostic data from applications running in Windows Azure. Windows Azure Diagnostics must be

enabled for cloud service roles in order for verbose monitoring to be turned on.

link a resource: To show your cloud service's dependencies on other resources, such as a Windows

Azure SQL Database instance, you can "link" the resource to the cloud service. In the Preview

Management Portal, you can view linked resources on the Linked Resources page, view their status

on the dashboard, and scale a linked SQL Database instance along with the service roles on

the Scale page. Linking a resource in this sense does not connect the resource to the application;

you must configure the connections in the application code.

https://www.windowsazure.com/en-us/manage/services/cloud-services/how-to-monitor-a-cloud-service/

scale a cloud service: A cloud service is scaled out by increasing the number of role instances

(virtual machines) deployed for a role. A cloud service is scaled in by decreasing role instances. In the

Preview Management Portal, you can also scale a linked SQL Database instance, by changing the SQL

Database edition and the maximum database size, when you scale your service roles.

Windows Azure Service Level Agreement (SLA): The Windows Azure Compute SLA guarantees

that, when you deploy two or more role instances for every role, access to your cloud service will be

maintained at least 99.95 percent of the time. Also, detection and corrective action will be initiated

99.9 percent of the time when a role instance‘s process is not running. For more information,

see Service Level Agreements.

https://www.windowsazure.com/en-us/support/legal/sla/

LECTURE 5: Data Management and Business

Analytics

Managing and analyzing data in the cloud is just as important as it is anywhere else. To let you do

this, Windows Azure provides a range of technologies for working with relational and non-relational

data. This article introduces each of the options.

Table of Contents

Blob Storage

Running a DBMS in a Virtual Machine

SQL Database

o SQL Data Sync

o SQL Data Reporting

Table Storage

Hadoop

Blob Storage

The word "blob" is short for "Binary Large OBject", and it describes exactly what a blob is: a collection

of binary information. Yet even though they‘re simple, blobs are quite useful. Figure 1 illustrates the

basics of Windows Azure Blob Storage.

https://www.windowsazure.com/en-us/develop/net/fundamentals/cloud-storage/#blob

https://www.windowsazure.com/en-us/develop/net/fundamentals/cloud-storage/#dbinvm

https://www.windowsazure.com/en-us/develop/net/fundamentals/cloud-storage/#sqldb

https://www.windowsazure.com/en-us/develop/net/fundamentals/cloud-storage/#datasync

https://www.windowsazure.com/en-us/develop/net/fundamentals/cloud-storage/#datarpt

https://www.windowsazure.com/en-us/develop/net/fundamentals/cloud-storage/#tblstor

https://www.windowsazure.com/en-us/develop/net/fundamentals/cloud-storage/#hadoop

https://www.windowsazure.com/en-us/develop/net/fundamentals/cloud-storage/#Fig1

Figure 1: Windows Azure Blob Storage stores binary data—blobs—in containers.

To use blobs, you first create a Windows Azure storage account. As part of this, you specify the

Windows Azure datacenter that will store the objects you create using this account. Wherever it lives,

each blob you create belongs to some container in your storage account. To access a blob, an

application provides a URL with the form:

http://<StorageAccount>.blob.core.windows.net/<Container>/<BlobName>

<StorageAccount> is a unique identifier assigned when a new storage account is created, while

<Container> and <BlobName> are the names of a specific container and a blob within that

container.

Windows Azure provides two different kinds of blobs. The choices are:

Block blobs, each of which can contain up to 200 gigabytes of data. As its name suggests, a block

blob is subdivided into some number of blocks. If a failure occurs while transferring a block blob,

retransmission can resume with the most recent block rather than sending the entire blob again.

Block blobs are a quite general approach to storage, and they‘re the most commonly used blob type

today.

Page blobs, which can be as large at one terabyte each. Page blobs are designed for random access,

and so each one is divided into some number of pages. An application is free to read and write

individual pages at random in the blob. In Windows Azure Virtual Machines, for example, VMs you

create use page blobs as persistent storage for both OS disks and data disks.

Whether you choose block blobs or page blobs, applications can access blob data in several different

ways. The options include the following:

Directly through a RESTful (i.e., HTTP-based) access protocol. Both Windows Azure applications and

external applications, including apps running on premises, can use this option.

Using the Windows Azure Storage Client library, which provides a more developer-friendly interface

on top of the raw RESTful blob access protocol. Once again, both Windows Azure applications and

external applications can access blobs using this library.

Using Windows Azure drives, an option that lets a Windows Azure application treat a page blob as a

local drive with an NTFS file system. To the application, the page blob looks like an ordinary

Windows file system accessed using standard file I/O. In fact, reads and writes are sent to the

underlying page blob that implements the Windows Azure Drive.

To guard against hardware failures and improve availability, every blob is replicated across three

computers in a Windows Azure datacenter. Writing to a blob updates all three copies, so later reads

won‘t see inconsistent results. You can also specify that a blob‘s data should be copied to another

Windows Azure datacenter in the same region but at least 500 miles away. This copying, called geo-

replication, happens within a few minutes of an update to the blob, and it‘s useful for disaster

recovery.

Data in blobs can also be made available via the Windows Azure Content Delivery Network (CDN). By

caching copies of blob data at dozens of servers around the world, the CDN can speed up access to

information that‘s accessed repeatedly.

Simple as they are, blobs are the right choice in many situations. Storing and streaming video and

audio are obvious examples, as are backups and other kinds of data archiving. Developers can also

use blobs to hold any kind of unstructured data they like. Having a straightforward way to store and

access binary data can be surprisingly useful.

Running a DBMS in a Virtual Machine

Many applications today rely on some kind of database management system (DBMS). Relational

systems such as SQL Server are the most frequently used choice, but non-relational approaches,

commonly known as NoSQLtechnologies, get more popular every day. To let cloud applications use

these data management options, Windows Azure Virtual Machines allows you to run a DBMS

(relational or NoSQL) in a VM. Figure 2 shows how this looks with SQL Server.


Figure 2: Windows Azure Virtual Machines allows running a DBMS in a VM, with persistence

provided by blobs.

To both developers and database administrators, this scenario looks much like running the same

software in their own datacenter. In the example shown here, for instance, nearly all of SQL Server‘s

capabilities can be used, and you have full administrative access to the system. You also have the

responsibility of managing the database server, of course, just as if it were running locally.

As Figure 2 shows, your databases appear to be stored on the local disk of the VM the server runs in.

Under the covers, however, each of those disks is written to a Windows Azure blob. (It‘s similar to

using a SAN in your own datacenter, with a blob acting much like a LUN.) As with any Windows

Azure blob, the data it contains is replicated three times within a datacenter and, if you request it,

geo-replicated to another datacenter in the same region. It‘s also possible to use options such as

SQL Server database mirroring for improved reliability.

Another way to use SQL Server in a VM is to create a hybrid application, where the data lives on

Windows Azure while the application logic runs on-premises. For example, this might make sense

when applications running in multiple locations or on various mobile devices must share the same

data. To make communication between the cloud database and on-premises logic simpler, an

organization can use Windows Azure Virtual Network to create a virtual private network (VPN)

connection between a Windows Azure datacenter and its own on-premises datacenter.

SQL Database

For many people, running a DBMS in a VM is the first option that comes to mind for managing

structured data in the cloud. It‘s not the only choice, though, nor is it always the best choice. In some

cases, managing data using a Platform as a Service (PaaS) approach makes more sense. Windows

Azure provides a PaaS technology called SQL Database that lets you do this for relational

data. Figure 3 illustrates this option.



Figure 3: SQL Database provides a shared PaaS relational storage service.

SQL Database doesn‘t give each customer its own physical instance of SQL Server. Instead, it

provides a multi-tenant service, with a logical SQL Database server for each customer. All customers

share the compute and storage capacity that the service provides. And as with Blob Storage, all data

in SQL Database is stored on three separate computers within a Windows Azure datacenter, giving

your databases built-in high availability (HA).

To an application, SQL Database looks much like SQL Server. Applications can issue SQL queries

against relational tables, use T-SQL stored procedures, and execute transactions across multiple

tables. And because applications access SQL Database using the Tabular Data Stream (TDS) protocol,

the same protocol used to access SQL Server, they can work with data using Entity Framework,

ADO.NET, JDBC, and other familiar data access interfaces.

But because SQL Database is a cloud service running in Windows Azure data centers, you don‘t need

to manage any of the system‘s physical aspects, such as disk usage. You also don‘t need to worry

about updating software or handling other low-level administrative tasks. Each customer

organization still controls its own databases, of course, including their schemas and user logins, but

many of the mundane administrative tasks are done for you.

While SQL Database looks much like SQL Server to applications, it doesn‘t behave exactly the same

as a DBMS running on a physical or virtual machine. Because it runs on shared hardware, its

performance will vary with the load placed on that hardware by all of its customers. This means that

the performance of, say, a stored procedure in SQL Database might vary from one day to the next.

Today, SQL Database lets you create a database holding up to 150 gigabytes. If you need to work

with larger databases, the service provides an option called Federation. To do this, a database

administrator creates two or more federation members, each of which is a separate database with its

own schema. Data is spread across these members, something that‘s often referred to as sharding,

with each member assigned a unique federation key. An application issues SQL queries against this

data by specifying the federation key that identifies the federation member the query should target.

This allows using a traditional relational approach with large amounts of data. As always, there are

trade-offs; neither queries nor transactions can span federation members, for instance. But when a

relational PaaS service is the best choice and these trade-offs are acceptable, using SQL Federation

can be a good solution.

SQL Database can be used by applications running on Windows Azure or elsewhere, such as in your

on-premises datacenter. This makes it useful for cloud applications that need relational data, as well

as on-premises applications that can benefit from storing data in the cloud. A mobile application

might rely on SQL Database to manage shared relational data, for instance, as might an inventory

application that runs at multiple dealers around the world.

Thinking about SQL Database raises an obvious (and important) issue: When should you run SQL

Server in a VM, and when is SQL Database a better choice? As usual, there are trade-offs, and so

which approach is better depends on your requirements.

One simple way to think about it is to view SQL Database as being for new applications, while SQL

Server in a VM is a better choice when you‘re moving an existing on-premises application to the

cloud. It can also be useful to look at this decision in a more fine-grained way, however. For example,

SQL Database is easier to use, since there‘s minimal setup and administration. But running SQL

Server in a VM can have more predictable performance—it‘s not a shared service—and it also

supports larger non-federated databases than SQL Database. Still, SQL Database provides built-in

replication of both data and processing, effectively giving you a high-availability DBMS with very

little work. While SQL Server gives you more control and a somewhat broader set of options, SQL

Database is simpler to set up and significantly less work to manage.

Finally, it‘s important to point out that SQL Database isn‘t the only PaaS data service available on

Windows Azure. Microsoft partners provide other options as well. For example, ClearDB offers a

MySQL PaaS offering, while Cloudant sells a NoSQL option. PaaS data services are the right solution

in many situations, and so this approach to data management is an important part of Windows

Azure.

SQL Data Sync

While SQL Database does maintain three copies of each database within a single Windows Azure

datacenter, it doesn‘t automatically replicate data between Windows Azure datacenters. Instead, it

provides SQL Data Sync, a service that you can use to do this. Figure 4 shows how this looks.


Figure 4: SQL Data Sync synchronizes data in SQL Database with data in other Windows Azure

and on-premises datacenters.

As the diagram shows, SQL Data Sync can synchronize data across different locations. Suppose

you‘re running an application in multiple Windows Azure datacenters, for instance, with data stored

in SQL Database. You can use SQL Data Sync to keep that data synchronized. SQL Data Sync can also

synchronize data between a Windows Azure datacenter and an instance of SQL Server running in an

on-premises datacenter. This might be useful for maintaining both a local copy of data used by on-

premises applications and a cloud copy used by applications running on Windows Azure. And

although it‘s not shown in the figure, SQL Data Sync can also be used to synchronize data between

SQL Database and SQL Server running in a VM on Windows Azure or elsewhere.

Synchronization can be bi-directional, and you determine exactly what data is synchronized and how

frequently it‘s done. (Synchronization between databases isn‘t atomic, however—there‘s always at

least some delay.) And however it‘s used, setting up synchronization with SQL Data Sync is entirely

configuration-driven; there‘s no code to write.

SQL Data Reporting

Once a database contains data, somebody will probably want to create reports using that data. To let

you do this with data stored in SQL Database, Windows Azure provides SQL Reporting. This cloud

service provides a subset of the functionality in SQL Server Reporting Services (SSRS), the reporting

technology included with SQL Server. In its initial incarnation, SQL Reporting is aimed primarily at

independent software vendors (ISVs) who need to embed reports in their applications. Figure

5 shows how the process works.

Figure 5: Windows Azure SQL Reporting provides reporting services for data in SQL Database.

Before a user can see a report, someone defines what that report should look like (step 1). With SQL

Reporting, this can be done using either of two tools: SQL Server Data Tools, part of SQL Server

2012, or its predecessor, Business Intelligence (BI) Development Studio. As with SSRS, these report

definitions are expressed in the Report Definition Language (RDL). After the RDL files for a report

have been created, they are uploaded to SQL Reporting in the cloud (step 2). The report definition is

now ready to use.

Next, a user of the application accesses the report (step 3). The application passes this request to

SQL Reporting (step 4), which contacts SQL Database to get the data it needs (step 5). SQL Reporting

uses this data and the relevant RDL files to render the report (step 6), then returns the report to the

application (step 7), which displays it to the user (step 8).

Embedding a report in an application, the scenario shown here, isn‘t the only option. It‘s also

possible to view reports in a SQL Reporting portal or in other ways. Reports can also be combined,

with one report containing a link to another.



Like SQL Database, SQL Reporting is a multi-tenant PaaS service. You can use it immediately—there‘s

nothing to install—and it requires minimal management. Microsoft monitors the service, provides

patches, handles scaling, and does the other work needed to keep the service available. While it‘s

possible to run reports on SQL Database tables using the on-premises version of SSRS, SQL

Reporting is typically a better alternative for adding reporting to Windows Azure applications.

Table Storage

Relational data is useful in many situations, but it‘s not always the right choice. If your application

needs fast, simple access to very large amounts of loosely structured data, for instance, a relational

database might not work well. A NoSQL technology is likely to be a better option.

Windows Azure Table Storage is an example of this kind of NoSQL approach. Despite its name, Table

Storage doesn‘t support standard relational tables. Instead, it provides what‘s known as a key/value

store, associating a set of data with a particular key, then letting an application access that data by

providing the key. Figure 6illustrates the basics.

Figure 6: Windows Azure Table Storage is a key/value store that provides fast, simple access to

large amounts of data.

Like blobs, each table is associated with a Windows Azure storage account. Tables are also named

much like blobs, with a URL of the form

http://<StorageAccount>.table.core.windows.net/<TableName>

As the figure shows, each table is divided into some number of partitions, each of which can be

stored on a separate machine. (This is a form of sharding, as with SQL Federation.) Both Windows

Azure applications and applications running elsewhere can access a table using either the RESTful

OData protocol or the Windows Azure Storage Client library.


Each partition in a table holds some number of entities, each containing as many as 255 properties.

Every property has a name, a type (such as Binary, Bool, DateTime, Int, or String), and a value. Unlike

relational storage, these tables have no fixed schema, and so different entities in the same table can

contain properties with different types. One entity might have just a String property containing a

name, for example, while another entity in the same table has two Int properties containing a

customer ID number and a credit rating.

To identify a particular entity within a table, an application provides that entity‘s key. The key has two

parts: apartition key that identifies a specific partition and a row key that identifies an entity within

that partition. InFigure 6, for example, the client requests the entity with partition key A and row key

3, and Table Storage returns that entity, including all of the properties it contains.

This structure lets tables be big—a single table can contain up to 100 terabytes of data—and it

allows fast access to the data they contain. It also brings limitations, however. For example, there‘s no

support for transactional updates that span tables or even partitions in a single table. A set of

updates to a table can only be grouped into an atomic transaction if all of the entities involved are in

the same partition. There‘s also no way to query a table based on the value of its properties, nor is

there support for joins across multiple tables. And unlike relational databases, tables have no support

for stored procedures.

Windows Azure Table Storage is a good choice for applications that need fast, cheap access to large

amounts of loosely structured data. For example, an Internet application that stores profile

information for lots of users might use tables. Fast access is important in this situation, and the

application probably doesn‘t need the full power of SQL. Giving up this functionality to gain speed

and size can sometimes make sense, and so Table Storage is just the right solution for some

problems.

Hadoop

Organizations have been building data warehouses for decades. These collections of information,

most often stored in relational tables, let people work with and learn from data in many different

ways. With SQL Server, for instance, it‘s common to use tools such as SQL Server Analysis Services to

do this.

But suppose you want to do analysis on non-relational data. Your data might take many forms:

information from sensors or RFID tags, log files in server farms, clickstream data produced by web

applications, images from medical diagnostic devices, and more. This data might also be really big,

too big to be used effectively with a traditional data warehouse. Big data problems like this, rare just

a few years ago, have now become quite common.


To analyze this kind of big data, our industry has largely converged on a single solution: the open-

source technology Hadoop. Hadoop runs on a cluster of physical or virtual machines, spreading the

data it works on across those machines and processing it in parallel. The more machines Hadoop has

to use, the faster it can complete whatever work it‘s doing.

This kind of problem is a natural fit for the public cloud. Rather than maintaining an army of on-

premises servers that might sit idle much of the time, running Hadoop in the cloud lets you create

(and pay for) VMs only when you need them. Even better, more and more of the big data that you

want to analyze with Hadoop is created in the cloud, saving you the trouble of moving it around. To

help you exploit these synergies, Microsoft provides a Hadoop service on Windows Azure. Figure

7 shows the most important components of this service.

Figure 7: Hadoop on Windows Azure runs MapReduce jobs that process data in parallel using

multiple virtual machines.

To use Hadoop on Windows Azure, you first ask this cloud platform to create a Hadoop cluster,

specifying the number of VMs you need. Setting up a Hadoop cluster yourself is a non-trivial task,

and so letting Windows Azure do it for you makes sense. When you‘re done using the cluster, you

shut it down. There‘s no need to pay for compute resources that you aren‘t using.

A Hadoop application, commonly called a job, uses a programming model known as MapReduce. As

the figure shows, the logic for a MapReduce job runs simultaneously across many VMs. By

processing data in parallel, Hadoop can analyze data much more rapidly than single-machine

solutions.



On Windows Azure, the data a MapReduce job works on is typically kept in blob storage. In Hadoop,

however, MapReduce jobs expect data to be stored in the Hadoop Distributed File System (HDFS).

HDFS is similar to Blob Storage in some ways; it replicates data across multiple physical servers, for

example. Rather than duplicate this functionality, Hadoop on Windows Azure instead exposes Blob

Storage through the HDFS API, as the figure shows. While the logic in a MapReduce job thinks it‘s

accessing ordinary HDFS files, the job is in fact working with data streamed to it from blobs. And to

support the case where multiple jobs are run over the same data, Hadoop on Windows Azure also

allow copying data from blobs into full HDFS running in the VMs.

MapReduce jobs are commonly written in Java today, an approach that Hadoop on Windows Azure

supports. Microsoft has also added support for creating MapReduce jobs in other languages,

including C#, F#, and JavaScript. The goal is to make this big data technology more easily accessible

to a larger group of developers.

Along with HDFS and MapReduce, Hadoop includes other technologies that let people analyze data

without writing a MapReduce job themselves. For example, Pig is a high-level language designed for

analyzing big data, while Hive offers a SQL-like language called HiveQL. Both Pig and Hive actually

generate MapReduce jobs that process HDFS data, but they hide this complexity from their users.

Both are provided with Hadoop on Windows Azure.

Microsoft also provides a HiveQL driver for Excel. Using an Excel add-in, business analysts can create

HiveQL queries (and thus MapReduce jobs) directly from Excel, then process and visualize the results

using PowerPivot and other Excel tools. Hadoop on Windows Azure includes other technologies as

well, such as the machine learning libraries Mahout, the graph mining system Pegasus, and more.

Big data analysis is important, and so Hadoop is also important. By providing Hadoop as a managed

service on Windows Azure, along with links to familiar tools such as Excel, Microsoft aims at making

this technology accessible to a broader set of users.

More broadly, data of all kinds is important. This is why Windows Azure includes a range of options

for data management and business analytics. Whatever application you‘re trying to create, it‘s likely

that you‘ll find something in this cloud platform that will work for you.

Lecture 6:

Windows Azure SQL Database


Microsoft Windows Azure SQL Database is a cloud-based relational database service that is built on SQL

Server technologies and runs in Microsoft data centers on hardware that is owned, hosted, and

maintained by Microsoft. This topic provides an overview of Windows Azure SQL Database and describes

some ways in which it is different from SQL Server.

Similarities and Differences

Similar to an instance of SQL Server on your premises, Windows Azure SQL Database exposes a tabular

data stream (TDS) interface for Transact-SQL-based database access. This allows your database

applications to use Windows Azure SQL Database in the same way that they use SQL Server. Because

Windows Azure SQL Database is a service, administration in Windows Azure SQL Database is slightly

different.

Unlike administration for an on-premise instance of SQL Server, Windows Azure SQL Database abstracts

the logical administration from the physical administration; you continue to administer databases, logins,

users, and roles, but Microsoft administers the physical hardware such as hard drives, servers, and storage.

This approach helps Windows Azure SQL Database provide a large-scale multi-tenant database service

that offers enterprise-class availability, scalability, security, and self-healing.

Because Microsoft handles all of the physical administration, there are some differences between

Windows Azure SQL Database and an on-premise instance of SQL Server in terms of administration,

provisioning, Transact-SQL support, programming model, and features. For more information,

seeTransact-SQL Support (Windows Azure SQL Database) and Tools and Utilities Support (Windows Azure

SQL Database).

Logical Administration and Physical Administration

Although Windows Azure SQL Database plays an active role in managing the physical resources of the

database, the DBA plays a very important role in administering SQL Database-based database

applications. Using Windows Azure SQL Database, DBAs manage schema creation, statistics management,

index tuning, query optimization, and security administration (logins, users, roles, and so on). For more

information about security administration in Windows Azure SQL Database, see Managing Databases and

Logins in Windows Azure SQL Database.

Database administration in Windows Azure SQL Database differs most from SQL Server in terms of

physical administration. Windows Azure SQL Database automatically replicates all data to provide high

availably. Windows Azure SQL Database also manages load balancing and, in case of a server failure,

transparent fail-over.

http://msdn.microsoft.com/en-us/library/windowsazure/ee336241.aspx#feedback






To provide this level of physical administration, you cannot control the physical resources of Windows

Azure SQL Database. For example, you cannot specify the physical hard drive or file group where a

database or index will reside. Because the computer file system is not accessible and all data is

automatically replicated, SQL Server backup and restore commands are not applicable to Windows Azure

SQL Database.

Note

SQL Database allows you to back up your database by copying it to a new database in SQL Database.

For more information, see Copying Databases in Windows Azure SQL Database.

Although backup and restore commands are not available, you can also use SQL Server Integration

Services and the SQLCMD utility to bulk copy data. For more information about using SQLCMD with

Windows Azure SQL Database, see How to: Connect to Windows Azure SQL Database Using sqlcmd.

Provisioning

When preparing an on-premises SQL Server deployment, it may be the role of the DBA or IT department

to prepare and configure the required hardware and software. When using Windows Azure SQL Database,

these tasks are performed by the SQL Database provisioning process.

You can begin provisioning your SQL Databases after you create a Windows Azure platform account. This

account allows you to access all the services, such as Windows Azure, Windows Azure AppFabric, and

Windows Azure SQL Database, and is used to set up and manage your subscriptions.

Each SQL Database subscription may be bound to one or more SQL Database servers at the Microsoft

data center. Your SQL Database server is an abstraction that defines a grouping of databases. To enable

load balancing and high availability, databases associated with your SQL Database server may reside on

separate physical computers at the Microsoft data center.

For more information about provisioning, see Windows Azure SQL Database Provisioning Model.

Transact-SQL Support

Many SQL Server Transact-SQL statements have parameters that allow you to specify file groups or

physical file paths. These types of parameters are not supported in Windows Azure SQL Database because

they have dependencies on the physical configuration. In such cases, the command is considered partially

supported. For more information about Transact-SQL support, see Transact-SQL Support (Windows Azure

SQL Database).

Features and Types

Windows Azure SQL Database does not support all of the features and data types found in SQL Server.

Analysis Services, Replication, and Service Broker are not currently provided as services on the Windows

Azure platform.

http://msdn.microsoft.com/en-us/library/windowsazure/ff951624.aspx





Because Windows Azure SQL Database performs the physical administration, any statements and options

that attempt to directly manipulate physical resources will be blocked, such as Resource Governor, file

group references, and some physical server DDL statements. It is also not possible to set server options

and SQL trace flags or use the SQL Server Profiler or the Database Tuning Advisor utilities.

Windows Azure SQL Database supports many SQL Server 2008 data types; it does not support data types

that have been deprecated from SQL Server 2008. For more information about data type support in

Windows Azure SQL Database, see Data Types (Windows Azure SQL Database). For more information

about SQL Server 2008 deprecated types, see Deprecated Database Engine Features in SQL Server 2008.


http://go.microsoft.com/fwlink/p/?LinkId=157727

Compare SQL Server with Windows Azure SQL Database

(en-US)

Windows Azure SQL Database is a cloud-based relational database service from Microsoft. SQL Database provides

relational database functionality as a utility service. Cloud-based database solutions such as SQL Database can

provide many benefits, including rapid provisioning, cost-effective scalability, high availability, and reduced

management overhead. This paper provides an architectural overview of SQL Database, and describes how you can

use SQL Database to augment your existing on-premises data infrastructure or as your complete database solution.

Last Reviewed: 8/26/2011

Table of Contents


Logical Administration vs. Physical Administration

Provisioning


Features and Types

Key Benefits of the Service

o Self-Managing

o High Availability

o Scalability

o Familiar Development Model

o Relational Data Model

See Also

Other Languages


Similar to an instance of SQL Server on your premises, SQL Database exposes a tabular data stream (TDS) interface for

Transact-SQL-based database access. This allows your database applications to use SQL Database in the same way

that they use SQL Server. Since SQL Database is a service, administration in SQL Database is slightly different.

Unlike administration for an on-premise instance of SQL Server, SQL Database abstracts the logical administration

from the physical administration; you continue to administer databases, logins, users, and roles, but Microsoft

administers and configures the physical hardware such as hard drives, servers, and storage. This approach helps SQL

Database provide a large-scale multi-tenant database service that offers enterprise-class availability, scalability,

security, and self-healing.

Since Microsoft handles all of the physical administration, there are some differences between SQL Database and an

on-premise instance of SQL Server in terms of administration, provisioning, Transact-SQL support, programming

model, and features. For more information, see Guidelines and Limitations (Windows Azure SQL Database) .

Logical Administration vs. Physical Administration

http://social.technet.microsoft.com/wiki/contents/articles/996.compare-sql-server-with-windows-azure-sql-database-en-us.aspx#Similarities_and_Differences

http://social.technet.microsoft.com/wiki/contents/articles/996.compare-sql-server-with-windows-azure-sql-database-en-us.aspx#Logical_Administration_vs_Physical_Administration

http://social.technet.microsoft.com/wiki/contents/articles/996.compare-sql-server-with-windows-azure-sql-database-en-us.aspx#Provisioning

http://social.technet.microsoft.com/wiki/contents/articles/996.compare-sql-server-with-windows-azure-sql-database-en-us.aspx#Transact-SQL_Support

http://social.technet.microsoft.com/wiki/contents/articles/996.compare-sql-server-with-windows-azure-sql-database-en-us.aspx#Features_and_Types

http://social.technet.microsoft.com/wiki/contents/articles/996.compare-sql-server-with-windows-azure-sql-database-en-us.aspx#Key_Benefits_of_the_Service

http://social.technet.microsoft.com/wiki/contents/articles/996.compare-sql-server-with-windows-azure-sql-database-en-us.aspx#Self-Managing

http://social.technet.microsoft.com/wiki/contents/articles/996.compare-sql-server-with-windows-azure-sql-database-en-us.aspx#High_Availability

http://social.technet.microsoft.com/wiki/contents/articles/996.compare-sql-server-with-windows-azure-sql-database-en-us.aspx#Scalability

http://social.technet.microsoft.com/wiki/contents/articles/996.compare-sql-server-with-windows-azure-sql-database-en-us.aspx#Familiar_Development_Model

http://social.technet.microsoft.com/wiki/contents/articles/996.compare-sql-server-with-windows-azure-sql-database-en-us.aspx#Relational_Data_Model

http://social.technet.microsoft.com/wiki/contents/articles/996.compare-sql-server-with-windows-azure-sql-database-en-us.aspx#See_Also

http://social.technet.microsoft.com/wiki/contents/articles/996.compare-sql-server-with-windows-azure-sql-database-en-us.aspx#Other_Languages

http://msdn.microsoft.com/en-us/library/ff394102.aspx


Although SQL Database plays an active role in managing the physical resources of the database, the DBA plays a very

important role in administering SQL Database-based database applications. Using SQL Database, DBAs manage

schema creation, statistics management, index tuning, query optimization, and security administration (logins, users,

roles, etc.). For more information about security administration in SQL Database, see Managing Databases and Logins

in Windows Azure SQL Database .

Database administration in SQL Database differs most from SQL Server in terms of physical administration. SQL

Database automatically replicates all data to provide high availability. SQL Database also manages load balancing

and, in case of a server failure, transparent fail-over to a healthy machine hosting one of the backup copies of your

database.

To provide this level of physical administration, you cannot control the physical resources of SQL Database. For

example, you cannot specify the physical hard drive or file group where a database or index will reside. Because the

computer file system is not accessible and all data is automatically replicated, SQL Server backup and restore

commands are not applicable to SQL Database. The SQL Database service still backs up all databases; however they

are not accessible to regular users. This is a feature that may be offered in future.

Starting with SQL Database Service Update 4, SQL Database allows you to back up your database by copying it to a

new database in SQL Database. For more information, see Copying Databases in Windows Azure SQL Database .

For more information on the available options to transfer data to SQL Database, see Migrating Databases to Windows

Azure SQL Database .

Provisioning

When preparing an on-premises SQL Server deployment, it may be the role of the DBA or IT department to prepare

and configure the required hardware and software. When using SQL Database, these tasks are performed by the SQL

Database provisioning process.

You can begin provisioning your SQL Databases after you create a Windows Azure Platform account. This account

allows you to access all the services, such as Windows Azure, AppFabric, and SQL Database, and is used to set up and

manage your subscriptions.

Each SQL Database subscription is bound to one SQL Database server within one of the Microsoft data centers. Your

SQL Database server is an abstraction that defines a grouping of databases. To enable load-balancing and high

availability, databases associated with your SQL Database server may reside on separate physical computers within

the Microsoft data center.

For more information about provisioning, see Windows Azure SQL Database Provisioning Model .


Transact-SQL is a language that contains commands used to administer instances of SQL Server including creating

and managing all objects in an instance of SQL Server, and inserting, retrieving, modifying, and deleting all data in

tables. Applications can communicate with an instance of SQL Server by sending Transact-SQL statements to the

server. Windows Azure SQL Database supports a subset of Transact-SQL for SQL Server. For more information about

Transact-SQL support, see Transact-SQL Support (Windows Azure SQL Database) .

http://msdn.microsoft.com/en-us/library/ee336235.aspx
































Features and Types

SQL Database does not support all of the features and data types found in SQL Server. Analysis Services, Replication,

and Service Broker are not currently provided as services on the SQL Database. You can connect from on-premises

Analysis Server to SQL Database and SQL Database can be used either as a data source or destination. When this

article is being updated, the Customer Technology Preview of Windows Azure SQL Reporting is also available. SQL

Reporting is a cloud-based reporting service built on SQL Database, SQL Server, and SQL Server Reporting Services

technologies. You can publish, view, and manage reports that display data from SQL Database data sources.

Because SQL Database performs the physical administration, any statements and options that attempt to directly

manipulate physical resources will be blocked, such as Resource Governor, file group references, and some physical

server DDL statements. It is also not possible to set server options and SQL trace flags or use the SQL Server Profiler

or the Database Tuning Advisor utilities.

Key Benefits of the Service

The benefits of using SQL Database include manageability, high availability, scalability, a familiar development model,

and a relational data model.

Self-Managing

SQL Database offers the scale and functionality of an enterprise data center without the administrative overhead that

is associated with on-premise instances of SQL Server. This self-managing capability enables organizations to

provision data services for applications throughout the enterprise without adding to the support burden of the central

IT department or distracting technology-savvy employees from their core tasks in order to maintain a departmental

database application.

With SQL Database, you can provision your data storage in minutes. This reduces the initial costs of data services by

enabling you to provision only what you need. When your needs change, you can easily extend your cloud-based

data storage to meet those needs.

High Availability

SQL Database is built on proven Windows Server and SQL Server technologies, and is flexible enough to cope with

any variations in usage and load. The service replicates multiple redundant copies of your data to multiple physical

servers to maintain data availability and business continuity. In the case of a hardware failure, SQL Database provides

automatic failover to ensure availability for your application.

Scalability

A key advantage of SQL Database is the ease with which you can scale your solution. As data grows, databases need

to either scale up or scale out. Scale up always has a ceiling whereas scale out has no virtual limits. A common scale

out technique is data-partitioning. After partitioning your data, the service scales as your data grows. A pay-as-you-

grow pricing model makes sure that you only pay for the storage that you use, so that you can also scale down the

service when you do not need it.

Familiar Development Model

http://www.microsoft.com/windowsazure/features/reporting/

http://www.microsoft.com/windowsazure/features/reporting/

When developers create on-premise applications that use SQL Server, they use client libraries like ADO.NET, ODBC

that use the tabular data stream (TDS) protocol to communicate between client and server. SQL Database provides

the same TDS interface as SQL Server so that you can use the same tools and libraries to build client applications for

data that is stored in SQL Database. For more about TDS, see Network Protocols and TDS Endpoints .

Relational Data Model

SQL Database will seem very familiar to developers and administrators because data is stored in SQL Database just

like it is stored in SQL Server, by using Transact-SQL. Conceptually similar to an on-premise instance of SQL Server, a

SQL Database server is logical group of databases that acts as an authorization boundary.

Within each SQL Database server, you can create multiple databases that have tables, views, stored procedures,

indices, and other familiar database objects. This data model makes good use of your existing relational database

design and Transact-SQL programming skills, and simplifies the process of migrating existing on-premise database

applications to SQL Database. For more about Transact-SQL and its relationship to SQL Database, see Transact-SQL

Support (Windows Azure SQL Database) .

SQL Database servers and databases are virtual objects that do not correspond to physical servers and databases. By

insulating you from the physical implementation, SQL Database enables you to spend time on your database design

and adding additional value to the business.

The following table provides a high-level comparison between SQL Database and SQL Server.

Feature SQL Server

(On-premise) SQL Database Mitigation

Data Storage No size limits

as such

* The Web Edition Database is best suited for

small Web applications and workgroup or

departmental applications. This edition supports

a database with a maximum size of 1 or 5 GB of

data.

* The Business Edition Database is best suited for

independent software vendors (ISVs), line-of-

business (LOB) applications, and enterprise

applications. This edition supports a database of

up to 150 GB of data, in increments of 10 GB.

Exact size and pricing information can be

obtained atPricing Overview .

· An archival process can be created where

older data can be migrated to another

database in SQL Database or on premise.

· Because of above size constraints, one of the

recommendations is to partition the data across

databases. Creating multiple databases will

allow you take maximum advantage of the

computing power of multiple nodes. The

biggest value in the Azure model is the

elasticity of being able to create as many

databases as you need, when your demand

peaks and delete/drop the databases as your

demand subsides. The biggest challenge is

writing the application to scale across multiple

databases. Once this is achieved, the logic can

be extended to scale across N number of

databases.

Edition · Express

· Workgroup

· Standard

· Enterprise

* Web Edition

* Business Edition

For more information, seeAccounts and Billing in

Windows Azure SQL Database .

Connectivity · SQL Server

Management

* The SQL Server Management Studio from SQL

Server 2008 R2 and SQL Server 2008 R2 Express

can be used to access, configure, manage and




http://www.microsoft.com/windowsazure/pricing/#sql





http://www.microsoft.com/windowsazure/pricing/














Studio

· SQLCMD

administer SQL Database. Previous versions of

SQL Server Management Studio are not

supported.

*The Management portal for Windows Azure SQL

Database

* SQLCMD

For more information, see Tools and Utilities

Support .

Data Migration For more information, seeMigrating Databases to

Windows Azure SQL Database .

Authentication * SQL

Authentication

* Windows

Authentication

SQL Server Authentication only Use SQL Server authentication

Schema No such

limitation

SQL Database does not support heaps. ALL tables

must have a clustered index before data can be

inserted.

Check all scripts to make sure all table creation

scripts include a clustered index. If a table is

created without a clustered constraint, a

clustered index must be created before an

insert operation is allowed on the table.

TSQL

Supportability

Certain Transact-SQL commands are fully

supported; some are partially supported while

others are unsupported.

* Supported Transact-

SQL:http://msdn.microsoft.com/en-

us/library/ee336270.aspx

* Partially Supported Transact-



* Unsupported Transact-



―USE‖ command Supported In SQL Database, the USE statement does not

switch between databases. To change databases,

you must directly connect to the database.

In SQL Database, each of the databases created

by the user may not be on the same physical

server. So the application has to retrieve data

separately from multiple databases and

consolidate at the application level.

Transactional

Replication

Supported Not supported You can use BCP or SSIS to get the data out on-

demand into an on premise SQL Server. When

this article is being updated, the Customer

Technology Preview of SQL Data Sync is also

available. You can use it to keep on-premise

SQL Server and SQL Database in sync, as well as

two or more SQL Database servers.

For more information on available migration

options, seeMigrating Databases to Windows

Azure SQL Database .

Log Shipping Supported Not supported

Database

Mirroring

Supported Not supported

SQL Agent Supported Cannot run SQL agent or jobs on SQL Database You can run SQL Server Agent on your on-

http://msdn.microsoft.com/en-us/library/gg442309.aspx












http://social.technet.microsoft.com/wiki/contents/articles/sql-azure-data-sync-overview.aspx




















































premise SQL Server and connect to SQL

Database.

Server options Supported Some system views are supported

For more information, seeSystem Views

(Windows Azure SQL Database) on MSDN.

The idea is most system level metadata is

disabled as it does not make sense in a cloud

model to expose server level information

Connection

Limitations

N/A In order to provide a good experience to all SQL

Database customers, your connection to the

service may be closed. For more information,

see General Guidelines and Limitations on

MSDN and SQL Database Connection

Management.

SQL Server

Integration

Services (SSIS)

Can run SSIS

on-premise

SSIS service not available on Azure platform Run SSIS on site and connect to SQL Database

with ADO.NET provider




http://social.technet.microsoft.com/wiki/contents/articles/add.aspx?WikiParentPageID=996&WikiPageKey=SQL-Database-Connection-Management

http://social.technet.microsoft.com/wiki/contents/articles/add.aspx?WikiParentPageID=996&WikiPageKey=SQL-Database-Connection-Management





Federations in Windows Azure SQL

Database (formerly SQL Azure)

Federations in SQL Database are a way to achieve greater scalability and performance from the database

tier of your application through horizontal partitioning. One or more tables within a database are split by

row and portioned across multiple databases (Federation members). This type of horizontal partitioning is

often referred to as ‗sharding‘. The primary scenarios in which this is useful are where you need to achieve

scale, performance, or to manage capacity.

SQL Database can deliver scale, performance, and additional capacity through federation, and can do so

dynamically with no downtime; client applications can continue accessing data during repartitioning

operations with no interruption in service.

Federation Architecture

A federation is a collection of database partitions that are defined by a federation distribution scheme,

known as the federation scheme. The federation scheme defines a federation distribution key, which

determines the distribution of data to partitions within the federation. The federation distribution key

must be an INT, BIGINT, UNIQUEIDENTIFIER, or VARBINARY (up to 900 bytes) and specifies a range value.

There can only be one federation scheme and one federation distribution key for a federation.

The database partitions within a federation are known as federation members, and each member covers a

part, or all, of the range of values covered by the data type of the federation distribution key. Federated

tables are tables which are spread across federation members. Each federation member has its own

schema, and contains the federated table rows that correspond to the federation member‘s range. The

collection of all rows in a federation member that match a specific federation key value is called

afederation atomic unit. Each federation member contains many federation atomic units. A federation

member may also contain reference tables, which are tables that are not federation aware. Reference

tables are fully contained within a member, and often contain reference information that is retrieved in

combination with federated data.

A federation member provides physical separation between the data it contains and data stored in other

members. Each federation member has its own schema, which may temporarily diverge from the schema

of other members due to member specific processing such as performing a rolling schema upgrade

across all members.

While federation members are physically implemented as databases, they are logically referenced at the

application layer as a range of federation key values. For example, a federation member database that

contains rows associated with federation key values 50-100 would be logically accessed by specifying a

key value within that range rather than specifying the database name.

Federations are accessed through a federation root database, which represents the application boundary

of the federation. It functions as the logical endpoint for applications to connect to a federation by

routing connections to the appropriate federation member based on the specified federation key value.

Each root database may contain multiple federations, each with its own federation schema. It may also

contain global data, such as users, passwords, roles, or other application specific data.

The following diagrams illustrate the logical and physical model for federations:

Design Considerations

When designing a federation, one of the most important design decisions is what value to federate on.

Ideally you want to select a key that allows you to federate data from multiple, related tables so related

rows are stored together. For example, in the case of a multi-tenant application you might select the

tenant_id. The rows within each federated table that specify the same tenant_id value would be stored in

the same federation atomic unit.

You must also consider how to insert new records in such a way that all federation members are equally

utilized, instead of storing all new records in one member. Determining how to distribute new data

among federation members must be handled at the application layer.

Since there is a physical separation of data contained in different federation members and SQL Database

doesn‘t support join operations across databases, your application must implement the logic for joining

data from multiple federation members or multiple federations. For example, a query that needs to join

data from a two federations would need to perform separate queries against each and join the data within

the application. The same holds true for aggregating data across multiple shards within a single

federation, such as obtaining a count of all rows contained within the federation.

Identity: Windows Azure Active Directory

Windows Azure Active Directory (Windows Azure AD) is a modern, REST-based service that provides

identity management and access control capabilities for your cloud applications. Now you have one

identity service across Windows Azure, Microsoft Office 365, Dynamics CRM Online, Windows Intune

and other 3rd party cloud services. Windows Azure Active Directory provides a cloud-based identity

provider that easily integrates with your on-premises AD deployments and full support of third party

identity providers.

Use Windows Azure AD to:

Integrate with your on-premises active directory

Quickly extend your existing on-premises Windows Azure AD to apply policy and control and

authenticate users with their existing corporate credentials to Windows Azure and other cloud

services.

Offer access control for you applications

Easily manage access to your applications based on centralized policy and rules. Ensure consistent

and appropriate access to your organizations applications is maintained to meet critical internal

security and compliance needs. Windows Azure AD Access Control provides developers centralized

authentication and authorization for applications in Windows Azure using either consumer identity

providers or your on-premises Windows Server Active Directory

Build social connections across the enterprise

Windows Azure AD Graph is an innovative social enterprise graph providing an easy RESTful

interface for accessing objects such as Users, Groups, and Roles with an explorer view for easily

discovering information and relationships.

Provide single sign-on across your cloud applications

Provide your users with a seamless, single sign-on experience across Microsoft Online Services, third

party cloud services and applications built on Windows Azure with popular web identity providers

like Windows Live ID, Google, Yahoo!, and Facebook.

Windows Azure AppFabric Access Control

Access Control is part of the AppFabric set of middleware services in Windows Azure. It provides

a central point for handling federated security and access control to your services and

applications, running in the cloud or on premise. AppFabric Access Control has built in support

for federating against AD FS 2.0 – or any custom identity provider that supports the WS-

Federation protocol.

In addition to supporting web based and SOAP based federation scenarios, Access Control also

support federation using the OAuth protocol and REST based services. It has a web based

management portal as well as OData based management services for configuring and managing

the service.

The Access Control service has a rule engine that can be used to transform the incoming claims,

thus being able to translate from a certain set of claims to another set of claims. It can also

translate token formats to/from SAML 1.0, SAML 2.0 and Simple Web Token (SWT) formats.

Using the Access Control service you set up trust between the service and every STS that you

want to federate with, including your own AD FS 2.0 STS:

Federated Identity through Windows Azure AppFabric Access Control

In this case your application running in Windows Azure still only has one point to point trust - to

the Access Control service – and all the dependencies to the business partners are managed by

this service.

Below is a sequence diagram showing the generic flow for single sign on using the Access

Control service. The terms used in the diagram maps to our example as:

Client – the client in our domain or our business partner‘s domain.

Identity Provider – our Active Directory + AD FS2 or our business partner‘s STS.

Relying party – our application running in Windows Azure

The above diagram shows the flow when the application is a web application and the client is a

web browser client (also known as a ‗passive client‘). If the application would expose web

services that a locally installed smart client (an ‗active client‘) would consume, for example WCF

services consumed by a WPF client – the first step in the sequence would be to directly login to

the identity provider – instead of being redirected to a login page.

Federation with public Identity Providers

Windows Azure Access Control also let you use some of the largest public Internet services as

Identity Providers. Out of the box there is built in support for using Windows Live ID, Facebook,

Yahoo and Google as identity providers. The Access Control service handles all protocol

transitions between the different providers, including Open ID 2.0 for Google and Yahoo,

Facebook Graph for Facebook, and WS-Federation for Windows Live ID. The service then

delivers a single SAML 1.1, SAML 2.0, or SWT token to your web application using the WS-

Federation protocol once a user is signed in.

This is of course a really interesting feature when you are building public facing consumer

oriented services, but could also be really useful in business to business scenarios. If for example

you are building a SAAS solution, using the public identity providers like Facebook and Windows

Live ID could be the option for smaller businesses while your enterprise customers could use

their own STS for federation.

Lecture 8:

WINDOWS AZURE SERVICE BUS

The Service Bus securely relays messages to and from any Web service regardless of the device or

computer on which they are hosted, or whether that device is behind a firewall or NAT router. It provides

direct one-way or request/response (relayed) messaging as well as brokered, or asynchronous, messaging

patterns.

The Service Bus and Access Control together make hybrid, connected applications—applications that

communicate from behind firewalls, across the Internet, from hosted cloud servers, between rich desktops

and smart devices—easier to build, secure, and manage. Although you can build hybrid, connected

applications today, doing this often means you have to build important infrastructure components before

you can build the applications themselves. The Service Bus and Access Control provide several important

infrastructure elements so that you can more easily begin making your connected applications work now.

Software, Services, Clouds, and Devices Today‘s business infrastructure is more feature-rich, connected, and interoperable than ever. People

access applications and data through stunning graphical programs running on traditional operating

systems; powerful Web applications that are running in browsers; and very small, intelligent computers

such as cell phones, netbooks, and other smart devices. Applications run locally, often on powerful servers

and server farms, and critical data is stored in performant databases, on desktops, and in the cloud.

The Internet connects people, applications, devices, and data across the world. Clouds of computing

power—such as Windows Azure—can help us reduce costs and increase scalability and manageability.

Web services can expose functionality to any caller safely and securely.

With these technologies, platforms, and devices, you can build significantly distributed, interactive

applications that can reach almost anyone, use almost any useful data, and do both securely and robustly

regardless of where the user is at the moment. Such hybrid, connected programs – including those often

referred to as ―Software plus Services‖ -- could use proprietary or private data behind a firewall and return

only the appropriate results to a calling device, or notify that device when a particular event occurs.

Fulfilling the Potential However, building these distributed applications currently is very, very hard—and it should not be. There

are many reasons why, without a platform that solves these problems for you, it remains difficult to take

advantage of these wonderful technologies that could make a business more efficient, more productive,

and your customers happier.

Operating systems are still located—trapped is often a better word—on a local computer, typically behind

a firewall and perhaps network address translation (NAT) of some sort. This problem is true of smart

devices and phones, too.


As ubiquitous as Web browsers are, their reach into data is limited to an interactive exchange in a format

they understand.

Heterogeneous platforms, such as server applications, desktop or portable computers, smart devices, and

advanced cell phones, often interoperate at a rudimentary level, if at all. They can rarely share the same

code base or benefit from feature or component reuse.

Much of the most valuable data is stored in servers and embedded in applications that will not be

replaced immediately—sometimes referred to as ―legacy‖ systems. The data in these systems are trapped

by technical limitations, security concerns, or privacy restrictions.

The Internet is not always the network being used. Private networks are an important part of the

application environment, and their insulation from the Internet is a simple fact of information technology

(IT) life.

Service Bus and Access Control are built to overcome these kinds of obstacles; they provide the ―fabric‖

that you can use to build, deploy, and manage the distributed applications that can help make the

promise of ―Software + Services‖ become real. The Service Bus and Access Control services together are

highly-scalable services that are running in Microsoft data centers that can be used by applications

anywhere to securely bridge the gap between local applications behind a firewall, applications that are

running in the cloud, and client applications of any kind around the world. Another way of saying this is

that the Service Bus and Access Control are the glue that makes ―Software‖ and ―Services‖ work together.

Feature Overview The Service Bus connects local, firewalled applications and data with applications in the cloud, rich

desktop applications, and smart, Web-enabled devices anywhere in the world.

Access Control Service is a claims-based access control service that can be used on most Web-enabled

devices to build interoperable, federated authentication and authorization into any connected application.

The following diagram illustrates this architecture.



Service Bus Features

Securely exposes to external callers Windows Communication Foundation (WCF)-based Web services that

are running behind firewalls and NAT routers -- without requiring you to open any inbound ports or

otherwise change firewall and router configurations.

Enables secure inbound communication from devices outside the firewall.

Provides a global namespace system that is location-independent: the name of a service in the Service

Bus provides no information about the final destination of the communication.

Provides a service registry for publishing and discovering service endpoint references in a service

namespace.

Provides relayed messaging capabilities: the relay service supports direct one-way messaging,

request/response messaging, and peer-to-peer messaging.

Provides brokered (or asynchronous) messaging capabilities: Senders and receivers do not have to be

online at the same time. The messaging infrastructure reliably stores messages until the receiving party is

ready to receive them. The core components of the brokered messaging infrastructure are Queues, Topics,

and Subscriptions.

Builds and hosts service endpoints that support:

o Exposing a Web service to remote users. Expose and secure a local Web service in the cloud without

managing any firewall or NAT settings.

o Eventing behavior. Listen for notifications on any device, anywhere in the world.

o Tunneling between any two endpoints to enable bidirectional streams.

The following diagram illustrates the capabilities of the Service Bus.



Relayed and Brokered Messaging


The messaging pattern associated with the initial releases of the Windows Azure Service Bus is referred to

as relayed messaging. The latest version of the Service Bus adds another type of messaging option known

as brokered messaging. The brokered messaging scheme can also be thought of as asynchronous

messaging.

Relayed Messaging

The central component of the Service Bus is a centralized (but highly load-balanced) relay service that

supports a variety of different transport protocols and Web services standards. This includes SOAP, WS-*,

and even REST. The relay service provides a variety of different relay connectivity options and can even

help negotiate direct peer-to-peer connections when it is possible. The Service Bus is optimized for .NET

developers who use the Windows Communication Foundation (WCF), both with regard to performance

and usability, and provides full access to its relay service through SOAP and REST interfaces. This makes it

possible for any SOAP or REST programming environment to integrate with it.

The relay service supports traditional one-way messaging, request/response messaging, and peer-to-peer

messaging. It also supports event distribution at Internet-scope to enable publish/subscribe scenarios and

bi-directional socket communication for increased point-to-point efficiency. In the relayed messaging

pattern, an on-premise service connects to the relay service through an outbound port and creates a bi-

directional socket for communication tied to a particular rendezvous address. The client can then

communicate with the on-premises service by sending messages to the relay service targeting the

rendezvous address. The relay service will then ―relay‖ messages to the on-premises service through the

http://msdn.microsoft.com/en-us/library/windowsazure/hh367519.aspx#feedback

bi-directional socket already in place. The client does not need a direct connection to the on-premises

service nor is it required to know where the service resides, and the on-premises service does not need

any inbound ports open on the firewall.

You must initiate the connection between your on-premise service and the relay service, using a suite of

WCF ―relay‖ bindings. Behind the scenes, the relay bindings map to new transport binding elements

designed to create WCF channel components that integrate with the Service Bus in the cloud.

Relayed messaging provides many benefits, but requires the server and client to both be online at the

same time in order to send and receive messages. This is not optimal for HTTP-style communication, in

which the requests may not be typically long lived, nor for clients that connect only occasionally, such as

browsers, mobile applications, and so on. Brokered messaging supports decoupled communication, and

has its own advantages; clients and servers can connect when needed and perform their operations in an

asynchronous manner.

Brokered Messaging

In contrast to the relayed messaging scheme, brokered messaging can be thought of as asynchronous, or

―temporally decoupled.‖ Producers (senders) and consumers (receivers) do not have to be online at the

same time. The messaging infrastructure reliably stores messages until the consuming party is ready to

receive them. This allows the components of the distributed application to be disconnected, either

voluntarily; for example, for maintenance, or due to a component crash, without affecting the whole

system. Furthermore, the receiving application may only have to come online during certain times of the

day, such as an inventory management system that only is required to run at the end of the business day.

The core components of theService Bus brokered messaging infrastructure are Queues, Topics, and

Subscriptions. These components enable new asynchronous messaging scenarios, such as temporal

decoupling, publish/subscribe, and load balancing. For more information about these structures, see the

next section.

As with the relayed messaging infrastructure, the brokered messaging capability is provided for WCF and

.NET Framework programmers and also via REST.

What are Service Bus Queues

Service Bus Queues support a brokered messaging communication model. When using queues,

components of a distributed application do not communicate directly with each other, they instead

exchange messages via a queue, which acts as an intermediary. A message producer (sender) hands

off a message to the queue and then continues its processing. Asynchronously, a message consumer

(receiver) pulls the message from the queue and processes it. The producer does not have to wait for

a reply from the consumer in order to continue to process and send further messages. Queues

offer First In, First Out (FIFO) message delivery to one or more competing consumers. That is,

messages are typically received and processed by the receivers in the order in which they were

added to the queue, and each message is received and processed by only one message consumer.



Service Bus queues are a general-purpose technology that can be used for a wide variety of

scenarios:

Communication between web and worker roles in a multi-tier Windows Azure application

Communication between on-premises apps and Windows Azure hosted apps in a hybrid solution

Communication between components of a distributed application running on-premises in different

organizations or departments of an organization

Using queues can enable you to scale out your applications better, and enable more resiliency to

your architecture.

What are Service Bus Topics and Subscriptions

Service Bus topics and subscriptions support a publish/subscribe messaging

communication model. When using topics and subscriptions, components of a distributed

application do not communicate directly with each other, they instead exchange messages via a

topic, which acts as an intermediary.

In contrast to Service Bus queues, where each message is processed by a single consumer, topics and

subscriptions provide a one-to-many form of communication, using a publish/subscribe pattern. It is

possible to register multiple subscriptions to a topic. When a message is sent to a topic, it is then

made available to each subscription to handle/process independently.

A topic subscription resembles a virtual queue that receives copies of the messages that were sent to

the topic. You can optionally register filter rules for a topic on a per-subscription basis, which allows

you to filter/restrict which messages to a topic are received by which topic subscriptions.

Service Bus topics and subscriptions enable you to scale to process a very large number of messages

across a very large number of users and applications.

What is the Service Bus Relay

The Service Bus Relay service enables you to build hybrid applications that run in both a Windows

Azure datacenter and your own on-premises enterprise environment. The Service Bus relay facilitates

this by enabling you to securely expose Windows Communication Foundation (WCF) services that

reside within a corporate enterprise network to the public cloud, without having to open up a firewall

connection or requiring intrusive changes to a corporate network infrastructure.

The Service Bus relay allows you to host WCF services within your existing enterprise environment.

You can then delegate listening for incoming sessions and requests to these WCF services to the

Service Bus running within Windows Azure. This enables you to expose these services to application

code running in Windows Azure, or to mobile workers or extranet partner environments. The Service

Bus allows you to securely control who can access these services at a fine-grain level. It provides a

powerful and secure way to expose application functionality and data from your existing enterprise

solutions and take advantage of it from the cloud.

LECTURE 7:

NETWORKING AND CACHING IN

WINDOWS AZURE

Windows Azure Networking

The easiest way to connect to Windows Azure applications and data is through an ordinary Internet

connection. But this simple solution isn‘t always the best approach. Windows Azure also provides

three more technologies for connecting users to Windows Azure datacenters:

Virtual Network

Connect

Traffic Manager

This article takes a look at each of these.

Table of Contents

Windows Azure Virtual Network

Windows Azure Connect

Windows Azure Traffic Manager

Windows Azure Virtual Network

Windows Azure lets you create virtual machines (VMs) that run in Microsoft datacenters. Suppose

your organization wants to use those VMs to run enterprise applications or other software that will

be used by your firm‘s employees. Maybe you want to create a SharePoint farm in the cloud, for

example, or run an inventory management application. To make life as easy as possible for your

users, you‘d like these applications to be accessible just as if they were running in your own

datacenter.

There‘s a standard solution to this kind of problem: create a virtual private network (VPN).

Organizations of all sizes do this today to link, say, branch office computers to the main company

datacenter. This same approach can work with Windows Azure VMs, as Figure 1 shows.

https://www.windowsazure.com/en-us/develop/net/fundamentals/networking/#Vnet

https://www.windowsazure.com/en-us/develop/net/fundamentals/networking/#Connect

https://www.windowsazure.com/en-us/develop/net/fundamentals/networking/#TrafficMngr

https://www.windowsazure.com/en-us/develop/net/fundamentals/networking/#Fig1

Figure 1: Windows Azure Virtual Network allows creating a virtual network in the cloud that’s

connected to your on-premises datacenter.

As the figure shows, Windows Azure Virtual Network lets you create a logical boundary around a

group of VMs, called a virtual network or VNET, in a Windows Azure datacenter. It then lets you

establish an IPsec connection between this VNET and your local network. The VMs in a VNET can be

created using Windows Azure Virtual Machines, Windows Azure Cloud Services, or both. In other

words, they can be VMs created using either Windows Azure‘s Infrastructure as a Service (IaaS)

technology or its Platform as a Service (PaaS) technology. Whatever choice you make, creating the

IPsec connection requires a VPN gateway device, specialized hardware that‘s attached to your local

network, and it also requires the services of your network administrator. Once this connection is in

place, the Windows Azure VMs running in your VNET look like just another part of your

organization‘s network.

As Figure 1 suggests, you allocate IP addresses for the Windows Azure VMs from the same IP

address space used in your own network. In the scenario shown here, which uses private IP

addresses, the VMs in the cloud are just another IP subnet. Software running on your local network

will see these VMs as if they were local, just as they do with traditional VPNs. And it‘s important to

note that because this connection happens at the IP level, the virtual and physical machines on both

sides can be running any operating system. Windows Azure VMs running Windows Server or Linux

can interact with on-premises machines running Windows, Linux, or other systems. It‘s also possible


to use mainstream management tools, including System Center and others, to manage the cloud

VMs and the applications they contain.

Using Windows Azure Virtual Network makes sense in many situations. As already mentioned, this

approach lets enterprise users more easily access cloud applications. An important aspect of this

ease of use is the ability to make the Windows Azure VMs part of an existing on-premises Active

Directory domain to give users single sign-on to the applications they run. You can also create an

Active Directory domain in the cloud if you prefer, then connect this domain to your on-premises

network.

Creating a VNET in a Windows Azure datacenter effectively gives you access to a large pool of on-

demand resources. You can create VMs on demand, pay for them while they‘re running, then remove

them (and stop paying) when you no longer need them. This can be useful for scenarios that need

fast access to a preconfigured machine, such as development teams building new software. Rather

than wait for a local administrator to set up the resources they need, they can create these resources

themselves in the public cloud.

And just as Virtual Network makes Windows Azure VMs appear local to on-premises resources, the

reverse is also true: Software running in your local network now appears to be local to applications

running in your Windows Azure VNET. Suppose you‘d like to move an existing on-premises

application to Windows Azure, for example, because you‘ve determined that it will be less expensive

to operate in the cloud. But what if the data that application uses is required by law to be stored on

premises? In a situation like this, using Virtual Network lets the cloud application see an on-premises

database system as if it were local—accessing it becomes straightforward. Whatever scenario you

choose, the result is the same: Windows Azure becomes an extension of your own datacenter.

Windows Azure Connect

Sometimes, connecting your entire on-premises network to a group of Windows Azure VMs is the

right thing to do. Windows Azure Virtual Network is designed to solve this problem. But what if you

don‘t need a solution that‘s this general? Suppose instead that all you‘d like to do is connect a single

Windows Azure application—or even a single VM—to a specific group of computers on your local

network. Addressing this problem is the goal of Windows Azure Connect, as Figure 2 shows.


Figure 2: Windows Azure Connect links one or more VMs in Windows Azure with a group of

on-premises machines running Windows.

Unlike Virtual Network, Connect doesn‘t require using a VPN gateway device, nor does it require the

services (or approval) of a network administrator. Instead, anybody with administrative access to a

Windows machine in the local network can install the required Windows Azure Connect software on

that machine. Once this is done, the software can create an IPsec link with designated Windows

Azure VMs.

As the figure shows, Connect doesn‘t link two networks together; the Windows Azure VMs retain

whatever IP addresses they already have. Instead, it creates direct IPsec connections between specific

on-premises Windows computers and specific Windows Azure VMs. (To work with existing firewall

settings, Connect actually sends IPsec on top of an SSL connection.) For Cloud Services applications,

you can choose one or more roles to connect to, and Windows Azure will make it possible to

communicate with each instance in those roles. For VMs created using Windows Azure Virtual

Machines, you can install the same Windows Azure Connect software used for on-premises

computers.

Windows Azure Connect is useful in a variety of situations. An application running on Windows Azure

might use Connect to link to an on-premises database system, for example, or a developer on the

local network might use Connect to domain-join a cloud VM to an on-premises environment. While

Connect isn‘t as general a solution as Virtual Network, it is significantly easier to set up. Developers

can do it without bothering their network admins and with no extra hardware. Which approach is

right for you depends on exactly what problems you need to solve.

Windows Azure Traffic Manager

Imagine that you‘ve built a successful Windows Azure application. Your app is used by many people

in many countries around the world. This is a great thing, but as is so often the case, success brings

new problems. Here, for instance, your application most likely runs in multiple Windows Azure

datacenters in different parts of the world. How can you intelligently route traffic across these

datacenters so that your users always get the best experience?

Windows Azure Traffic Manager is designed to solve this problem. Figure 3 shows how.

Figure 3: Windows Azure Traffic Manager intelligently directs requests from users across

instances of an application running in different Windows Azure datacenters.

In this example, your application is running in VMs spread across four datacenters: two in the US,

one in Europe, and one in Asia. Suppose a user in Berlin wishes to access the application. If you‘re

using Traffic Manager, here‘s what happens.

As usual, the user‘s system looks up the DNS name of the application (step 1). This query is

redirected to the Windows Azure DNS system (step 2), which then looks up the Traffic Manager

policy for this application. Each policy is created by the owner of a particular Windows Azure




application, either through a graphical interface or a REST API. However it‘s created, the policy

specifies one of three options:

Performance: All requests are sent to the closest datacenter.

Failover: All requests are sent to the datacenter specified by the creator of this policy, unless that

datacenter is unavailable. In this case, requests are routed to other datacenters in the priority order

defined by the policy‘s creator.

Round Robin: All requests are spread equally across all datacenters in which the application is

running.

Once it has the right policy, Traffic Manager figures out which datacenter this request should go to

based on which of the three options is specified (step 3). It then returns the location of the chosen

datacenter to the user (step 4), who accesses that instance of the application (step 5).

For this to work, Traffic Manager must have a current picture of which instances of the application

are up and running in each datacenter. To make this possible, Traffic Manager periodically pings

each copy of the application via an HTTP GET, then records whether it receives a response. If an

application instance stops responding, Traffic Manager will stop sending traffic to that instance until

it resumes responding to pings.

Not every application is big enough or global enough to need Traffic Manager. For those that do,

however, this can be a quite useful service.




Caching in Windows Azure


Windows Azure Caching enables you to provision a cache in the cloud, to be used from any applications

or services that could benefit from Caching. ASP.NET applications can use Caching for the common

scenario of session state and output caching. Caching increases performance by temporarily storing

information from other backend sources. High performance is achieved by maintaining this cache in-

memory across multiple cache servers. For a Windows Azure solution, Caching can reduce the costs and

increase the scalability of other storage services such as SQL Database or Azure storage.

There are two main ways to use Caching:

Caching (Preview) where Caching is deployed on web/worker roles of your application

Shared Caching where Caching is consumed as a managed service

Caching (Preview) on Roles

Windows Azure Caching (Preview) allows you to host Caching within your Azure roles. This capability is

also referred to as role-based Caching. There are two main deployment topologies for this type of

Caching: dedicated and co-located. In the dedicated topology, you define a worker role that is dedicated

to Caching. This means that all of the worker role's available memory is used for the Caching and

operating overhead. In a co-located topology, you use a percentage of available memory on application

roles for Caching. For example, you could assign 20% of the physical memory for Caching on each web

role instance. In both cases, you only pay for the Compute services required for the running role instances.

For more information, see Windows Azure Caching (Preview) FAQ.

Note

Caching (Preview) role-based Caching is not supported in production at this time.

Shared Caching

Windows Azure Shared Caching enables you to register a cache through the Windows Azure

Management Portal. Theses caches do not reside on your own roles. Instead, they reside on a group of

servers in a multitenant environment. You can access your cache with a Service URL and Authentication

token from the Management Portal. In this model, you pay for one of several cache offerings that vary in

memory, bandwidth, transactions, and client connections. For more information, see Windows Azure

Shared Caching FAQ.

http://msdn.microsoft.com/en-us/library/windowsazure/gg278356.aspx#feedback




Note

Windows Azure Caching features are a subset of the features of the on-premise caching solution of

Windows Server AppFabric. For more information, see Differences Between Caching On-Premises

and in the Cloud.

Important

Windows Azure Caching is designed for Windows Azure applications hosted in the cloud. This

architecture achieves the best throughput at the lowest latency. With Shared Caching, it is possible

to test on-premises code that accesses a Windows Azure cache, but this design is not supported for

production. On-premises applications can instead rely on an on-premises cache cluster that uses

Windows Server AppFabric.

http://msdn.microsoft.com/en-us/library/windowsazure/gg185678.aspx


LECTURE 8:

OTHER TOPICS

MEDIA SERVICES

What are Media Services?

Windows Azure Media Services form an extensible media platform that integrates the best of the Microsoft Media

Platform and third-party media components in Windows Azure. Media Services provide a media pipeline in the cloud

that enables industry partners to extend or replace component technologies. ISVs and media providers can use Media

Services to build end-to-end media solutions. This overview describes the general architecture and common

development scenarios for Media Services.

The following diagram illustrates the basic Media Services architecture.

Media Services Feature Support

The current release of Media Services provides the following feature set for developing media applications in the

cloud. For information on future releases, see Media Services Upcoming Releases: Planned Feature Support.

http://social.msdn.microsoft.com/Forums/en-US/MediaServices/thread/431ef036-0939-4784-a939-0ecb31151ded

Ingest. Ingest operations bring assets into the system, for example by uploading them and encrypting them before

they are placed into Windows Azure Storage. By the RTM release, Media Services will offer integration with partner

components to provide fast UDP (User Datagram Protocol) upload solutions.

Encode. Encode operations include encoding, transforming and converting media assets. You can run encoding tasks

in the cloud using the Media Encoder that is included in Media Services. Encoding options include the following:

o Use the Windows Azure Media Encoder and work with a range of standard codecs and formats, including

industry-leading IIS Smooth Streaming, MP4, and conversion to Apple HTTP Live Streaming.

o Convert entire libraries or individual files with total control over input and output.

o A large set of supported file types, formats, and codecs (see Supported File Types for Media Services).

o Supported format conversions. Media Services enable you to convert ISO MP4 (.mp4) to Smooth Streaming

File Format (PIFF 1.3) (.ismv; .isma). You can also convert Smooth Streaming File Format (PIFF) to Apple HTTP

Live Streaming (.msu8, .ts).

Protect. Protecting content means encrypting live streaming or on demand content for secure transport, storage, and

delivery. Media Services provide a DRM technology-agnostic solution for protecting content. Currently supported

DRM technologies are Microsoft PlayReady Protection and MPEG Common Encryption. Support for additional DRM

technologies will be available.

Stream. Streaming content involves sending it live or on demand to clients, or you can retrieve or download specific

media files from the cloud. Media Services provide a format-agnostic solution for streaming content. Media Services

provide streaming origin support for Smooth Streaming, Apple HTTP Live Streaming, and MP4 formats. Support for

additional formats will be available. You can also seamlessly deliver streaming content by using Windows Azure CDN

or a third-party CDN, which enables the option to scale to millions of users.

Media Services Development Scenarios

Media Services support several common media development scenarios as described in the following table.

Scenario Description

Building end-to-

end workflows

Build comprehensive media workflows entirely in the cloud. From uploading media to distributing

content, Media Services provide a range of components that can be combined to handle specific

application workflows. Current capabilities include upload, storage, encoding, format conversion,

content protection, and on-demand streaming delivery.

Building hybrid

workflows

You can integrate Media Services with existing tools and processes. For example, encode content

on-site then upload to Media Services for transcoding into multiple formats and deliver through

Windows Azure CDN, or a third-party CDN. Media Services can be called individually via standard

REST APIs for integration with external applications and services.

Providing cloud

support for

media players

You can create, manage, and deliver media across multiple devices (including iOS, Android, and

Windows devices) and platforms.

Media Services Client Development

Extend the reach of your Media Services solution by using SDKs and player frameworks to build media client

applications. These clients are for developers who want to build Media Services applications that offer compelling

user experiences across a range of devices and platforms. Depending on the devices that you want to build client

applications for, there are options for SDKs and player frameworks available from Microsoft and other third-party

partners.

http://msdn.microsoft.com/en-us/library/hh973634

Setting up a Windows Azure account for Media Services

To set up your Media Services account, use the Windows Azure Management Portal (recommended). See the

topic How to Create a Media Services Account. After creating your account in the Management Portal, you are ready

to set up your computer for Media Services development.

CONTENT DELIVERY NETWORK (CDN)

The Windows Azure Content Delivery Network (CDN) caches Windows Azure blobs and the static content output of

compute instances at strategically placed locations to provide maximum bandwidth for delivering content to users.

You can enable CDN delivery for your content providers using the Windows Azure Platform Management Portal. CDN

is an add-on feature to your subscription and has a separate billing plan.

The CDN offers developers a global solution for delivering high-bandwidth content by caching the content at physical

nodes in the United States, Europe, Asia, Australia and South America. For a current list of CDN node locations,

see Windows Azure CDN Node Locations.

The benefits of using CDN to cache Windows Azure data include:

Better performance and user experience for end users who are far from a content source, and are using applications

where many ‗internet trips‘ are required to load content

Large distributed scale to better handle instantaneous high load, say, at the start of an event such as a product launch

To use the Windows Azure CDN you must have a Windows Azure subscription and enable the feature on the storage

account or hosted service in the Windows Azure Management Portal.

Note

Enabling the CDN may take up to 60 minutes to propagate worldwide.

http://go.microsoft.com/fwlink/?linkid=256662&clcid=0x409

http://windows.azure.com/



When a request for an object is first made to the CDN, the object is read retrieved directly from the Blob service or

from the hosted service. When a request is made using the CDN syntax, the request is redirected to the CDN

endpoint closest to the location from which the request was made to provide access to the object. If the object is not

found at that endpoint, then it is retrieved from the service and cached at the endpoint, where a time-to-live setting is

maintained for the cached object.

Caching content from Windows Azure blobs

Once the CDN is enabled on a Windows Azure storage account, any blobs that are in public containers and are

available for anonymous access will be cached via the CDN. Only blobs that are publically available can be cached

with the Windows Azure CDN. To make a blob publically available for anonymous access, you must denote its

container as public. Once you do so, all blobs within that container will be available for anonymous read access. You

have the option of making container data public as well, or restricting access only to the blobs within it. See Setting

Access Control for Containers for information on managing access control for containers and blobs.

For best performance, use CDN edge caching for delivering blobs less than 10 GB in size.

When you enable CDN access for a storage account, the Management Portal provides you with a CDN domain name

in the following format: http://<identifier>.vo.msecnd.net/. This domain name can be used to access blobs in a public

container. For example, given a public container namedmusic in a storage account named myaccount, users can

access the blobs in that container using either of the following two URLs:

Windows Azure Blob service URL:http://myaccount.blob.core.windows.net/music/

Windows Azure CDN URL:http:// <identifier>.vo.msecnd.net/music/

Caching content from hosted services

You can cache objects to the CDN that are provided by a Windows Azure hosted service.

CDN for hosted services has the following constraints:

Should be used to cache static content.

http://msdn.microsoft.com/en-us/library/windowsazure/dd179354.aspx

http://msdn.microsoft.com/en-us/library/windowsazure/dd179354.aspx

Warning

Caching of highly volatile or truly dynamic content may adversely affect

your performance or cause content problems, all at increased cost.

The hosted service must be deployed to in a production deployment.

The hosted service must provide the object on port 80 using HTTP.

The hosted service must place the content to be cached in, or delivered from, the /cdn folder on the hosted

service.

When you enable CDN access for on a hosted service, the Management Portal provides you with a CDN domain name

in the following format: http://<identifier>.vo.msecnd.net/. This domain name can be used to retrieve objects from a

hosted service . For example, given a hosted service named myHostedService and an ASP.NET web page called

music.aspx that delivers content, users can access the object using either of the following two URLs:

Windows Azure hosted service URL:http://myHostedService.cloudapp.net/cdn/music.aspx

Windows Azure CDN URL:http://<identifier>.vo.msecnd.net/music.aspx

Accessing cached content over HTTPS

Windows Azure allows you to retrieve content from the CDN using HTTPS calls. This allows you to incorporate content

cached in the CDN into secure web pages without receiving warnings about mixed security content types.

Accessing CDN content using HTTPS has the following constraints:

You must use the certificate provided by the CDN. Third party certificates are not supported.

You must use the CDN domain to access content. HTTPS support is not available for custom

domain names (CNAMEs) since the CDN does not support custom certificates at this time.

HTTPS is from the CDN to the client only. Requests from the CDN to the content provider (Storage

Account or hosted service) are still made using HTTP.

Even when HTTPS is enabled, content from the CDN can be retrieved using both HTTP and HTTPS.

For more information on enabling HTTPS for CDN content, see How to Enable CDN for Windows Azure.

Accessing cached content with custom domains

You can map the CDN HTTP endpoint to a custom domain name and use that name to request objects from the CDN.

For more information on mapping a custom domain, see How to Map CDN Content to a Custom Domain.

Introduction to Hadoop on Windows Azure

Overview

Apache™ Hadoop™-based Services for Windows Azure is a service that deploys and provisions

clusters in the cloud, providing a software framework designed to manage, analyze and report on big

data.



Data is described as "big data" to indicate that is being collection is in ever escalating volumes, at

increasingly high velocities, and for a widening variety of unstructured formats and variable semantic

contexts. Big data collection does not provide value to an enterprise. For big data to provide value in

the form of actionable intelligence or insight, it must be accessible, cleaned, analyzed, and then

presented in a useful way, often in combination with data from various other sources.

Apache Hadoop is a software framework that facilitates big data management and analysis. Apache

Hadoop core provides reliable data storage with the Hadoop Distributed File System (HDFS), and a

simple MapReduce programming model to process and analyze in parallel the data stored in this

distributed system. HDFS uses data replication to address hardware failure issues that arise when

deploying such highly distributed systems.

To simplify the complexities of analyzing unstructured data from various sources, the MapReduce

programming model provides a core abstraction that provides closure for map and reduce

operations. The MapReduce programming model views all of its jobs as computations over key-value

pair datasets. So both input and output files must contain such key-value pair datasets. Other

Hadoop-related projects such as Pig and Hive are built on top of HDFS and the MapReduce

framework, providing higher abstraction levels such as data flow control and querying, as well as

additional functionality such as warehousing and mining, required to integrate big data analysis and

end-to-end management.

Implementing Hadoop on Windows Azure as a service in the cloud makes the HDFS/MapReduce

software framework and related projects available in a simpler, more scalable, and cost efficient

environment. To simplify configuring and running Hadoop jobs and interacting with the deployed

clusters, Microsoft provides JavaScript and Hive consoles. This simplified JavaScript approach enables

IT professionals and a wider group of developers to deal with big data management and analysis by

providing an accessible path into the Hadoop framework.

It addition to the available Hadoop-related ecosystem projects, it also provides Open Database

Connectivity (ODBC) drivers to integrate Business Intelligence (BI) tools such as Excel, SQL Server

Analysis Services, and Reporting Services, facilitating and simplifying end-to-end data analysis.

This topic describes the Hadoop ecosystem on Windows Azure, the main scenarios for Hadoop on

Windows Azure cases, and provides a tour around the Hadoop on Windows Azure portal. It contains

the following sections:

Big Data: Volume, Velocity, Variety and Variability. - The qualities of big data that render it best

managed by NoSQL systems like Hadoop, rather than by conventional Relational Database

Management System (RDBMS).

The Hadoop Ecosystem on Windows Azure - Hadoop on Windows Azure provides Pig, Hive, Mahout,

Pegasus, Sqoop, and Flume implementations, and supports other BI tools such as Excel, SQL Server

Analysis Services and Reporting Services that are integrated with HDFS and the MapReduce

framework.

Big Data Scenarios for Hadoop on Windows Azure - The types of jobs appropriate for using Hadoop

on Windows Azure.

http://www.windowsazure.com/en-us/develop/net/tutorials/intro-to-hadoop/#BigData

http://www.windowsazure.com/en-us/develop/net/tutorials/intro-to-hadoop/#Ecosystem

http://www.windowsazure.com/en-us/develop/net/tutorials/intro-to-hadoop/#Scenarios

Getting Started with Microsoft Hadoop on Windows Azure - Get Community Technical Preview (CTP)

access and an introduction to the Apache™ Hadoop™-based Services for Windows Azure Portal.

Tour of the Portal - Deploying clusters, managing your account, running samples, and the interactive

JavaScript console.

Resources for Hadoop on Windows Azure - Where to find resources with additional information.

Big data: volume, velocity, variety, and variability

You cannot manage or process big data by conventional RDBMS because big data volumes are too

large, or because the data arrives at too high a velocity, or because the data structures variety and

semantic variability do not fit relational database architectures.

Volume

The Hadoop big data solution is a response to two divergent trends. On the one hand, because the

capacity of hard drives has continued to increase dramatically over the last 20 years, vast amounts of

new data generated by web sites and by new device and instrumentation generations connected to

the Internet, can be stored. In addition, there is automated tracking of everyone's online behavior.

On the other hand, data access speeds on these larger capacity drives have not kept pace, so reading

from and writing to very large disks is too slow.

The solution for this network bandwidth bottleneck has two principal features. First, HDFS provides a

type of distributed architecture that stores data on multiple disks with enabled parallel disk reading.

Second, move any data processing computational requirements to the data-storing node, enabling

access to the data as local as possible. The enhanced MapReduce performance depends on this

design principle known as data locality. The idea saves bandwidth by moving programs to the data,

rather than data to programs, resulting in the MapReduce programming model scaling linearly with

the data set size. For an increase in the cluster size proportionately with the data processed volume,

the job executes in more or less the same amount of time.

Velocity

The rate at which data is becoming available to organizations has followed a trend very similar to the

previously described escalating volume of data, and is being driven by increased ecommerce

clickstream consumer behavior logging and by data associated social networking such as Facebook

and Twitter. Smartphones and tablets device proliferation has dramatically increased the online data

generation rate. Online gaming and scientific instrumentation are also generating streams of data at

velocities with which traditional RDBMS are not able to cope. Insuring a competitive advantage in

commercial and gaming activities requires quick responses as well as quick data analysis results.

These high velocity data streams with tight feedback loops require a NoSQL approach like Hadoop's

optimized for fast storage and retrieval.

Variety

Most generated data is messy. Diverse data sources do not provide a static structure enabling

traditional RDBMS timely management. Social networking data, for example, is typically text-based

taking a wide variety of forms that may not remain fixed over time. Data from images and sensors

http://www.windowsazure.com/en-us/develop/net/tutorials/intro-to-hadoop/#HadoopOnAzure

http://www.windowsazure.com/en-us/develop/net/tutorials/intro-to-hadoop/#Tour

http://www.windowsazure.com/en-us/develop/net/tutorials/intro-to-hadoop/#Resources

feeds present similar challenges. This sort of unstructured data requires a flexible NoSQL system like

Hadoop that enables providing sufficient structure to incoming data, storing it without requiring an

exact schema. Cleaning up unstructured data is a significant processing part required to prepare

unstructured data for use in an application. To make clean high-quality data more readily available,

data marketplaces are competing and specializing in providing this service.

Variability

Larger issues in the interpretation of big data can also arise. The term variability when applied to big

data tends to refer specifically to the wide possible variance in meaning that can be encountered.

Finding the most appropriate semantic context within which to interpret unstructured data can

introduce significant complexities into the analysis.

The Hadoop ecosystem on Windows Azure

Introduction

Hadoop on Windows Azure offers a framework implementing Microsoft cloud-based solution for

handling big data. This federated ecosystem manages and analyses large data amounts while

exploiting parallel processing capabilities, other HDFS architecture optimizations, and the

MapReduce programming model. Technologies such as Sqoop and Flume integrate HDFS with

relational data stores and log files. Hive and Pig integrate data processing and warehousing

capabilities. Pegasus provides graph-mining capabilities. Microsoft Big Data solution integrates with

Microsoft BI tools, including SQL Server Analysis Services, Reporting Services, PowerPivot and Excel.

Microsoft BI tools enable you to perform a straightforward BI on data stored and managed by the

Hadoop on Windows Azure ecosystem. The Apache-compatible technologies and sister technologies

are part of this ecosystem built to run on top of Hadoop clusters are itemized and briefly described

in this section.

Pig

Pig is a high-level platform for processing big data on Hadoop clusters. Pig consists of a data flow

language, called Pig Latin, supporting writing queries on large datasets and an execution

environment running programs from a console. The Pig Latin programs consist of dataset

transformation series converted under the covers, to a MapReduce program series. Pig Latin

abstractions provide richer data structures than MapReduce, and perform for Hadoop what SQL

performs for RDBMS systems. Pig Latin is fully extensible. User Defined Functions (UDFs), written in

Java, Python, C#, or JavaScript, can be called to customize each processing path stage when

composing the analysis. For more information, see Welcome to Apache Pig!

Hive

Hive is a distributed data warehouse managing data stored in an HDFS. It is the Hadoop query

engine. Hive is for analysts with strong SQL skills providing an SQL-like interface and a relational data

model. Hive uses a language called HiveQL; a dialect of SQL. Hive, like Pig, is an abstraction on top of

MapReduce and when run, Hive translates queries into a series of MapReduce jobs. Scenarios for

Hive are closer in concept to those for RDBMS, and so are appropriate for use with more structured

data. For unstructured data, Pig is better choice. Hadoop on Windows Azure includes an ODBC driver

for Hive, which provides direct real-time querying from business intelligence tools such as Excel into

Hadoop. For more information, see Welcome to Apache Hive!

Mahout

Mahout is an open source machine-learning library facilitating building scalable matching learning

libraries. Using the map/reduce paradigm, algorithms for clustering, classification, and batch-based

collaborative filtering developed for Mahout are implemented on top of Apache Hadoop. For more

information, see What is Apache Mahout.

Pagasus

Pegasus is a peta-scale graph mining system running on Hadoop. Graph mining is data mining used

to find the patterns, rules, and anomalies characterizing graphs. A graph in this context is a set of

objects with links that exist between any two objects in the set. This structure type characterizes

networks everywhere, including pages linked on the Web, computer and social networks (FaceBook,

Twitter), and many biological and physical systems. Before Pegasus, the maximum graph size that

could be mined incorporated millions of objects. By developing algorithms that run in parallel on top

of a Hadoop cluster, Pegasus develops algorithms to mine graphs containing billions of objects. For

more information, see the Project Pegasus Web site.

Sqoop

Sqoop is tool that transfers bulk data between Hadoop and relational databases such a SQL, or other

structured data stores, as efficiently as possible. Use Sqoop to import data from external structured

data stores into the HDFS or related systems like Hive. Sqoop can also extract data from Hadoop and

export the extracted data to external relational databases, enterprise data warehouses, or any other

structured data store type. For more information, see the Apache Sqoop Web site.

Flume Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and

moving large log data amounts to HDFS. Flume's architecture is streaming data flow based. It is

robust and fault tolerant with tunable and reliability mechanisms with many failover and recovery

http://pig.apache.org/

http://hive.apache.org/

http://mahout.apache.org/

http://www.cs.cmu.edu/~pegasus/

http://sqoop.apache.org/

mechanisms. It has a simple extensible data model enabling online analytical applications. For more

information, see the Flume incubation site.

Business intelligence tools

Familiar Business Intelligence (BI) tools such as Excel, PowerPivot, SQL Server Analysis Services and

Reporting Services retrieves, analyzes, and reports data integrated with Hadoop on Windows Azure

using ODBC drivers. The Hive ODBC driver and Hive Add-in for Excel are available for download on

the Hadoop on Windows Azure portal How To Connect Excel to Hadoop on Windows Azure via

HiveODBC.

* For information Analysis Services, see SQL Server 2012 Analysis Services.

* For information Reporting Services, see SQL Server 2012 Reporting.

Big data scenarios for Hadoop on Windows Azure

An exemplary scenario that provides a case for an Hadoop on Windows Azure application is an ad

hoc analysis, in batch fashion, on an entire unstructured dataset stored on Windows Azure nodes,

which do not require frequent updates.

These conditions apply to a wide variety of activities in business, science, and governance. These

conditions include, for example, monitoring supply chains in retail, suspicious trading patterns in

finance, demand patterns for public utilities and services, air and water quality from arrays of

environmental sensors, or crime patterns in metropolitan areas.

Hadoop is most suitable for handling a large amount of logged or archived data that does not

require frequent updating once it is written, and that is read often, typically to do a full analysis. This

scenario is complementary to data more suitably handled by a RDBMS that require lesser amounts of

data (Gigabytes instead of Petabytes), and that must be continually updated or queried for specific

data points within the full dataset. RDBMS work best with structured data organized and stored

according to a fixed schema. MapReduce works well with unstructured data with no predefined

schema because it interprets data when being processed.

Getting started with Microsoft Hadoop on Windows Azure

The Hadoop on Windows Azure CTP

The Hadoop on Windows Azure service is available by invitation only during this Community

Technical Preview (CTP). The CTP purpose is for you to test Hadoop-based service on Windows

Azure, become more familiar with it, and provide feedback. The process for gaining access is outlined

below.

The portal used by Apache Hadoop-based services for Windows Azure

The Microsoft implementation of Hadoop on Windows Azure uses a Portal to provision new Apache

Hadoop clusters. Clusters provisioned on the portal are temporary and expire after several days.

When there is less than six hours left on the clock, an expiration time extension is allowed. These

clusters run jobs that process data either on the cluster or located elsewhere. For example, the data

could reside in a Windows Azure account or be transferred to the cluster over FTP.

http://incubator.apache.org/flume/

http://social.technet.microsoft.com/wiki/contents/articles/6226.how-to-connect-excel-to-hadoop-on-azure-via-hiveodbc-en-us.aspx

http://social.technet.microsoft.com/wiki/contents/articles/6226.how-to-connect-excel-to-hadoop-on-azure-via-hiveodbc-en-us.aspx

http://www.microsoft.com/sqlserver/en/us/solutions-technologies/business-intelligence/SQL-Server-2012-analysis-services.aspx

http://www.microsoft.com/sqlserver/en/us/solutions-technologies/business-intelligence/SQL-Server-2012-reporting-services.aspx

The advantage of using a temporary cluster is that there is no cost to maintain the hardware needed

for the MapReduce parallel processing jobs. You use the cluster and then release it or allow it to

expire. Apache Hadoop deployment solutions are also available for deploying Apache Hadoop to a

Windows Azure account, or on-premise hardware that you manage.

Azure Coursebook

Documents