Top Banner
Azure App Dev Best Practices July 2019
15

Azure App Dev Best Practices - Microsoft · 2020. 11. 30. · Enabling Windows Azure Authentication configures ... •Hadoop and MapReduce –The high volumes of data that one can

Mar 05, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Azure App Dev Best Practices - Microsoft · 2020. 11. 30. · Enabling Windows Azure Authentication configures ... •Hadoop and MapReduce –The high volumes of data that one can

Azure App Dev

Best Practices

July 2019

Page 2: Azure App Dev Best Practices - Microsoft · 2020. 11. 30. · Enabling Windows Azure Authentication configures ... •Hadoop and MapReduce –The high volumes of data that one can

2 | Copyright © 2020 HCL Technologies Limited | www.hcltech.com

Automate X Source Control CI/CD Web Development

Single Sign-on Data Storage Data Partition Strategies Design to detect Failures

Monitoring & Telemetry Transient Fault Handling Distributed Caching

ADvantage Azure: The Key Areas

Security

Page 3: Azure App Dev Best Practices - Microsoft · 2020. 11. 30. · Enabling Windows Azure Authentication configures ... •Hadoop and MapReduce –The high volumes of data that one can

3 | Copyright © 2020 HCL Technologies Limited | www.hcltech.com

Automate X (i.e. whatever is possible to be automated)

Go Home

• X = DevOps Workflows - To increase certainty in shorter release cycles - this ensures repeatability, reliability and predictability

• X = Azure Management Scripts – Script-based automation through APIs using Windows PowerShell or open-source frameworks like Puppet

or Chef. Alternatively, .NET management API could be used for writing code instead of scripts.

• X = Environment Creation Scripts – to create an Azure environment that one can deploy apps to, for testing. This is the main script that calls scripts for all the following.

• X = Storage Account Creation and Database Creation Scripts – These are created by the main script

• X = Store App Settings and Connection Strings

• X = Prepare to Deploy – The environment creation script calls two functions to create files that will be used by the deployment script

• X = Troubleshooting & Error Handling - when the scripts fail, we should turn on verbosePreference so we can learn about the failure and what caused it. For this reason, the environment creation script changes the value of the VerbosePreference variable from SilentlyContinue to Continue.

• X = Deployment Script - The deployment script gets the name of the website from the website-environment.xml file created by the environment creation script

Note: This may be extended to Databases too. This is touched upon in the Data Storage section of this POV.

Page 4: Azure App Dev Best Practices - Microsoft · 2020. 11. 30. · Enabling Windows Azure Authentication configures ... •Hadoop and MapReduce –The high volumes of data that one can

4 | Copyright © 2020 HCL Technologies Limited | www.hcltech.com

Source Control

Go Home

• Automation Scripts - All automation scripts should be kept under a source code control. All the scripts that one uses to create the environment, to deploy to it, to scale it, and so on, need to be in sync with the application source code

• Secrets should not be checked in – If scripts rely on secrets such as passwords, parameterize those settings so that they don’t get saved in source code, and store all secrets somewhere else.

• Structure source branches to facilitate DevOps Workflow – Different patterns to be used for mid-size (normal master, dev and staging branches and large sized (e.g. feature toggling/feature flags) deployments

• Store sensitive data in Azure – One way to avoid storing credentials in source control is to store them in Azure instead. Microsoft Visio Studio Online offers both Git (DVCS – Distributed Version Control System) and Team Foundation Version Control (TFVC)

• Governance - Measuring the success of the source control system based on how quickly one can make a change and get it live in a safe and predictable way is critical from a Governance perspective.

o Note: If the situation is such that the development team is not confident of making a change because of ensuing effort towards manual testing, which is time-consuming, it is time to introspect into the plan of action on what needs to be done process-wise or test-wise so that you can make that change in minutes or, at worst, no longer than an hour. One strategy for doing that is to implement continuous integration and continuous delivery, which is covered next.

Page 5: Azure App Dev Best Practices - Microsoft · 2020. 11. 30. · Enabling Windows Azure Authentication configures ... •Hadoop and MapReduce –The high volumes of data that one can

5 | Copyright © 2020 HCL Technologies Limited | www.hcltech.com

CI/CD

Go Home

• CI/CD Workflows – These workflows comprises 5 key steps: 1. Delivery team checking in their code to Version Control; 2. Version Control

triggering Build and Unit tests; 3. Trigger Automated Test Scripts 4. Trigger User Acceptance and 5. Trigger Release

• Version Everything that affects Production – Apart from source code, certainly, version tests, scripts, configuration files etc.

• Shorten the pipeline – Rather than going linear, reduce dependencies, e.g. UAT, Capacity Testing and some features of Production can be

released in parallel to reduce cycle time, as part of Fail Fast and Fail Often strategy.

• Provide Visibility to all the team members and Surface the common pains – Perform the less critical analyses early and often.

• Orchestrating the release pipeline – Trigger pipelines on small check-ins; Keep pipeline instances small; run steps in parallel; Stop the

pipeline on a failure; build only once; user environment-agnostic binaries; standardize deployments; Deploy to a copy of Production

• Automating the release pipeline – Automate Deployments and Tests; Deploy the same way to every environment; Tokenize Configurations;

Automate the BVTs; Use one-click deployments; Leave the environment in a known state; Opt for a viable Testing Orchestration Model; Have a

rollback mechanism available; Lock down the environments; Make deployment scripts granular; Adopt Trey Research (Issue, Cause & Solution)

• Getting feedback and improving the pipeline – In and From pipeline, Metrics for Release process, Visuals to display data, Canary releases,

A/B testing, Blue/Green deployments, Error detection in Production environment, Telemetry and Analytics, Cause random failures and so on.

Page 6: Azure App Dev Best Practices - Microsoft · 2020. 11. 30. · Enabling Windows Azure Authentication configures ... •Hadoop and MapReduce –The high volumes of data that one can

6 | Copyright © 2020 HCL Technologies Limited | www.hcltech.com

Web Development

Go Home

• Stateless Web Servers – Should not store any application data in the web server memory or file system. If the web tier is stateless and it sits

behind a load balancer, we can quickly respond to changes in application traffic by dynamically adding or removing servers; Easier to scale

out. Cloud servers, like on-premises servers, need to be patched and rebooted occasionally. If the web tier is stateless, rerouting traffic when a

server goes down temporarily won't cause errors or unexpected behavior

• No Session State – It's often not practical in a real-world cloud app to avoid storing some form of state for a user session, but some

approaches impact performance and scalability more than others. If we have to store state, the best solution is to keep the amount of state

small and store it in cookies. If that isn't feasible, the next best solution is to use ASP.NET session state with a provider for distributed, in-

memory cache. The worst solution from a performance and scalability standpoint is to use a database-backed session state

• CDN to store static file assets – Static file assets such as images and script files may be sent to a CDN provider, and the provider caches

these files in data centers all over the world so that wherever people access your application, they get relatively quick response and low

latency for the cached assets. This speeds up the overall load time of the site and reduces the load on your web servers. Good for

geographically distributed audience.

• Use .NET 4.5 Async Support & Entity Framework 6 – In ASP.NET 4.5, support for asynchronous programming has been added not just to

the language but also to the MVC, Web Forms, and Web API frameworks. For example, an ASP.NET MVC controller action method receives

data from a web request and passes the data to a view, which then creates the HTML to be sent to the browser. As part of .NET 4.5, Microsoft

provided async support for web service calls, sockets, and file system I/O, but the most common pattern for web applications is to hit a

database, and Microsoft’s data libraries didn’t support async programming for this situation. Entity Framework 6 now supports it.

Page 7: Azure App Dev Best Practices - Microsoft · 2020. 11. 30. · Enabling Windows Azure Authentication configures ... •Hadoop and MapReduce –The high volumes of data that one can

7 | Copyright © 2020 HCL Technologies Limited | www.hcltech.com

Single Sign-On

Go Home

• Adding a Global Administrator to you AD – Azure AD enables you to make enterprise line-of-business (LOB) apps available over the

Internet, and it enables you to make these apps available to business partners as well. Enabling Windows Azure Authentication configures

your application to authenticate users using a single Azure Active Directory tenant.

• Create an ASP.Net Application with authentication choice (Organizational). Choose “Host in the Cloud” option.

• Leverage Graph API – To call the Graph API, you first need to retrieve a token. When the token is retrieved, its string value must be appended

in the Authorization header for all subsequent requests to the Graph API

• Federated Identify Pattern – Implement an authentication mechanism that can use federated identity. Separating user authentication from

the application code, and delegating authentication to a trusted identity provider, can considerably simplify development and allow users to

authenticate using a wider range of identity providers (IdPs) while minimizing the administrative overhead. It also allows you to clearly

decouple authentication from authorization.

• ADFS 2.0 – The primary advantage of AD FS is that it enables organizations to use federated partnerships when they plan for cross-

organizational identity. You can think of federation as a two organizations agreeing to facilitate user access and support sharing of

collaborative applications, with each organization retaining responsibility for managing their own IT assets and user accounts. SSO simplifies

access for trusted users, enabling them to reuse the session credentials that they have already applied and authenticated to reauthorize

access to all other resources in a networked environment.

Page 8: Azure App Dev Best Practices - Microsoft · 2020. 11. 30. · Enabling Windows Azure Authentication configures ... •Hadoop and MapReduce –The high volumes of data that one can

8 | Copyright © 2020 HCL Technologies Limited | www.hcltech.com

Data Storage

Go Home

• Data Storage Options On Azure – Relational, Key/Value, Column Family, Document and Graph

• Relational – Azure SQL, SQL Server, Oracle, MySQL, SQL Compact, SQLite, Postres; Key/Value – Azure Blob, Azure Table, Azure Cache,

Redis, Memchached and Riak; Column Family – Cassandra, Hbase; Column Family – Mongo, Raven, Couch; Graph – Neo4j

• Hadoop and MapReduce – The high volumes of data that one can store in NoSQL databases may be difficult to analyze efficiently in a timely

manner. To perform this type of analysis, we can use a framework such as Hadoop, which implements MapReduce functionality. On Azure,

HDInsight enables you to process, analyze, and gain new insights from big data by using the power of Hadoop. As part of Automate X, most

functions including set up and execution of HDInsight analysis jobs could be automated. Steps are as follows – 1. Provision an HDInsight

cluster; 2. Upload the MapReduce job executables to the HDIsight cluster; 3. Submit a MapReduce job; 4. Wait for the job and delete the

cluster 5. Access the output from Blob storage. By running a script that performs these steps, one can minimize the amount of time that the

HDInsight cluster is provisioned, which minimizes one’s costs.

• PaaS vs. IaaS – PaaS option is easier and cheaper to manage and IaaS gives almost unlimited data storage options

• Choosing the right option – There are multiple factors that we need to assess before choosing the right option, as follows: Data Semantic,

Query Support needed, Functional projection (e.g. aggregation on the server), Ease of scalability, Ease of Deployment, API Support,

Transactional integrity and Data Consistency, Business Continuity and Cost.

• SQL PaaS vs. SQL VM – Less Cost vs. More Control.

Note: We have not covered some best practices around Azure Blob Storage in this POV.

Page 9: Azure App Dev Best Practices - Microsoft · 2020. 11. 30. · Enabling Windows Azure Authentication configures ... •Hadoop and MapReduce –The high volumes of data that one can

9 | Copyright © 2020 HCL Technologies Limited | www.hcltech.com

Data Partitioning Strategies

Go Home

• The Options – Vertical, Horizontal and Hybrid

• Vertical Partitioning – In a cloud database, storage is relatively expensive, and a high volume of images or similar assets could make the size

of the database grow beyond the limits at which it can operate efficiently. We can address these problems by partitioning the data vertically,

which means we choose the most appropriate data store for each column in your table of data. What might work best for this example is to

put the string data in a relational database and the images in Blob storage.

• Horizontal Partitioning (“sharding”) – Horizontal portioning is like splitting up a table by rows: one set of rows goes into one data store,

and another set of rows goes into a different data store. We have to be very careful about our sharding scheme to be sure that our data is

evenly distributed to avoid hot spots. We might hit table size limitations earlier than we might expect because some databases would

become very large while most would remain small. A downside of horizontal partitioning is that it might be hard to run queries across all the

data.

• Hybrid Partitioning – We can combine vertical and horizontal partitioning. For example, we could store the images in Blob storage and

horizontally partition the string data.

• Conclusion – Any partitioning scheme does increase code complexity and introduces many new complications that we have to deal with. If

we're moving images to Blob storage, what to do when the storage service is down? How do we handle blob security? What happens if the database and Blob storage get out of sync? If we are “sharding”, how to handle querying across all the databases? In summary, an effective partitioning scheme can enable your cloud app to scale to petabytes of data without bottlenecks, but proper planning is critical to succeed.

Page 10: Azure App Dev Best Practices - Microsoft · 2020. 11. 30. · Enabling Windows Azure Authentication configures ... •Hadoop and MapReduce –The high volumes of data that one can

10 | Copyright © 2020 HCL Technologies Limited | www.hcltech.com

Design to detect and survive Failures

Go Home

• Two types of Failure – Transient (that goes away without intervention) and Enduring (that needs intervention)

• Transient Failure – A few ways to manage this are: Retry/back-off logic (instead of throwing an exception right away). Entity Framework 6

builds this kind of retry logic right into the framework. For SQL Database exceptions that the framework identifies as typically transient errors,

the code shown instructs EF to retry the operation up to three times, with an exponential back-off delay between retries, and a maximum

delay of 5 seconds. Exponential back-off means that after each failed retry, the app will wait for a longer period before trying again. If three

tries in a row fail, the app will throw an exception. If we run into issues when we are using the Azure Storage service, and the .NET storage

client API already implements the same kind of logic. We just specify the retry policy, unless the default settings are not enough. Circuit

Breaker pattern is another approach: this means that at a certain retry threshold your app stops retrying and takes some other action, such as

one of the following: Custom rollback, Fail silently or Fail fast (error out the user before other requests flood the service with similar requests).

• Enduring Failure – Details are given in the Monitoring and Telemetry section of this POV.

• Failure Scope – Machine, Service and Regions

• Machine – In Azure, this is managed automatically. A failed server is automatically replaced by a new one. So, this is not a concern for

the application development team.

• Service – Cloud apps typically use multiple services. We can use the Queue-centric work pattern to address these challenges.

• Regions – This is a rare possibility. In case we are running mission-critical applications that affect end users globally, It is possible to set up our app in Azure to run in multiple regions simultaneously so that if a disaster occurs in one, our app continues running in another region.

Note: More details on management of Transient failures could be found on this slide.

Page 11: Azure App Dev Best Practices - Microsoft · 2020. 11. 30. · Enabling Windows Azure Authentication configures ... •Hadoop and MapReduce –The high volumes of data that one can

11 | Copyright © 2020 HCL Technologies Limited | www.hcltech.com

Monitoring & Telemetry

Go Home

• Native Azure Options – A proactive approach is a must. Else, getting sufficient data to analyze incidents would be close to impossible. Azure

provide App Insights feature that help dev teams collect data across several parameters, as follows:

o Request rates, response times, and failure rates - Find out which pages are most popular, at what times of day, and where your users are. See which pages

perform best. If your response times and failure rates go high when there are more requests, then perhaps you have a resourcing problem.

o Dependency rates, response times, and failure rates - Find out whether external services are slowing the app down.

o Exceptions - Analyze the aggregated statistics or pick specific instances and drill into the stack trace and related requests, both server and browser exceptions.

o Page views and load performance - reported by your users' browsers.

o AJAX calls from web pages - rates, response times, and failure rates.

o User and session counts.

o Performance counters from your Windows or Linux server machines, such as CPU, memory, and network usage.

o Host diagnostics from Docker or Azure.

o Diagnostic trace logs from your app - so that we can correlate trace events with requests.

o Custom events and metrics that the Dev team can write in the client or server code, to track business events (e.g. items sold)

• Buy Options – There are few good Microsoft partners in this space to consider – New Relic, AddDynamics, MetricsHub and Dynatrace.

• Logs – Configure logging (e.g. Error, Info, Warning and Debug) levels at run time through a logging framework (e.g. system.diagnostics). Log

inner exceptions too. Create a logger class (implementing ILogger interface) to standardize catching of exceptions. We can collect custom

logs with Log Analytics agent in Azure Monitor. Custom scripts can write data to Windows Events or Syslog or send data to Azure Monitor

through HTTP Data Collector API.

Page 12: Azure App Dev Best Practices - Microsoft · 2020. 11. 30. · Enabling Windows Azure Authentication configures ... •Hadoop and MapReduce –The high volumes of data that one can

12 | Copyright © 2020 HCL Technologies Limited | www.hcltech.com

Transient Fault Handling

Go Home

• Causes –Failed and dropped database connections happen since apps go through more load balancers than in an on-premises environment.

Also, sometimes when the app is dependent on a multitenant service, calls to the service become slower or time out because another tenant

of the service is hitting it heavily. In some cases, the service deliberately throttles the app—denies connections—to prevent the app from

adversely affecting other tenants of the service.

• Don’t Show Not Available or Error Messages – Instead, implement a smart retry logic. This may be done in many ways:

1. Transient Fault Handling Application Block – First, the block includes logic to identify transient faults for several common cloud-based services ((e.g. SQL Database,

Azure Service Bus, Azure Storage Service and Azure Caching Service) in the form of detection strategies. These detection strategies contain built-in knowledge that can

identify whether a particular exception is likely to be caused by a transient fault condition. Second, the application block enables you to define your retry strategies so that

you can follow a consistent approach to handling transient faults in your applications. The specific retry strategy you use will depend on several factors; for example, how

aggressively you want your application to perform retries, and how the service typically behaves when you perform retries. Some services can further throttle or even

block client applications that retry too aggressively. A retry strategy defines how many retries you want to make before you decide that the fault is not transient or that

you cannot wait for it to be resolved, and what the intervals should be between the retries

2. Leveraging Entity Framework – When we use the Entity Framework, we typically are not working directly with SQL connections, so we cannot use the Patterns &

Practices package, but Entity Framework 6 builds this kind of retry logic right into the framework. In a similar way, we specify the retry strategy, and then EF uses that

strategy whenever it accesses the database.

3. Circuit Breakers – There are many reasons why we should not retry too many times, e.g. degrading the user experience, service flooding, continuation of throttling etc.

Exponential back-off addresses some of these issue by limiting the frequency of retries that a service can get from your application. But we need to have circuit breakers:

this means that at a certain retry threshold the app stops retrying and takes some other action, such as one of the following: 1. Custom fallback. If we cannot get data from the database, maybe we can get it from cache (strategy on the next slide).

2. Fail silently. If what we need from a service is not all-or-nothing, just return null when we cannot get the data.

3. Fail fast. Error out the user to avoid flooding the service with retry requests that could cause service disruption for other users or extend a throttling window. We can display a

friendly “try again later” message. There is no one-size-fits-all retry policy. It would depend upon multiple factors, e.g. use case, non-functional requirements etc.

Page 13: Azure App Dev Best Practices - Microsoft · 2020. 11. 30. · Enabling Windows Azure Authentication configures ... •Hadoop and MapReduce –The high volumes of data that one can

13 | Copyright © 2020 HCL Technologies Limited | www.hcltech.com

Distributed Caching (for better Performance)

Go Home

• Why Distributed Caching? – A cache provides high throughput, low-latency access to commonly accessed application data by storing the

data in memory. For a cloud app, the most useful type of cache is a distributed cache, which means that the data is not stored in the

individual web server's memory but on other cloud resources, and the cached data is made available to all from an application's web servers

(or other cloud VMs that are used by the application)

• When to use – Caching works best for application workloads that do more reading than writing of data and when the data model supports

the key/value organization that we use to store and retrieve data in cache. Caching is also more useful when application users share a lot of

common data. An example where caching could be very beneficial is a product catalog. The benefit of caching becomes increasingly

measurable the more an application scales, because the throughput limits and latency delays of the persistent data store become more of a

limit on overall application performance. We may implement caching for reasons other than performance as well: for data that doesn't have

to be up to date when shown to a user, cache access can act as a circuit breaker when the persistent data store is unresponsive/ unavailable.

• Ways to use –

1. On demand/cache aside - The application tries to retrieve data from cache, and when the cache doesn't have the data (a “miss”), the application stores the data in the cache so that it will be available the next time. The next time the application tries to get the same data, it finds what it's looking for in the cache (a “hit”). To prevent fetching cached data that has changed in the database, we invalidate the cache when making changes to the data store.

2. Background data push - Background services push data into the cache on a regular schedule, and the app always pulls from the cache. This approach works great with high-latency data sources that do not require that we always return the latest data.

3. Circuit breaker - The application normally communicates directly with the persistent data store, but when the persistent data store has availability problems, the application retrieves data from cache. Data may have been put in cache using either the cache aside or background data push strategy. This is a fault-handling strategy rather than a performance-enhancing strategy.

Page 14: Azure App Dev Best Practices - Microsoft · 2020. 11. 30. · Enabling Windows Azure Authentication configures ... •Hadoop and MapReduce –The high volumes of data that one can

14 | Copyright © 2020 HCL Technologies Limited | www.hcltech.com

Security

Security Policy IAM Storage Accounts SQL Services Virtual Machine Others (continued)

• Enable automatic

provisioning of

monitoring agent –

keep “Data Collection”

on

• System Updates to be

kept on

• OS vulnerabilities to

be kept on

• Endpoint protection to

be set on.

• Disk Encryption on

• Network Security

Groups to be on

• Web Application

Firewall to be set to on

• Many others

• Multi-factor

authentication enabled

• No guest users

• Disable MFA on

devices that users trust

• No. of Reset methods

reset to 2

• No. of days before

users are asked to

reconfirm to be 0

• Notify users on

password resets – Yes

• User permission to

access company data –

to be set to No

• Many other user rights

to be set to No

• Allow requests to the storage account by a secure connection –Turn secure transfer required on. When you are using the Azure files service, connection without encryption will fail, including scenarios using SMB 2.1, SMB 3.0 without encryption, and some flavors of the Linux SMB client.

• storage service encryption’ is set to enabled. Enable data encryption at rest for blobs.

• Auditing to be set to

on and auditing type -

blob

• Threat detection on

• Threat detection types

to be set to “all”

• On SQL Servers, email

service and co-admin

to be enabled

• Enable all threat

detection types

• Proper firewall rules to

be set

• “Send alerts to” – On

• Transparent data

encryption on SQL DBs

• Others

• Install endpoint

protection for VMs

• Enabled latest patch

updates

• Enforce disk

encryption

• Enable VM Agent

• Minimize the number

of admins/owner

access to subscriptions

• Remove

deprecated/stale

accounts

• No permission to

external accounts

• Service accounts not

to support MFA

• ASC (Security Center)

must be configured

• Critical application resources should be protected using a resource lock.

• Others

Note: We do not provide services on Networking. So, that aspect has been kept out of this POV

Go Home

Others

• Secure subscription -configure security e.g. elements such as alerts, ARM policies, RBAC, Security Center policies, JEA, Resource Locks, etc.

• Secure cloud

development

Page 15: Azure App Dev Best Practices - Microsoft · 2020. 11. 30. · Enabling Windows Azure Authentication configures ... •Hadoop and MapReduce –The high volumes of data that one can

$9.7 BILLION | 149,000+ IDEAPRENEURS | 45 COUNTRIES

www.hc l tech .com