DevOps.com

  • Latest
    • Articles
    • Features
    • Most Read
    • News
    • News Releases
  • Topics
    • AI
    • Continuous Delivery
    • Continuous Testing
    • Cloud
    • Culture
    • DataOps
    • DevSecOps
    • Enterprise DevOps
    • Leadership Suite
    • DevOps Practice
    • ROELBOB
    • DevOps Toolbox
    • IT as Code
  • Videos/Podcasts
    • Techstrong.tv Podcast
    • Techstrong.tv Video Podcast
    • Techstrong.tv - Twitch
    • DevOps Unbound
  • Webinars
    • Upcoming
    • On-Demand Webinars
  • Library
  • Events
    • Upcoming Events
    • On-Demand Events
  • Sponsored Content
  • Related Sites
    • Techstrong Group
    • Container Journal
    • Security Boulevard
    • Techstrong Research
    • DevOps Chat
    • DevOps Dozen
    • DevOps TV
    • Techstrong TV
    • Techstrong.tv Podcast
    • Techstrong.tv Video Podcast
    • Techstrong.tv - Twitch
  • Media Kit
  • About
  • Sponsor
  • AI
  • Cloud
  • Continuous Delivery
  • Continuous Testing
  • DataOps
  • DevSecOps
  • DevOps Onramp
  • Platform Engineering
  • Low-Code/No-Code
  • IT as Code
  • More
    • Application Performance Management/Monitoring
    • Culture
    • Enterprise DevOps
    • ROELBOB
Hot Topics
  • Running Serverless in Production: 7 Best Practices for DevOps
  • We Are Living in an Ephemeral World
  • Cisco Bets on OpenTelemetry to Advance Observability
  • 5 Technologies Powering Cloud Optimization
  • Platform Engineering: Creating a Paved Path to Reduce Developer Toil

Home » Blogs » DevOps in the Cloud » Business Continuity in the Azure Cloud: Understanding the Options

Business Continuity in the Azure Cloud: Understanding the Options

Avatar photoBy: Dave Bermingham on February 21, 2019 Leave a Comment

There are numerous ways to assure business continuity for applications running within the Azure cloud via various high availability and disaster recovery provisions. But, selecting the best and most cost-effective provisions for each and every application can be extraordinarily difficult owing to the myriad choices available. At best, making poor choices during development can waste money. At worst, a wrong choice can cause failover provisions to fail when needed during operation.

Related Posts
  • Business Continuity in the Azure Cloud: Understanding the Options
  • Enlisting the C-Suite in Disaster Recovery Initiatives
  • Panzura Expands Enterprise Grade Functionality and Services
    Related Categories
  • Blogs
  • DevOps in the Cloud
    Related Topics
  • business continunity
  • cloud services
  • disaster recovery
  • High availability
  • Microsoft Azure cloud
Show more
Show less

All business continuity provisions involve hardware and software redundancy, data replication and some means of failover and failback. Purpose-built failover clustering software has long been among the most popular choices based on its proven dependability and cost-effectiveness. The clusters are relatively easy to deploy in an enterprise data center using shared storage. But with no shared storage available in the public cloud, configuring failover clusters in Azure becomes considerably more challenging.

TechStrong Con 2023Sponsorships Available

This article examines the options available for high availability (HA) and disaster recovery (DR) provisions within and for the Azure cloud. Special emphasis is given to SQL Server as a particularly popular application for Azure.

Options Available Within the Azure Cloud

The Azure cloud offers redundancy at three layers: within data centers, within regions and across multiple regions. Within data centers, Availability Sets are used to distribute redundant servers across different fault domains located in different racks, which protects against failures at the server and rack levels. This affords some protection for some failures, but provides no protection during a sitewide failure, such as the one that occurred in September 2018 in Azure’s South Central US Region. The 99.95 percent Service Level Agreement (SLA) only guarantees that in an Availability Set with two or more servers, at least one will have external connectivity, but it does nothing to assure availability at the application level.

To protect against sitewide failures, Azure is now offering Availability Zones (AZs). Regions with AZs have at least three data centers interconnected via a high-bandwidth, low-latency network that supports synchronous replication. Azure offers a 99.99 percent SLA for AZs, but again, only guarantees at least one server will have external connectivity—nothing less and nothing more.

For redundancy during major disasters, Azure offers Region Pairs, in which each region is paired with another in the same geography (e.g. United States or Europe). The regions are separated by at least 300 miles to protect against widespread disasters that might impact an entire region, including across multiple AZs. By pairing regions, Microsoft is able to apply updates one at a time to prevent the “update gone bad” scenario and will prioritize the recovery of at least one region in each pair during an Azurewide outage. But again, Azure only guarantees “dial tone” for the servers, leaving it to the customers to ensure availability at the application level.

Options Available in OS and SQL Server Software

Windows Server Failover Clustering (WSFC) is a standard operating system feature that is utilized by many applications to provide HA protection in enterprise data centers. However, WSFC requires some form of shared storage, which historically has not been available in any public cloud, including Azure’s.

Microsoft addressed this problem in the Datacenter Edition of Windows Server 2016 by adding Storage Spaces Direct (S2D), a software-defined, virtual storage area network. But because the cluster must reside entirely within a single data center, S2D is incompatible with Availability Zones. Applications that require multisite HA/DR protection will, therefore, need to use third-party failover clustering software, log shipping or some other additional provision(s).

With no equivalent to WSFC or S2D for Linux, HA/DR protection requires either the use of open source software, such as Pacemaker, or a third-party failover clustering solution. Because supporting open source software requires a substantial and ongoing commitment, only the largest organizations have the wherewithal to even consider this do-it-yourself option.

SQL Server, whether for Windows or Linux, offers two of its own HA/DR features: Failover Cluster Instances and Always On Availability Groups. FCIs afford two major advantages: inclusion in the Standard Edition; and protection for the entire SQL Server instance, including system databases. A notable disadvantage is the need for cluster-aware shared storage, including the virtual variety with S2D, which is only supported for SQL Server 2016 and later.

Always On Availability Groups is SQL Server’s more robust HA/DR offering, capable of delivering recovery times of 5-10 seconds and recovery points of seconds or less. Its disadvantages include the lack of protection for the entire SQL instance and the need to license the more expensive Enterprise Edition, which can be cost-prohibitive for many applications.

A significant disadvantage with all application-specific options is the need for DevOps staff to use different HA and/or DR solutions for different applications. Having multiple HA/DR solutions inevitably increases complexity and costs, making this another reason why application-agnostic third-party solutions are so popular.

The Third-party Failover Clustering Software Option

Being agnostic with respect to both applications and platforms enables purpose-built failover clustering software to provide a complete HA/DR solution for virtually all Windows and Linux applications. Application-agnosticism eliminates the need to have different HA/DR solutions for different applications. Platform-agnosticism makes it possible to leverage, while not being dependent upon, various capabilities and services within the Azure cloud, making this option suitable for use in private, public and hybrid cloud environments.

Failover clustering solutions include, at a minimum, real-time data replication, continuous monitoring for detecting failures at the application level and configurable policies for failover/failback. All are designed to satisfy mission-critical recovery time and recovery point objectives, and most also offer a variety of value-added capabilities to simplify implementation and management.

Clustering with Confidence

Whether used individually or in various combinations, all of these options can have a role to play in making HA and DR protections more effective and more affordable for all applications—from those that can tolerate some downtime, to those that demand five 9s of uptime. But be sure that the options chosen afford protection at the application level for all likely failure scenarios.

— Dave Bermingham

Filed Under: Blogs, DevOps in the Cloud Tagged With: business continunity, cloud services, disaster recovery, High availability, Microsoft Azure cloud

« New Relic Accelerates AIOps with SignifAI Acquisition
DevOps Chat: Mainframe DevOps Update With Compuware’s Chris O’Malley »

Techstrong TV – Live

Click full-screen to enable volume control
Watch latest episodes and shows

Upcoming Webinars

Shipping Applications Faster With Kubernetes: Myth or Reality?
Wednesday, February 8, 2023 - 1:00 pm EST
Why Current Approaches To "Shift-Left" Are A DevOps Antipattern
Thursday, February 9, 2023 - 1:00 pm EST
Log Love: Monitoring, Troubleshooting, Forensics and Biz Analytics
Tuesday, February 14, 2023 - 11:00 am EST

Sponsored Content

The Google Cloud DevOps Awards: Apply Now!

January 10, 2023 | Brenna Washington

Codenotary Extends Dynamic SBOM Reach to Serverless Computing Platforms

December 9, 2022 | Mike Vizard

Why a Low-Code Platform Should Have Pro-Code Capabilities

March 24, 2021 | Andrew Manby

AWS Well-Architected Framework Elevates Agility

December 17, 2020 | JT Giri

Practical Approaches to Long-Term Cloud-Native Security

December 5, 2019 | Chris Tozzi

Latest from DevOps.com

Running Serverless in Production: 7 Best Practices for DevOps
February 8, 2023 | Gilad David Maayan
We Are Living in an Ephemeral World
February 8, 2023 | Don Macvittie
Cisco Bets on OpenTelemetry to Advance Observability
February 7, 2023 | Mike Vizard
5 Technologies Powering Cloud Optimization
February 7, 2023 | Gilad David Maayan
Platform Engineering: Creating a Paved Path to Reduce Developer Toil
February 7, 2023 | Daniel Bryant

TSTV Podcast

On-Demand Webinars

DevOps.com Webinar ReplaysDevOps.com Webinar Replays

GET THE TOP STORIES OF THE WEEK

Most Read on DevOps.com

OpenAI Hires 1,000 Low Wage Coders to Retrain Copilot | Netflix Blocks Password Sharing
February 2, 2023 | Richi Jennings
Automation Challenges Holding DevOps Back
February 1, 2023 | Mike Vizard
Three Trends That Will Transform DevOps in 2023
February 2, 2023 | Dan Belcher
Red Hat Brings Ansible Automation to Google Cloud
February 2, 2023 | Mike Vizard
The Ultimate Guide to Hiring a DevOps Engineer
February 2, 2023 | Vikas Agarwal
  • Home
  • About DevOps.com
  • Meet our Authors
  • Write for DevOps.com
  • Media Kit
  • Sponsor Info
  • Copyright
  • TOS
  • Privacy Policy

Powered by Techstrong Group, Inc.

© 2023 ·Techstrong Group, Inc.All rights reserved.