The Str8 Deal – Read Before Using Microsoft Azure Backup Services or Site Recovery

The Str8 Deal offers “Unreserved Words“, penned directly from the experts at Liquid Mercury Solutions. The opinions expressed in this article are ours and ours alone.

We’ve read many, many articles and walkthroughs stating that Microsoft Azure Backup Server (MABS) can be used for a variety of recovery needs including backup of system state and even Active Directory. In this blog post, I will debunk this myth based on our real-life experiences, and explain a methodology that will actually work for saving your systems and domain from a variety of risks. This will include touching on Common Wisdom Regarding Azure Backup and Site Recovery. There is a lot of material out there describing how you can protect your on-premises data with Azure Backup Services. We have been using this solution in house and also running it for several of our customers.

MABS is billed as having the primary advantage of being essentially “free” except for whatever storage you require. Indeed, installing a MABS server onto whatever hardware you have lying around will cost you nothing extra beyond the Windows licenses you already have and whatever drives you need to add for on-premises disk based backups. Moreover, it’s also a great feature that MABS manages your data so local copies are maintained and replicated to the cloud in accordance with your retention policies. If you are lucky enough to have machines that have already been upgraded to Windows Server 2016, then you will be able to take advantage of MABS v3, including better data de-duplication technology, support for ReFS, and more. It is said that (especially when combined with Azure Site Recovery for your VMs) MABS will provide you a level of protection that will allow you to rest easy knowing that your company’s data is safe.

Why the Common Wisdom is Wrong

     There are so many things wrong with these assertions, that it is difficult to know where to begin. We’ll start by simply stating that, when the rubber hits the road, often things are not as simple as they appeared to be while you were getting directions from Google Navigator. Let’s go through the list of all the things that prove the above dream to be little more than a mirage.

Recovery Is Not The Same as Having a Backup

Backing up and restoring a few files or even an entire drive is one thing. Bringing systems back into a working status where they can actually be used is another ball of wax. Much like the Hotel California, your backup data can check into Azure any time it wants, but getting things back to a working state will likely prove more difficult than you ever imagined. Compounding this problem is the fact that Microsoft provides loads of documentation about the process for creating backups, but guidance is comparatively sparse with respect to restoration scenarios. There are a few cases, which is good if your situation matches exactly with what they describe. Sadly, we’ll soon learn that you also can’t take this for granted, since, as Fred Brooks once famously said, “The documentation lies.” So, if you’ve never done a disaster recovery simulation (fire drill), you should do one soon. More than likely there are hidden obstacles that you will come across that need to be dealt with.

Connectivity Between Azure and Your Network

There are many aspects of Azure Backup and Recovery that have VPN connectivity as an unstated requirement, yet no documentation we have found will mention this. If you haven’t configured Site-to-Site VPN connection for your Azure tenant yet, there are many good reasons to do it that go far beyond the backup and recovery concerns. This being said, we have many clients who have used Azure in ways where they were never forced to configure the VPN connectivity. Setting up S2S ensures that on-premises workers can access VMs in Azure on equal footing with those in the office, which will be important if you want to do useful things like running an off-site backup domain controller, robust file server, etc.

Even if you already have S2S configured, it’s still very important to consider how you are accomplishing this connection. If you’re using RRAS to make the connection, do your disaster recovery plans take into account that the connection will most likely be lost in cases where your Windows Server is not working correctly? If you’re lucky enough to have dedicated hardware to support the IPSec tunnel to Azure, is it in fact located in the same server closet as the rest of your servers, and does it have any backup device at all? These things may come into play if you’re facing fire, flood, or electrical problems. Strictly speaking of course, MABS doesn’t require the S2S VPN connection to do its work. It will happily back the data up non-stop. You only encounter the need for this connectivity in cases where you no longer have working hardware on-premises to run your DC or the MABS server.

That’s why we call this a hidden requirement; once you cross that Rubicon into a true DR scenario, you suddenly find yourself wishing you have the VPN and other servers set up in the cloud. Having the VPN available to you opens up the possibility to have a replica of Active Directory running on Azure, which in turn makes it possible for VMs on the other side of the tunnel to join your domain. Such a setup also survives pretty well through drops in VPN connectivity, provided that it can eventually be restored.

But… Being Domain Joined Isn’t Needed

Yeah, we hear this a lot that you can restore data with MARS onto Azure VM that are not joined to the on-premises domain. Strictly speaking, we suppose this is technically true. Let us share why this is a myth. MARS is the recovery agent for MABS. As such it is not as feature rich as MABS server is. We would go so far as to say that actually it is a poorly documented and a buggy piece of crap, but don’t let that scare you away from using it. MARS will get you through any case where you need to recover files or system state on the same machine they were backed up from. However, while Microsoft documentation demonstrates the ability to restore System State onto a different machine, our tests have shown that this does not work as expected. Instead of being presented with a set of backups from which to choose, you will either see no results or you’ll be given dropdown dialogs too narrow to determine which backup is which. Perhaps this worked better in older versions of Windows; who can say.

How to Overcome These Problems (If You Must)

The workaround for this annoyance is to install a secondary MABS server and attach to the backup vault using it, restore the System State to a network share, then use Windows Backup to restore it to the target machine. Thus, requiring MABS instead of MARS now requires in turn that the machine on which you run recovery operations must be joined to the domain. Simply put, it’s an installation requirement for MABS. Moreover, we tested this using just the S2S VPN back to home office, and for whatever reason, the MABS installer simply refused to acknowledge that we were joined to the domain and running as a Domain Administrator unless we also had a local domain controller in Azure. Perhaps we could have overcome this. However, the addition of a live secondary DC is so important in its own right, why even bother? One thing to point out here is that MABS install takes a fortnight compared to the MARS agent, so this is really something you should be doing pre-emptively and not while your boss is riding over your shoulder because their entire network is offline. You can install MABS to a VM, attach it to the backup vault, and then shut it down (deprovision) so that it only incurs storage costs that are minimal. This is just good sense.

Some folks will note here that the scenario of putting a VM into Azure to be a domain controller is inconsistent with Azure AD DS service and shouldn’t be needed. We agree with this sentiment, but feel like a few things need to be mentioned. Firstly, Azure AD DS will provide fault tolerant 2-server domain that you can join Azure VMs to in lieu of on-premises domain. This is great, but the Azure AD DS domain is not the same as Active Directory if you have such a thing on-premises; it is a replica of the Azure AD that comes from Office 365 or Azure portal, not a replica of domain controllers. It has not got the same capabilities as a real DC, nor would it have the same data in it; if you have AD locally, that’s tied to Azure via AD Connect (sync) and that in turn syncs to Azure AD DS. So, if you need advanced capabilities of Active Directory (GPO anyone?), fault tolerance for your primary DC, or for systems in Azure to be joined to the same domain as on-premises, you will need a real VM to do this not Azure AD DS. If none of these apply to you, feel free to use it.

As a final note, we can say that maybe MABS v3 has slightly different requirements. However, since our systems still have some Windows Server 2012 R2 and MABS v2 to v3 is a breaking change, we have not been able to test it yet. We have plans to do so in the near future and will report back later for those who are interested in a future blog. Hopefully after all this you can see that just because joining the domain isn’t required, that doesn’t mean that doing so and/or having a cloud based domain controller is a bad idea, and it may prove to be necessary too.

Close to the Metal

Anybody remember that episode of “Halt and Catch Fire” where they had to recover the floppy disks by hand-spinning them across the drive heads one sector at a time? Well, regardless whether you’re doing BMR or System State, building a physical server is kind of like that. It will be much more involved than anything virtual. This type of recovery looks difficult because it is difficult. You should read this as “risky, time consuming, and costly”. Once upon a time, we lost a large RAID array due to two-drive failure. We had no backup plan in place, so we were glad that it was recoverable. However, for the six grand we spent in doing so, we could’ve easily afforded to purchase an effective backup solution for several years. Don’t rely on the IT equivalent of Evil Knievel; your best backup strategy is to actually have one.

What You Should Be Doing Instead

We can’t state this strongly enough. You would likely be significantly better off to have VMWare or Windows Server Core with Hyper-V running as the sole application on your physical servers, and then have everything you own running in VMs on it, rather than to have even a single application running on the bare metal.

Why is this? It goes back to recovery not being the same thing as backup.

If you are fully virtualized, restoring your virtual host becomes largely an academic matter. It can be fun to debate it or kick around expensive solutions, but for the most part only the largest organizations needing time-to-recover in minutes or seconds will find it worthwhile. (Full disclosure: Zerto is a Liquid Mercury Solutions’ customer, and we like their products.) Anyone with less than 500 employees should just implement Azure Site Recovery for the VMs, and if or when the on-premises host hardware should break down, they will fail over to the cloud for whatever time it takes to order newer and better hardware from Dell/HP, pop in the thumb drive to re-install Windows Server Core and Hyper-V then connect the new server to Site Recovery and fail-back over to it.

In such a case, the best System State backups are going to give you is fast restoration of corrupted OS that isn’t based on underlying hardware failure, and maybe a few extra minutes wasted configuring your LAN adapter. It would be just as reliable (and often as quick) to just document the process for creating a new VM host from the ground up and go from there. Nevertheless, we do not always have this luxury. Often, we inherit systems that were managed by someone else or configured at a time before current best practices were common or even before Azure Site Recovery ever existed.

If you’re a small business and have the situation where you have lots of critical services running on your physical servers, there are things you can do to make the situation better without too much hardship. Start by doing a full inventory of what you have running on that Small Business Server that you upgraded to Windows 2012. Once you know what it does, you can build out individual VMs for sub-sets of services. More than likely these can be hosted on Hyper-V using the same hardware you have now – or with inexpensive upgrades.

Here are some common things you may be running there, and where they can move to:

  • AD Domain Controller -> On-prem VM plus Azure backup VM
  • Internal DNS -> should follow your domain controller(s)
  • Public facing DNS -> Use a DNS hosting / registrar service
  • DHCP/WINS -> On-prem VM or replace with your firewall
  • VPN/RRAS -> On-prem VM or replace with your firewall
  • Printer Shares -> On-prem VM
  • File Shares -> On-prem or Azure VM
  • Accounting Software -> On-prem or Azure VM
  • Databases -> On-prem or Azure VM, SQL Azure, or Managed Instance
  • Exchange Server -> Office 365
  • SharePoint -> Office 365
  • On-prem CRM Server -> Dynamics 365
  • IIS Web Site/Applications -> Azure Web Apps

If you have stuff like these, you can virtualize slowly one workload at a time until eventually there is little or nothing left on your physical server itself. Should the worst ever come to pass, recovery of these remnants can be left to MABS / System State / BMR. Beyond that, you should also feel free to ask us about solutions from Veeam.

Finally, keep in mind that Azure Backup BMR and Site Recovery for bare-metal servers were really designed to be “lift-and-shift” operations. Why is this, because of the aforementioned best practices we talked about earlier. You should be *wanting* to virtualize everything. Of course, Microsoft will be more than happy to have you do this in Azure. Once lifted to the cloud, your images are now VMs, so moving them back to either VMWare or Hyper-V on-premises are both technically possible – though some paths are more clearly marked than others. So, if you have legacy scenarios like what we just mentioned, this wouldn’t be a terrible way to make a transition to virtualized systems, keeping in mind that Microsoft will charge you for the bandwidth being used when you move them back to on-premises hardware.

At the very least, all of this sounds like a project for a long weekend. If you need help getting the job done, you should definitely reach out to us. We can do a full evaluation of your unique situation, provide you with a comprehensive report and written plan, and even do the implementation.

Credit: this post was republished from Liquid Mercury Solutions' Staff of LiquidHg with permission from the author and/or publisher; original post URL is LiquidHg.