Active Directory Forest Recovery: Plan to Eliminate Downtime

Active Directory (AD) is still the predominant identity and management platform for tens of thousands of organizations worldwide. Larger organizations with hundreds to even thousands of applications continue to rely on AD, even as they transition to a cloud-first or AD-minimized environment. The reason? Unraveling all the AD-related technology and applications an organization has invested in over the years can take a lot of time, money, and other resources. Although vendors are offering more products to help facilitate the move to the cloud, it’s still a slow process and not all vendors provide such options.

Malware and ransomware attacks continue to increase every year. The adage that “AD holds the keys to the kingdom” is still true—which makes AD a prime target. Once a domain is compromised, it can be weaponized to deliver malicious code to every system, or it can be a direct target for ransomware. Malware and disaster recovery plans must therefore include the tools and procedures necessary to recover an AD forest and to expel the threat and prevent reinfection. Due to the lingering dependence on AD, if AD is unavailable then authentication and access control are also unavailable. You can restore applications and data, but without AD you can’t get back to business as usual. If AD is weaponized, reinfection is much more likely.

In June/July 2017, the giant shipping company Maersk was attacked by ransomware, effectively shutting down its operations. According to the Redmond magazine article “Inside a Domain Controller Nightmare,” the attack cost the company an estimated $300 million in lost revenue. Wired magazine reported on the incident in more detail on August 22, 2018, in “The Untold Story of NotPetya, the Most Devastating Cyberattack in History.” Although administrators located all the backups for the servers, they could not find any backups of the company’s domain controllers (DCs). By luck, a simple power outage isolated a single DC in Lagos, Nigeria, that was used to bring AD back online.

Recovering an Active Directory Forest Requires a Specific Plan and Process

We can all agree that hope is not a plan. Similarly, relying on luck is not a disaster recovery strategy. However, we often fail to devote enough attention and resources to protect one of the most important services in our organization. In many cases, AD forest recovery plans aren’t documented, reviewed, or practiced. Teams are therefore unprepared when disaster strikes.

Recovering an entire AD forest requires a specific method, as discussed in Microsoft’s “Active Directory Forest Recovery Guide.” If done incorrectly, AD recovery can lead to problems during the restore process or later after recovery is complete and all the DCs have been manually rebuilt. The entire process can be tedious and time-consuming, often taking several days to weeks or even months for full recovery. Traditional file-level backup solutions cannot recover AD, and virtual machine (VM)-level backups can reintroduce malware. Without a well-documented plan and proper tools, recovery can be exceedingly difficult.

Strategies to Consider when Choosing an Active Directory Forest Backup and Recovery Solution

An AD forest backup and recovery strategy should be well documented, and it should be designed to reduce the amount of lost data and the overall time to recover. In addition, it should not reintroduce malware. There are some exceptionally good third-party solutions that incorporate Microsoft best practices and allow for automated recovery while protecting against reintroduction of malware connected with the operating system. Semperis Active Directory Forest Recovery (ADFR) and Quest Recovery Manager for Active Directory Disaster Recovery Edition (RMAD DRE) are two great solutions.

When choosing a solution, consider the following guidelines.

AD Aware

The solution should adhere to Microsoft’s detailed processes for recovering an AD forest, as documented in the “Active Directory Forest Recovery Guide.”

Clean Source

A typical file and folder backup solution cannot restore an AD forest. VM-level backups may reintroduce the malware that caused the problem. An effective recovery solution should decouple AD from the underlying operating system, where malware resides. AD recovery should be performed on a new system, using verified clean-source installation media for the operating system.

Push-Button Orchestrated Recovery  

The decision to restore an AD forest should not be made lightly, but it usually follows a period of hair-on-fire phone calls and meetings. When the time comes to flatten the existing AD environment and perform a forest restore, it’s imperative that the recovery process is streamlined; it should require minimal manual interaction from enterprise administrators. The solution should provide push-button orchestration of multiple DCs simultaneously, to speed up the restore time and to eliminate the confusion of several domain admins performing manual processes and complicating or impeding the recovery process. Manually performing a forest recovery on a medium-sized organization with more than 25 DCs could take weeks to complete and to successfully validate.

Platform Agnostic

Sophisticated supply-chain attacks on hardware and software, as well as various other causes, can render a platform useless during a disaster. An ideal recovery solution should be able to restore the AD forest to any platform capable of hosting a Windows Server operating system and not be dependent on the exact same underlying hardware or hypervisor. In a crisis scenario, the option to restore AD to a different platform eliminates dependencies on specific hardware or hypervisors that may be inaccessible or compromised. This option provides the flexibility to shift immediately to the cloud or even a secure, isolated environment, thereby reducing the overall recovery time.

Develop a Playbook

Tools are only one part of the solution. To be fully prepared, you need a written recovery playbook. This part of disaster preparedness can take the most time and effort but is essential to a recovery exercise. The playbook should identify the members who make up the crisis management team, such as executive leadership, coordinator, internal and external communications lead, application and network engineers, cybersecurity leads, and enterprise administrators. It should also contain a list of mission-critical applications that must be addressed as the plan progresses. Include detailed step-by-step instructions for every member of the crisis mitigation team. Additionally, include a call tree and details about an alternative communications technology for conference calls. Also, be sure to document any processes necessary to access break-glass accounts that are located off site or in a safe within the datacenter.

Practice Makes Perfect

Once a plan and tools are in place, it’s time to train and test. The worst time to discover a flaw in your plan is during a crisis. At a minimum, perform a yearly tabletop exercise with the crisis management team so that everyone knows their job and how to execute. This is the time to identify inexperienced staff who should be trained, update the call tree, and refamiliarize experienced staff. Note any bottlenecks or flaws, then create and assign tasks with deadlines to resolve any issues. Work with any third-party vendors and include them in the yearly exercise.

If you really want to validate your plan, create an isolated environment and perform a yearly forest restore. This will undoubtedly reveal issues for you to address before a real event occurs.

Conclusion

No organization is immune to misconfigurations, zero-day vulnerabilities, sophisticated phishing campaigns, or even insider threats. Often, organizations must strike a balance between security and continuing operations. These compromises, coupled with exploits and unknown vulnerabilities, can affect even the most prepared organizations.

Do not overlook AD when it comes to malware preparedness. Proper tools and an effective strategy must be put into place that can quickly and efficiently recover the forest in the event of a malware infection, returning the organization to business as usual without reintroducing the problem. Then, you must practice and refine the recovery plan so that all team members are well acquainted with the procedures.

In today’s cyber landscape, it’s not IF but WHEN you’ll get hit by ransomware. Being prepared with the proper plan and tools will minimize the effects of an attack and get your operations back up, to continue your organization’s mission.

If you need assistance with implementing an AD forest recovery plan in your environment, contact the experts at Ravenswood Technology Group. We’re here to help!

[RELEVANT BLOG CONTENT]

6 Tips to Harden Your Windows LAPS Deployment

In a previous blog post, we covered how to migrate to Windows Local Administrator Password Solution (LAPS). With Windows LAPS deployments gaining traction, it’s important

Migrating to Windows LAPS

Windows Local Administrator Password Solution (LAPS), now integrated into the OS, is the replacement for Microsoft LAPS, which was a separate installation. Windows LAPS is

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.