Wednesday, August 19, 2015

Repairing and Recovering AD

I looked at some Active Directory (AD) maintenance and disaster-prevention activities that you should regularly undertake. Now let's take a look at a topic you need to know about when everything else fails: AD repair and recovery.
Repairing AD with Ntdsutil 
If you suspect—because of error messages, log entries, or application errors—that the AD replica on a domain controller (DC) is corrupted, you might consider using the Ntdsutil utility's Repair feature to repair the damage. However, I recommend that you use this method only as a last resort. If a valid backup is available, restoring the database, which I discuss later, should be your first course of action.
Repairing the directory database doesn't always achieve successful results. For example, if a database file is corrupted, using the Ntdsutil Repair feature might not restore all objects and attributes. In fact, in some cases, using the Repair feature could cause further data loss. Isolating a DC from the rest of the network before you attempt this kind of repair can prevent additional corruption to other DCs' AD replicas. After you ensure that all is well, you can reattach the DC to the network.
Figure 1, page 54, shows how to use Ntdsutil to repair the AD database. To perform a repair operation on the AD database file, follow these steps:
  1. Go to the command prompt window, type
  2. ntdsutil
    and press Enter.
  3. At the Ntdsutil prompt, type
  4. files
    The utility will display the File Maintenance category.
  5. At the File Maintenance prompt, type
  6. repair
Restoring AD 
When all else fails, you might find that restoring functionality to a Windows 2000 DC (or the entire AD network) requires that you restore AD from backup. Although the process of physically restoring the AD database on a Win2K DC from a backup isn't a logistically complex procedure, you need to consider some important logical and architectural factors before you perform any type of AD restore operation. On networks that have more than one Win2K DC, AD doesn't exist in only one location—an important factor to consider because it relates to the AD restore process. Ask yourself the following questions:
  • Is only the local DC's copy of AD corrupted or damaged, or are other replicas on other DCs also in the same state?
  • Is the data I'm restoring the definitive copy I should use to overwrite all other copies of AD object data? If so, do I risk losing changes or structural modifications (e.g., added or deleted organizational units—OUs, modifications to user or computer objects) by restoring this copy of AD as a master copy?
  • Should I restore AD on a local DC only to regain functionality on that DC (i.e., is the corruption, damage, or other type of problem isolated to the local copy of AD on that computer), which should then receive updates from other DCs that use AD replication to bring its data store up-to-date?
Answering these questions will help you determine which AD restore modes—nonauthoritative or authoritative—to use. (To read more about recovering AD, see the sidebar "AD Recovery Resources.")
Nonauthoritative restore. Most restore operations use the nonauthoritative restore mode. You typically perform a nonauthoritative restore when the problem is limited to the local Win2K DC and you believe that the AD replicas housed on other Win2K DCs are valid. During a nonauthoritative restore, any data that you restore (including AD objects) will retain its original update sequence number (USN). AD replication then uses this number to detect and propagate any changes to other DCs in the same domain.
Authoritative restore. Perform an authoritative restore when the other Win2K DCs contain invalid replicas or undesirable data. In this case, you manually designate the copy of the AD database that you want to restore. Designate only the local DC as authoritative (i.e., the master copy from which all other DCs seed their AD replicas). Authoritative restores modify the AD objects' USNs so that each object's USN is higher than those of any other AD database replicas; as a result, all the restored objects will be replicated to the other DCs' AD replicas.
You can use backup data from one DC to restore to only the same DC; you can't use a backup of one DC to restore another machine. However, if the DC system fails, you can restore the backup data to another computer that replaces the original DC. Keep this restriction in mind when you develop your backup strategy. To completely back up your environment, you need a backup of every DC in the network. In addition, you need to frequently back up the first DC that you installed in the forest root domain. This DC typically hosts unique forestwide roles and contains unique data essential to network operation.
If you're using Win2K's backup utility (ntbackup.exe) to perform a restore, you must meet the following additional conditions to successfully restore the system state (including AD). If you don't meet all these conditions, the restore operation will fail.
  • The server name must be identical to the backed-up server's name.
  • The drive letter on which the \%systemroot% folder resides must be the same letter it was when you performed the backup.
  • The \%systemroot% folder must be in the same location as it was when you originally backed it up (e.g., in the C:\winnt directory).
Performing Nonauthoritative Restores 
If the AD replicas on DCs other than the DC you're restoring are intact and valid, you'll probably want to perform a nonauthoritative restore, which is a typical restore of the AD objects to the DC that originally contained them. Although you can back up AD either online (while the directory service is running) or offline (when the services are stopped), you can restore AD only when the directory services are offline.
To restore AD, start the Win2K DC in a special startup mode called Directory Services Restore Mode. Select this mode at system startup by pressing F8 when the Win2K Boot Loader menu appears, then selecting the option from the alternative boot menu. Win2K will start in Safe Mode, and you can use the following steps to restore AD information on a DC:
  1. Log on as a member of the Administrator or Backup Operator group.
  2. Run the Win2K backup program and select the Restore Wizard option from the Welcome tab. Choose File, Backup, then select the System State check box. System State data includes the registry, AD, and other key system components.
  3. If you restore the System State data and don't designate an alternative location for it, the utility erases the System State data that's currently on your computer and replaces it with the System State data you're restoring. The AD, Certificate Services database, and COM+ Class Registration databases won't be restored if you designate an alternative location.
  4. After you finish the restore, restart the DC.
The DC will now participate in AD replication and will receive directory updates from the other DCs. After the completion of the nonauthoritative restore, the restored data (which might be out of date) is synchronized.
Use the nonauthoritative restore if the DC fails or the entire AD database is corrupted. A nonauthoritative restore will maintain its original USN. (AD uses this number to detect and propagate the most recent changes to other DCs.)
To minimize replication traffic on the network, the nonauthoritative restore provides a start point (the point at which backup began) for data replication—only changed data (rather than the entire directory) is replicated. Without this start point, all data from other servers would be replicated. Simply reinstalling Win2K and reconfiguring the system as a DC (through dcpromo.exe) is another option for restoring AD on a Win2K DC. The AD-replication process will automatically repopulate the DC with current directory information.
Performing Authoritative Restores 
An authoritative restore lets you recover a DC, restore it to a specific point in time, and mark AD objects as authoritative with respect to their replication partners. For example, you might need to perform an authoritative restore if an administrator inadvertently deletes an OU that contains many users. You can use the authoritative restore process to recover the AD information and mark it as the definitive source for replication to the other DCs in the domain.
The authoritative restore modifies the USN of the AD objects that you're restoring to the DC so that each object has the highest value of any AD replica on any DC in the domain. This, in turn, forces replication of the newly restored objects to the other replicas residing on all other DCs.
Authoritative restores are unusual; they can roll back all the AD objects in the DC to the point in time when you performed the original backup. You can use this action to restore information that was erroneously deleted from a replicated data set. For example, if you inadvertently delete or modify objects stored in AD, you can authoritatively restore those objects so that you can replicate them again to the other DCs. If you don't authoritatively restore the missing objects, they'll never get replicated to the other DCs in the same domain because the missing or deleted objects you're restoring appear to be older than the objects currently on your DC.
To mark the target objects for authoritative restore, you can use the Ntdsutil utility, which ensures that the data you want to restore is replicated to the appropriate DCs after the restoration. Table 1 lists and describes the authoritative Ntdsutil restore commands.
By definition, an authoritative restore replicates any changes you made to the current data set to its outbound replication partners. Use the following steps to perform an authoritative restore of AD on a specific DC:
  1. Open a command prompt window (select Start, Run, type
  2.                               cmd
    then press Enter).
  3. Type
  4.                               ntdsutil
    and press Enter.
  5. At the Ntdsutil prompt, type
  6.                               authoritative restore
    and press Enter. This action puts Ntdsutil into Authoritative Restore mode.
  7. At the Authoritative Restore prompt, type
  8. restore database
    to set the entire database as authoritative. Alternativelyyou can set only a subtree of the database (for example, an individual OU); doing so requires that you use the Lightweight Directory Access Protocol (LDAP) string that identifies the AD portion that you're authoritatively restoring. For example, to authoritatively restore an OU called Engineering in the mycompany.com domain, type the following command at the Authoritative Restore prompt:
    restore subtree ou=engineering,dc=mycompany,dc=com
  9. When the system prompts you to confirm the authoritative restore you specified in Step 4, answer Yes.
  10. Click Quit, then press Enter twice to return to the command prompt.
  11. Close the command prompt session.
  12. After the AD restore operation is complete, answer No to the option to restart the server. This step is crucial; otherwise, the restore will be nonauthoritative when the server restarts, and you'll risk reinheriting unwanted data from other AD replicas.
Always authoritatively restore the Sysvol folder whenever you authoritatively restore AD. This process ensures that Sysvol and AD remain synchronized. Also, be aware that authoritative restores have several potentially negative consequences.
One such effect relates to trust relationships and computer account passwords, which are automatically negotiated at a specific interval (every 7 days by default, except for computer accounts that administrators can disable). During an authoritative restore, you can restore a previously used password for the AD objects that maintain trust relationships and computer accounts. For trust relationships, this action could void communication with DCs from other domains. For computer account passwords, it could void communications between the member workstation or server and a DC.
In this article, I've attempted to cover some of the more important maintenance activities related to AD upkeep and give you information about how to repair and restore AD when things go awry. Following these recommendations can help ensure that your network stays healthy and available, that your users are productive, and that the boss stays off your back.

Recovering from Active Directory Disasters

Active Directory (AD) is typically one of the key network services in an organization. Without it, everything comes to a grinding halt. With this in mind, it’s important to be prepared for the various disasters that might strike a forest.
When it relates to AD, the scope of the disaster can vary quite a bit. It can be as simple as the failure of single domain controller (DC) or the accidental deletion of a single object. An even worse situation is when an entire organizational unit (OU) hierarchy is accidentally deleted. In the worst case scenario, an entire domain or forest might need to be restored.
The good news is that many of the techniques that apply to recovering from simple disasters also apply to recovering from catastrophic disasters. I’ll discuss how to recover from the two most common calamities: a failed DC and accidentally deleted objects.

Backup Strategy

You first need to make sure that you have something to use for a recovery. At a minimum, you should have valid system state backups of at least two DCs in each domain in your AD forest. Windows Server Backup (Windows Server 2008 and later), NTBackup (Windows Server 2003 and Windows 2000 Server), and most commercially available backup tools can perform valid system state backups. However, it’s always worth testing the backups to make sure everything is in order. One important point regarding backup tools is that you should use a Volume Shadow Copy Service (VSS)–aware backup tool. Backup tools that rely on disk imaging or virtual machine (VM) snapshot technologies are generally incompatible with AD. Restoring a backup made by one of these tools can cause serious replication failures known as update sequence number (USN) rollback.
In many organizations, the responsibility for server backups and restores falls to a different team than the team that runs AD. This leads to a couple of problems. First, you have no direct control over the backup process, which makes validating backups difficult. Second, many backup tools require an agent on each DC being backed up, which indirectly provides elevated access to the DC.
To mitigate these problems, I frequently employ a two-tiered approach to DC backups. I use a script to run Windows Server Backup each night on the DC and keep a week or two of backups locally on the DC. The folder containing the backups is then shared, with access restricted to the backup tool, as many backup tools can back up a file share without an agent. I also sometimes store the backup files on neighboring DCs within a site. So, for example, if you have DC1 and DC2 in a site, the backups of DC1 are stored on a file share on DC2 and vice versa.
The benefits of this two-tiered approach include:
  • You mitigate some of the risk of being dependent on another team for backups.
  • In the event you need to perform a restore, you can proceed right away with the native backup files you have on hand versus waiting for another team to perform the restore.
  • You’re not waiting for a backup to copy over the WAN from another site in the event backups are performed remotely.
I posted the script I use to run Windows Server Backup as well as directions for setting it up in my article, "Managing Local Backups with Windows Server Backup".

DC Recovery

One of the great things about AD is the mostly stateless nature of the DC. Aside from potentially holding one or more Flexible Single-Master Operation (FSMO) roles, a DC should generally be a matching replica of other DCs in the domain, except for some potential delay in replication depending on your topology. If a failure renders a DC inoperable, this stateless nature is fantastic because it will often remove the need to go through a complicated restore from a backup. Instead, you can simply reinstall Windows and use Dcpromo to promote the server to a DC and replicate all of the data back in—assuming your domain has more than one DC. If you only have one DC in your domain, you can greatly reduce your exposure to failure by deploying a second one.

Before you reinstall and repromote a DC, though, you need to clean up AD, which is a two-step process. The first step is to seize any FSMO roles that the DC might hold for another DC in the domain. If you’re not sure which DCs are hosting FSMO roles in the domain, run
netdom query fsmo
in a command prompt window to find out. You can then seize the FSMO roles using the Ntdsutil utility. Follow the instructions under the “Seize FSMO roles” section in the Microsoft article “Using Ntdsutil.exe to Transfer or Seize FSMO Roles to a Domain Controller”. It’s very important to note that when you seize a FSMO role, best practice dictates that you should never bring the original role-holder back online.
Because it isn’t possible to put the original FSMO role-holder back in service, the second step is performing a metadata cleanup of the failed DC’s configuration in AD. You can use Ntdsutil for this step as well. Follow the instructions in the Microsoft article “How to Remove Data in Active Directory After an Unsuccessful Domain Controller Demotion”. Alternatively, if you’re using the Server 2008 (or later) version of the Active Directory Users and Computers snap-in, you can complete this step by deleting the DC’s computer account in the Domain Controllers OU.
Repromoting a DC over the network might not be feasible when the amount of data to replicate would place an undue amount of strain on the network. In this case, there are a couple of other options. The first option is to restore the DC’s system state from a backup and continue on. The second option is to use the Install from Media (IFM) functionality, which was added in the Windows 2003 release. IFM lets you take a system state backup (created with NTBackup in Windows 2003) or IFM media (created with Ntdsutil in Server 2008 or later) and point Dcpromo to the AD database in the IFM media. IFM media created by Windows 2003 must first be restored to an alternate location on the file system so that Dcpromo can consume it. The DC will make the necessary changes to the database in the media and replicate only the changes since the media was created over the network.

AD Object Life Cycle

When you delete an AD object, a number of things happen behind the scenes. Most important, deleting an object doesn’t directly correlate to a record being removed from the AD database. To maintain consistency in AD’s replication model, objects first transition through a state known as being tombstoned, as Figure 1 shows. Rather than implementing a distributed mechanism to replicate physical deletions from the database, AD replicates a change to an attribute that indicates the object has been deleted.
Figure 1: Default life cycle of an AD object
Figure 1: Default life cycle of an AD object
When you delete an object from AD, the isDeleted attribute is set to True, which means nearly all the object’s attributes are removed. The object is moved to the Deleted Objects container, and its lastKnownParent attribute is stamped with the distinguished name (DN) of the parent object before the object is deleted. After an object has been marked as deleted, it won’t be visible to any tools that query AD, unless you add a special LDAP control to indicate that you want AD to return deleted objects in the search results. Various free LDAP query tools (such as AdFind) will include this LDAP control for you and allow you to easily search for deleted objects.
At this point, the object will remain as a tombstone for a period of time. The default tombstone lifetime for forests is based on the OS of the first DC in the forest. Table 1 shows the default tombstone lifetimes. Upgrading AD doesn’t change the tombstone lifetime for the forest.
Table 1: Default Tombstone Lifetime for New Forests
Table 1: Default Tombstone Lifetime for New Forests
Periodically, a background process called garbage collection runs on each DC. The garbage collection process (aka garbage collector) scans the database for tombstones that are older than the forest’s tombstone lifetime and purges them from the AD database.
Up until the point when a tombstone is purged by the garbage collector, you can recover the object using a process known as tombstone reanimation. When you reanimate a tombstone, you only get back a handful of attributes that are kept during the tombstoning process. For example, the attributes saved for a user object include the user’s SID, SID history, and username (sAMAccountName). Notice that this list doesn’t include attributes such as the user’s password, group membership, or demographic information (e.g., name, department). You can control the list of attributes that are preserved when an object is tombstoned by modifying the searchFlags attribute of an individual attribute’s definition in the schema. You can add as many attributes as you like. However, you can’t add linked attributes, such as group membership or the mailbox database containing a user’s mailbox. For information about how to modify the searchFlags attribute, see the MSDN web page “Search-Flags Attribute”.

In AD forests operating at the Server 2008 R2 forest functional level (FFL), you can enable a new feature known as the Active Directory Recycle Bin. As Figure 2 shows, the Active Directory Recycle Bin adds an intermediate state between when an object is deleted and when it is tombstoned. When an object is in this new deleted state, it’s hidden from search results but all its attributes (including linked attributes such as group membership) are preserved.
Figure 2: Life cycle of an AD object when the Active Directory Recycle Bin is enabled
Figure 2: Life cycle of an AD object when the Active Directory Recycle Bin is enabled
An object in the deleted object phase can be recovered to the exact state it was in at the time of deletion using the same process that’s used to reanimate a tombstone. By default, an object stays in the deleted object phase for the same amount of time as the forest’s tombstone lifetime, as outlined in Table 1. You can change this time period by modifying the forest’s msDS-deletedObjectLifetime attribute.
After the deleted object lifetime expires, the garbage collector moves the object into the recycled object phase. A recycled object is the functional equivalent of a tombstone, with one important difference: You can’t reanimate a recycled object or restore it from a backup.

Object Recovery Mechanisms

As AD matured from release to release, the mechanisms to recover a deleted object have evolved significantly. In Windows 2000, the only way to get a deleted object back was to perform an authoritative restore from a backup. Windows 2003 introduced the concept of tombstone reanimation, which lets you get a partial copy of the deleted object back without restoring it from a backup. Server 2008 R2 added the Active Directory Recycle Bin, which allows the complete recovery of a deleted object without a restoration.
It’s important to note that the shelf lifetime of an AD backup (as well as IFM media) is the same as the tombstone lifetime. If you have the Active Directory Recycle Bin enabled, the shelf lifetime is the lesser of the deleted object lifetime or recycled object lifetime. For example, if the deleted object lifetime is 180 days and the recycled object lifetime is 60 days, then the shelf lifetime is 60 days. Thus, it isn’t possible to restore a deleted object from a backup that’s older than either of these values.

Authoritative Restore

When you need to get an object or series of objects back from a backup, the authoritative restore process is often the way to go. If you’ve ever wondered what the Directory Services Restore Mode (DSRM) option on a DC’s F8 boot menu is for, this is the option you choose to perform an authoritative restore. When you boot in DSRM mode, AD is never started and the database is offline. You can restore the AD database from a backup while booted into DSRM mode, then use Ntdsutil to select the objects that need to be restored. Note that it isn’t possible to perform a restore when the AD NTDS service is stopped on Server 2008 and later DCs.
When you perform an authoritative restore, AD increments the internal version number of the objects being restored. This ensures that when the DC is back online, those objects are replicated out into the rest of the domain and the restored version becomes globally effective.
Authoritative restores are often performed to recover OUs that contain a large number of objects (e.g., users, groups, computers, other OUs). Suppose that you accidentally deleted the Executives OU from the contoso.com domain. To get the OU and everything in it back, here are the steps you need to take:
1.     Boot into DSRM mode and log on with the DSRM password you set during Dcpromo.
2.     Restore a system state backup that was created before the accident. Don’t reboot. (This is a common mistake, especially when under pressure.)
3.     Launch a command prompt window and run Ntdsutil.
4.     Run the command
authoritative restore
5.     Run the command
restore subtree                              OU=Executives,DC=contoso,DC=com
(Although this command wraps here, you'd enter it all on one line. The same holds true for the other commands that wrap.)
6.     Review and confirm the confirmation safety prompts. You should then receive a message like the one in Figure 3. Make note of the text and LDAP Data Interchange Format (LDIF) files that are generated.
Figure 3: Message noting a successful authoritative restore

Figure 3: Message noting a successful authoritative restore
7.     Reboot the DC into normal operating mode.
8.     Log on to the DC and open a command prompt window. Import the LDIF file exported during step 6 by running the command
ldifde -i -f                              ar_20110221-151131_links_contoso.com.ldf
This will import the linked attribute values (such as group membership) for the objects restored.



If you need to restore only a single object (e.g., a deleted computer object), you can use therestore object command instead of restore subtree command in step 5. If your forest contains multiple domains, you need to use the text file exported in step 6 to restore group membership for domain local groups in other domains.

Tombstone Reanimation

There are a number of tools that you can use to reanimate a tombstone, but they all ultimately perform the same steps. So, as an example, here are the steps you need to take to reanimate a deleted user named John Doe with the AdRestore utility:
1.     Open a command prompt window and search for the user with the command
adrestore Doe
AdRestore will search the deleted objects for anything matching *doe* and return output like that in Figure 4.
Figure 4: Sample output from the AdRestore utility
Figure 4: Sample output from the AdRestore utility
2.     Make sure the object you want to reanimate is present, then run AdRestore again with the -r switch:
adrestore -r Doe
3.     Confirm the prompt asking if you want reanimate the object. AdRestore will then reanimate the object to the location it was previously found.
As discussed earlier, tombstones lose most of their attributes when they’re deleted. So, you’ll have to repopulate many of the attributes to make the reanimated object useful again. (Generally speaking, if you use an automated identity management tool, the attributes will be automatically repopulated after the tombstone is reanimated.)

Active Directory Recycle Bin Undelete

The Active Directory Recycle Bin is undoubtedly the best recovery option because all attributes are restored, including linked attributes such as group membership. However, as mentioned previously, your forest needs to be operating at the Windows Server 2008 R2 FFL to take advantage of it.
You can use Windows PowerShell to enable the Active Directory Recycle Bin by running a command such as
Enable-ADOptionalFeature -Identity
  'CN=Recycle Bin Feature,
  CN=Optional Features,
  CN=Directory Service,
  CN=Windows NT,CN=Services,
  CN=Configuration,DC=contoso,DC=com'
  -Scope ForestOrConfigurationSet
  -Target 'contoso.com'
Note that enabling the Active Directory Recycle Bin is not a reversible step. In addition, objects that are already tombstoned when you enable the Active Directory Recycle Bin will no longer be recoverable through tombstone reanimation.
After you’ve enabled the Active Directory Recycle Bin, any objects that are subsequently deleted will be recoverable in their entirety for the duration of the forest’s deleted object lifetime. There are a number of ways to undelete objects, but the easiest is to use PowerShell’s Restore-ADObject cmdlet. For example, here are the steps to undelete a user named John Doe:
1.     Launch the Active Directory Module for Windows PowerShell from the Administrative Tools section of the Start menu.
2.     Search for the deleted user by running the command
Get-ADObject -SearchBase
  "CN=Deleted Objects,DC=contoso,DC=com"
  -ldapFilter:"(msDs-lastKnownRDN=John Doe)"
  -IncludeDeletedObjects
  -Properties lastKnownParent
Make sure that it’s the only object returned in the result set
3.     Restore that object with the command
Get-ADObject -SearchBase
  "CN=Deleted Objects,DC=contoso,DC=com"
  -ldapFilter:"(msDs-lastKnownRDN=John Doe)"
  -IncludeDeletedObjects
  -Properties lastKnownParent |
  Restore-ADObject
If you deleted an entire OU, you’ll need to recover objects in the correct order (i.e., such that an object is not recovered before its parent is recovered) so that they can be put back where they belong. Microsoft has posted a tree undelete PowerShell script that you can use to perform this task.

A Complex Task

Planning for an AD disaster is a complex task because of the multitude of things that can go wrong. However, if you know how to recover from a failed DC and the accidental deletion of an object or an entire tree of objects (such as an OU), you’re well on your way to being prepared for a disaster.

No comments:

View Tenant (ULS) Logs in SharePoint Online using CSOM

Even though the classes exist in the CSOM, the Microsoft Office 365 Engineering team has confirmed that this is something which is not poss...