Workflow Steps Perform a datacenter switchover for a database availability group Version 1.2 (Updated 12/2012) Exchange 2010 - Datacenter Switchover Stop-DatabaseAvailabilityGroup Restore-DatabaseAvailabilityGroup Exchange 2010 - Datacenter Switchback Start-DatabaseAvailabilityGroup Stop-DatabaseAvailabilityGroup Has the datacenter switchover been approved? YES NO Stop-DatabaseAvailabilityGroup Is the primary datacenter online or physically accessible? YES NO Stop-DatabaseAvailabilityGroup Do the remote and primary datacenters have network connectivity? YES NO Stop-DatabaseAvailabilityGroup Are the Exchange servers in the primary datacenter online? YES NO Stop-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO Stop-DatabaseAvailabilityGroup COMMANDS: Using the Exchange Management Shell on a sever in the recovery datacenter, run: Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> Repeat the above command for all Active Directory sites containing DAG members that are not the recovery datacenter AD site. EXPECTED OUTCOMES: 1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. 2) Exchange servers that were accessible in the primary datacenter should have their Cluster services forcibly cleaned up and the Cluster service should be configured with a startup type of DISABLED. You can verify this using Services.msc. 3) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. COMMON ERRORS: If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed? Stop-DatabaseAvailabilityGroup COMMANDS: Using the Exchange Management Shell on a sever in the recovery datacenter, run: Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> Repeat the above command for all DAG members that are not in the recovery datacenter. EXPECTED OUTCOMES: 1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. 2) Exchange servers that were accessible in the primary datacenter should have their Cluster services forcibly cleaned up and the Cluster service should be configured with a startup type of DISABLED. You can verify this using Services.msc. 3) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. COMMON ERRORS: If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed? Stop-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO Stop-DatabaseAvailabilityGroup COMMANDS: Using the Exchange Management Shell on a sever in the recovery datacenter, run: Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> -ConfigurationOnly:$True Repeat for any additional Active Directory sites that are not the recovery datacenter. EXPECTED OUTCOMES: 1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. 2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. COMMON ERRORS: If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed? Stop-DatabaseAvailabilityGroup COMMANDS: Using the Exchange Management Shell on a sever in the recovery datacenter, run: Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> -ConfigurationOnly:$True Repeat command for all DAG members that are not in the recovery datacenter. EXPECTED OUTCOMES: 1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. 2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. COMMON ERRORS: If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed? Stop-DatabaseAvailabilityGroup Are the Exchange servers in primary datacenter online? YES NO Stop-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO Stop-DatabaseAvailabilityGroup COMMANDS: Using the Exchange Management Shell on a sever in the recovery datacenter, run: Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> -ConfigurationOnly:$True Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. EXPECTED OUTCOMES: 1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. 2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. COMMON ERRORS: If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed? Stop-DatabaseAvailabilityGroup COMMANDS: Using the Exchange Management Shell on a sever in the recovery datacenter, run: Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> -ConfigurationOnly:$True Repeat for any additional DAG members that are not in the recovery datacenter Active Directory site. EXPECTED OUTCOMES: 1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. 2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. COMMON ERRORS: If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed? Stop-DatabaseAvailabilityGroup COMMANDS: Optional: If Exchange Management Shell access to the primary datacenter is available, run: Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. EXPECTED OUTCOMES: 1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. 2) Exchange servers that were accessible in the primary datacenter should have their Cluster services forcibly cleaned up and the Cluster service should be configured with a startup type of DISABLED. You can verify this using Services.msc. 3) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. COMMON ERRORS: If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. No Exchange server instance if functional to service the Exchange Management Shell – in this instance this step can be skipped. Command Completed? Stop-DatabaseAvailabilityGroup COMMANDS: Using the Exchange Management Shell on a sever in the recovery datacenter, run: Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> Repeat for any additional DAG members that are not in the recovery datacenter Active Directory site. EXPECTED OUTCOMES: 1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. 2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. COMMON ERRORS: If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed? Stop-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO Stop-DatabaseAvailabilityGroup COMMANDS: Using the Exchange Management Shell on a sever in the recovery datacenter, run: Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary datacenter> -ConfigurationOnly:$True Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. EXPECTED OUTCOMES: 1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. 2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. COMMON ERRORS: If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed? Stop-DatabaseAvailabilityGroup COMMANDS: Using the Exchange Management Shell on a sever in the recovery datacenter, run: Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> -ConfigurationOnly:$True Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. EXPECTED OUTCOMES: 1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. 2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. COMMON ERRORS: If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed? Stop-DatabaseAvailabilityGroup COMMANDS: OPTIONAL: Using the Exchange Management Shell on a sever in the recovery datacenter, run: Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> -ConfigurationOnly:$True Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. EXPECTED OUTCOMES: 1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG (this assumes at least one Exchange server exists : Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. 2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. COMMON ERRORS: If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed? Stop-DatabaseAvailabilityGroup COMMANDS: Optional: If Exchange Management Shell access to the primary datacenter is available, run: Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> configurationOnly:$TRUE Repeat for any additional DAG members that are not in the recovery datacenter Active Directory site. EXPECTED OUTCOMES: 1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG (this assumes at least one Exchange server exists : Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. 2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. COMMON ERRORS: If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed? Stop-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO Stop-DatabaseAvailabilityGroup COMMANDS: Optional: If Exchange Management Shell access to the primary datacenter is available, run: Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> -ConfigurationOnly:$True Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. EXPECTED OUTCOMES: 1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. 2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. COMMON ERRORS: If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed? Stop-DatabaseAvailabilityGroup COMMANDS: Using the Exchange Management Shell on a sever in the recovery datacenter, run: Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> -ConfigurationOnly:$True Repeat command for all DAG members that are not in the recovery datacenter. EXPECTED OUTCOMES 1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. Command Completed? Restore-DatabaseAvailabilityGroup Did Stop-DatabaseAvailabilityGroup complete successfully? YES NO Restore-DatabaseAvailabilityGroup COMMANDS: Stop the Cluster service on each DAG member in the recovery datacenter. To do this run the appropriate command for your DAG member’s operating system: • Windows Server 2008 R2: • Windows Server 2008 SP2: Stop-Service Clussvc Net Stop Clussvc EXPECTED OUTCOMES: Cluster services are stopped on remaining nodes. COMMON ERRORS Access denied – You must use an elevated command prompt run as administrator if the default administrator account is not used Command Completed? Restore-DatabaseAvailabilityGroup Is the Cluster service stopped on all DAG members in your recovery datacenter? YES NO Restore-DatabaseAvailabilityGroup COMMANDS: From the Exchange Management Shell on an Exchange server in the recovery datacenter, run: Restore-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <recovery site> -AlternateWitnessDirectory:<AWSPath> -AlternateWitnessServer:<AWSName> EXPECTED OUTCOMES: 1) A DAG member in the recovery datacenter is randomly selected and it’s Cluster service is started in /forceQuourm mode 2) DAG members on the StoppedMailboxServers list are evicted from the DAG’s cluster thereby adjusting the membership count a) If the resulting membership count is EVEN or results in a SINGLE node, the Cluster is configured with a Node and File Share Majority quorum and it begins using the Alternate Witness Server and Alternate Witness Directory 3) Cluster services are started on the remaining DAG members and they successfully join the DAG’s cluster VERIFICATION: Use the following steps to verify that the DAG members are up and the Cluster Group is online by running the following commands: Windows Server 2008 R2 1) Import-Module FailoverClusters 2) Get-ClusterNode –Cluster <DAGName> 3) Get-ClusterGroup –Cluster <DAGName> Windows Server 2008 SP2 1) Cluster <DAGName> node 2) Cluster <DAGName> group COMMON ERRORS: Nodes fail to evict with error 0x46. See http://aka.ms/0x46 Command Completed? Restore-DatabaseAvailabilityGroup Assuming all pre-requisites have been met, any activation blocks can now be removed and databases can be mounted Command Completed? Start-DatabaseAvailabilityGroup Is your primary datacenter online? YES NO Start-DatabaseAvailabilityGroup Ensure that supporting services are available including but not limited to: 1) Active Directory / domain controllers / global catalog / FSMO role holders 2) Domain Name Services (DNS) 3) Witness Server 4) Supporting Exchange roles: Client Access and Hub Transport OPTIONAL: Dynamic Host Configuration Protocol servers (DHCP), if DHCP addresses are used for DAG networks Edge Transport server Unified Messaging server Continue… Start-DatabaseAvailabilityGroup Are the necessary services established and functioning? YES NO Start-DatabaseAvailabilityGroup COMMANDS: Verify network connectivity between all DAG members. Suggested methods: 1) Ping test between DAG members 2) Map administrative shares between DAG members EXPECTED OUTCOMES: Connectivity between datacenters is functioning and all cluster inter-node communications are operating normally Command Completed? Start-DatabaseAvailabilityGroup Have datacenter communications been verified? YES NO Start-DatabaseAvailabilityGroup Verify that Cluster service on the DAG members in the primary datacenter have a startup type of DISABLED. If they do not, either the Stop-DatabaseAvailabilityGroup command was not successful or the DAG members in the primary datacenter failed to receive eviction notification after network connectivity between datacenters was restored Do not proceed until Cluster service cleanup has occurred and Cluster service has a startup type of DISABLED. You can optionally run the following command on the DAG members in the primary datacenter to forcibly cleanup the outdated cluster information: Cluster node /forcecleanup Continue… Start-DatabaseAvailabilityGroup Does the Cluster service show a startup type of disabled? YES NO Start-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO Start-DatabaseAvailabilityGroup COMMAND: Using the Exchange Management Shell, run the following command: Start-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> Repeat for all other Active Directory sites that were stopped during the datacenter switchover process. EXPECTED OUTCOMES: 1) DAG members in the primary datacenter are added to the DAG’s cluster 2) If the resulting membership count is EVEN, the cluster is to use the Node and File Share Majority quorum VERIFICATION: Use the following steps to verify that the DAG members are up and the Cluster Group is online by running the following commands: Windows Server 2008 R2 1) Import-Module FailoverClusters 2) Get-ClusterNode –Cluster <DAGName> 3) Get-ClusterGroup –Cluster <DAGName> Windows Server 2008 SP2 1) Cluster <DAGName> node 2) Cluster <DAGName> group The following command shows the StartedMailboxServers list with all DAG members and an empty StoppedMailboxServers list: Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL COMMON ERRORS: Nodes may fail to join the cluster with invalid node error. If this occurs, retry the command again. Continue… Start-DatabaseAvailabilityGroup COMMAND: Using the Exchange Management Shell, run the following command: Start-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> Repeat for all other Mailbox servers that were stopped during the datacenter switchover process. EXPECTED OUTCOMES: 1) DAG members in the primary datacenter are added to the DAG’s cluster 2) If the resulting membership count is EVEN, the cluster is to use the Node and File Share Majority quorum VERIFICATION: Use the following steps to verify that the DAG members are up and the Cluster Group is online by running the following commands: Windows Server 2008 R2 1) Import-Module FailoverClusters 2) Get-ClusterNode –Cluster <DAGName> 3) Get-ClusterGroup –Cluster <DAGName> Windows Server 2008 SP2 1) Cluster <DAGName> node 2) Cluster <DAGName> group The following command shows the StartedMailboxServers list with all DAG members and an empty StoppedMailboxServers list: Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL COMMON ERRORS: Nodes may fail to join the cluster with invalid node error. If this occurs, retry the command again. Continue… Start-DatabaseAvailabilityGroup Were the DAG members added to the cluster successfully? YES NO Start-DatabaseAvailabilityGroup Were the DAG members added to the cluster successfully? YES NO Start-DatabaseAvailabilityGroup COMMANDS: Reset the DAG’s Witness Server and Alternate Witness Server properties by running the following command: Set-DatabaseAvailabilityGroup –Identity <DAGName> -WitnessServer <WSName> -AlternateWitnessServer <AWSName> EXPECTED OUTCOMES: Witness Server and Alternate Witness Server properties are configured to ensure the appropriate witness server is in use If the Cluster configuration does not match the DAG configuration, the Cluster is updated with the proper configuration COMMON ERRORS: Administrators incorrectly verify which file share witness is currently in use. See http://aka.ms/E14FSW. Continue… Start-DatabaseAvailabilityGroup After any activation blocks have been removed, active database copies can be moved to servers in the primary datacenter Continue…