How to recover a Single DAG Mailbox Server Member
Whoa!!! One of my servers in DAG crashed and it’s crashed for good (RIP)… what next???
The steps outlined in this document is a “MUST-MUST HAVE” for all Exchange pro’s as a part of your DR documentation. I can tell you with confidence that you don’t want to wait for a server to crash and then go haywire looking for this. Read and understand every single information outlined here and try to replicate it in a lab if you have any… Let’s get started…!
In case of single server failure there will not be any impact to the production as the DB’s copies will come online on the other member in DAG. We still need the server back to the production so that we can continue providing the agreed DAG infrastructure. The following step will guide the recovery of the single server failure in the Exchange 2010 DAG.
So, I have a 2 member DAG setup and one of them failed. Following are the steps if you want to recover that server. Yes – I know you can add a different server as a member of DAG, but knowing this process won’t hurt.
Note:
If Exchange is installed in a location other than the default location, you must use the /TargetDir Setup switch to specify the location of the Exchange program files. If you don’t use the /TargetDir switch, the Exchange program files will be installed in the default location (%programfiles%MicrosoftExchange ServerV14).
To determine the install location, follow these steps:
- Open ADSIEDIT.MSC or LDP.EXE.
- Navigate to the following location: CN=ExServerName,CN=Servers,CN=First Administrative Group,CN=Administrative Groups,CN=ExOrg Name,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=DomainName,CN=Com
- Right-click the Exchange server object, and then click Properties.
- Locate the msExchInstallPath attribute. This attribute stores the current installation path.
Steps of DAG Single server recovery
Clean-up of the crashed server data from Exchange DAG:
Retrieve any replay lag or truncation lag settings for any mailbox database copies.
Get-MailboxDatabase DB1 | Format-List *lag*
Remove any mailbox database copies
Remove-MailboxDatabaseCopy DB1MBX1
Remove the failed server’s configuration from the DAG.
Remove-DatabaseAvailabilityGroupServer -Identity DAG1 -MailboxServer MBX1 ConfigurationOnly
Evict the node from failover cluster- Open the Failover cluster and evict the crashed node from the cluster.
Now proceed with steps below:
- Reset the server’s computer account in Active Directory
- Join the server to the domain with the same name.
- Install IIS.
- Install .net framework 3.51
- Install Office Filter Pack
- Open a Command Prompt window with run as administrator.
-
Run the exchange setup with the below cmdlet.
Setup.com /m:RecoverServer
-
When the Setup recovery process is completed. Open the Exchange management shell and run the below cmdlet.
Add-DatabaseAvailabilityGroupServer -Identity DAG1 -MailboxServer MBX1
-
After the server has been added back to the DAG, reconfigure mailbox database copies using below 3 cmdlets
Add-MailboxDatabaseCopy -Identity <DBName> -MailboxServer <destination mailboxservername> -ReplayLagTime 0.00:01:00 –truncationlagtime 0.00:01:00 -ActivationPreference number
Suspend-MailboxDatabasecopy –identity <DBNamedestinationservername>
Update-MailboxDatabaseCopy -Identity <DBNamedestinationservername> -SourceServer <sourcemailbox server> -DeleteExistingFiles
Note: Wait for the seeding to complete which might take more than 24hrs (depending on the DB)
Prabhat Nigam (Wizkid)
Team@MSExchangeGuru
Keywords: DAG member crashed, DAG recovery document, recover a DAG member Exchange 2010, Exchange 2010 DAG member recovery document, DAG member crashed in exchange 2010, database availability group member crashed, how to recover an exchange 2010 DAG server
January 15th, 2014 at 12:13 am
Yeah that’s all well and good but when you have 2 nodes and one dies, how the hell do you get the remaining one up and running while you rebuild the failed one?
January 15th, 2014 at 12:38 am
@Craig
There is no way you can avoid it but install another quick server. I have never seen 2 nodes going down together. But if you have bought a bad hardware then add additional server.
In case all faulty hardware. Replace the hardware and rebuild the environment with the help of backup.
Let me know if you need any help. we have few architects in Melbourne and Sydney who can help.
January 15th, 2014 at 1:14 am
Sorry mate but you are wrong cause I just did it.
Removed the failed server in adsiedit and forced a remount of the databases on the working server. I am back up and running.
I know its not your fault. It is extremely poor design. The whole idea of a failover cluster is to work if one fails. Not to shut it down cause one node aint working. SERIOUSLY, if microsoft worked in the real world, it may be a good system.
January 15th, 2014 at 9:33 am
Craig
I have done hundreds of installations of DAG and many of them were just 2 node. Fail-over has worked every single time. So this is not Microsoft fault if you can’t configure DAG properly rather you need skilled resource.
I would highly recommend training for you, you are doing terribly incorrect. You should have contacted some expert and get it fixed rather than breaking the environment. Send me your Organizations info I will organize some training for you.
Removing failed server from ADSIedit means you are not recovering the server from your Environment. We never recommend this option.
December 23rd, 2014 at 1:28 am
Dear Prabhat, Do we need to consider anything if we have EDGE sync, could you give details on this.
December 24th, 2014 at 10:18 am
Lakshmi,
Yes. we need to have separate edge transport servers on the DR site and configured for the separate mailflow from DR site.
While reverting back to the Prod site you need to reconfigure the EDge Transport.
February 26th, 2015 at 3:24 pm
Hello, i have a situation where one of the member servers in my dag has a hardware problem so i am replacing it with another server (same IP configuration, different name) do i follow these same steps? or can you guide me please
February 26th, 2015 at 4:01 pm
Yes follow the blog
February 26th, 2015 at 5:11 pm
OK THERES ONLY TWO MEMBER SERVERS IN MY DAG PROCEED ANYWAYS? THANKS
February 26th, 2015 at 5:35 pm
Yes Enrique Listen to Prabhat follow the blog step by step. Make sure YOU DO NOT SKIP ANY STEP or else you wont succeed.
February 26th, 2015 at 5:59 pm
Make sure you have a healthy copy active on the other server and you have taken full back up of exchange and AD
February 26th, 2015 at 6:47 pm
Ok will do. Thanks
March 2nd, 2015 at 12:59 pm
Hi Prabhat, im trying to install exchange on the new server that will replace the server that died in my DAG, but im getting an error RMS Shared Identity user ‘CN=FederatedEmail.4c1f4d8b-8179-4148-93bf-00a95fa1e042,CN=Users,DC=exchange,DC=com’ (originating server = ‘W2K3DC01.exchange.com’) is being linked to computer ‘CN=Computer,CN=Computers,DC=exchange,DC=com’ (originating server = ‘W2K3DC01.exchange.com).
[06-05-2009 11:50:26.0856] [2] [ERROR] Database is mandatory on UserMailbox. Property Name: Database.
[06-05-2009 11:50:26.0950] [2] Ending processing.
[06-05-2009 11:50:26.0950] [1] The following 1 error(s) occurred during task execution:
[06-05-2009 11:50:26.0950] [1] 0. ErrorRecord: Database is mandatory on UserMailbox. Property Name: Database.
[06-05-2009 11:50:26.0950] [1] 0. ErrorRecord: Microsoft.Exchange.Data.DataValidationException: Database is mandatory on UserMailbox. Property Name: Database.
at Microsoft.Exchange.Data.Directory.ADSession.Save(ADObject instanceToSave, IEnumerable`1 properties)
at Microsoft.Exchange.Management.Deployment.UpdateRmsSharedIdentity.Link()
at Microsoft.Exchange.Management.Deployment.UpdateRmsSharedIdentity.InternalProcessRecord()
at Microsoft.Exchange.Configuration.Tasks.Task.ProcessRecord()
[06-05-2009 11:50:26.0950] [1] [ERROR] The execution of: “$error.Clear(); if ( ($server -eq $null) -and ($RoleIsDatacenter -ne $true) ) { Update-RmsSharedIdentity -ServerName $RoleNetBIOSName }”, generated the following error: “Database is mandatory on UserMailbox. Property Name: Database.”.
[06-05-2009 11:50:26.0950] [1] [ERROR] Database is mandatory on UserMailbox. Property Name: Database.
[06-05-2009 11:50:26.0981] [1] [WARNING] <<< Setup failed to execute a task.
March 2nd, 2015 at 1:03 pm
MY QUESTION IS, I FOUND POSSIBLE SOLUTION,WHICH IS TO LOCATE Locate the CN=FederatedEmail.4c1f4d8b-8179-4148-93bf-00a95fa1e042 container THEN DELETE IT, AND RERUN THE INSTALL THEN I WILL BE ABLE TO RECREATE CN=FederatedEmail.4c1f4d8b-8179-4148-93bf-00a95fa1e042 container. SO MY QUESTION IS, IF I DELETE THIS? WHAT PROBLEMS MUST I EXPECT? AND JUST TO CONFIRM I DELETE IT FROM MY DC RIGHT?
March 2nd, 2015 at 1:05 pm
Looks like you have lost one of the system mailbox on this server. you need to disable this mailbox and enable it back with an active database.
March 2nd, 2015 at 1:22 pm
WHEN I RUN GET-MAILBOX -ARBITRATION I DONT GET AN ERROR, IT DISPLAYS THE NAME, AND IT SHOWS 2 SYSTEM MAILBOXES, SAME UNDER ALIAS, AND THE SERVER NAME IT DISPLAYS THE SERVER IN MY DAG THAT DOES WORK! THE NEW SERVER THAT IM INSTALLING EXCHANGE ON HAS THE SAME IP CONFIG AS THE OLD SERVER THAT DIED, FROM WHERE DO I DISABLE THE MAILBOX AND ENABLE IT BACK WITH AN ACTIVE DATABASE??? PLEASE HELP
March 2nd, 2015 at 4:19 pm
Run the following commands on active Exchange shell
Disable-mailbox FederatedEmail.4c1f4d8b-8179-4148-93bf-00a95fa1e042 -arbitration
Enable-mailbox FederatedEmail.4c1f4d8b-8179-4148-93bf-00a95fa1e042 -arbitration -Database “currently mounted DB”
Hope this helps
March 2nd, 2015 at 4:31 pm
ok, what is the risk of running this command, i currently have 5 mounted DBs, i run this command for each DB mounted??
March 2nd, 2015 at 4:36 pm
This command is just disabling and enabling federated mailbox which is effecting your RMS.
March 2nd, 2015 at 4:39 pm
ok, so theres no risk of running this?? and i run this command with each of my 5 DBS correct? once i run this command i should be able to install my Exchange on my new Server that is replacing the old member server from my DAG right? Thanks in advanced Prabhat.
March 2nd, 2015 at 5:24 pm
U keep 1 mailbox in 1 DB so you have to run only for one DB. No risk. You RMS might be already down.
July 26th, 2016 at 4:15 am
Awesome how-to much appreciated.
August 8th, 2016 at 12:29 pm
Hi Prabhat
I still fail to understand the real need of recovery. If i have mail1 & mail2 in my DAG and mail1 fails the mail flow will continue via mail2. i can always install mail3 and make it as a third node in the same DAG and just remove the failed mail1 from the DAG. isn’t that more straight forward and simple or i am missing on something here? please clarify!
August 8th, 2016 at 2:05 pm
New server is your choice which comes with more work means all configurations.
August 9th, 2016 at 12:33 am
thanks prabhat! another question while recovering the server is it mandatory to install using the same build of exchange 2013 or a more recent CU can be used?
August 9th, 2016 at 1:07 am
Yes same build is recommended.
December 29th, 2016 at 6:08 am
Hi Prabhat,
is there any step by step blog available for below issue:
I have only single site with 2 node dag and one hub/cas server , and all three exchange server has crashed , i have backup available in the backup device. now i want recover my exchange infra back.
December 30th, 2016 at 6:12 pm
Here are the steps.
Run the setup with /recoverserver switch for CAS/hub then mailbox.
once servers are back bring them back to the production.
then restore the backup and swap the databases.
mail me prabhat.nigam@GoldenFive.net, if you need professional escalation service which is recommended in case of disaster.
January 20th, 2017 at 5:36 am
Thanks Prabhat,
Definitely will contact you if required.
January 20th, 2017 at 5:42 am
Hi,
I have two node dag with 10 database , in the both node there is two network configured Network 1 for Mapi and network 2 for replication. How would i know that, currently replication is going through which network either Mapi or replication for specific database?
January 21st, 2017 at 12:32 am
In the DAG configuration you have an option to choose which traffic you would like to pass through it.
February 1st, 2017 at 6:03 am
Thanks, That i understand but if i have enabled replication on both network 1 and network 2 then how would i now that currently which network is seeping the logs? i mean currently which network in replicating logs?
February 1st, 2017 at 2:23 pm
Wrong configuration. It should be one NIC for replication and other for Production.
February 10th, 2017 at 4:36 am
Hi Prabhat, I have initiated move request from one to another server in same DAG. mailbox size is 2 gb, but its showing In Progress since 50 hours, and i am not seeing any option to stop the move request, i have already restarted ms exchange mailbox replication service from both side, could you please suggest further steps to troubleshoot.
February 10th, 2017 at 4:42 am
Hello,
Do we have option to stop or Kill seeding/Reseeding process if it is taking more then expected time. how ? and what affect this will cause on the database, source and target ?
February 10th, 2017 at 2:23 pm
Why do you want to do it? it is not recommended. There are multiple ways to cancel reseeding. There is no issue on the either side.
February 10th, 2017 at 2:25 pm
Are you doing cross-forest move? It might take time.
You can check the status using the command mentioned below. you could have googled this command.
Get-moverequest | Get-moverequeststatistics | FL
March 3rd, 2017 at 8:17 am
Hi Prabhat,
We are facing issue while adding IP remote range in receive connector and below are the error , could you please help us.
Microsoft Exchange Error
——————————————————–
The following error(s) occurred while saving changes:
Set-ReceiveConnector
Failed
Error:
The administrative limit for this request was exceeded.
The administration limit on the server was exceeded.
Warning:
The cmdlet extension agent with the index 0 has thrown an exception in OnComplete(). The exception is: System.Net.WebException: The underlying connection was closed: An unexpected error occurred on a send. —> System.IO.IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host. —> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host
at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
March 3rd, 2017 at 11:11 pm
There is a clear message “The administrative limit for this request was exceeded.”
AD field limit has reached. Split the IPs if you have multiple Exchange servers.
June 20th, 2017 at 12:37 am
Hi Prabhat,
We have exchange 2010 enterprise environment, and now we are planning to install install two edge server in load balancing in our environment so could you please suggest below query,
How many licence we need for two edge server?
we have enterprise version of exchange So Do we need enterprise or standard license for Edge servers ?
any other suggestion are welcome
June 23rd, 2017 at 11:18 am
2 licenses.Standard should work.
October 24th, 2017 at 9:21 am
Recovering Single member of 2 node dag in my lab. FSW and other member are gone. I restored a full backup of the server and DC VM’s, it was backed up with exchange aware app.
When I get to Remove-MailboxDatabaseCopy it doesn’t want to proceed because it thinks the failed member is the active copy. Furthermore my backup copy came up in a failed state for some reason even though it was not at the time of backup. The cmdlet suggests Move-ActiveDatabaseCopy but I can’t do that because the DAG won’t start. I was NOT running DAC at the time of this backup.
October 24th, 2017 at 9:21 am
I’m on Exchange 2010, Windows server 2012
October 25th, 2017 at 2:36 am
Hi Kenny,
VM restore is not a correct solution for Exchange and AD so there could be multiple issues related to USN and AD objects. I would recommend an Exchange expert like us to handle this issue.
If you need professional help then mail at architects@GoldenFive.net