Monitor Databases in DAGs
A few days ago, someone at the Microsoft Forums asked if there was a script to alert an administrator of when Exchange performs a failover of databases in a DAG.
This was something that I have wanted to do for a long time, but never actually got to do it… So here is my current solution (might get improved in the future).
The script can be downloaded from this link. Read on to understand what this script means and how it adds value.
Databases in a DAG, and therefore with multiple copies, have the ActivationPreference attribute that shows which servers have preference over the others to mount the database in case of a disaster, for example.
The following output is just an example of what you will get if you run the following command in an environment with at least a DAG and multiple copies:
Get-MailboxDatabase | Sort Name | Select Name, ActivationPreference
Name ActivationPreference
—- ——————–
ADB1 {[MBXA1, 1], [MBXA2, 2]}
ADB2 {[MBXA1, 1], [MBXA2, 2]}
ADB3 {[MBXA1, 1], [MBXA2, 2]}
…
MDB1 {[MBX1, 1], [MBX2, 2], [MBX3, 3], [MBX4, 4]}
MDB2 {[MBX1, 1], [MBX2, 2], [MBX3, 3], [MBX4, 4]}
MDB3 {[MBX1, 1], [MBX2, 2], [MBX3, 3], [MBX4, 4]}
…
Based on the ActivationPreference attribute, we can monitor if databases are currently active on the servers that they should be, i.e., on servers with an ActivationPreference of 1.
To check this, we can use the following script:
Get-MailboxDatabase | Sort Name | ForEach{
$db = $_.Name
$curServer = $_.Server.Name
$ownServer = $_.ActivationPreference | ? {$_.Value -eq 1}
Write-Host “$db on $curServer should be on $($ownServer.Key) – ” -NoNewLine
If ($curServer -ne $ownServer.Key)
{
Write-Host “WRONG” -ForegroundColor Red
}
Else
{
Write-Host “OK” -ForegroundColor Green
}
}
Which basically compares the server where the database is currently active with the server that has an ActivationPreference of 1. If they differ, then write WRONG in red to let the administrator know.
But since we are at it, why not also check for the status of the database and the state of its content index? This can be checked using the Get-MailboxDatabaseCopyStatus cmdlet.
According to the Monitoring High Availability and Site Resilience TechNet article, here are all the possible values for the database copy status:
Database copy status
Failed – The mailbox database copy is in a Failed state because it isn’t suspended, and it isn’t able to copy or replay log files. While in a Failed state and not suspended, the system will periodically check whether the problem that caused the copy status to change to Failed has been resolved. After the system has detected that the problem is resolved, and barring no other issues, the copy status will automatically change to Healthy;
Seeding – The mailbox database copy is being seeded, the content index for the mailbox database copy is being seeded, or both are being seeded. Upon successful completion of seeding, the copy status should change to Initializing;
SeedingSource – The mailbox database copy is being used as a source for a database copy seeding operation;
Suspended – The mailbox database copy is in a Suspended state as a result of an administrator manually suspending the database copy by running the Suspend-MailboxDatabaseCopy cmdlet;
Healthy – The mailbox database copy is successfully copying and replaying log files, or it has successfully copied and replayed all available log files;
ServiceDown – The Microsoft Exchange Replication service isn’t available or running on the server that hosts the mailbox database copy;
Initializing – The mailbox database copy will be in an Initializing state when a database copy has been created, when the Microsoft Exchange Replication service is starting or has just been started, and during transitions from Suspended, ServiceDown, Failed, Seeding, SinglePageRestore, LostWrite, or Disconnected to another state. While in this state, the system is verifying that the database and log stream are in a consistent state. In most cases, the copy status will remain in the Initializing state for about 15 seconds, but in all cases, it should generally not be in this state for longer than 30 seconds;
Resynchronizing – The mailbox database copy and its log files are being compared with the active copy of the database to check for any divergence between the two copies. The copy status will remain in this state until any divergence is detected and resolved;
Mounted – The active copy is online and accepting client connections. Only the active copy of the mailbox database copy can have a copy status of Mounted;
Dismounted – The active copy is offline and not accepting client connections. Only the active copy of the mailbox database copy can have a copy status of Dismounted;
Mounting – The active copy is coming online and not yet accepting client connections. Only the active copy of the mailbox database copy can have a copy status of Mounting;
Dismounting – The active copy is going offline and terminating client connections. Only the active copy of the mailbox database copy can have a copy status of Dismounting;
DisconnectedAndHealthy – The mailbox database copy is no longer connected to the active database copy, and it was in the Healthy state when the loss of connection occurred. This state represents the database copy with respect to connectivity to its source database copy. It may be reported during DAG network failures between the source copy and the target database copy;
DisconnectedAndResynchronizing – The mailbox database copy is no longer connected to the active database copy, and it was in the Resynchronizing state when the loss of connection occurred. This state represents the database copy with respect to connectivity to its source database copy. It may be reported during DAG network failures between the source copy and the target database copy;
FailedAndSuspended – The Failed and Suspended states have been set simultaneously by the system because a failure was detected, and because resolution of the failure explicitly requires administrator intervention. An example is if the system detects unrecoverable divergence between the active mailbox database and a database copy. Unlike the Failed state, the system won’t periodically check whether the problem has been resolved, and automatically recover. Instead, an administrator must intervene to resolve the underlying cause of the failure before the database copy can be transitioned to a healthy state;
SinglePageRestore – This state indicates that a single page restore operation is occurring on the mailbox database copy;
Based on these values, we want the Status attribute to be either Mounted (true for the server where the database is mounted) or Healthy (for the servers that hold a copy of it). For the ContentIndexState attribute, we want it to be always Healthy.
To monitor both these attribute, we can use the following command:
Get-MailboxServer | Get-MailboxDatabaseCopyStatus | ForEach {
If ($_.Status -notmatch “Mounted” -and $_.Status -notmatch “Healthy” -or $_.ContentIndexState -notmatch “Healthy”)
{
Write-Host “`n$($_.Name) – Status: $($_.Status) – Index: $($_.ContentIndexState)” -ForegroundColor Red
}
}
Now, let’s put everything together and tell the script that if something is wrong with any database, to send an e-mail to the administrator! This way, we can create a schedule task to run this script every 2 minutes, for example.
Function Get–ExchangeServerADSite ([String] $excServer)
{
# We could use WMI to check for the domain, but I think this method is better
# Get-WmiObject Win32_NTDomain -ComputerName $excServer
$configNC =([ADSI]“LDAP://RootDse”).configurationNamingContext
$search = new-object DirectoryServices.DirectorySearcher([ADSI]“LDAP://$configNC”)
$search.Filter = “(&(objectClass=msExchExchangeServer)(name=$excServer))”
$search.PageSize = 1000
[Void] $search.PropertiesToLoad.Add(“msExchServerSite”)
Try {
$adSite = [String] ($search.FindOne()).Properties.Item(“msExchServerSite”)
Return ($adSite.Split(“,”)[0]).Substring(3)
} Catch {
Return $null
}
}
[Bool] $bolFailover = $False
[String] $errMessage = $null
Get–MailboxDatabase | Sort Name | ForEach {
$db = $_.Name
$curServer = $_.Server.Name
$ownServer = $_.ActivationPreference | ? {$_.Value –eq 1}
# Compare the server where the DB is currently active to the server where it should be
If ($curServer –ne ($ownServer.Key).Name)
{
# Compare the AD sites of both servers
$siteCur = Get–ExchangeServerADSite $curServer
$siteOwn = Get–ExchangeServerADSite $ownServer.Key
If ($siteCur –ne $null –and $siteOwn –ne $null –and $siteCur –ne $siteOwn) {
$errMessage += “`n$db on $curServer should be on $($ownServer.Key) (DIFFERENT AD SITE: $siteCur)!”
} Else {
$errMessage += “`n$db on $curServer should be on $($ownServer.Key)!”
}
$bolFailover = $True
}
}
$errMessage += “`n`n”
#Get-MailboxDatabase -Status | ? {$_.Recovery -eq $False -and $_.Mounted -eq $False} | Sort Name (…)
Get–MailboxDatabase | Sort Name | Get–MailboxDatabaseCopyStatus | ForEach {
If ($_.Status –notmatch “Mounted” –and $_.Status –notmatch “Healthy” –or $_.ContentIndexState –notmatch “Healthy” –or $_.CopyQueueLength –gt 300 –or $_.ReplayQueueLength –gt 300)
{
$errMessage += “`n`n$($_.Name) – Status: $($_.Status) – Copy QL: $($_.CopyQueueLength) – Replay QL: $($_.ReplayQueueLength) – Index: $($_.ContentIndexState)”
$bolFailover = $True
}
}
If ($bolFailover) {
Schtasks.exe /Change /TN “MonitorDAG” /DISABLE
# Send alert containing $errMessage Send–MailMessage –From “admin@letsexchange.com” –To “nuno.mota@letsexchange.com”, “user2@letsexchange.com” –Subject “DAG NOT Healthy!” –Body $errMessage –Priority High –SMTPserver “smtp.letsexchange.com” –DeliveryNotificationOption onFailure
}
At the end of the script, the scheduled task is disabled so you don’t receive an e-mail every two minutes until you resolve the issue. We could use the Enable-ScheduledTask and Disable-ScheduledTask which are now available in Windows 8 and Server 2012. We could, for example, put the script in an infinite loop, set it to check the DBs every 5 minutes and if a problem is found, send an alert and sleep for 1h.
Please note that there are more attributes that can and should be monitored! For example, you could run the Test-ReplicationHealth to view replication status information about mailbox database copies.
EDIT: I have just updated the attached script to check if a database has failed over to a server in another AD site. This is important as it might affect the way users login to OWA, for example.
Hope this helps!
Nuno Mota
Team @MSExchangeGuru
August 16th, 2011 at 11:47 am
Looking good.. I will try in my setup
August 16th, 2011 at 12:10 pm
Thanks! Yeah, give it a go and let us know how it went. I think it is useful especially for environments without any monitoring solution like SCOM.
Of course it requires some tweaking to match your infrastructure, like to use certain HUB servers to send the e-mail or the CAS Array for example if you have your CAS/HUB roles in the same servers.
August 16th, 2011 at 4:57 pm
It’s very useful for environments that using incapable monitoring soolutions (like Solarwind Orion)
I tried it and it works fine 🙂
August 16th, 2011 at 7:22 pm
Nice one. Thank you very much
August 19th, 2011 at 1:19 pm
Glad it works, hope it is useful! 🙂
Just updated the (attached) script to include a check in case a database fails over to a server in another AD site, which might impact how users access OWA for example.
October 21st, 2011 at 8:35 am
In environments with NON DAG Exchange 2007 Servers it’s good to add a wildcard filter to line 76.
For Example if all you DAG names contain the phrase “dag” it could look like this:
# Get-MailboxServer | Where-Object {$_.DatabaseAvailabilityGroup -like “DAG*”} | Get-MailboxDatabaseCopyStatus | ForEach {
Otherwise the script is great! Thanks 🙂
October 23rd, 2011 at 6:23 pm
Hi Robert, thanks for your comment!
The original cmdlet is not correct like you pointed out.
I actually updated it to: Get-MailboxDatabase | Get-MailboxDatabaseCopyStatus | ForEach {
If you run this command from an Exchange 2010 server, it will only return 2010 DBs. But your suggestion is also good and prevents further problems as well.
Need to update the script to work with every possible scenario 🙂
October 27th, 2011 at 5:54 pm
Looks useful, but I am having some odd errors on 2010:
The string starting:
At D:\exchsrvr\scripts\DAGHealthCheck.ps1:38 char:94
+ $errMessage += “`n$($_.Name) – Status: $($_.Status) – Index: $($_.ContentIndexState)! <<<< "
is missing the terminator: ".
At D:\exchsrvr\scripts\DAGHealthCheck.ps1:43 char:44
+ If ($bolFailover) { sendEmail $errMessage } <<<<
+ CategoryInfo : ParserError: (
$bolF…l $errMessage }:String) [], ParseException
+ FullyQualifiedErrorId : TerminatorExpectedAtEndOfString
October 29th, 2011 at 9:53 am
Hi Josh,
Does your code matches exactly what I posted? Aren’t you missing a ” after $_.ContentIndexState)”?
That’s a simple string inside ” ” so it shouldn’t need any terminator…
October 31st, 2011 at 12:25 pm
Must have been a cut/paste error. I did a stare and compare and didn’t see anything wrong, but cut and pasted again and everything works. Thanks!
December 12th, 2011 at 9:01 am
Great script, thanks for taking the time to put it together. Any reason for not using Send-MailMessage?
December 12th, 2011 at 2:40 pm
Hi Tyler. Thanks! Hope it is useful. I wrote this script for Exchange 2007… I have updated it for 2010 but this version here is still the “old” one.
February 22nd, 2012 at 11:11 am
Hi – got the below output when pasting the script. XX-MAIL is an Exchange 2007 Box in our exchange organisation? Any clues why it’s stopping or how do I filter that server out?
The operation couldn’t be performed because object ‘*\XX-EMAIL’ couldn’t be found on ‘XX-DC01.XXXXXX.local’.
+ CategoryInfo : NotSpecified: (:) [Get-MailboxDatabaseCopyStatus], ManagementObjectNotFoundException
+ FullyQualifiedErrorId : 719921BD,Microsoft.Exchange.Management.SystemConfigurationTasks.GetMailboxDatabaseCopySt
atus
February 23rd, 2012 at 5:26 pm
Hi Mark,
If you have Exchange 2007 servers, you should exclude them. There are many ways of doing this. For example, you can just include 2010 servers:
Get-MailboxServer | ? {$_.AdminDisplayVersion -match “Version 14”} | Get-MailboxDatabaseCopyStatus | ForEach {
or if you just want to exclude that particular server:
Get-MailboxServer | ? {$_.Name -ne “XX-MAIL”} | Get-MailboxDatabaseCopyStatus | ForEach {
Hope this helps!
May 30th, 2012 at 9:35 pm
Is there anyway of excluding specific databases? For instance we have an unmounted test database I would like to exclude from the report.
Second when I run .\monitordag.ps1 I receive and email notification regarding the test database but I never see the Scheduled Task listed.
Thanks!
May 31st, 2012 at 4:45 am
Hi Eric,
Sure there is! 🙂 You have a couple of options. You can manually exclude the DBs based on their name:
Get-MailboxDatabase | ? {$_.Name -ne “MDB10” -and $_.Name -ne “MDB11”} | Sort Name (…)
and
Get-MailboxServer | Get-MailboxDatabaseCopyStatus | ? {$_.Name -notmatch “MDB10” -and $_.Name -notmatch “MDB11”} | Sort Name (…)
Or, better yet, you can filter them based on their status for example:
Get-MailboxDatabase -Status | ? {$_.Recovery -eq $False -and $_.Mounted -eq $True} | Sort Name (…)
This will exclude Recovery DBs and DBs not currently mounted. But be careful as dismounted DBs might be a sign of problems!
Did you create the schedule task to run this script? Have a look at the latest version in http://gallery.technet.microsoft.com/scriptcenter/Monitor-Databases-in-DAG-310b7bd1
In there, the last line of code disables the scheduled task, so when you get an e-mail saying something is wrong, you should still see the task but as disabled.
Please let me know exactly what happens.
Hope this helps!
Regards,
Nuno
May 31st, 2012 at 9:31 am
Thanks so much for the quick reply Nuno! I was able to get the scheduled task running just fine.
I have modified the above as you suggested:
Get-MailboxDatabase | ? {$_.Name -ne “DBTest” -and $_.Name -ne “DBTest2” -and $_.Name -ne “DBTest3”} | Sort Name (…)
However excuse my novice questions but where do I paste this into the script? Or does it replace any other line in the script?
Thanks!
May 31st, 2012 at 9:46 am
never mind I figured it out. thank again for this awesome script!
May 31st, 2012 at 10:16 am
Hi Eric,
Thank you! 🙂
No worries! Yes, simply replace the lines that start with Get-MailboxDatabase and Get-MailboxServer with the ones I mentioned.
Glas it is working now!
Regards, Nuno
August 9th, 2012 at 12:58 pm
Occasionally I am getting Status of mounted but index failed (Mailbox Database 1994346563\server – Status: Mounted – Index: Failed )
What does this indicate?
Thanks.
August 13th, 2012 at 2:23 pm
Hi Eli,
It may be that the Content Index Catalog has become corrupted for example (look for EventID 123 in the Application Log of that server). If this is the case, your users might have issues performing searches…
Run the following and see if you get anything in ContentIndexErrorMessage:
Get-MailboxDatabaseCopyStatus –Server | FL Name, *Index*
To fix it, you can either use the ResetSearchIndex.ps1 script provided by Microsoft in the Scripts folder or the “Update-MailboxDatabaseCopy -CatalogOnly” cmdlet to reseed the Catalog from a healthy server.
The fact that you only get this occasionally, means that Exchange is able to eventually fix the issue but if you get it very often, then there might be something wrong…
Hope this helps!
Regards, Nuno
August 31st, 2012 at 11:44 am
Thanks – very useful
November 14th, 2012 at 3:46 pm
Thank Nuno, i keen to set this up in my organisation.
One simple question I hope.. What dependancies are there for the Semd email component ?
I have a cas with CA,HT,MT I have tried but get the error
[PS] C:\_Source>.\MonitorDAG.PS1
Exception calling “Send” with “1” argument(s): “Failure sending mail.”
At C:\_Source\MonitorDAG.PS1:19 char:18
+ $SMTPClient.Send <<<< ($MailMessage)
+ CategoryInfo : NotSpecified: (:) [], MethodInvocationException
+ FullyQualifiedErrorId : DotNetMethodException
Thanks
Wade
November 14th, 2012 at 3:48 pm
sorry sorted it.. smtp server issue. now solved . please ignore my previous
April 28th, 2013 at 11:57 pm
hi wade,
how do you solve your issue? .. I’ve also encounter the same issue in my organization
thanks
April 29th, 2013 at 12:17 pm
Hi Wade/ButterSuck,
Please check here for the latest version of this script that does not use the .NET method to send e-mails: http://gallery.technet.microsoft.com/Monitor-Databases-in-DAG-310b7bd1
Please let me know if that works for you.
Best regards,
Nuno
April 29th, 2013 at 4:39 pm
Hi buttersucks,
My issue related to not setting a mail server is the script correctly. you can verify it works using a telnet to the server using the setings in the script.
Nuno, will test your new script and advise (although the old one is working just great for me currently)
cheers
Wade
April 29th, 2013 at 5:22 pm
Hi Wade,
Excellent, really glad to hear that! 🙂
Cheers,
Nuno
June 14th, 2013 at 4:09 am
Hi Nuno,
Great script thanks, just a question about deleting the scheduled task. If it fires off the email, and the script deletes the scheduled task, do I need to recreate the scheduled task again? Or does that just delete the one instance of the repeating job?
July 29th, 2013 at 1:35 pm
Hi Matt,
Thanks for the feedback!
If you are using the version posted here, it simply deletes the task (Schtasks.exe /Delete /TN “MonitorDAG” /F) so you will have to manually re-create it…
However, if you go to http://gallery.technet.microsoft.com/Monitor-Databases-in-DAG-310b7bd1#content you can get the latest version of this same script which actually disables the task (Schtasks.exe /Change /TN “MonitorDAG” /DISABLE). However, you still have to manually re-enable it…
Best regards,
Nuno
July 30th, 2013 at 3:15 am
Hi Nuno
Thanks for the info. I have implemented the old script but have taken the portion that deletes the scheduled task out completely. It works quite well for us because if the databases do fail over, we get the alert every 5 mins until it’s sorted out. Obviously if it’s due to planned work, then we disable the scheduled task temporarily.
Thanks for a great script.
July 30th, 2013 at 4:26 am
Hi Matt,
That is also a good idea. Glad to hear it is useful, thanks for the feedback! 🙂
Regards,
Nuno
January 9th, 2015 at 11:33 am
Hi
Many thanks for the script.. I run it as a scheduled task and it does the job beautifully! I was wondering how we can improvise the script to get the best out of it.
Instead of disabling/deleting the scheduled task as one can easily forget to re-enable it, can we make use of start-sleep or register-objectevent to check if the databases are mounted on the preferred server and then send another email out once everything is ok?
Thanks.
January 23rd, 2015 at 4:35 am
Hi Saaj,
Excellent, glad to hear that! 🙂
Of course! This is just one way that at the time I decided to use. We can use the Enable-ScheduledTask and Disable-ScheduledTask which are now available in Windows 8 and Server 2012. We could, for example, put the script in an infinite loop, set it to check the DBs every 5 minutes and if a problem is found, send an alert and sleep for 1h.
There are many ways we could change this, it all depends what exactly you need and prefer.
Best regards,
Nuno
January 26th, 2016 at 10:37 pm
Any Ideas Will this work on Exchange 2013!!. I tried and it just goes in a loop and not having any output…
February 19th, 2016 at 5:13 pm
Try the script of you like this report. https://msexchangeguru.com/2015/06/05/e2013-health-check-script/
June 14th, 2017 at 1:20 pm
Excellent, glad to hear that! ????
Of course! This is just one way that at the time I decided to use. We can use the Enable-ScheduledTask and Disable-ScheduledTask which are now available in Windows 8 and Server 2012. We could, for example, put the script in an infinite loop, set it to check the DBs every 5 minutes and if a problem is found, send an alert and sleep for 1h.
There are many ways we could change this, it all depends what exactly you need and prefer.
Best regards,