Exchange Server transport: Shadow Redundancy, Bifurcation and Duplicate detection
Introduced for the first time in Exchange 2010, Shadow Redundancy is a feature that provides redundancy for messages so long as they are in transit. This means that a message is not deleted from the transport database until the transport server verifies that all of the next hops for that message have completed delivery. If any of the next hops fail before reporting back successful delivery, the message is resubmitted for delivery to that next hop.
In this article, we deal in detail with Shadow Redundancy.
Before we move into the details of shadow redundancy, we need to know two basic ideas regarding exchange, namely- Bifurcation and Duplicate Detection.
You may also want to read the article:
Exchange 2013 Safety Net Vs Exchange 2010 Transport Dumpster: https://msexchangeguru.com/2013/04/15/safetynet/
Bifurcation in Exchange
Exchange 2010, by default, makes use of active directory sites for email routing. During this routing process, the route is chosen by the AD Site selection Algorithm. The AD site selection algorithm is based on the following principle. Initially, the least cost path is selected. In case of multiple paths with the same cost, the least hop path among them is chosen. And when the number of hops is also same, paths are chosen based on alpha numeric precedence with lower precedence getting higher priority.
If and when there is a hub server available between the sender and recipient, Exchange can make a direct connection with the hub server and this connection is made to the hub server closest to the mailbox server of the recipient.
When the email has multiple recipients who have their server located in a different site bifurcation comes into play. In such cases, once the email is put in the submission queue, the categorizer will perform recipient resolution, content conversion and routing is performed on the message before the message is put in a delivery queue. The route is then selected, direct connection is established to the last hub server common to both (or all) recipients. That hub transport server will then perform bifurcation of the message. It can then setup a direct connection to hub transport servers closest to each of the recipients’ mailbox servers.
This process is called also called delayed fan-out. As per TechNet, “”Delayed fan-out is only used when the delivery group is an Active Directory site. Delayed fan-out attempts to reduce the number of message transmissions when multiple recipients share any part of the least-cost routing path.””
The advantage of delayed fan-out method is that it saves a lot of bandwidth on the internal network.
Duplicate Detection
How does exchange deal with duplicate messages – makes a pretty good interview question !!!
There are two properties of a message in exchange, which helps us in detecting occurrence of duplicate messages. They are the Internet message ID and the Client Submit time.
The store maintains a table in JET named DeliveredTo table. With the help of this table, it can track duplicate messages. Whenever a message gets delivered to a user, the DeliveredTo database is checked for an entry for the message. If there are no existing entries, a new entry is created. Else the message is turfed.
The default duration for tracking duplicates is 1 hour. This can be changed by changing the value of registry setting:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\MSExchangeIS\<Server Name>\<Private/Public-Guid>\Track Duplicates (in hours)
The maximum duration for which a message can be tracked for duplication is 49 days. When a value greater than 49 is set, it is reset to 24 hours automatically. With increased duration, the table size will obviously get increased there by delaying the message delivery.
Further, old records from the table is deleted every once in a while. The duration for that can be configured using the registry setting is:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\MSExchangeIS\<Server Name>\<Private/Public-Guid>\Background Cleanup (in msecs).
This is however not a fail proof mechanism. Duplication may still occur in any of the following events.
- When either of the key parameters, the internet message id or the submit time, differs between the two messages the messages will not be treated as duplicates.
- Whenever an entry in the deliveredTo table is deleted due to expiration of the storage time property, even if the messages are the same and the user will get duplicates.
- In case the user’s mailbox is moved to a different server, the deliveredTo table is not updated.
Shadow Redundancy
Shadow Redundancy is a cool feature introduced first in Exchange 2010, which ensure that a message is delivered from one hop to the other. This feature will retain a copy of the message which is in transit from one hop to the other, unless it is sure that the message has been delivered to the next hop. This feature is by default enabled in Exchange and will work for message transmission between Exchange 2010 Hub Transport and/or Edge Transport servers. It does not work for previous versions of exchange.
The XSHADOW verb specifies that shadow redundancy is enabled by the server.
Shadow Redundancy Components
-
Transport server
The transport server is the server that contains all message queues and routes all messages to the servers. In Exchange 2013, a transport server and Mailbox server are both same.
-
Transport database
Transport database refers to the message queue database. It also stores the Shadow queues and Safety Net.
-
Transport high availability boundary
It can be either a database availability group (DAG) in the DAG environments, or an Active Directory site in non-DAG environments. Exchange maintain two redundant copies of the message upon arrival at transport server in the transport high availability boundary. These redundant copies are removed once the message leavers the transport high availability boundary.
-
Primary message
The original message which is in transit before delivery is known as the primary message.
-
Shadow message
The redundant copy of the message maintained by the shadow server as long as the primary message is in transit.It is deleted once the shadow server ensures that the delivery of the primary message is successfully processed by the primary server.
-
Primary server
It is the transport server processing a primary message
-
Shadow server
As the name suggests, it is the transport server that maintains the shadow message for the primary server. A transport server can simultaneously be the primary server for one messages and the shadow server for another.
-
Shadow queue
It is the delivery queue maintained by the shadow server which contains the shadow messages. For messages with more than one recipient, each hop will contain a shadow queue.
-
Discard status
It is an indicator that the primary message is successfully processed. It is maintained in the transport server.
-
Discard notification
It is the notification given to the shadow server by the primary server whenever a shadow message is ready to be discarded.
-
Shadow Redundancy Manager
It is the transport component that manages shadow redundancy.
-
Heartbeat
It is a very aptly titled process which indicates the availability of primary servers and shadow servers to each other.
Requirements for shadow redundancy
Fundamentally, all mailboxes should be running on Exchange 2010 platform for shadow redundancy to be implemented regardless of the fact that the mailbox server is standalone servers, or Mailbox servers or Client Access servers.
If the Mailbox server is not the part of a DAG, then the other Mailbox servers should be located in the local Active Directory site.
If the Mailbox server is the part of a DAG, the other Mailbox servers must be with the same DAG.
Shadow Redundancy will not work in the following situations:
- In single Exchange server environments.
- In under-provisioned DAGs.
- During the simultaneous failure of two or more transport servers involved in the shadow redundancy of a message.
Enabling Shadow Redundancy
Shadow Redundancy can be enabled using ShadowRedundancyEnabled parameter of the cmdlet Set-TransportConfig.
When shadow redundancy is enabled, by default it does not reject the messages whose redundant copy cannot be created. This can be enabled using RejectMessageOnShadowFailure parameter on the Set-TransportConfig cmdlet.
Creating Shadow Redundancy Messages
By shadow redundancy, what we aim at doing is creating and retaining copy of the message as long as it is in transit. The location at which this redundant copy is created depends on the origin and destination of the original message. Major determining factors are:
- Messages received from outside a transport high availability boundary.
- Messages sent outside a transport high availability boundary.
- Messages received from the Mailbox Transport Submission service from a Mailbox server within the transport high availability boundary.
A transport high availability boundary can be defined as:
- A DAG, for Mailbox servers that are members of a DAG. This includes a DAG that spans multiple Active Directory sites.
- An Active Directory site, for Mailbox servers that don’t belong to a DAG.
Shadow redundancy never tracks shadow messages beyond a transport high availability boundary. As and when a message goes past a transport high availability boundary, shadow redundancy begins or restarts.
Messages received from outside a transport high availability boundary
Whether or not a message that is received by a Transport service on an Exchange 2013 Mailbox server is from a shadow redundancy enabled server, is not a matter of concern for the recipient server when the incoming message is from outside the transport high availability boundary. The recipient mailbox server promptly creates a redundant copy of the message when it receives a message if shadow redundancy is enabled in the recipient server.
The process flow is illustrated in the following figure:
- A Mailbox server’s transport service receives a message from an SMTP server. Here the primary server will be the mailbox server and the primary message, the transmitted message.
-
A redundant copy of the message received is created by the transport service of the primary server by opening a new simultaneous SMTP session on a different mailbox server within the organization.
- Now, if the primary server belongs to a particular DAG, the redundant copy is made by connecting to another mailbox within the DAG. And if there are multiple Active Directory sites available within the DAG, a Mailbox server located in another Active Directory site is chosen if available by default. This setting can be managed using the ShadowMessagePreference parameter on the Set-TransportService cmdlet. The default value is PreferRemote. RemoteOnly or LocalOnly are the available options.
- When the primary server is not a part of a DAG, the primary server looks for a Mailbox server which is located within the Active Directory Site. Here the ShadowMessagePreference parameter is ignored completely.
- Now, if the primary server belongs to a particular DAG, the redundant copy is made by connecting to another mailbox within the DAG. And if there are multiple Active Directory sites available within the DAG, a Mailbox server located in another Active Directory site is chosen if available by default. This setting can be managed using the ShadowMessagePreference parameter on the Set-TransportService cmdlet. The default value is PreferRemote. RemoteOnly or LocalOnly are the available options.
- Next, the copy of the message is transmitted to the other mailbox server by the primary server. Once the Transport service on the other Mailbox server receives the message, it acknowledges that the copy of the message was received successfully. This is called the shadow message, and the Mailbox server that receives a shadow message is known as shadow server. The message is placed in a shadow queue on the shadow server.
- Once the acknowledgment from the shadow server reaches the primary server, it in turn send an acknowledgement to the SMTP server and closes the original SMTP session.
Messages sent outside a transport high availability boundary
When the message is transmitted to a server outside the transport high availability boundary and it receives an acknowledgment from the SMTP server on the other side that the message is received successfully, the message is moved into the safety net by the transport server. This ensures that message resubmission won’t occur once transmitted across the boundary.
Messages transmitted within a transport high availability boundary
Exchange 2013 optimizes message routing in such a way that if the final point of the message route is within a DAG of an ADS, multiple hops within Transport services on Mailbox servers in that DAG or Active Directory site are eliminated.
When the transport service on the mailbox server of the final point of the route receives the message, the message route is completed within a single hop.
Shadow redundancy with Exchange 2010 Hub Transport servers in the same Active Directory site
During message transmission between exchange 2010 and exchange 2013 mailbox servers, both located in the same active directory sites, the 2010 Hub Transport server indicates that it has shadow redundancy enabled. This is done by the XSHADOW command. However the same is not true in the case of exchange 2013 mailbox server. This thus restricts the Exchange 2010 Hub Transport server from creating a shadow copy of the message on an Exchange 2013 Mailbox server.
When the roles are reversed, the message is shadowed by Exchange 2013 Mailbox server. Once the exchange 2010 hub transport acknowledges that the message is received, the message is moved to the safety net by the Exchange 2013 Mailbox server. However, messages stored in Safety Net by Exchange 2013 Mailbox are never resubmitted to the Exchange 2010 Hub Transport servers.
SMTP Timeouts
Since there is a ConnectionInactivityTimeOut parameter for both primary and shadow servers, a timeout may occur during the creation of the redundant copy of message at the SMTP session.
What happens when a timeout occur before a shadow message is created, is decided by the RejectMessageOnShadowFailure parameter on the Set-TransportConfig cmdlet.
If its value is set to $true, the primary message will get rejected and if its value is $false, the primary message goes through without the creation of a shadow copy. The default value is $false.
If the SMTP server times out after the successful creation of the message shadow copy, the message is accepted and processed by the primary server. The message is sent once again by the SMTP server but it is detected as a duplicate message.
Shadow message creation parameters
ShadowMessagePreferenceSetting on Set-TransportConfig
Values:
-
PreferRemote
It will use a mailbox server in a different active directory site while creating a shadow copy. If that fails, it will try to make a copy in the local ditectory site.
-
LocalOnly
The shadow copy will be created only in the local Active Directory site.
-
RemoteOnly
The shadow copy will be created only in a different Active Directory site.
Default Value: PreferRemote
MaxRetriesForRemoteSiteShadowon Set-TransportConfig
As the names suggests, it defines the maximum number of tries a mailbox server attempts to create a shadow copy in a remote active drectory sites.
It will work only for the cases where ShadowMessagePreferenceSetting is PreferRemote or RemoteOnly.
When a shadow copy of the message can’t be successfully created:
- If RejectMessageOnShadowFailure is $true, the primary message is rejected with a transient error.
- If RejectMessageOnShadowFailure is $false, the primary message is accepted anyway, but isn’t redundantly persisted
Default Value 4
MaxRetriesForLocalSiteShadow on Set-TransportConfig
This has the same function as MaxRetriesForRemoteSiteShadow on Set-TransportConfi, but this parameter will work only when ShadowMessagePreferenceSetting is set to PreferRemote or LocalOnly.
In case of RemoteOnly, the server first attempts the number of trials as specified in MaxRetriesForRemoteSiteShadow parameter. Once this doesn’t succeed, it attempts the number of trials as specified in MaxRetriesForLocalSiteShadow.
When a shadow copy of the message can’t be successfully created:
- If RejectMessageOnShadowFailure is $true, the primary message is rejected with a transient error.
- If RejectMessageOnShadowFailure is $false, the primary message is accepted anyway, but isn’t redundantly persisted.
Default Value 2
ConnectionInactivityTimeout onSet-ReceiveConnector
The maximum duration that an SMTP Receiver connection will remain idle before disconnecting automatically is defined by this parameter.
The value of this parameter must be smaller than the value specified by the ConnectionTimeout parameter.
Default Value
- 5 minutes in the Transport service on Mailbox servers
- 5 minutes in the Front End Transport service on Client Access servers.
- 1 minute on Edge Transport servers.
ConnectionTimeout on Set-ReceiveConnector
The maximum duration that an SMTP Receiver connection will remain idle before disconnecting automatically is defined by this parameter.
The value of this parameter must be greater than the value specified by the ConnectionInactivityTimeout parameter.
Default Value
- 10 minutes in the Transport service on Mailbox servers
- 10 minutes in the Front End Transport service on Client Access servers.
- 5 minutes on Edge Transport servers.
ConnectionInactivityTimeOut on Set-SendConnector
This parameter specifies the maximum time that an open SMTP connection with a destination messaging server can remain idle before the connection is closed.
Default Value 10 minutes
How Exchange server Shadow Redundancy Works
A message whose recipient is outside of the organization is delivered to EDFE01 server by HUB01. If EDGE01 supports shadow redundancy, HUB01 detects this and will direct the message to the shadow redundancy queue with its primary owner as EDGE01. When the Queue Viewer is opened, this message will be listed in the Shadow Redundancy Queue.
Success
HUB contacts the EDGE on port 25 and issues an EHLO and EDGE responds with 250 XSHADOW SMTP verb. 250 is a success message that HUB can actually contact the EDGE as expected.
When EDGE delivers the message correctly, the discard status of the message will be updated. The discard status will now show that the message was delivered successfully. Status of sent messages will be checked at regular intervals by HUB01 (default every 15 minutes) using XQDISCARD command. EDGE will issue a list of all messages with discard status that read ‘successfully delivered’. HUB will remove all the messages from the Shadow Redundancy Queue which are on the discard list.
Failure (EDGE Outage or is down)
Once the message is sent and EDGE is unreachable for HUB within the time-out period, HUB resends the message to the other EDGE. The primary owner of the message in the Shadow Redundancy Queue will now be changed to second EDGE.
When there is no alternate path available, the message will not be resubmitted and will eventually be auto deleted from the redundancy queue.
Temporary Failure
In case EDGE01 is temporary offline and gets submitted to both EDGE01 and EDGE02 because the HUB01 isn’t sure that the message is delivered properly then because of the duplicate detection feature of exchange, Exchange mailbox users won’t see duplicate messages in their mailbox.
Ratish Nair
MVP Exchange
Team @MSExchangeGuru
Keywords: Exchange 2010 transport, how exchange deals with duplicate messages emails, exchange server transport, exchange server email flow.
April 23rd, 2013 at 12:00 pm
Highly Informative. Thankyou Somuch…:)
April 28th, 2013 at 6:24 pm
[…] Exchange Server transport: Shadow Redundancy, Bifurcation and Duplicate detection – […]
April 29th, 2013 at 10:58 am
[…] Exchange Server transport: Shadow Redundancy, Bifurcation and Duplicate detection […]
June 3rd, 2013 at 7:54 am
Cost is depended on Inviorment In Intrasite Architure between two EDge server its possible but its hard task
June 3rd, 2014 at 4:55 pm
Hi, I cannot change the shadowmessagepreferencesetting to “localonly” on my Exchange 2013 Org. It gives me this error: “the value for maxretriesforremotesiteshadow must be set to zero for the LocalOnly shadow redundancy preference setting. And yet, when I try to set maxretriesforremotesiteshadow, I get another error saying that cannot change because shadowmessagepreferencesetting is set to “preferredremote”.
So, how do I change it! Thanks.
February 3rd, 2015 at 4:14 pm
Great Article! Now to the problem at hand–
Every 2 min I am getting “The periodic heartbeat to primary server “server.domain.com” failed. This started happening after doing Windows updates to the server that’s complaining. The server it’s complaining about has no errors. Any ideas where to start? I found nothing so far.
April 1st, 2015 at 1:18 pm
Ian, just sync the clock time on your Exchange servers.
May 7th, 2015 at 4:03 am
Let us know if this helps
May 7th, 2015 at 8:17 am
The clocks are all in sync via AD. This happens randomly with my Exchange 2013 DAG servers. Sometimes it goes away by itself, sometimes bouncing the server will clear the issue. Getting the error doesn’t seem to affect anything, AFAIK.
October 4th, 2016 at 10:24 pm
Good article. i came across issue like. In message tracking
Event ID :HADIRCT , HADISCARD, HARECIVE for inbound Application email.Mai flow is Sender–> Lotus Notes –> Exchange 2013 CU9 –> SAP application (SAP has feature to accpet email).
Question : any reason for Event ID generates ? is mail work properly?
Anyone did such configuration ?