Traveler Optimization for Slow Backend Mail Server Connections
Daniel Nashed – 3 March 2019 08:30:57
In the last couple of month we have been working on performance bottlenecks for customers with higher latency network connections between Traveler and the back-end mail-servers. It took a while until we got all the fixes implemented after very detailed analysis (for example I wrote an extension manager to track object reads).
The good news is that those fixed are included in the current release and most of the settings are now even enabled by default in the latest releases.
[Side note about Traveler accepted latency]
IBM/HCL recommend that the connection between your mail-servers and Traveler servers should have less than 50 ms latency!
But you don't have always a choice. On the other hand I have seen corporate network connections with latency around 5/6 ms today!
Even internet connections between two different provides I use are around 6 ms!
See technote for recommendations and troubleshooting steps:
https://www.ibm.com/support/docview.wss?uid=swg21961707
My first observation was that the attachments for richtext messages are sent multiple times over the network during sync, which lead to the first fix already implemented in the Traveler 9.0.1 code stream.
After we got the fix, I figured out that also MIME messages have been effected in similar way -- It was just harder to track.
Specially on WAN networks transferring attachments multiple times causes additional network utilization and in combination with higher latency also causes slower sync.
Not just when the attachment is syned, because attachments might be pre-streamed in some cases.
The changes are very low-level in the back-end how Traveler uses the Domino APIs. So the overhead was only trackable below the Traveler interface to Domino (C-API calls).
The two many changes have been implemented in Traveler 10.0.1 / 10.0.0 and one fix needed a changed notes.ini to not pre-stream the attachment.
In our first hotfixes the parameter needed to be disabled NTS_ATTACHMENT_PRESTREAM=false but since 10.0.1 the parameter is disabled by default.
The pre-stream of attachments was needed for blackberry devices which need the exact size before syncing a document. Unless you have blackberry devices the new default should work for you.
The two main fixes are the following:
Traveler 10.0.1.1
TRAV-3279 MIME message processing reads attachments multiple times
Traveler 10.0
TRAV-3004 Avoid streaming attachments just to calculate size.
In addition Traveler 10.0 introduces two other optimization fixes for slow network connections:
TRAV-3165 Reduce Dispatch logging to reduce network utilization.
TRAV-2952 Master Monitor queue bottlenecked by slow response from mail servers.
Network Session Optimization
There is one additional notes.ini Parameter which is helpful to optimize back-end connections between Traveler and the Domino mail-servers.
I have worked in two larger environments with a high number of Domino mail-servers in the same Traveler HA pool.
Usually you should use separate Traveler pools for servers in different locations and best practices would be to have a Traveler pool in the same data center than your mail-servers. But this isn't possible in all customer environments.
In combination with a high number of users on different mail-servers and a single Traveler HA pool, we have seen many open network connections per mail-server.
You can see up to 40 ESTABLISHED network sessions with mail-servers for a longer time.
The following (finally officially documented) NTS parameter helps to optimize and properly recycle those Domino NRPC network sessions between Traveler and your mail-servers.
If you are experiencing a high number of open NRPC sessions per Domino back-end mail-servers, you should have a look into this parameter.
NTS_DOMINO_THREADS_OPTIMIZE_RECYCLE=false
Controls whether IBM Traveler threads that use Domino API calls are Domino initialized and terminated when IBM Traveler is done with the thread and
the thread is destroyed (true) or when each usage of the thread for a user's device is done but the thread is not destroyed (false).
True saves the overhead of doing the initialization and termination for each user's device but NRPC connections are cached per thread and only released upon the termination.
If your IBM Traveler server is talking over NRPC to a large number of mail servers (for example, more than 100) and the IBM Traveler server is running out of TCP/IP network ports,
you may want to change this value to False to force more frequent thread terminations which release NRPC connections more frequently.
- Comments [0]