Domino on Linux/Unix, Troubleshooting, Best Practices, Tips and more ...

DOMINO NETWORK PERFORMANCE OPTIMIZATION WINDOWS 2008 R2 VERSUS 2012

Daniel Nashed  31 December 2017 10:19:26
There is a new APAR which describes a performance issue on Windows 2008 and earlier.
The APAR is based on a PMR which I had open with IBM. So I want give you the full detail about what we found out instead of the summary described in  

LO93355: DOMINO NETWORK PERFORMANCE OPTIMIZATION WINDOWS 2008 R2 VERSUS 2012 -->
https://www-01.ibm.com/support/entdocview.wss?uid=swg1LO93355

We had a situation where we needed to replicate databases from an existing Domino 8.5.3 FP6 Windows 2008 environment to a new Domino 9.0.1 FP9 Linux 64 environment.  
The replication was quite slow and we tried all kind of optimization on Domino, Windows and Linux.
Beside increasing the sending/receiving buffer and memory we have been looking into ways to optimize the the Domino configuration.

It turned out that Domino NRPC network compression was not always helpful depending on the configuration. So we ended up to disable network compression in our particular case. But this might not help in your configuration. It's something that needs testing.
Between the Domino application sending the data and the IP-Stack sending the actual data there is a layer called the "NTI" layer which is responsible for actually coordinating sending the data.  

The buffer size can not be modified and depending on the transaction higher latency networks take some time for the round-trip between sending and receiving side.
But the main issue we have seen was that sending attachments which have a bigger chunk size that is send over the network was also slow.  


Windows 2008 TCP/IP Issue
 


The issue we found on the Windows IP stack only had impact in network environments which have higher latency than a local network where the latency is around 1 ms.  
Our environment had 1 GBit and around 6 ms latency which is already great for a wide area connection . If you are having higher latency the performance might be even lower!
We also reproduced the slow performance also with a faster connection with similar latency (10 GBit network with 5-6 ms latency). So it is the latency that has impact!  

For transferring attachments with my own written C-API test application we have seen 2,5 MB/sec transferring data from Win2008.
In contrast Windows 2012 did not have the same issue and that was very strange for us.  

After discussion with the network team and a lot of tests we found the following tuning parameters.
Both parameter do not exist by default on Win2008 R2 but the DefaultSendWindow exists for example on Win7 (which hare  comparable network stack) with a smaller value.  


[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\AFD\Parameters]  
"DefaultSendWindow"=dword:00080000  
"DefaultReceiveWindow"=dword:00080000  


Those settings ensures that the much more chunk sizes are send over the network before the IP-stack waits for the ACK from the other side. By default it was around 12 KB of data which was quite small!
The first tests on our internal environment after the change showed 35 MB/sec!


But that does not mean that normal replication will have the same performance because it is a mix of different transactions! We only tested object write transactions which had the biggest impact in our case.


Object Write Chunk-Size is 256 KB
 


In discussions with IBM we also found out that the documentation for changing the chunk size for sending attachment data was wrong.  

The WIKI documentation say that the chunk size ist 64 KB and can be increased with a Notes.ini parameter up to 1 MB.  
But it turns out that the parameter was only implemented as a test for a customer and the fix had never been added to the code.  


Here is the technote describing Notes.ini SERVER_SEND_OBJECT_CHUNK_SIZE.  

This is the only documentation for the parameter that should be corrected. The parameter does currently not exist and the default is 256 KBinstead of 64 KB.  


https://www-10.lotus.com/ldd/dominowiki.nsf/dx/Optimising_NRPC_Bandwidth_Consumption_for_attachment
I have been testing different chunk sizes between 64KB and 1 MB with a low level C-API application which writes attachments.
And I found out that 256KB is a good balanced value. So there would be no need to change this parameter.

Conclusion and some additional tips for AdminP


When you are running on Win20012 or higher you don't need to change anything.  
For Windows 2008 you should really set the registry parameter, because this will be a big boost for your replication performance.  

On the other side the nature of replication is that document level replication will always take some time even in a local network.
That's why Domino provides accelerated replica which uses a different transaction type.
It's a kind of backup restore over the network. But that does only work if the database is not DAOS enabled.
For DAOS enabled databases the replicator is used and it takes benefit of storage optimization.
It will only send the attachment if it isn't yet on the remote side. But this might be still slower compared to an accelerated replica.

To better utilize the bandwidth of your 1 GBit line we ended up having multiple AdminP threads leveraging the replicator code to push databases in parallel.
There is ab enhancement in the 9.0.1 codestream (we got it backported to 8.5.3 FP6) which allows one process with multiple threads to replicate in parallel.


And if you want AdminP to create the replica immediately instead of just creating a replica stub you need the following notes.ini parameter: ADMINP_EXCHANGE_ALL_UNREAD_MARKS=1.

When you set this parameter Adminp actually pushes the database instead of creating a replica stub and also syncs all unread marks for the database.
Note:  The admin4.nsf the request type will look like accelerated replica copy even DAOS is enabled on the database and the status of the request also looks a bit different.
You should not be worried about that. It will use the normal replicator code including unread mark sync.

 

Comments

1Steven Vaughan  03.01.2018 12:19:18  DOMINO NETWORK PERFORMANCE OPTIMIZATION WINDOWS 2008 R2 VERSUS 2012

Very interesting indeed.

Thanks for sharing !

2Stuart Bogom  04.01.2018 16:01:07  DOMINO NETWORK PERFORMANCE OPTIMIZATION WINDOWS 2008 R2 VERSUS 2012

Thanks for this information! By the way, the IBM link now redirects to:

https://www-01.ibm.com/support/entdocview.wss?uid=swg1LO93355

Just to clarify, you are recommending that we add the registry entry:

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\AFD\Parameters]

"DefaultSendWindow"=dword:00080000

"DefaultReceiveWindow"=dword:00080000

on 2008 R2 servers?

Archives


  • [IBM Lotus Domino]
  • [Domino on Linux]
  • [Nash!Com]
  • [Daniel Nashed]