Domino on Linux/Unix, Troubleshooting, Best Practices, Tips and more ...

Fix Available: SMTP regression issue in Domino 9.0.1 FP9 can cause malformed headers

Daniel Nashed  16 September 2017 00:43:31
Finally we got IF1 for 9.0.1 FP9 for the issue I reported in an earlier blog post
.
The regression was introduced by a fix that IBM has removed in IF1 (and I got a hotfix earlier as mentioned in an earlier blog post).

The root cause is an issue with malformed headers  -- specially the from header that are generated at message itemization.

Depending on your configuration this causes garbage chars in your headers. In any case some functionality like SMTPVerifyAuthenticatedSender=1 or capturing mail for certain recipients via SMTPSaveImportErrors=3 and SMTPSaveFileFrom=sender  did not work any more.

This is not the final fix. IBM is working on resolving the regression. So this fix along with another agent bug fix is really just a quick fix to allow you to deploy FP9.

I installed the fix on my Linux64 machine which shows up as 9.0.1 FP9 HF 63 and it resolves the regression.

See details here:

http://www.ibm.com/support/docview.wss?uid=swg22008327

IF1 contains the two fixes:

KBRNAQKKK9
        Domino agents crash in the backend in FP8 with a memory overwrite        

JCARAQSJB6
        SMTP regression issue in Domino 9.0.1FP9 can cause malformed headers & prevent Internet mail delivery with SMTPVerifyAuthenticatedSender=1 (technote 2008327)        

Domino Performance issue on some Linux Versions

Daniel Nashed  14 September 2017 12:13:17
When working on a larger Domino migration and consolidation project I ran into an new Linux specific performance issues that might hit some of you depending on your Linux version.
I have tested with current RHEL 7 servers which are not affected.

But on customer site we are using the latest patch level of RHEL 6.9 and I have also seen it with SLES 11 SP2/3. I did not yet test with SLES 12 (maybe someone volunteers to do some testing).


There has been an issue in the 8.5.3 code stream which has been fixed in 8.5.3 FP2.


SPR# PHEY8RJHXR - Fixed a performance issue where creating multiple documents with attachments led to high NETIO delays on Linux, Mac, and IBM i, resulting in slower transactions for other users accessing other databases.


The old issue has been a timing issue between the Domino network stack/listener and the scheduling of the kernel.

The change was to use native pthread semaphores in the Domino network layer.


But already at that time we saw some performance issues with the standard kernel tuning

(see -->
http://blog.nashcom.de/nashcomblog.nsf/dx/runfaster1-for-domino-on-linux.htm for details).

Over time some other changes in the kernel made the default settings used for the CFS process scheduler to not work nicely with Domino in some kernel versions.


I discovered this slow down specially for attachment write transactions when troubleshooting some Windows related issues on the sending side (working on another blog post for the Windows 2008 issue).


But at least on RHEL 6.x and some SLES versions the receiving side can and should be optimized.


For testing I wrote a simple servertask which  creates attachments in a remote database from memory to benchmark the performance.

It turned out that with standard kernel settings for server in a local network we have been able to write with 25 MB/sec.

With the kernel tuning changes we have been able to write with over 100 MB/sec.


Attachment write operations are just one part of the communication but specially when consolidating servers all attachments have to be transferred which will be the bigger part of the data that has to be transferred.


The setting I found is responsible for the CFS scheduler behavior for process scheduling. It specially hits larger transfer operations like attachments (I did not test other transactions types).


By default the setting is set to 12 ms on RHEL 6.9  (take care it is specified in nano seconds). This is causing some timing issues with the Domino network layer.


I have found a recommendation for SAP on Linux which suggested to reduce the value to 1 ms. But in my testing already reducing it to 6 ms did help.


My suggestion would be to set the value to 4 ms.


You can change the parameter via:


echo 4000000 >  /proc/sys/kernel/sched_latency_ns


OR you can permanently set it in /etc/sysctl.conf


kernel.sched_latency_ns = 4000000


Automatically set after boot or run once sysctl -p



Again this setting might not be needed for all Linux versions and should be Domino release independent (I have tested the lates 8.5.3 FP6 versions and 9.0.1 FP8/9).


So you could either set it as a best practice or use my test tool to check you current performance.
My tests with the latest RHEL 7 version did show that even setting the value much higher there, did not have any performance impact.


I am happy to send over the test tool for Windows or Linux. I cannot make it available for download because I don't want to spead the binaries uncontrolled.

But feel free to contact me by mail and I am also interested to see your results, when you test it.

The tool can create documents in any target database and you can specify the number of documents (default is 10).


See detailed test results below.


-- Daniel



-- Test with Default Settings --


cat /proc/sys/kernel/sched_latency_ns

12000000

/local/notesdata $ /opt/ibm/domino/bin/nshobj dsim012\!\!admin/nshobj.nsf

Local Notes/Domino Release 9.0 QMR:1 QMU:9 Hotfix: 0 Fixpack: 0 (0)

Remote Notes/Domino Release 9.0 QMR:1 QMU:9 Hotfix: 0 Fixpack: 0 (0)

Database:   'dsim012!!admin/nshobj.nsf'

Att-Size:   2097152

Chunk-Size: 262144

Count:      10

Total:      814

Minimum:    19

Maximum:    194

Average:    81

MB/Sec:     25,3



-- Test with Modified Settings --


echo 4000000 >  /proc/sys/kernel/sched_latency_ns


/local/notesdata $ /opt/ibm/domino/bin/nshobj dsim012\!\!admin/nshobj.nsf

Local Notes/Domino Release 9.0 QMR:1 QMU:9 Hotfix: 0 Fixpack: 0 (0)

Remote Notes/Domino Release 9.0 QMR:1 QMU:9 Hotfix: 0 Fixpack: 0 (0)

Database:   'dsim012!!admin/nshobj.nsf'

Att-Size:   2097152

Chunk-Size: 262144

Count:      10

Total:      192

Minimum:    16

Maximum:    35

Average:    19

MB/Sec:     107,8


How to resolve synchronization issues that start after upgrading to IBM Traveler 9.0.1.18 (or higher)

Daniel Nashed  9 September 2017 11:21:53
If you are running on Traveler 9.0.1.18 and higher you should read the following support flash technote in detail.

http://www.ibm.com/support/docview.wss?uid=swg22005703

You must read this technote if you are running on 9.0.1.18 and higher.
And with this new information it makes a lot of sense to move to this new version soon.

As mentioned before, IBM changed the default security mode for Traveler.
Traveler uses a run as user feature to ensure that all functionality is invoked in the name of the user.


Therefore Traveler server has to be listed in the trusted server on the security tab of the mail-server (which already caused a yellow status warning on your servers in earlier versions in preparation).

But there are additional requirements for each mail-database to correctly sync with this new security modeil.
Some of them have not been documented in detail before this technote was available. And Traveler 9.0.1.19 has more detailed checking/logging if capabilities in access for a database is missing.
Also there is a fallback per user to the old mode, if not all requirements are full filled for a mail-database.
For example Maximum Internet Access for a mail-database needs to be set to Editor or higher.

The technote describes the requirements and the new error logging in very detail. And also all options that you have to disable the new access mode for the server or per user.

-- Daniel


Traveler 9.0.1.19 with important fixes

Daniel Nashed  8 September 2017 09:15:12
We have been waiting for Traveler 9.0.1.19 for some important fixes and also updates SQL server support and push certificate update:
 
 
  • Support for MS SQL Server 2016 Enterprise Edition.
  • Updated APNS Certificates with expiration 8/1/2018.
  • Improvements for the Run as User Feature.

But the most important changes are for the "Run as User" Feature which has been introduced in 9.0.1.18.
Some of my customers and issues with Traveler profiles which could not be read correctly in some cases.

Beside this fix there are a couple of minor enhancements listed below.

-- Daniel

Fixlist:
APAR # Abstract
LO92524 Sync performance impacted if syncing a large repeating calendar event.
LO92525 Reply notice sent from an FYI recipient for a calendar event when processed on a mobile device.
LO92557 Cancelled event may appear ghosted after the cancel on the iOS Native Calendar application.
LO92638 Add invitee from native iOS Calendar application and the recipient may be added twice to the meeting.
LO92645 Error reading some policy documents when Run as User feature is enabled.
LO92713 User unable to sync when sync request is internally routed from an older server to a 9.0.1.18 or later server.
LO92728 DB Connection exception during migration from Derby to an Enterprise database.
LO92783 Android user incorrectly denied access if specific set of administration settings are enabled.
LO92829 Handle comma character in display name for mail sent from an Outlook client.
LO92881 High CPU may be seen on the database server for one particular SQL query.
LO92897 Update APNS Certificates, new expiration is 8/1/2017.



SSLV3 disabled by default since 9.0.1 FP9

Daniel Nashed  5 September 2017 16:18:57
This change has been discussed a while ago.
Now it was finally time to disable default SSLv3 in Domino.

The SPR did not make it into the fixlist. Thanks Thibaud Maes for your mail!

The change addressed by SPR # DKENAKNSEG will affect all connection types that utilise the native Domino security stack such as HTTPS and secure DIIOP.

If you still need SSLv3 you need this new notes.ini parameter ENABLE_SSLV3=1

There are not many applications left that need SSLV3 ...

Daniel

Domino 9.0.1 FP9 SMTP Issue

Daniel Nashed  30 August 2017 23:15:56
Last Friday a friend contacted me about a SMTP issue. I was able to reproduce the root cause of the issue but the emails I receive still look OK.
So it depends on your STMP configuration how much impact this issue has in your infrastructure with FP9.


In my environment I see in my SpamGeek log database that sender and recipient information contains some garbage chars at the beginning of the string.

This can cause that the header information might get corrupted. We don't know exactly how this happens and why it has different impact in different environments. The issue we see is cross platform (verified on Windows and Linux).


I have an open PMR since Friday escalated to L3 and I got a test hotfix today, which corrected the header. This confirmed that one of changes in FP9 caused this regression. But I don't know yet if this first fix was just a test fix that reverted the behavior or was a final fix.


So if you did not yet update your SMTP server to FP9, you should wait until this issue is fixed.


Update 1.9.2017:


Got the information from IBM that the test fix they did removed one SPR included in FP9.


Here is the description for this SPR:


SPR# TPON949L2M - Fixed an issue where encoded phrases may have embedded delimiters after decoding -- e.g., the comma (',') in Ziffle, Fred causes an error for Notes.  Fix is to unconditionally quote the decoded phrase:  "Ziffle, Fred"


It's still not clear what they are going to do. I would expect that they change the fix and because many customers are affected I would also expect this to be included in a IF soon.


For now we have the hotfix available if you need to upgrade to FP9 now.

If you want to request the hotfix. My PMR number is PMR# 25337,999,724



I will continue post updates once I have more details.


-- Daniel

midpoints Let’s Encrypt for Domino (LE4D)

Daniel Nashed  28 August 2017 04:44:58
As posted before I am running my server with Let's Encrypt certificates.
The first available client choices for requesting certificates (ACME clients) did not make me happy because I had to install Python just for that.
Meanwhile there are multiple tools available including simple shell scripts and also a Java implementation.

For Linux running "getssl" script (https://github.com/srvrco/getssl) with a small script to automate the process works quite well. But it is still completely server-backend based and only works for Linux.

As mentioned in the comments of my previous blog post Detlev Poettgen and Ulrich Krause from midpoints created a Notes/Domino application leveraging the Java ACME client implementation to handle Let's Encrypt certificates!
After configuring the application it handles everything for you in one database. It works on Windows and Linux and you can centrally manage certs for all your servers!

The application "midpoints Let's Encrypt for Domino (LE4D)" is available for free.
You find it here --> https://www.midpoints.de/LE4D

Once you filled in a request form you will get a template and documentation.

There will also be a session at AdminCamp next month about the tool --> http://www.admincamp.de/AC17/Agenda

Huge thanks to Detlev and Ulrich!

Daniel



Notes/Domino 9.0.1 Feature Pack 9 shipped

Daniel Nashed  20 August 2017 21:39:16
Notes and Domino 9.0.1 Feature Pack 9 is available.

The client side and server-side introduces fixes and also new features.

The official "flixlist" can be found here --> http://www.lotus.com/ldd/fixlist.nsf/0/12d957b7c277fc728525816300434c53

Here are the highlights and some important comments.


JVM Update in Notes Client & Domino Server

The security fixed version introduced with a JVM patch for FP8 is included in FP9:

Notes/Domino - Java 1.8 SR4 FP5

But this is still just the runtime in the client. The compile time support for Java 1.8 has to wait for FP10 because it also needs an update Eclipse version.

The Designer compile is Java 1.6 SR16 FP45


Notes Client Updates

Some of the changes need an updated mail-template. Some changes need notes.ini parameters or other settings.
But some other new functionality is enabled by default.


High resolution support for the Notes client

The Windows client now scales correctly text and icons with high resolution displays (higher the HD) and also with custom DPI settings.
This was a long waited feature request. FP8 already had some improvement but the main change is shipped with FP9.


Full fidelity for fonts in Notes emails

There is a change in the fonts used for the MIME body for messages to render fonts better in other clients.

In my tests it turned out that this causes interoperability issues with Notes clients.
When using sans-serif fonts in outgoing messages the messages are displayed with a serif font depending on your Notes client configuration.

It happens when you are using "Disable Embedded Browser for MIME mail" via preference (notes.ini BrowserRenderDisable=1).

Also in any case when the message is put into edit mode.

IBM is currently looking into this issue and I will post an update what I find out.


Improved handling of non-English characters in internet messages

Notes and Domino now support RFC2231, a standard internet protocol for handling non-English characters in internet messages
 to improve message fidelity in communications with other applications that support this standard.


Improved name lookup in Notes

Searching by through typeahead or in the ambiguous name dialog, returns the same results as searching by .
For example, searching for don smith or smith don returns the same results, including variants such as Donald, Donovan, Smithfield.

This is also a long waited feature. But it needs the updated pernames.ntf template provided with 9.0.1 FP9 and add a notes.ini setting AllowWildcardLookup=1.



Domino Server Updates

Enhancement Request To Be Able To Increase The Amgr Queue Beyond 100 (SPR #RSTNA4SL7C APARID: LO87242)

The Agent Manager's Eligible queue is now able to change from the lowest value possible at 100, to 255 which is the highest value possible via an INI AMGRMaxQueue.

I worked with one of those customers who has the requirement to run many agents in a very tight schedule. In there environment even on a fast server not all agents have been scheduled.
This parameter allows to increase the queue. But sadly the parameter is just a BYTE and cannot be increased about 255.
I would have wished they would have increased the limit to a higher value (which would have been a bigger change) and I would have wished that the parameter would be 255 by default.
But it should already help for most environments.


Databases and views can be opened more quickly in databases that are enabled for transaction logging

It takes less time to open databases and views that are at ODS 52 or higher and enabled for transaction logging.
Previously, performance for opening databases or views could be slow in frequently updated databases.
This improvement is due to the implementation of less contention with update operations.

So it is really important that your servers have translog enabled! We have still seen customers not running without translog.
And it is important that you update your ODS to 52 via create_r9_databases=1 and compact -C.

I have not seen any performance values. But this should improve also performance for NIFNSF enabled databases.

Enabling and managing inline view indexing

A view index is an internal filing system that NotesĀ® uses to build the list of documents to display in a database view or folder.
By default, view indexes are updated on a server at scheduled intervals.
To update a view index immediately after documents change instead, administrators can enable inline view indexing.
When you enable inline view indexing, a critical view is always kept up-to-date for your users.

I have not looked into it in detail yet. But it is a bigger change and the release information does not contain all the details.

Here is the documentation update with all the details -->https://www.ibm.com/support/knowledgecenter/en/SSKTMJ_9.0.1/admin/admn_inline_index_enabling.html
The documentation contains information how to implement it and also how to query information, statistics and other details.

I hope those additional comments on top of the release notes are helpful for your first look into FP9.


-- Daniel


Blog Certificate updated and Let’s Encrypt Update

Daniel Nashed  8 August 2017 11:30:13
My certificate expired after 90 days because I did not track it. And the Let's Encrypt original client configuration did not work any more when I was looking into renewal today.
The client was Python based and there is a newer client -->
https://certbot.eff.org/ which is officially recommended by Let's Encrypt.

It's still complicated to use and you need to have Python installed.

But since I first implemented it there are many other ACME clients that properly integrate with Let's Encrypt ->
https://letsencrypt.org/docs/client-options/.
There are even two simple shell script based clients which both do not require root permission and work in combination with Domino.


I have installed the "getssl" script (
https://github.com/srvrco/getssl) and it was quite easy to implement, even for a server with multiple certificates (SAN cert).

And I also updated my shell script to automatically generate a Domino keyring file now with the getssl script.

But it still needs a manual restart of all servertasks that use the certificate. So it is not a completely automated process yet.


The gettssl script works with the Domino html root and port 80.  

With some additional checks I could potentially automate certificate updates on my server completely.

For now there is a manual step required.


Is anyone using Let's Encrypt Certificates with Domino? Which ACME client are you using?



Let's Encrypt Certificates are a good alternative if certificate updates would be automatically installed.

Right now it's a simple shell script. I could polish it and make it available if there is demand for it.


What do you think? Any feedback is welcome!


-- Daniel



SLES 12 SP2 Issues with Domino running with Systemd

Daniel Nashed  24 July 2017 12:01:20
There is a new feature introduced in SLES 12 SP2 which could lead to issues with larger Domino or Traveler servers.

The default nproc size is still set to 7400. So in most cases this tunable does still not need to be set in your Domino service file.


But there is a new security feature introduced in SLES 12 SP2 which will cause processes fail to start or not able to span more threads.


The error you might see is the following:


Jul 20 11:02:41 dom-srv kernel: cgroup: fork rejected by pids controller in /system.slice/domino.service

The new feature limits processes by default to use more than 512 threads.


Here is the relevant extract from SLES 12 SP2 readme:


-- snip --


2.3.2 Support for PIDs cgroup Controller #

The version of systemd shipped in SLES 12 SP2 uses the PIDs cgroup controller. This provides some per-service fork() bomb protection, leading to a safer system.
However, under certain circumstances you may notice regressions. The limits have already been raised above the upstream default values to avoid this but the risk remains.
If you notice regressions, you can change a number of TasksMax settings.

To control the default TasksMax= setting for services and scopes running on the system, use the system.conf setting DefaultTasksMax=. This setting defaults to 512, which means services that are not explicitly configured otherwise will only be able to create 512 processes or threads at maximum.

For thread- or process-heavy services, you may need to set a higher TasksMax value. In such cases, set TasksMax directly in the specific unit files. Either choose a numeric value or even infinity.
Similarly, you can limit the total number of processes or tasks each user can own concurrently. To do so, use the logind.conf setting UserTasksMax (the default is 12288).
nspawn containers now also have a TasksMax value set, with a default of 16384.


-- snip --

The best solution for Domino is to increase the limit directly in the domino.service file.

In addition to this new setting I also updated the config file with an updated value for LimitNPROC= 8000 in addition to the new setting TasksMax= 8000.

8000 Threads should be sufficient for all Domino server environments.


So in case you are running a larger scale environment with SLES 12 SP2 you really should check those settings in  your service file!


-- Daniel



-- snip --


[Unit]

Description=IBM Domino Server (notes)

After=syslog.target network.target


[Service]

Type=forking

User=notes

LimitNOFILE=60000

LimitNPROC= 8000

TasksMax= 8000

PIDFile=/local/notesdata/domino.pid

ExecStart=/opt/ibm/domino/rc_domino_script start

ExecStop=/opt/ibm/domino/rc_domino_script stop

TimeoutSec=100

TimeoutStopSec=300

KillMode=none

RemainAfterExit=no

#Environment=LANG=en_US.UTF-8

#Environment=LANG=de_DE.UTF-8


[Install]

WantedBy=multi-user.target

-- snip --


  • [IBM Lotus Domino]
  • [Domino on Linux]
  • [Nash!Com]
  • [Daniel Nashed]