Domino on Linux/Unix, Troubleshooting, Best Practices, Tips and more ...

Domino 9.0.1 FT Index Hang and potential crash

Daniel Nashed  10 December 2017 06:08:37
We ran into a hang situation multiple times during FT indexing. It turned out that this is a regression introduced in FP9 due to changes in the FT index area.

In certain situations the FTIndex update does hang getting document data and will cause one CPU core to be maxed out for this thread.
The description of the SPR says it is a "spike" but it more looks like the thread permanently uses CPU.

This can happen with updall, DBMT and also other tasks updating the FT index.
The process cannot be stopped and this will also cause that the server cannot be shutdown cleanly.

We got a hotfix which will will be included in IF3. After applying the hotfix we had no new server hangs.

I am including the call-stack for the hang to this blog post to have it searchable for others who might run into the same issue.
If you have not installed FP9 you should wait for IF3. If you are on FP9 and run into this issue, take a full NSD, open a PMR and reference the mentioned SPR numbers to get the fix.

-- Daniel


-- Fixed SPRs --

SPR #SVEM9SLCL7
J3 server crashed on DBMT task, while full text indexing the database

SPR #TDOOAT6LK9
CPU spike when running dbmt (or updall/update task) and creating full text index.

-- Call Stack --

Thread 3 (Thread 0x7f2c5da71700 (LWP 17594)):
#0  ODSToOrFromHost (toHost=32769, type=0, vbuffer=0x7f2c5da6e8e0, iterations=1) at ods.c:824
#1  0x00007f2cba7ef8fe in ODSReadItem (src=0x7f2c54466d96, type=, dest=0x7f2c5da6e8e0) at ods.c:1420
#2  0x00007f2cbab635e2 in GetChar(STREAM_CTX*, STREAM_DATA*) () from /opt/ibm/domino/notes/latest/linux/libnotes.so
#3  0x00007f2cbab64932 in FTGetDocStream () from /opt/ibm/domino/notes/latest/linux/libnotes.so
#4  0x00007f2c5d390919 in NotesStreamReadChar (arg=) at ftg_dstr.cpp:1412
#5  0x00007f2cbab5ca7c in FTLexMatch () from /opt/ibm/domino/notes/latest/linux/libnotes.so
#6  0x00007f2c5d39296c in FTGCreateIndex (pFTGCtx=0x7f2c4c00abf8) at ftg_dstr.cpp:1839
#7  0x00007f2c5d38bac0 in CFTNoteIndexer::ProcessDoc(FTG_CTX *, struct {...} &) (this=, pFTGCtx=0x7f2c4c00abf8, docIndexerInfo=...) at ftgindex.cpp:2074
#8  0x00007f2c5d38c5d1 in FTGIndexIDProc (Parameter=, NoteID=207326) at ftgindex.cpp:1685
#9  0x00007f2cb999285d in IDEnumerate (hTable=536872571, Routine=0x7f2c5d38c343 , Parameter=0x7f2c4c00abf8) at idtable.c:2216
#10 0x00007f2c5d38e252 in FTGIndex(FT_THREAD *, struct {...} *, WORD, char *) (pftt=0x7f2cb41004d0, pFTStreamCtx=0x7f2c4c00abf8, Options=392, StopFile=) at ftgindex.cpp:1146
#11 0x00007f2cbab5adce in FTCallIndex () from /opt/ibm/domino/notes/latest/linux/libnotes.so
#12 0x00007f2cbab5c3a3 in FTIndexExt2 () from /opt/ibm/domino/notes/latest/linux/libnotes.so
#13 0x00007f2cb93e8485 in UpdateFullTextIndex (hDB=1154, Pathname=0x7f2cb4101648 "mail/c1/xn06451.nsf", Flags=201342976, fullTextStatus=8) at update.c:1239
#14 0x00007f2cb93ea78f in UpdateCollectionsExt (_hModule=, Pathname=0x7f2cb4101648 "mail/c1/xn06451.nsf", Type=2, Flags=201342976, Flags2=0, mSecs=0, ViewNoteID=0, ContainerObjectID=0, ViewTitle=0x40a360 "", retDbTitle=0x0, fSrchSite=0, QueuedRequest=0, retbLater=0x0, fullTextStatus=8, wantsFulltext=0x0) at update.c:660
#15 0x00007f2cb93ea957 in UpdateCollections (_hModule=32769, Pathname=0x0, Flags=, ViewNoteID=, ContainerObjectID=, ViewTitle=, retDbTitle=0x0, fSrchSite=0, QueuedRequest=0, retbLater=0x0, fullTextStatus=8, wantsFulltext=0x0) at update.c:106
#16 0x0000000000405238 in UpdallThread (threadparam=) at dbmt.c:2108
#17 0x00007f2cb98e7be3 in ThreadWrapper (Parameter=) at thread.c:1183
#18 0x0000003aae007aa1 in start_thread () from /lib64/libpthread.so.0
#19 0x0000003aadce8bcd in clone () from /lib64/libc.so.6

End of Service for JVM 1.6

Daniel Nashed  25 November 2017 13:34:30

IBM uses the Oracle JVM as their base for their IBM JVM platform which is used in IBM products like Notes, Domin and Traveler.


JVM 6.0 has been around for almost 10 years and is now discontinued since Sep 2017.
Oracle discontinued their support for JVM 1.6 so IBM cannot support JVM 1.6 on their side.


That also means for IBM platforms that there is no patch support for JVM 1.6!


For Notes and Domino means you have to update to 9.0.1 FP8/FP10 for JVM 1.8 and hopefully FP10 will bring compile time for JVM 1.8 as well (current planning).


If you are running on Notes/Domino 8.5.3 or an earlier 9.0.1 FP don't panic. Most Java applications on Domino are not directly accessible over the network. There is at least the Domino HTTP stack between the client and the Java application.

On the client side you might have direct connection from the client to the internet. And for encrypted connections there have been limitations before in the SSL/TLS area as posted before.
For example there is just very limited TLS 1.2 support in JVM 1.6 with just one chiper.


I personally would still wait for Feature Pack 10 and have the full JVM 1.8 support also at compile time. But you should be aware that it is time to move to a current release.


If you are on 9.0.1 you are just a "FP" install away. If you are on 8.5.3 there are another good reasons to move to a current 9.0.1 release from security point of view. For example missing SHA-256 support and no TLS 1.2 support - not just for JVM.



Here is a link to the support cycle for the IBM JVM

https://developer.ibm.com/javasdk/support/lifecycle/


Daniel

Traveler 9.0.1.20 Released

Daniel Nashed  18 November 2017 14:44:58
Traveler 9.0.1.20 has been released and I installed it already.
As usual, if you are not waiting for an urgent open issue that is listed in the fix list, it might make sense to wait before installing a new version in production asap.
I have installed it already befor the weekend and it looks good for my small environment.

Beside the fixes listed below there is a new feature:
  • Support for invitee availability search from Calendar on Exchange ActiveSync clients.

Still trying to test it. Not sure the iOS native calendar does support it.

I did not have this on the radar and also never tested with the Verse app. Don't see it working here.
Maybe someone has an idea?

I did not testing but without luck on any of my clients.

APAR # Abstract
LO93044 Slow sync due to prime sync thread looping over large number of child documents.
LO93067 Better handling of encrypted mail when syncing to mobile device.
LO93070 Traveler cleanup bind command may fail when using MS SQL Server.
LO93084 Better handling of Notes Doc Links when syncing to mobile devices.
LO93196 Traveler "did not respond in time" messages on the console log.
LO93217 Additional HTML to plain text conversion options to improve generated plain text content.
LO93221 Do not include previous attachments on reply mails from MaaS Secure Mail client.
LO93236 Improve crash prevention on Traveler server when processing documents.
LO93238 Phone messages with HTML content may not display correctly on mobile device.
LO93258 Traveler server may be unresponsive due to logging thread deadlock.
LO93319 Support for Domino 9.0.1 FP10.


    VIEW_REBUILD_DIR changed to /dev/shm/view_rebuild

    Daniel Nashed  14 November 2017 14:35:22
    We just discovered an interesting configuration issue, which generates quite some logging and is a bit annoying.
    When you specify the view_rebuild_dir without the trailing slash / back-slash, the server will internally append the slash.

    But if you configured the view_rebuild_dir in the config document without the (back) slash the server will tell you every couple of minutes that the server changed the setting.

    This happens why the internal path is always stored with the trailing (back) slash and the notes.ini check to update the parameter compares against the config doc entry without the (back) slash.

    VIEW_REBUILD_DIR changed to /dev/shm/view_rebuild

    So you have a constant changing parameter -- even it looks the same in the notes.ini

    The correct notes.ini entry would be

    VIEW_REBUILD_DIR changed to /dev/shm/view_rebuild/

    including the trailing slash.

    This avoids the log messages.

    -- Daniel

    Erster DNUG "Domino Next" Event 23.11.2017

    Daniel Nashed  9 November 2017 13:44:18
    Nach den Ankündigungen zu Domino 10, der IBM & HCL Kooperation und Domino 2025 haben wir die Agenda für den DNUG Domino Day am 23.11.2017 in Düsseldorf umgestellt.

    Teil des Domino Days wird ein Domino Next Feedback Richtung Domino 10 und beyond.

    Neben Uffe Sorensen wir auch ein Kollege von HCL mit dabei sein.

    Es wird  im Vortrag von Uffe u.A. um die aktuellen Informationen zur Kooperation gehen.
    Ihr könnt alle eure Fragen mitbringen und im "Feedback" Teil geht es dann um Feedback für Wünsche für Domino 10 und beyond.

    Dieser Teil ist bewußt ans Ende gelegt, damit genügend Zeit für Fragen ist ...


    Ich bin sehr gespannt.


    PS: Der Event ist für DNUG Mitglieder kostenlos! Gegen eine Gebühr können auch nicht Mitglieder teilnehmen...

     

    https://www.eventbrite.de/e/dnug-fachgruppentag-domino-day-fachgruppe-verse-und-notesdomino-tickets-35785282744



    09:00 - 09:10
    Begrüßung
    Daniel Nashed - CEO (Nash!Com)
    Manfred Lenz - Technical Sales Professional, IBM Collaboration and Talent Solutions (IBM Software Sales)

    09:10-10:05
    UPDATE: IBM Notes/Domino Feature Packs
    Daniel Nashed - CEO (Nash!Com)

     
    10.05-11:05
    ApplicationInsights & IBM Domino Doublecheck - Der Weg um richtige Entscheidungen treffen zu können
    Christoph Adler - Senior Consultant (panagenda)

    11:05-12:00
    SSL Zertifikate unter Domino - Allgemeine Einführung und Vorstellung der kostenfreien CA Let's Encrypt
    Detlev Poettgen - Geschäftsführer (midpoints)

    12:00-13:00
    Mittagspause


    13:00-13:45
    Domino Application Cloud (DAC) & Domino on Docker
    Michael Finkenbrink - Certified Senior Architect, IBM Collaboration Solutions (IBM Software Services)

    13:45-14:15
    Kaffeepause

    14:15-15:15
    Keynote: IBM Notes/Domino and Verse On-Premises - News/Strategy/Roadmaps incl. Notes Domino 10
    Uffe Sorensen - Messaging & Collaboration Director (IBM Software Sales)
    n.n. - HCL Industries

    15:15 - 17:00
    Notes/Domino Next Feedback" - Workshop in Form eines Knowledge Cafes zum Thema "Anforderungen an Verse Notes Domino 10 und danach
    Uffe Sorensen - Messaging & Collaboration Director (IBM Software Sales)
    Peter Schütt - Leader IBM Collaboration Solutions Strategy D-A-CH (IBM Software Sales)
    Manfred Lenz - Technical Sales Professional, IBM Collaboration and Talent Solutions (IBM Software Sales)


    Domino on Linux Start Script 3.1.3 with changed way to request it

    Daniel Nashed  30 October 2017 05:43:05
    Just updated the start script to a new version with some minor changes.
    There was one issue with systemd on shutdown and I made a change in the way config files are used.

    Most of the new features are coming in either thru projects or when I want something for my own environment.
    I don't get much feedback or feature requests beside that.

    One change triggered by a project was how config files apply. We wanted to use the same configuration for all servers.
    But we wanted special settings for the Traveler servers. So I changed the way the config files apply.
    Now you can use a general config and additional or changed parameters for individual servers.
    That way you can have a general config that you deploy automatically and you keep a server specific file with changes.

    So in that case the general config would be /etc/sysconfig/rc_domino_config.
    And the specific config would be for example: /etc/sysconfig/rc_domino_config_notes.

    This would also work in partitioned environments where each server has a basic configuration and you want additional parameters for a partition.
    On the other side even on partitioned servers you could use variables which depend on variables like the DOMINO_USER.


    Changed way to request the start script

    I am also changing the way you can request the new version. Until now I had a request form.
    Now you just send a mail to dominostartscript  at nashcom.de with the subject "script".

    The old implementation was a servertask which read the data posted in database.
    I switched to a pre-delivery agent with some additional logic to check the message.
    So for example I am not triggering an automatic reply if the message is a reply or is an autosubmitted message.

    I am not yet updating the start script page and want to see first how this works with requests coming in thru the blog.
    And I hope you like the new way to request the start script? Any feedback is welcome.

    -- Daniel



    --------------
    Change History
    --------------

    V3.1.3 30.10.2017

    Problems Solved
    ---------------

    Fixed an issue with systemd in combination with server controller.
    Now the server controller correctly shutsdown when the service is stopped


    New Features
    ------------

    listini -- displays server's notes.ini

    Changes
    -------

    Changed sample rc_domino_config_notes setting DOMINO_PRE_SHUTDOWN_COMMAND to "tell traveler shutdown"


    V3.1.2 01.09.2017

    New Features
    ------------

    New check if Domino ".res" files exist and readable to generate warnings

    New short cut command "res" for "resources"

    Changes
    -------

    In previous version either the server specific config file was used or the default config file.

    The config files are now used in the following order to allow more flexible configurations:

    - First the default config-file is loaded if exists (by default: /etc/sysconfig/rc_domino_config)
    - In the next step the server specifc config-file (by default: /etc/sysconfig/rc_domino_config_notes) is included.
    - The server specific config file can add or overwrite configuration parameters.

    This allows very flexible configurations. You can specify global parameters in the default config file and have specific config files  per Domino partition.
    So you can now use both config files in combination or just one of them.

    Great news Notes Domino 10 and beyond

    Daniel Nashed  25 October 2017 20:03:05
    We got great news today. There will be a Notes & Domino 10 in 2018. And IBM also announced that with joined efforts with HCL Technologies they are working on a strategy for #Domino2025.

    This isn't a new partnership. IBM and HCL are already working together for Tivoli and Rational software. But is was still a big surprise today.

    Notes and Domino 9.0.1 FP10 is committed for 2017 and I am looking forward to hear more about the Notes & Domino Strategy planned for 2018 with Notes and Domino 10.

    And I am really looking forward actively giving feedback for future directions that IBM is asking customers and partners for.


    See details in this official links


    https://www.ibm.com/blogs/social-business/2017/10/25/ibm-announces-investment-notes-domino-version-10-beyond/

    https://www.ibm.com/social-business/us-en/announce/domino-jam2025/


    What a coincident. Tomorrow I am part of a Notes & Domino strategy workshop for a customer and I bet some of the slides of one of the co-speakers need some updates tonight :-)




    Image:Great news Notes Domino 10 and beyond

    Correctly Stopping a Traveler Server

    Daniel Nashed  24 October 2017 15:02:57
    This is not new but I ran into this a couple of times on customer site. Specially on a Traveler HA server this becomes important.

    Shutting down the Traveler servertask when the Domino server/service is stopped might lead to hang situations of the HTTP task.

    The better way would be to shutdown the Traveler servertask first. But even that might lead to undesired results.


    There is a special Traveler shutdown command "tell traveler shutdown" that can be used to let Traveler finish it's work and not accept any new requests before cleanly shutting down.


    When you configured NTS_AUTOSTART_HTTP=true the Traveler task did start the HTTP task automatically but by default the HTTP task will not be shutdown automatically when you just shutdown the Traveler servertask.

    If you configure NTS_AUTOSTOP_HTTP=true the Traveler servertask will automatically take care of shutting down the HTTP task in the right moment.


    This is important for Traveler HA environments because the Traveler OSGi servlet will only connect to the local Traveler servertask which will than either process the request or forward it to another Traveler server (if another server holds the master monitor for the user).

    So if you don't shutdown the HTTP task your load-balancer might still send requests to the Traveler server which reports back that the Traveler Server is not available ("IBM Traveler server is not available.").


    That's an undesirable result for a load-balancer because usually the load-balance does not check for this status and might continue to send devices to that Traveler server.


    It makes sense to configure the server to be either fully available or to not respond to HTTPS requests if the Traveler service is not available.

    This is true if you shutdown the Traveler task or also during shutdown of the whole server.


    My Domino start script for Linux allows to configure a pre-shutdown command which can be used to shutdown Traveler before you shutdown the whole server.


    -- /etc/sysconfig/rc_domino_config --


    # -- Command to execute before shutting down the Domino server --

    DOMINO_PRE_SHUTDOWN_COMMAND="tell traveler shutdown"


    # -- Delay before shutting down the Domino server after invoking the pre-shutdown command --

    DOMINO_PRE_SHUTDOWN_DELAY=10


    There is no simple solution for Windows because Domino is started as a Windows service.
    But if you shutdown your server manually, you should shutdown the Traveler server and also use the notes.ini setting.


    Sadly if you shutdown the whole server all processed will receive the quit command at the same time and will start to shutdown.
    This means it could come to undesired timing issues when HTTP and Traveler do their shutdown at the same time.

    A controlled Traveler server shutdown would be the cleaner solution.


    Update to make it more clear:

    Yes there is a big difference between using tell traveler quit and tell traveler shutdown!

    If you quit a servertask the task has only limited time to continue to work with full resources available.
    If a task or a server is in pending shutdown some resources are not available any more and cannot fully controlled terminate pending work.

    Here is the decumentation for the shutdown command:

    Shutdown [command]     - Stop accepting new work requests, allow current work to complete, and then Quit.  
    If you specify a [command], then this command will be run after the server is idle but before it quits.

    So it makes a lot of sense to use shutdown instead of quit.

    ------------


    Another tip that might be still new for some customers. If your load-balancer or other monitoring software support authenticated (with user credentials), SSL enabled probes, there is a very nice feature in Traveler to allow an end to end check for your Traveler server availability.


    HTTPS://traveler.acme.com/traveler?action=getStatus

    Depending on your language settings you either get


    a.) IBM Notes Traveler server is available.


    b.) IBM Notes Traveler server is not available.

    Or an error message for the end to end check for the user you specified. The user must be allowed to user Traveler and have a proper mailfile for this end to end test.

    c.) The IBM Notes Traveler server cannot connect to your mail database mail/jdoe.nsf on server CN=domino.acme.com/OU=Srv/O=Acme.  Verify that your mail server mail database grants access to server CN=domino.acme.de/OU=Srv/O=Acme and is operational.  If this does not resolve the problem, your administrator may need to verify the network connection between the servers and that the IBM Notes Traveler server is allowed to access your mail server.


    So I would recommend to
    • always use the "tell traveler shutdown" command first
    • enable HTTP Auto shutdown -> notes.ini NTS_AUTOSTOP_HTTP=true
    • use the options in my start script if you are running on Linux
    • have a look into the ?action=getStatus command if you didn't use that before
    -- Daniel

    Notes Client 9.0.1 FP9 F1 released

    Daniel Nashed  14 October 2017 19:31:24
    There is also a client IF1 for 9.0.1 FP9 which fixes one part of the issue that I reported.
    Depending on your configuration MIME messages sent did show up with different fonts on Notes clients.
    It happened in edit mode or when the embedded MIME browser was disabled.

    What has been fixed is that the IF1 client shows correct fronts. But earlier clients still shows different fonts (for example if you send a mail with sans serif it will show up in serif).

    I don't know if that can be fixed at all but IBM is aware of it and is looking into it.

    -- Daniel



    AYAVAQF7WZ         Fix an issue where sent internet mail shows as "serif" font instead of "san serif"        
    JVEKARBEP2         Fixed an issue where the contents are not displayed after editing if a Richtext field contains an image and "Store contents as HTML and MIME" is enabled
    YNABANLSUB         Fix an error 4399 "Value Is Out Of Range" when running deleteuser lotusscript        

    Domino 9.0.1 FP 9 IF2 available with important fixes

    Daniel Nashed  13 October 2017 11:09:53
    Two of the issues fixed in IF2 have been discussed before in my blog.
    But there are also two other critical issues fixed.

    Some of my customers reported DBMT and updall hangs which have been fixed with TDOOAREP8W.
    And the Private on first use folder issue also has been reported before.

    If you have installed 9.0.1 FP 9 you should update to IF2!


    -- Daniel


    JPAIAQ5SKW
            PANIC: DbMarkCorrupt! (d:\notefile\admin4.nsf Dbiid: 0x3D91E116 0x3C07FE17)        

    JVEKAQSGCC
            Shared, Private on First Use Folder not working as expected in 9.0.1 FP9. It is not possible to view the folder in Designer.        

    TDOOAREP8W
            Performances issue (deadlock) with long held lock after update to Domino 9.0.1 FP9.        

    YNABANLSUB
            Error 4399 (Value Is Out Of Range) When Running Deleteuser Lotusscript.        

    • [IBM Lotus Domino]
    • [Domino on Linux]
    • [Nash!Com]
    • [Daniel Nashed]