Domino on Linux/Unix, Troubleshooting, Best Practices, Tips and more ...

 
alt

Daniel Nashed

 

Domino 9.0.1 FT Index Hang and potential crash

Daniel Nashed  10 December 2017 05:08:37
We ran into a hang situation multiple times during FT indexing. It turned out that this is a regression introduced in FP9 due to changes in the FT index area.

In certain situations the FTIndex update does hang getting document data and will cause one CPU core to be maxed out for this thread.
The description of the SPR says it is a "spike" but it more looks like the thread permanently uses CPU.

This can happen with updall, DBMT and also other tasks updating the FT index.
The process cannot be stopped and this will also cause that the server cannot be shutdown cleanly.

We got a hotfix which will will be included in IF3. After applying the hotfix we had no new server hangs.

I am including the call-stack for the hang to this blog post to have it searchable for others who might run into the same issue.
If you have not installed FP9 you should wait for IF3. If you are on FP9 and run into this issue, take a full NSD, open a PMR and reference the mentioned SPR numbers to get the fix.

-- Daniel


-- Fixed SPRs --

SPR #SVEM9SLCL7
J3 server crashed on DBMT task, while full text indexing the database

SPR #TDOOAT6LK9
CPU spike when running dbmt (or updall/update task) and creating full text index.

-- Call Stack --

Thread 3 (Thread 0x7f2c5da71700 (LWP 17594)):
#0  ODSToOrFromHost (toHost=32769, type=0, vbuffer=0x7f2c5da6e8e0, iterations=1) at ods.c:824
#1  0x00007f2cba7ef8fe in ODSReadItem (src=0x7f2c54466d96, type=, dest=0x7f2c5da6e8e0) at ods.c:1420
#2  0x00007f2cbab635e2 in GetChar(STREAM_CTX*, STREAM_DATA*) () from /opt/ibm/domino/notes/latest/linux/libnotes.so
#3  0x00007f2cbab64932 in FTGetDocStream () from /opt/ibm/domino/notes/latest/linux/libnotes.so
#4  0x00007f2c5d390919 in NotesStreamReadChar (arg=) at ftg_dstr.cpp:1412
#5  0x00007f2cbab5ca7c in FTLexMatch () from /opt/ibm/domino/notes/latest/linux/libnotes.so
#6  0x00007f2c5d39296c in FTGCreateIndex (pFTGCtx=0x7f2c4c00abf8) at ftg_dstr.cpp:1839
#7  0x00007f2c5d38bac0 in CFTNoteIndexer::ProcessDoc(FTG_CTX *, struct {...} &) (this=, pFTGCtx=0x7f2c4c00abf8, docIndexerInfo=...) at ftgindex.cpp:2074
#8  0x00007f2c5d38c5d1 in FTGIndexIDProc (Parameter=, NoteID=207326) at ftgindex.cpp:1685
#9  0x00007f2cb999285d in IDEnumerate (hTable=536872571, Routine=0x7f2c5d38c343 , Parameter=0x7f2c4c00abf8) at idtable.c:2216
#10 0x00007f2c5d38e252 in FTGIndex(FT_THREAD *, struct {...} *, WORD, char *) (pftt=0x7f2cb41004d0, pFTStreamCtx=0x7f2c4c00abf8, Options=392, StopFile=) at ftgindex.cpp:1146
#11 0x00007f2cbab5adce in FTCallIndex () from /opt/ibm/domino/notes/latest/linux/libnotes.so
#12 0x00007f2cbab5c3a3 in FTIndexExt2 () from /opt/ibm/domino/notes/latest/linux/libnotes.so
#13 0x00007f2cb93e8485 in UpdateFullTextIndex (hDB=1154, Pathname=0x7f2cb4101648 "mail/c1/xn06451.nsf", Flags=201342976, fullTextStatus=8) at update.c:1239
#14 0x00007f2cb93ea78f in UpdateCollectionsExt (_hModule=, Pathname=0x7f2cb4101648 "mail/c1/xn06451.nsf", Type=2, Flags=201342976, Flags2=0, mSecs=0, ViewNoteID=0, ContainerObjectID=0, ViewTitle=0x40a360 "", retDbTitle=0x0, fSrchSite=0, QueuedRequest=0, retbLater=0x0, fullTextStatus=8, wantsFulltext=0x0) at update.c:660
#15 0x00007f2cb93ea957 in UpdateCollections (_hModule=32769, Pathname=0x0, Flags=, ViewNoteID=, ContainerObjectID=, ViewTitle=, retDbTitle=0x0, fSrchSite=0, QueuedRequest=0, retbLater=0x0, fullTextStatus=8, wantsFulltext=0x0) at update.c:106
#16 0x0000000000405238 in UpdallThread (threadparam=) at dbmt.c:2108
#17 0x00007f2cb98e7be3 in ThreadWrapper (Parameter=) at thread.c:1183
#18 0x0000003aae007aa1 in start_thread () from /lib64/libpthread.so.0
#19 0x0000003aadce8bcd in clone () from /lib64/libc.so.6

Comments

1Greg  11.12.2017 8:40:45  Domino 9.0.1 FT Index Hang and potential crash

I have same problem.

I configured DBMT at domino startup, and used "-range" option to specyfy time slot for DBMT.

I also used optuion "-ftindays" to frequently renew FT on databases.

Hangs described by You happens only when time slot is too short to renew FT in all databases in queue (All databases where FT creation date in past is over than value in "-ftindays"). I made PMR and got information from IBM that colizion is with "updall" process. As I observed when time slot is going to end, DBMT can stop compact threads but can't stop FT threads (allow FT threads to complete started databases).

From my point of view, DBMT stops controling "updall" process after time slot, "updall" does not know that DBMT FT thread still rebuilding FT on database xxxx.nsf and starts updating this index, and we have colizion.

If You alreday testing IF3, You could force described condition)

If somone have alredy updated to FP9. I found walkaround of this problem (works on my system). You can start DBMT with the same options -updallthreads -compacthreads -ftithreads but not on system startup, and not option -range but -timelimit.

DBMT will also hangs if conditions described by me happens, but command "tell dbmt quit" will work.

Good luck.

2Rob Kirkland  19.12.2017 17:24:00  Domino 9.0.1 FT Index Hang and potential crash

Reading Greg's interesting comment, I wonder if Greg's problem would disappear if he disabled UPDALL entirely. IBM expressly warns us to disable UPDALL upon enabling DBMT. Greg, if you read this and I'm misunderstanding your comment, please clarify.

3Irv Schor  20.12.2017 21:18:21  Domino 9.0.1 FT Index Hang and potential crash

I reported this to IBM Support in early September. After going through a couple of hotfix attempts that did not work, we were provided with patch which merely downgrades the FT code to FP8 until they get it resolved (what I was told). We're okay now, but it definitely caused pain on a busy mail server with DBMT. The OS was basically spiking CPU to 100% without any letdown. Updall was not running on our servers.

Links

    Archives


    • [HCL Domino]
    • [Domino on Linux]
    • [Nash!Com]
    • [Daniel Nashed]