Fault Recovery with external Applications
Daniel Nashed – 5 August 2013 09:13:24
When a Domino server is recovered after a crash using the fault recovery functionality, the server tries to cleanup all resources used by the server.
This includes processes, shared memory and message queues.
NSD does cleanup all resources that the Domino server keeps track of using Domino code to allocate a resource.
So for example all shared memory that has been allocated via Domino memory manager, all processes that have been invoked thru Domino.
Domino keeps track of the resources in the following files
pid.nbf (processes)
shm.nbf (shared memory)
mq.nbf (message queues)
In case of add-on applications like virus-scanners, backup software and other C-API based applications not all of those resources might be tracked by the Domino server.
For example when creating own message queues or own shared memory allocated by direct OS calls.
In that case Domino cannot remove those resources via NSD automatically in case of a crash.
For those kind of applications you should ask your software vendor if they have a separate cleanup script that you can specify in your Domino server doc.
Also if the server is in an instable situation and not all resources have been correctly tracked a "nsd -kill" might not clear all resources even without external software because only the "tracked" resources are cleaned up.
In those cases you would need to cleanup resources manually. My start/stop script has a "cleanup" option which would remove all processes, all shared memory and all MQs independent if they are tracked or not.
The script basically enumerates all resources used by the Linux/Unix user the server is running with and removes all the resources as a last resort option.
This works fine in most of the cases and is a last resort option but if external software is installed in separate directories the start script does filter out those processes.
For those kind of configurations I have added a new configuration variable to specify additional binary directories to check processes against in cleanup case.
I cannot just kill all processes from a user because this could also include relevant other processes not directly used by the Domino server partition.
In addition you always have the option to include own scripts with DOMINO_PRE_KILL_SCRIPT / DOMINO_POST_KILL_SCRIPT.
But in most cases just adding the additional binary directories via DOMINO_3RD_PARTY_BIN_DIRS should work fine.
This option is new and I am currently testing it. If you have feedback or better ideas in this area please let me know before I am releasing the update.
-- Daniel
- Comments [0]