Domino on Linux/Unix, Troubleshooting, Best Practices, Tips and more ...

 
alt

Daniel Nashed

 

Debugging program crashes with gdb on Linux

Daniel Nashed  9 April 2023 03:51:09

This bugged me for a while because I had no idea what was happening.
One of my OpenSSL-based tools in C crashed once in a while.
I was only able to find out once wrote my own small tool to check the server listener of my other application.

The crash happened very intermittently in different places when I opened and closed the connection very quickly.
Adding a delay of 1 ms stopped the crash. But if you are running a service on the internet with port scanners around, you better find out in detail.

It turned out to be the SIGPIPE event causing my program to terminate.
But the interesting part is how I found out about the crash.

gdb (GNU debugger) is the tool also used by HCL Domino's NSD to get the call stacks of running or crashed processes.
But you can also invoke it manually to trace your application and catch terminations.

The interface is a bit low-level, which reminds me how I low-level formatted my first hard drive (MFM drive on my XT) jumping into the controller's BIOS with the MS DOS debugger ages ago -- but that's a different story...


First of all you have to install the debugger -- on a Domino server it is hopefully already installed ( command-line for CentOS, SUSE use zypper install, Ubuntu/Debian uses apt install ).

yum install gdb

Then you simply run it from command-line

gdb

In the next step, you specify the program to run

file bin/nshciphers
Reading symbols from bin/nshciphers...

Your application should be compiled with symbols (-g switch)

Depending on the program, you also need to define its command line parameters

set args -s -cert /mnt/d/rsa_cert.pem -key /mnt/d/rsa_key.pem

Finally, you run the program

run

In my case I then hit the program really hard trying out different types of ciphers via OpenSSL C code until the following showed up on screen:

Program received signal SIGPIPE, Broken pipe.
0x00007f27a62bb920 in write () from /usr/lib/libc.so.6

In the final step I ran the back trace command to show the call stack causing this program termination:

bt

#0  0x00007f27a62bb920 in write () from /usr/lib/libc.so.6
#1  0x00007f27a6873db5 in sock_write () from /usr/lib/libcrypto.so.3
#2  0x00007f27a686b4d7 in bwrite_conv () from /usr/lib/libcrypto.so.3
#3  0x00007f27a6869e86 in bio_write_intern () from /usr/lib/libcrypto.so.3
#4  0x00007f27a686a403 in BIO_write () from /usr/lib/libcrypto.so.3
#5  0x00007f27a6721878 in ssl3_write_pending () from /usr/lib/libssl.so.3
#6  0x00007f27a6722864 in do_ssl3_write.localalias () from /usr/lib/libssl.so.3
#7  0x00007f27a66fbd3a in ssl3_dispatch_alert () from /usr/lib/libssl.so.3
#8  0x00007f27a66fa8e5 in ssl3_shutdown () from /usr/lib/libssl.so.3
#9  0x000055ef1f11019e in ServerCheck (pszHost=0x0, pszPort=0x0, pszPemCert=0x7ffe6b505ed2 "/mnt/d/rsa_cert.pem", pszPemKey=0x7ffe6b505eeb "/mnt/d/rsa_key.pem", pszCipherList=0x0, Options=0)
    at nshciphers.cpp:388
#10 0x000055ef1f1112f5 in main (argc=6, argv=0x7ffe6b5059d8) at nshciphers.cpp:877


It turned out the lower level Linux code in glibc causes the signal to be thrown and the only way to work around this issue, was to ignore the signal thrown.

This led to the following code in my application to prevent the problem to happen:

    struct sigaction SignalAction = {0};

    /* Ignore broken pipes, which can happen when the remote side is not behaving well */
    SignalAction.sa_handler = SIG_IGN;
    ret = sigaction (SIGPIPE, &SignalAction, NULL);


The interesting part is that I found almost no reference to this problem until I found our the underlaying issue.
It was really the GNU debugger helping me to find the right path to solve it.

I had this problem open for a while with my own small OpenSSL-based CA in C, which crashed once in a while.

Maybe this helps other debugging their weird problem as well.

This made my day after being quite frustrated about this problem yesterday.

-- Daniel


Comments
No Comments Found

Links

    Archives


    • [HCL Domino]
    • [Domino on Linux]
    • [Nash!Com]
    • [Daniel Nashed]