Domino on Linux/Unix, Troubleshooting, Best Practices, Tips and more ...

 
alt

Daniel Nashed

 

    Domino Storage optimization -- why are there still customers not leveraging the full potential?

    Daniel Nashed  19 September 2021 10:08:37

    In the context of Domino Backup I am looking again into storage optimization.

    There are many different ways to backup Domino databases.
    One of the most advanced option is to leverage a snapshot vendor solution.


    For a snapshot reducing the NSF storage and moving all data out of the data directory, which doesn't need a snapshot backup should be clear to every admin.

    But storage optimization best practices are beneficial for any server and also for run-time performance.


    It's still not clear why so many customers are not taking all the benefits from storage optimization Domino offers.


    I looked back into a presentation I did in 2008 when Domino 8.5.x introduced DAOS and I copied one of the most essential slides.


    Here is a very quick summary of storage optimization benefits without going into too much detail.

    All the functionality is easy to implement and I really don't see why it's not used more.


    What is missing to get more admins implement all of it?



    -- Daniel



    Design compression

    • Reduces 50% of the design.
    • Enabled by default on mail templates but needs to be applied on older databases manually.
    Data compression
    • Reduces 50% of the notes data (basically everything that is not an attachment).
    • Enabled by default on mail templates but needs to be applied on older databases manually.
    DAOS
    • Usually more than 70% in databases are attachments.
    • Moving the attachments to DAOS allows incremental DAOS backup.
    • NSF is reduced by 70% and backup, compact and other operations greatly benefit in performance and backup storage costs.
    • Deduplication safes 30-35% of storage on attachments on average on top of that!
      DAOS T2 could be used to offload older attachments to cheaper storage if needed.

    NIFNSF
    • Moves NIF data from the database to a separate NDX file which can be also stored in a different file-system outside the data directory.
    • Up to 10-15% less storage needed in NSF.
    FT Index moved to a different directory -- specially important for snapshot backups
    • You can move the FT index to a different disk so a snapshot backup would not backup that storage.
    • Sometimes moving FT is also helpful to avoid increasing your NSF file-system.
    DBMT
    • The database maintenance tool is designed to perform all maintenance operations on databases.
    • I still see that most customers use compact -B to maintain databases.
    • DBMT has many advantages including pre-allocating required disk space and allowing the file-system to align the file on disk properly.
    • The compact is a copy style compact and should be performed at most once per week.
    • The servertask is the recommended way to maintain your databases!


    For snapshot backup for sure you want to split data from from index to avoid taking a backup of index data.
    But also for standard backup you greatly benefit form storage optimization outlined above.




    Image:Domino Storage optimization -- why are there still customers not leveraging the full potential?

    Domino One touch setup meets Domino One touch install

    Daniel Nashed  18 September 2021 09:55:52
    Now that I introduced a Domino One Touch server installation, the next logical step is to combine it with Domino V12 One Touch setup.
    I added an extension to the start script and added environment var and JSON configurations for One Touch setup to the start script.

    The new functionality provides a easy to use new "setup" command, which automatically creates a sample configuration for your.
    When you use the install script, all the files are automatically copied to the /opt/nashcom/startscript/OneTouchSetup directory.

    The new command "domino setup" comes with a couple of options.
    I don't think I have to explain those in detail and I appended the draft help below.
    It will work for first and additional servers with ENV and JSON configurations.

    Now bringing up a new server from scratch drills down to those 3 commands :-)

    curl -sL https://github.com/IBM/domino-docker/raw/develop/start_script/install_domino.sh | bash -
    domino setup
    domino start

    What do you think?
    Would this help admins who never used Linux or One Touch setup in Domino V12 to have an easy start?

    PS: I also built it for myself. I don't want to setup any of my native installed test servers manually.
    And it helps also testing One Touch configurations.  I am looking into adding JSON validation via jq ...


    -- Daniel


    PS2: Of course you have to have the software available somewhere as described in my One Touch install blog post.
    It's not yet published. I have it working here but will do some testing first with multiple configurations.



    One Touch Setup Commands

    setup            edits an existing One Touch config file or creates a 1st server ENV file
    setup env 1      creates and edits a first server ENV file
    setup env 2      creates and edits an additional server ENV file
    setup json 1     creates and edits a first server JSON file
    setup json 2     creates and edits an additional server JSON file

    setup log        lists the One Touch setup log file
    setup log edit   edits the One Touch setup log file


    CentOS Stream uses "epel-next-release"

    Daniel Nashed  14 September 2021 05:35:05

    This morning my servers installed "epel-next-release".
    There is a link describing all the details and it also comes with an FAQ.

    See details here: https://docs.fedoraproject.org/en-US/epel/epel-about-next/
    But let me quote the most important details for administrators below.

    First of all you don't need to really care about this change. It will all work automatically and makes sure all packages still
    work for CentOS Stream and those changes are in preparation for the next minor version update for RHEL.
    But the details are interesting to read and show some background how RedHat is handling it.

    This isn't a CentOS Stream only repository and it will help to ensure packages are compatible.
    It will be used only for a few packages. Just in case you are wondering, here are the details.

    -- Daniel

    ...

    EPEL packages are built against RHEL.
    EPEL Next is an additional repository that allows package maintainers to alternatively build against CentOS Stream.
    This is sometimes necessary when CentOS Stream contains an upcoming RHEL library rebase, or if an EPEL package has
    a minimum version build requirement that is already in CentOS Stream but not yet in RHEL.

    ...

    Due to the strong compatibility guarantees of RHEL, most EPEL packages built against RHEL install just fine on CentOS Stream.

    ...

    "Next" correctly describes the purpose of the repository, which is providing packages compatible with the next minor release of RHEL.

    ...

    EPEL Next is bound by the same guidelines and policies as regular EPEL. If a version upgrade is inappropriate for EPEL, it’s inappropriate for EPEL Next.

    ...


    Tika in Notes/Domino

    Daniel Nashed  9 September 2021 19:01:12

    At DNUG Domino this week there have been some interesting questions how Tika works and how it could be used to search attachments.
    I explained that Tika is only used in the back-end by Notes/Domino to index the attachments and it not used for searching attachments.

    So Tika feeds attachment text extracts into the Indexing process and is not part of the search operations.
    Tika replaced the old legacy external code "Key view package" used for a long time. Tika is an open source Apache project.

    When I first looked at it when it was introduced I started a NSD to see what happens under the coverts.
    You can see from the call-stack below that Notes/Domino index processes communicate with Tika leveraging libcurl functionality.

    Tika is listening on localhost only after being started by Notes/Domino and any Notes/Domino process can send requests to Tika.


    Call stack from indexing attachments in a database

    ############################################################
    ### thread 1/12: [ nUpdate:  0450:  0f84]
    ### FP=0x001ab4e8, PC=0x77b315b0, SP=0x001ab4e8
    ### stkbase=0x001b0000, total stksize=81920, used stksize=19224
    ############################################################
     [ 1] 0x77b315b0 kernel32.TlsGetValue+48 (1ab640,FFFFFFFFFB3B4C0,100000000,10)
     [ 2] 0x7FEFCF06FF9 mswsock+28665 (4c0,7FEFDB9D0D8,76C6F2242BD7,0)
     [ 3] 0x7FEFDB9507C WS2_32.select+348 (0,4c0,0,7FEFDB9507C)
     [ 4] 0x7FEFDB94FFD WS2_32.select+221 (4c1,0,FFFFFFFFFFFFFFF,a0)
    @[ 5] 0x7FEF2044936 nnotes.Curl_socket_check+582 (4c1,0,0,0)
    @[ 6] 0x7FEF2049B9F nnotes.Curl_readwrite+159 (0,15c73b20,1abf89,0)
    @[ 7] 0x7FEF2035E41 nnotes.multi_runsingle+3393 (15c78ce0,1ac010,cc61600,0)
    @[ 8] 0x7FEF2034436 nnotes.curl_multi_perform+118 (0,0,15c78ce0,3e8)
    @[ 9] 0x7FEF20308B4 nnotes.easy_perform+404 (0,7FE00004E2B,7FE00000000,1)
    @[10] 0x7FEF20267CB nnotes.GetChar+4651 (0,d104ae0,cc616d8,13)
    @[11] 0x7FEF2029E37 nnotes.FTGetDocStream+519 (0,7FEF03A9C30,243,1accb0)
    @[12] 0x7FEF03A9D08 nftgtr40.NotesStreamReadChar+216 (45,290021000C,d104ae0,B000007E2)
    @[13] 0x7FEF202DDFE nnotes.FTLexMatch+142 (d104b88,1ad080,7FEF03A9C30,d1031c0)
    @[14] 0x7FEF03A752A nftgtr40.FTGCreateIndex+1466 (d100003,1ad2f0,0,465687300000000)
    @[15] 0x7FEF03A2F2E nftgtr40.CFTNoteIndexer::ProcessDoc+350 (d1031c0,90ef,121de,0)
    @[16] 0x7FEF03A5F81 nftgtr40.FTGIndexIDProc+817 (0,121de,fe,0)
    @[17] 0x7FEF275F20B nnotes.IDEnumerate+235 (20000466,1ad5f0,0,125833F003F0036)
    @[18] 0x7FEF03A519D nftgtr40.FTGIndex+6701 (1cdcc48,1,1148,1cdcc48)
    @[19] 0x7FEF2021690 nnotes.FTIndexExt2+4416 (243,20001148,0,0)
    @[20] 0x13F51BE68 nUpdate.UpdateFullTextIndex+488 (0,3fff,1aeaf0,0)
    @[21] 0x13F51BB26 nUpdate.UpdateCollectionsExt+3318 (0,1aeaf0,7FEF2740001,4000000)
    @[22] 0x13F51AE27 nUpdate.UpdateCollections+135 (1aeb38,7FE00000001,1aef40,20000012)
    @[23] 0x13F51483B nUpdate.PerformRequest+715 (0,1e,cd60098,0)
    @[24] 0x13F517978 nUpdate.Update+3576 (3304,1,0,3)
    @[25] 0x13F511181 nUpdate.AddInMain+385 (32cc38,0,11700001,0)
    ...

    Tika documentation

    The Tika service is fully documented and is a simple REST base interface.
    See this link for full documentation including the REST interface https://cwiki.apache.org/confluence/display/TIKA/TikaServer
    This makes Tika available for attachment based applications outside the standard use case.


    Using Tika for your own text filtering

    The discussion yesterday at the on-line conference resulted in a different approach needed for the customer.

    They will have to analyze attachments already in the routing phase to categorize and re-reroute messages based on sender, text, subject and also attachment content.

    For an application like this Tika could be a good candidate to extract the attachment data. This would need a custom solution, to send the attachments to Tika.
    Probably I would use an own Tika instance on another port. But sending attachments over a REST interface isn't rocket science if you know how to use libcurl.

    I am thinking about building a flexible and customizable routing and processing add-on for Domino leveraging Tika in the back end.

    This will need an extension manager to stop the mail in mail.box and a small servertask to process the message.
    And this would open new business cases for how Domino could be used to route mail for different type of purposes.

    But back to the troubleshooting approach.


    Troubleshooting and analysis

    In a customer situation needed to figure out how certain attachment types are handled by Tika and which results we get back.

    There is debugging on the Notes/Domino side, which helps to trace requests.

    notes.ini DEBUG_FILTER_TIMING=1

    This setting results in console.log output, which is easy to annotate:

    13.05.2021 12:34:54 Tika Attachment Filtering - took 4539 msecs Filtering Attachment 'domino_backup.odp' in '/local/notesdata/tika.nsf' (DocID = 2330), size = 402668, occurrences = 1785

    This was our first step to look into what was going on in the customer environment.
    I wrote some code to parse the log results into a Notes database.
    This is a similar tool I have for client_clock, server_clock, Domino iostat, sem debug and other files. All tools raised over the years thru on-site troubleshooting.

    It just takes the data that was already there and puts it sortable into documents.

    This approach provided already more details about how Tika is behaving with different attachment types.
    But this made me curious to really find out how Tika works..
    So I ended up writing my own C-API based analysis and performance troubleshooting tool.


    Analysis application to benchmark and troubleshoot Tika

    The tool crawls a whole NSF and sends all attachments to a Tika process that you can run manually on the same machine.
    The requests are send to Tika in a very similar what Notes/Domino sends them.

    All information returned from the request including meta information coming from Tika is logged into trace documents.
    In addition it is hooking into the Tika process to get performance data directly from the Tika process.
    Assumed we are the only thread sending requests, This data can be aligned with other results.

    This showed very detailed information about what is returned from Tika.
    I am getting the full text stream back. But I am just checking the size.

    I found out some very interesting details that might be similar for other environments.

    Here is an example:

    Image:Tika in Notes/Domino



    Analyzing logs and getting best practices

    Notes/Domino can only be as good as the Tika process handles data.
    And every new Tika version might bring better results.

    But there are some general rules for optimization.
    • You should exclude all graphics formats if not really required!
      Some graphic formats cause a lot of overhead with not very useful text data returned.
    • All type of ZIP/compressed files should be avoided, because the exclusions are always for the attachment name you pass.
      An extracted text file in ZIP format might get you huge results back.
      For example a NSD zipped is a small attachment. But expanded it can be huge!
    • PDF took some time in my environment. But you can't avoid PDFs
      It wasn't always just a matter of the attachment size

    In my environment I have only limited analysis data.
    But in a larger environment this could provide useful information for optimizing the Tika indexing.

    Obfuscating data

    The data can be completely obfuscated. The attachment name and the database names can be obfuscated with a single switch.
    The attachment extensions remains the same and also the path name. But the file names are always obfuscated by turning them into a hash.

    Because this data can be very sensitive, I added obfuscation from day one to allow the customer who first ran it, to share data with me.


    Conclusion

    This would be a good took if you look into larger full text index deployments. For example when you want to enable FT index including attachment indexing for your Verse users.

    I have never been a fan of attachment indexing. But if you want to enable it in a larger environment, this tool might help you.

    And for sure you should look into optimizing attachment indexing by excluding certain types of file extensions.

    ZIP formats are problematic! But they cannot be avoided in all type of environments. But you should try to!




    How do you backup DAOS? Some new ideas ...

    Daniel Nashed  29 August 2021 09:54:47

    Storage optimization before backup
    For my server I am testing multiple ways to backup NSF already.
    You have seen BorgBackup and I am looking into different ways of snapshot and standard Domino V12 Backups via file-system ( OpenZFS snapshots etc ).
    I moved out FT, NIFNSF, enabled all compression options, DAOS and now have a pretty small NSF files to backup.
    OK DAOS is a bit overkill for a one person server. But it offers me to optimize my backup as well.
    DAOS can be backuped using any backup solution at any time. the files are written once and never change.
    If the attachment changes a new NLO is created and the ref-count for the old NLO is
    lowered by the referencing documents removed.

    DAOS Backup

    I am now thinking about multiple ways to backup. A standard backup doesn't sound like the best option for me.
    A daily incremental backup still causes a lot of overhead on the backup side.

    In my case I probably have not to take care too much about deletions.
    The back-end storage would be allocated once. And deleting NLOs after a time would be really difficult to handle.

    Also I want to push the DAOS backups from my cluster partner to the same location for deduplication.
    The target storage is a Hetzner Storage box which can be either accessed over SAMBA file share or SFTP/SSH.


    rclone
    A great tool for Linux and also Linux with many options for different back-ends and also with an optional encryption layer.
    I am currently looking into pushing all my NLOs into a single directory for deduplication reasons.

    The copyto option would help to copy only the new files.
    https://rclone.org/commands/rclone_copy/

    And I have just tested the encryption layer as a connector on top with my One Drive account

    https://rclone.org/crypt/

    One drive would be another interesting option for a smaller environment.
    The standard private or family accounts come with 1 TB of storage included already per user!


    Copyto operation with rclone

    So rclone is a very interesting data mover and is very flexible to integrate different back-ends.

    The encryption on top for DAOS would be and interesting option.
    But also Domino V12 offers shared encryption keys for DAOS T1 storage.
    So we could have even multi level security...

    A nightly sync job would just sync all DAOS objects to the remote.

    And yes the biggest challenge would be deleting NLOs. In my case I might just move attachments to an archive database.
    So I would not expect many NLOs to be deleted.

    For a single server keeping track of deleted NLOs would be still possible.
    But for a multi server environment it would be the safer bet to not delete any NLO.

    Yes there are regulations for GDPR. But if the NLOs are encrypted and there is no attachment any more to access it, this might be an acceptable way.



    Notes 12.0.1 Advanced Properties Box for Replication Conflict Troubleshooting

    Daniel Nashed  27 August 2021 05:21:56

    Did I already say how much I love the new properties box?  :-)

    Just created a replication/safe conflict to see how this looks like.
    Even without checking the item sequence numbers, this is already very very helpful for replication/safe conflicts.

    And you can use the filter for different and new fields to narrow it down.

    -- Daniel

    Image:Notes 12.0.1 Advanced Properties Box for Replication Conflict Troubleshooting


    DNUG Domino Day Online -- With latest Domino V12 infos

    Daniel Nashed  26 August 2021 12:15:13
    Another exciting DNUG Domino Day on-line.

    We will cover the latest information about Notes and Domino including Beta material.
    And we have Thomas Hampel from HCL product management to talk about brand new stuff.

    My sessions about CertMgr and Domino Backup will also cover up to date Domino 12.0.1 Beta functionality

    https://dnug.de/event/dnug-online-domino-best-practices-domino-notes-v12/


    Sorry that this is again in German .. But we are a German speaking user group and tried to get local speakers

    -- Daniel

    Image:DNUG Domino Day Online -- With latest Domino V12 infos

    Happy 30th birthday Linux!!

    Daniel Nashed  26 August 2021 07:45:50

    Image:Happy 3th birthday Linux!!

    WOW 30 years Linux..

    I was an early bird and downloaded it to floppy disks at Düsseldorf university via ftp command line ..

    Later in Domino 5.0.3 we got Linux support for the first SLES versions.
    Now Linux is the OS powering the internet and IMHO the best OS we have for Domino.
    It has been an interesting journey so far.

    And it is getting easier every day for users as well.
    Today my dad likes his Ubuntu desktop more than his Windows notebook ..
    Updates and support are so much easier and faster!

    Congrats to the Linux community!

    Daniel


    Introducing Domino on Linux One Touch Install

    Daniel Nashed  25 August 2021 10:18:28

    For a DNUG Sametime on Docker/K8s workshop preparation I got the question how I would install Domino on Linux using current best practices.
    So I installed Domino step by step and noted all the different steps.

    I came up with the idea to bring this into a simple script that also re-uses functionality we already have in our Docker Community project.

    This includes finding the latest version and also helping you to find the software download.

    I came up with this mark down document and also a simple to use script that you could either just use or adopt for your own needs (it's open source).

    You find the step by step Domino on Linux guide in the referenced document.
    And it is also shows which steps the shell script would perform automatically for you.

    The software file contains all the information about current Domino versions and would install the latest official Domino12.0.0 release.

    Of course you need the Domino Linux web kit available locally or on a shared network location where it can be automatically downloaded.

    https://github.com/IBM/domino-docker/blob/develop/start_script/install_domino.md


    Here is the script. Again as mentioned in another blog post: I would download and have a look before executing a script via curl/bash ;-)
    But yes this single command would install Domino, my start script and configure everything else described in the document for you.


    curl -sL https://github.com/IBM/domino-docker/raw/develop/start_script/install_domino.sh | bash -


    If you are not a container fan, this would be the fastest way to install Domino.
    And once installed you can use Domino V12 One Touch Setup to configure your server.

    The script could also be an inspiration for your own automated installation.

    What you think? Is this helpful? What is missing?
    Right now it is just for the current version. But it can be customized.

    -- Daniel





    Notes 12.0.1 Beta 1 Client

    Daniel Nashed  24 August 2021 09:35:22

    There is a blog post with all the details of what is in Beta 1 --> https://blog.hcltechsw.com/domino/introducing-domino-v12-0-1-and-htmo-3-0-3-beta-program/
    But there are 3 major things I really like to outline in 12.0.1 Beta1 from my side ...
    • "Domi" the new integration for Sametime Meetings, Zoom, Webex, Teams is part of the new mail template.
      It was available before with a template customizer and is now fully integrated.
    • The new look & fell in the workspace with high resolution images really looks a lot better.
      Some applications already got a new icon for beta 1.
      And the mail template got some other look & feel improvements like in the memo form
    • And the most awesome part is the advanced properties/resizable box coming from Panagenda.
      Thanks to Julian Robichaux for this master piece of properties box!

      You can not only display and copy, search, select data from documents and profile documents

      There is also a very flexible compare for two documents!
      Some might still know the old "difference of two documentes" add-in menu we lost many releases ago.
      Now we finally got something new and it's extremely well done implemented by a developer using it very day for his own work.
      You can tell when using it for a couple of minutes ;-)
    -- Daniel



    Image:Notes 12.0.1 Beta 1 Client



    Image:Notes 12.0.1 Beta 1 Client



    Image:Notes 12.0.1 Beta 1 Client


    Links

      Archives


      • [IBM Lotus Domino]
      • [Domino on Linux]
      • [Nash!Com]
      • [Daniel Nashed]