Domino backup using a ZFS target in production
Daniel Nashed – 21 October 2021 04:24:40
Specially for hosted servers Domino backup can be a challenge.
I am implementing and testing multiple ways to run backup. One I posted about earlier was a Bog Backup integration.
Borg Backup compresses and deduplicates data.
But there are other ways to optimize a storage back end for backup.
ZFS is a very very cool and powerful file-system which provides many options other standard Linux file-systems like ext4 or XFS doesn't provide.
Beside very easy and flexible management of storage and integrated RAID support there are 3 other main factors I am interested in for backup:
- Compression
- Deduplication
- Very efficient snapshots
A snapshot would require the data disk of your server to run on ZFS and this will be a future integration I am working on for Domino Backup.
In this first integration I am focusing on using ZFS as the backup target file-system.
Similar to using it on a dedicated machine like a TrueNAS Scale server or other storage devices leveraging ZFS.
Leveraging ZFS allows you to store backup for multiple days in a very efficient way leveraging the integrated compression and deduplication.
Of course this means you have to optimize your maintenance operations and should only have a database compact at most every week!
And a file level back-end also allows to merge delta data, which can occur during on-line backup, automatically into the backup target databases on the ZFS file-system.
So your backup is always consistent in itself and you could copy it back without the restore command.
Storage optimization first
My server uses DAOS and I am storing DAOS NLOs using the rclone tool on an encrypted repository remotely via SSH.
And I moved out the NIF indexes using NIFNSF. I only need to take care of backing up the pure NSF data.
Storage back-end ZFS
To store backups I added another disk to my virtual machine, implemented the ZFS storage driver on CentOS Linux and creates the ZFS file-system.
Enabling ZFS and ensuring it continues to work even you update your machine to later kernels is a different story.
Lets assume for this article we have ZFS installed and available. A TrueNAS appliance running in VM for example would have ZFS installed and configured out of the box in the right way.
In my home lab I have a TrueNAS VM running on an ESXi server serving different type of volumes available to all my local servers and clients.
For the hosted server using a ZFS disk directly attached seemed like the easiest way for now.
Creating your disk pool from a single disk & adding a file-system and mount point
Creating a pool from a disk is a very simple step. You can extend the pool with additional disks later.
And you can add cache and log devices. Usually for a simple backup on a small environment a single disk would be a good starting point.
You would gain a lot of performance adding cache devices and multiple disks! But that would be a different type of article.
If you are interested in the ZFS file-system the following presentation is a must watch/read --> https://papers.freebsd.org/2020/linux.conf.au/paeps_the_zfs_filesystem/
Let's create a simple pool and our backup file system.
zpool create nsh-pool /dev/sdb
zfs create -o mountpoint=/local/backup nsh-pool/backup
Enable dedup per volume
Deduplication takes CPU and RAM resources (specially for larger file-systems).
On the other side is very very efficient and helpful. For a Domino server file-system this would not perform well.
For a backup target the performance is good and allows to store backup data optimized.
zfs set dedup=on nsh-pool/backup
Enable compression per volume
Compression almost takes no performance. The resources spend on compression are saving disk I/O on the other side.
So compression should be always enabled.
zfs set compression=on nsh-pool/backup
Disable atime
Like other file-systems ZFS also stores the access time of data. As a good practice this should be also disabled to avoid additional write requests for useless information for Domino or backup.
In contrast to other file-systems this isn't a mount option. All operations are performed using the zfs command line tool.
zfs set atime=off nsh-pool/backup
Get volume properties
The command can also be used to query single or all parameters at run-time.
There are a lot of parameters you can set and query. This would be also a blog post on it's own.
zfs get all nsh-pool/backup
Server file-systems
After introducing the ZFS pool and volume my main file-systems look like this:
df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 38G 28G 8.0G 78% /
nsh-pool 53G 128K 53G 1% /nsh-pool
nsh-pool/backup 68G 15G 53G 22% /local/zfs/backup
Configuring Domino Backup
Loading the Domino Backup task once creates the configuration database from the template.
After creating the database you just have to simply change the default configuration and point to your backup ZFS file-system location to leverage the standard file-copy integration.
This is the simple out of the box configuration integrated into Domino Backup working well for storage back-ends like ZFS.
Once configured the Domino Backup application takes care of also managing the backups. In my case I keep the 7 day retention time.
Scheduling backup
I decided to run backup on one cluster node at 12:00 and on the other cluster mate on 18:00 each day.
That's all your have to configure to get your backup up and running.
Backup logs
After 4 days my backup repository is filled with backups from the last days.
Domino backup takes care to prune older backups automatically when the retention time is reached. So this will be a rolling window.
Backup results on storage level
My server is a small box hosted at my favorite provider. It's not a large production environment but shows the effect of ZFS on your backup storage.
In my case I see a great storage reduction on this server. Another server only has a factor of 1.5. I would not expect dramatic compression benefits on a larger server.
It might be that the free space I have in my databases is causing the better compression level.
Domino by default overwrites free space with a defined pattern which is can be compressed quite well.
zfs get all nsh-pool/backup|grep compress
nsh-pool/backup compressratio 2.61x -
nsh-pool/backup compression on inherited from nsh-pool
The more interesting part here is that for the 4 days of backup I have already a deduplication rate of around 3.5 !!!
And this is also what I saw on customers using other type of deduplication like Cohesity backup.
zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
nsh-pool 59.5G 4.34G 55.2G - - 5% 7% 3.49x ONLINE -
Conclusion
ZFS storage is a very good way to store Domino backup data. It provides an easy to use interface without complex integration.
You just run your backup after pointing it to the right storage pool. And this is almost a fire and forget backup -- fully self maintained.
Of course this will copy all data for a full backup to ZFS storage. But the physical size of the backup data does not increase dramatically applying the right Domino database maintenance operations.
I am looking into ZFS storage snapshots running Domino on ZFS as well. But this would be a completely different more complex to setup approach.
This quick article is more to get you up and running with backup in a simple way and show what I am currently using in production as a first step.
- Comments [0]