Looking into S3 performance numbers for MinIO -- Is this the right target for backup?
Daniel Nashed – 16 March 2024 18:45:33
Introduction
I know MinIO for a while and I have been using it for DAOS T2 testing early on. Years later they are now grow up and play in the cloud native storage league.
Still the devil is in the detail and for using it in production environment customers hopefully use the enterprise subscription to get tuning support.
Paying for support this doesn't make it a cheap storage any more if you look at their price tag.
S3 is an interesting technology. But it isn't the solution to all problems. "If you only have a hammer every problem looks like a nail".
Coming from the cloud it is designed be AWS as a "Simple Storage Service" as the name indicates. It also has embedded verification and optional encryption.
For sure it isn't useful for all type of operations.There is a certain overhead when you are not accessing the file system directly.
And this also requires quite some additional resources like CPU if extensively used.
I am mainly interested in taking a look into it for Domino Backup.
But understanding the nature of access is also important to understand DAOS T2.
Specially when it comes to listing NLOs for resync operations (which is not part of this test setup).
Test the MinIO server
To scale MinIO, you need quite some hardware resources as your see from their report. My first test on a smaller machine failed, because I ran out of memory resources.
When you look at their benchmark, they are running a cluster with a couple of nodes and multiple client drivers to generate the load.
In the use case of a backup and also for DAOS T2 the performance of individual requests are more relevant.
https://min.io/resources/docs/MinIO-Throughput-Benchmarks-on-HDD-24-Node.pdf
MinIO used a GO based test program from Wasabi for the benchmark above.
https://github.com/wasabi-tech/s3-benchmark
I took a larger Hetzner cloud server with Intel CPU and run some quick tests.
The local disks at Hetzner are always fast SSDs as you can see from the results.
The machine I used was pretty untuned. Even MinIO lists some interesting parameters in their load test (this reminds me on the old Domino Notesbench results).
The basic command I used is the a simple performance test also used in their workload.
Example:
./s3-benchmark -a s3-user -s s3-password -u http://127.0.0.1:9000 -t 1 -z 1M
The MinoIO server is the simple MinIO Docker container without extra tuning using a native disk.
I mostly used the 1 MB object size for testing. But changed it for two test to see the different for backing up larger databases.
The result of parallel operations with the client and the server on the same machine was impressive.
But the write operations for a single object haven't been great for smaller files.
Domino Backup uses a single thread to backup databases. You could see that with parallel write operations the box was able to handle up to 580 MB/sec write performance.
So the disk itself wasn't the bottleneck here. It's probably the overhead of starting the operation which causes the slower performance for a write.
My test has been completely local. This is the lowest network latency you can get.
In a modern network environment the LAN latency should not play a big role here.
But usually machines have only a 1GBit network connection.
Conclusion
~160 MB/sec for a single writer thread is probably the fastest you can get with S3.
I saw similar performance for uploading files to AWS S3 via AWS CLI form a AWS hosted machine.
For Domino backup S3 does not buy you any simplification and the performance might really depend on your databases size and the tuning of your MinIO server.
In addition S3 itself does not de-duplicate data. Which is essential for Domino backup to a simple storage.
When you look at https://blog.min.io/myths-about-deduplication-and-compression/ it really sounds like they have no interest in de-duplication at all.
But compression alone will not help for daily backups with like 14 days of backup retention.
They are probably right about the general use case. But for daily backups de-duplication are essential.
I don't see the benefit of using S3 if you have to install, support, tune and back it up on your own.
It's a different story in the cloud where S3 is a native implementation for example in AWS S3 infrastructure and you consume it as an highly optimized service.
For a company putting Domino backup on a MinIO S3 drive does increase the overhead and will potential cost more than storing it to a simple ZFS de-duplicated share.
Also when it comes to backup performance of a simple file copy operation without de-duplication a Hetzner 1 TB Storage box for around 4 Euro/month can copy at 500 MB/sec without any additional CPU overhead.
-- Daniel
Test results
Threads: 1 / Object Size 1 MB
Loop 1: PUT time 60.0 secs, objects = 3617, speed = 60MB/sec, 60.3 operations/sec. Slowdowns = 0
Loop 1: GET time 60.0 secs, objects = 20158, speed = 336MB/sec, 336.0 operations/sec. Slowdowns = 0
Loop 1: DELETE time 9.0 secs, 402.1 deletes/sec. Slowdowns = 0
---
Threads: 1 / Object Size 100 MB
Loop 1: PUT time 60.0 secs, objects = 97, speed = 161.6MB/sec, 1.6 operations/sec. Slowdowns = 0
Loop 1: GET time 60.1 secs, objects = 589, speed = 980.1MB/sec, 9.8 operations/sec. Slowdowns = 0
Loop 1: DELETE time 0.2 secs, 415.6 deletes/sec. Slowdowns = 0
---
Threads: 10 / Object Size 1 MB
Loop 1: PUT time 60.0 secs, objects = 28594, speed = 476.5MB/sec, 476.5 operations/sec. Slowdowns = 0
Loop 1: GET time 60.0 secs, objects = 264612, speed = 4.3GB/sec, 4410.0 operations/sec. Slowdowns = 0
Loop 1: DELETE time 4.7 secs, 6093.5 deletes/sec. Slowdowns = 0
---
Threads: 10 / Object Size 100 MB
Loop 1: PUT time 60.4 secs, objects = 776, speed = 1.3GB/sec, 12.8 operations/sec. Slowdowns = 0
Loop 1: GET time 60.2 secs, objects = 3696, speed = 6GB/sec, 61.4 operations/sec. Slowdowns = 0
Loop 1: DELETE time 0.8 secs, 977.7 deletes/sec. Slowdowns = 0
---
Threads: 100 /Object Size 1 MB
Loop 1: PUT time 60.0 secs, objects = 34512, speed = 574MB/sec, 574.8 operations/sec. Slowdowns = 0
Loop 1: GET time 60.0 secs, objects = 308293, speed = 5GB/sec, 5137.5 operations/sec. Slowdowns = 0
Loop 1: DELETE time 20.6 secs, 1677.5 deletes/sec. Slowdowns = 0
---
Threads: 1000 / Object Size 1 MB
Loop 1: PUT time 60.2 secs, objects = 28628, speed = 475.5MB/sec, 475.5 operations/sec. Slowdowns = 0
Loop 1: GET time 60.1 secs, objects = 406941, speed = 6.6GB/sec, 6774.5 operations/sec. Slowdowns = 0
Loop 1: DELETE time 21.1 secs, 1359.6 deletes/sec. Slowdowns = 0
- Comments [0]