Domino on Linux/Unix, Troubleshooting, Best Practices, Tips and more ...

alt

Daniel Nashed

Simple LLM performance test for different GPU and CPU hardware

Daniel Nashed – 26 April 2025 23:13:48

The LLAMA C++ project is the base for many LLM solutions.
It is also the base for Ollama. Picking the right hardware can be quite tricky.

The project has a simple performance test. I took a quick look picking a small model which runs quite good even on modern CPU hardware.
With modern CPU hardware you can ge quite some OK performance for small models for a few parallel requests.

The following is what I tested today on different hardware.
Modern Intel CPUs are good for simple testing.

Apple Silicon is already better. A modern NVIDA CPU -- even on a notebook - has way better performance.

Small enterprise GPUs have better performance even for small LLMs.

Of course for larger models it is also a matter of GPU RAM.
But it n this case it's the pure compute performance compared in this simple test.

This simple test shows already a direction what type of performance you can expect in general with modern hardware.
I might redo the test with older GPUs, AMD CPUs and older NVIDIA cards just to get an idea.

But I would also like to hear from your experience. Specially if you have access to NIVIDA enterprise hardware like a H100.



I gave up formatting this richtext. The Domino blog template is unbelievable broken. I should move but I don't want to loose my blog history.

When saving the document it is always messed up again. It is removing and adding new lines in a very weird way when saved.


-- Anyhow here is the text --



All the test have been performed on Linux with this simple command:



./llama-bench -m qwen2.5-0.5b-instruct-q3_k_m.gguf



Hosted server Intel Xeon Processor (Icelake)
model size params backend ngl test t/s
qwen2 1B Q3_K - Medium 406.35 MiB 630.17 M RPC 99 pp512 196.84 ± 0.50
qwen2 1B Q3_K - Medium 406.35 MiB 630.17 M RPC 99 tg128 64.26 ± 0.20



Proxmox 12th Gen Intel(R) Core(TM) i9-12900HK
model size params backend ngl test t/s
qwen2 1B Q3_K - Medium 406.35 MiB 630.17 M RPC 99 pp512 352.61 ± 25.50
qwen2 1B Q3_K - Medium 406.35 MiB 630.17 M RPC 99 tg128 130.07 ± 2.84

Apple M4


model size params backend threads test t/s
qwen2 1B Q3_K - Medium 406.35 MiB 630.17 M Metal,BLAS,RPC 4 pp512 2294.79 ± 45.80
qwen2 1B Q3_K - Medium 406.35 MiB 630.17 M Metal,BLAS,RPC 4 tg128 150.85 ± 2.63


NVIDIA GeForce RTX 4060 Laptop GPU, compute capability 8.9, VMM: yes


model size params backend ngl test t/s
qwen2 1B Q3_K - Medium 406.35 MiB 630.17 M CUDA 99 pp512 16827.08 ± 228.85
qwen2 1B Q3_K - Medium 406.35 MiB 630.17 M CUDA 99 tg128 288.66 ± 1.67

NVIDIA RTX 4000 SFF Ada Generation, compute capability 8.9, VMM: yes


model size params backend ngl test t/s
qwen2 1B Q3_K - Medium 406.35 MiB 630.17 M CUDA 99 pp512 21249.40 ± 86.94
qwen2 1B Q3_K - Medium 406.35 MiB 630.17 M CUDA 99 tg128 363.51 ± 1.70





 Domino  NVIDIA  Docker  WSL  DominoIQ 

How to use NVIDIA GPUs on a Windows notebook with Linux

Daniel Nashed – 24 April 2025 22:42:35

The following is actually a run this at home instead of a don't try this at home.

I have been playing around with VMware workstation today. It turns out that VMware Workstation 17.6 Pro can provide 3D acceleration.

But it can't do a true vGPU passthru from what it looks like.


On the other side running LLMs on Linux on the other side is quite desirable.
Instead of running Linux on VMware, I makes a lot of sense to use WSL2 anyhow.

But also for GPUs WSL2 is the right choice as you can see below.


WSL + Ubuntu = NVIDIA GPU support


If you have the right drivers installed you can use the GPU on Windows (for example with Ollama) and also access it in WSL at the same time sharing the card.

Usually you want to run Ollama only once and share the loaded models via REST requests.


But it would be possible to run Ollama inside an Ubuntu WSL instance.

Not only that. With the right drivers you can also run Docker containers on WSL exposing the GPU.


WSL + Ubuntu + Docker = better NVIDIA GPU support


There is one special case where you even need a Docker container in this context.

The very useful nvtop tool (included in Ubuntu) does not run on WSL2.


The reason is that WSL does not expose the low level hardware which nvtop is relying on.

nvidia-smi
(a tool shipped by NVIDIA to query GPU information) works on the WSL Ubuntu instance,

But nvtop only works inside a container because the NVIDIA drivers for the Docker server provide the full support.


You can run a wild match on the Windows host, the WSL Ubuntu and a Docker instance running Ubuntu inside WSL.


Usually you pick one way to run LLMs. But it is good to understand all different ways for your specific use case.


I am running all of those in parallel for testing on my lab notebook.


Now that Domino IQ shipped with external GPU support, those options might be come more interesting for you.

But you can also run Ollama without GPU support on modern CPUs which quite decent performance for a lab environment.


With a GPU Domino on Linux could run native inside the WSL Ubuntu instance or on Docker in side WSL.


Windows --> WSL2 Ubuntu 24.04 --> Docker 28.x -->  Ubuntu 24.04 container --> Domino 14.5 EA3


I am not providing step by step instructions for installing drivers, because this might change over time and the versions change.


You can ask for example ChatGPT to provide detailed steps -- which are pretty good actually.




Image:How to use NVIDIA GPUs on a Windows notebook with Linux



 Domino 

Domino 14.5 EA3 shipped yesterday

Daniel Nashed – 23 April 2025 21:26:05

The final beta for Domino 14.5 shipped yesterday.
I was quite busy with installing software, preparing the container image.

It also go other updates at the same time like Domino Leap, he REST API and the Domino C-API 14.5 EA3).


Adding new versions isn't just changing the software.txt lines in the project.

Everything gets re-tested with an automation script including the REST API combinations for Domino 12 and 14.


But there was more I had to prepare. The Domino IQ Lab database got updated to the new functionality in Domino 14.5 EA3.

And there was also a web cast for Domino 14.5 EA3 today.


The web cast could not show all details of all functionality added.


You should really look into the early access help and join us in the forum


-->
https://hclsw.co/domino-14-5-eap-forum

The forum contains all the links and information to the new and updated features.


After updating all of my servers, I am looking into features like the new Domino License Dashboard (DLA).


The upcoming Engage conference will have sessions for all of the new features.

At DNUG conference the following month we are planning sessions and workshops.


But really you want to look into the Early Access forum to get current information and provide final feedback.


I am looking forward to see many of you at the conferences. But please also have a look into the forum.


All developers are looking into the forum and are looking forward to your feedback and questions.

We could not answer all questions to the detail we would want in an one hour webinar.

That's where the forum can help.


-- Daniel



 VMware 

VMware Workstation 17.6 free for private and commerical use

Daniel Nashed – 23 April 2025 21:15:10

Friedhelm Klein pointed out in a comment to my last post, that VMware Workstation and also Fusion on Mac is free to use meanwhile.

I did know that but didn't want to look into it, because registering at Broadcom and their website are not that easy to handle.


But actually VMware Workstation Pro and Fusion are great products.
I gave it another try on my new demo notebook. Which is by the way an Asus gaming notebook for AI work.


Beside WSL and Docker Desktop it is now running VMware Workstation Pro.
Friedhelm and others reported that Fusion works great on the Mac (which will be my next install).


If you are coming to DNUG conference you will see my full setup in the DNUG Lab.

And probably also in the Domino IQ workshop.

I might not bring it with me to Engage in May, because I only have an AutoUpdate session.

My new Thinkpad will probably also get VMware Workstation Pro.


But my home lab as other component for virtual machines.

One of the machines is running Proxmox, which is a great option for server based environments.


For notebooks including GPU hardware, VMware Workstation Pro is a good idea.


You can either register and download it from the free software section.

Or you can use Choco to get it via command-line. I decided
to download it from the software support site.

Thanks Friedhelm to have me take another look for old good friend VMware workstation I used for many years.
It has many advantages over other solutions, including snapshots and cloning.

Also installation of Windows is optimized. I got my two new servers setup in minutes.


Here are the two relevant links

Product page

https://www.vmware.com/products/desktop-hypervisor/workstation-and-fusion

Download Link for free downloads

https://support.broadcom.com/group/ecx/free-downloads

I just got someone to send me  exact link for download.
Actually I had someone else helping me finding them in UI earlier.
The direct link is very helpful. Navigate this page to the VMware downloads at the end.


-- Daniel


Image:VMware Workstation 17.6 free for private and commerical use
 Ollama 

Ollama keep models loaded for longer than 5 minutes idle time

Daniel Nashed – 20 April 2025 11:24:15

Ollama keeps loaded models for 5 minutes in memory if they are idle.

When you use the API to load a model, you can specify how long it stays in memory by model.

But probably it would be a good idea to change the default time as well.


The setting is controlled via an environment variable.
For a container image you would just pass it as a parameter:


-e "OLLAMA_KEEP_ALIVE=30m"


For Ollama running as a Linux service, you have to add it to the systemd service like this:


/etc/systemd/system/ollama.service


Environment="OLLAMA_KEEP_ALIVE=30m"


The default unit is seconds. You can specify a unit behind the idle shutdown delay (like m for minutes).


I just looked it up set it on my servers.


Ollama loads LLMs automatically if they have been downloaded.

Dynamically loading of LLMs is quite useful. But loading them again after 5 minutes of idle time would delay requests if the LLM as been unloaded.


You can also configure how many models should be loaded at most via OLLAMA_MAX_LOADED_MODELS (the default is 3).

Usually you can't run too many models at the same time. But if you are testing with smaller models, this still might be important:



-- Daniel



Example for some smaller models all loaded into the CPU in parallel:

ollama ps

NAME             ID              SIZE      PROCESSOR    UNTIL

qwen2.5:0.5b     a8b0c5157701    1.3 GB    100% GPU     29 minutes from now

granite3.2:2b    9d79a41f2f75    3.2 GB    100% GPU     29 minutes from now

gemma3:1b        8648f39daa8f    1.9 GB    100% GPU     27 minutes from now


 Domino 

Get ready for Domino 14.5 EA3

Daniel Nashed – 20 April 2025 10:08:08

The final 14.5 EA3 Early Access version is about to be released next week.
We can expect updates in Domino IQ and a couple of other features.
The EAP Forum is the place to get all up to date information and to ask questions --> https://hclsw.co/domino-14-5-eap-forum

I am planning t provide an updated version about the Domino IQ Lab database and to post some examples to help with the new functionality we can expect.

There is also a webinar On Wednesday you should join:

Date: April 23, 2025 Time: 10 AM - 11 AM EDT
Register Now:
https://register.gotowebinar.com/register/5527396029191085664

You should really join the webinar and the the forum.

If you are on a Domino 14.5 EA release you can leverage AutoUpdate to update to the latest version automagically on Windows and Linux.

See you in the forum...


 ACME  Go 

Creating ACME certificates with Go

Daniel Nashed – 20 April 2025 08:25:24

"LEGO" is a full featured ACME implementation including DNS-01 challenges.

But if you just need a basic ACME functionality for HTTP-01 requests, there are modules available directly from Go.


"golang.org/x/crypto/acme/autocert" provides an easy to use interface, which works hand in hand with the Go web-server functionality.


The following simple program creates a certificate and starts a simple demo web-server.
This brings ACME directly into your application without any extra tools if you are working with Go.


Reference:
https://pkg.go.dev/golang.org/x/crypto/acme/autocert


package main


import (

  "crypto/tls"

  "log"

  "net/http"

  "os"

  "net"

  "strings"

  "golang.org/x/crypto/acme"

  "golang.org/x/crypto/acme/autocert"

)


func main() {


  szHostname, err := os.Hostname()


  if err != nil {

      log.Println ("Error getting hostname:", err)

      return

  }


  szFQDN, err := net.LookupCNAME(szHostname)


  if (err == nil) {

      szHostname = strings.TrimSuffix (szFQDN, ".");

  }


  log.Println ("Local Hostname: ", szHostname)


  manager := &autocert.Manager{

      Cache:      autocert.DirCache("certs"), // Local cert cache

      Prompt:     autocert.AcceptTOS,

      HostPolicy: autocert.HostWhitelist (szHostname),


      Client: &acme.Client{

          DirectoryURL: "
https://acme-staging-v02.api.letsencrypt.org/directory",
      },

  }


  server := &http.Server{

      Addr: ":443",

      TLSConfig: &tls.Config{

          GetCertificate: manager.GetCertificate,

      },

      Handler: http.HandlerFunc (func(w http.ResponseWriter, r *http.Request) {

          w.Write([]byte ("Hello, Staging HTTPS world!"))

      }),

  }


  // Redirect HTTP to HTTPS

  go func() {

      log.Fatal (http.ListenAndServe (":80", manager.HTTPHandler(nil)))

  }()


  log.Println ("Starting HTTPS server with Let's Encrypt staging certs...")

  log.Fatal (server.ListenAndServeTLS("", ""))

}

Using a proxy to optimize downloads and packet updates

Daniel Nashed – 20 April 2025 23:32:52

Proxying downloads can be tricky. If the resouce is HTTPS there isn't easy way because the connection is encrypted.
But for HTTP resources a Squid proxy can cache downloads very effective.


Cache HTTP resource


Linux distributions often use HTTP resources, because the software itself is signed.
This opens the door for effective caching.

I have pointed Linux machine and Docker host to my Squid proxy.

Not only that the cache dramatically reduce the downloaded data on second download.
It also reduces the download time dramatically.

Some hot data might even come from memory as you see in my test below.

Caching should be setup with care. Not everything should be cached for a longer time.
*.deb packages can be safely cached, because the file names would have a different file name when updated.



1745101322.369    485 172.17.0.2 TCP_OFFLINE_HIT/200 42288974 GET
http://archive.ubuntu.com/ubuntu/pool/universe/o/openjdk-lts/openjdk-11-jre-headless_11.0.26%2b4-1ubuntu1%7e24.04_amd64.deb - HIER_NONE/- application/vnd.debian.binary-package
1745101322.370      0 172.17.0.2 TCP_MEM_HIT/200 209690 GET
http://archive.ubuntu.com/ubuntu/pool/universe/o/openjdk-lts/openjdk-11-jre_11.0.26%2b4-1ubuntu1%7e24.04_amd64.deb - HIER_NONE/- application/vnd.debian.binary-package


Cache container images


Container registries mostly use HTTPS. In that case a registry with caching support like the Harbor registry could make sense.



Cache MyHCLSoftware downloads


HCL Software downloads like many other software downloads are HTTPS only.
One of the main reason is that authentication is required.

The Domino Download Server project provides a way to download software, which then can be locally accessed.
A Domino Download Server could be the source for internal downloads, container builds and even the Domino AutoUpdate functionality can leverage the Download Server.



https://github.com/nashcom/domino-startscript/tree/develop/domdownload-server

Conclusion


Now matter how fast your internet connection is, a local download is always faster and reduces resource usage.


 ACME  GO 

Let’s Encrypt ACME implementation in GO -- LEGO

Daniel Nashed – 19 April 2025 19:13:27
Image:Let’s Encrypt ACME implementation in GO -- LEGO Domino CertMgr has full ACME / Let's Encrypt integration.
But there might be cases where you need for other servers.

I discovered an interesting project when looking for an ACME client for Go.


The project offers:


  • A command-line interface
  • Go Libs to use in your own applications
  • A container image

Beside HTTP-01 challenges the project supports many DNS-01 integrations with reference to the APIs used.

Specially setting up a new environment HTTP-01 challenges are important.
Getting a certificate uses a simple command-line, which is also available on Docker.


Here is a simple Docker example:


docker run --rm \

-v $(pwd)/data:/data \

-p 80:80 \

goacme/lego \

--http \

--domains
www.acme.com \
--email info@acme.com \

--accept-tos \

--server
https://acme-staging-v02.api.letsencrypt.org/directory \
--path /data \

run


It's a well done integration with many options and is easy to use at the same time.
This should not replace any of your existing Domino CertMgr flows,

but can help you in one or the other case.

OK on top the project name is pretty cool. Not sure how they get away with this name.










 Docker  WSL  Hyper-V  UTM 

Choosing the Right Desktop Virtualization Solution for Mac and Windows

Daniel Nashed – 19 April 2025 13:58:21

Here is an experiment. I told ChatGPT what I like to say and just fine tuned the result.

ChatGPT is really helpful meanwhile - If you ask the right questions and returns grat results of all sorts of topics.

The following blog post is co-authored with ChatGPT.


Specially on Windows the world changed a bit. For most people (including me) VMware is not an option any more even it is free for personal use meanwhile.

I was a big fan of Windows Sandbox. But with Windows 11 and other software installed it became quite instable for me.


Even I never liked Hyper-V on the server side, as an easy local solution it plays well with other virtualization built on the same foundation like WSL.

WSL2 is a game changer for me for quite a while -- not only for Docker development. I am working with it every day and it is well integration into the Windows stack both ways.
From Windows to access Linux resources and from Linux to access Windows resources.


I would be interested to hear which VM solutions you use in your environment and why..


Daniel



Introduction


As development environments shift toward containers and platform-agnostic workflows, desktop virtualization remains an essential too l— especially for developers, system administrators, and DevOps engineers.

Whether you're working on macOS or Windows, selecting the right virtualization solution is critical for performance, compatibility, and efficiency.
This post highlights the best options for virtualization on Mac and Windows, with a clear breakdown of commercial and free tools, and a special focus on Apple Silicon Macs and Windows ARM-based systems.



-- Virtualization on macOS ---


Apple’s transition to Apple Silicon has changed how virtualization works.

These systems only support ARM-based operating systems natively, which affects tool compatibility, including some legacy enterprise software.


Best Commercial Option: Parallels Desktop


Parallels remains the most refined, high-performance commercial virtualization tool for macOS.

It supports Windows 11 ARM, various Linux distros, and offers excellent Apple Silicon support with fast boot times and tight macOS integration.


Best Free Option: UTM


UTM is a solid open-source alternative. It uses QEMU under the hood and can run ARM and emulated x86 systems (though x86 emulation is slower). It’s a great option for personal projects or basic dev/testing needs.


Commercial Container Option: Docker Desktop for Mac


Docker Desktop is widely used for container-based development. On Apple Silicon, containers must be ARM-based, and some legacy images may not work without modification.

Docker Desktop is free for small teams but requires a paid subscription for larger organizations (250+ employees or $10M+ in revenue).

Important for Domino users
: Domino agents on Linux
do not work on Apple Silicon Macs due to the lack of x86 compatibility.


Free Container Option: Rancher Desktop


Rancher Desktop is an open-source alternative to Docker Desktop, supporting Kubernetes and containerd. It’s a great choice for developers who want to avoid licensing fees while still getting a container-native workflow.



--- Virtualization on Windows --


Windows offers a rich set of virtualization and containerization tools—many built-in or easily installed. With support for both x86 and ARM editions of Windows 11, your choice depends on the host hardware and guest OS requirements.


Native VM Option: Hyper-V


Since VMware Workstation is off the table for me, even it is now free but hard to get Hyper-V seems to be the best option.


Hyper-V is Microsoft’s built-in hypervisor for Windows 10 Pro and all versions of Windows 11 with Pro or Enterprise licensing. It supports a wide range of guest OSes and is ideal for more traditional VM workloads.


Note
: If you’re running Windows on ARM (like on a Surface Pro X), you’ll need to use Windows 11 ARM Edition for your guest VMs.



Virtualizing Linux on Windows


For developers focused on Linux environments, Windows offers multiple high-performance options that don’t require full VMs.


WSL2 with Ubuntu (Recommended)


WSL2 (Windows Subsystem for Linux v2) with Ubuntu is the best way to run a full Linux environment on Windows. It uses a lightweight VM backend and provides near-native performance. Perfect for dev work, scripting, and server emulation.
  • Fast startup
  • Deep Windows integration (file system access, network)
  • No need for dual-boot or heavy VM management


Docker Desktop with WSL2 Integration


Docker Desktop integrates seamlessly with WSL2, making it easy to run containers inside your Linux subsystem. This hybrid model gives you the best of both worlds: native Docker CLI in Linux with a Windows GUI when needed.
  • Preferred by most container-based developers
  • Easy setup and Kubernetes support
  • Commercial licensing applies to larger organizations

Docker Daemon in WSL2 (Manual Setup)


For power users or those avoiding Docker Desktop licensing, you can install the Docker Engine directly inside a WSL2 instance (e.g., Ubuntu). This gives you full Docker CLI functionality without Docker Desktop.
  • Lightweight, manual setup
  • Avoids licensing fees
  • Ideal for scripting or custom CI/CD environments


Bonus Tip: Docker as a "Mini-VM" with Red Hat UBI


If you're looking for VM-like behavior within containers, check out the Red Hat UBI init image. This image runs systemd and other init services, making it behave much more like a traditional Linux VM.

Perfect for testing services that rely on init behavior or for more complex container orchestration needs.
With the UBI init image, containers feel a lot closer to VMs—great for simulating production-like environments.


Comparison Table
Platform Virtualization Tool Type Notes
Mac Parallels Desktop Commercial VM Best for Apple Silicon
Mac UTM Free VM Great for light use, ARM-based only
Mac Docker Desktop Commercial Container ARM-only; licensing applies
Mac Rancher Desktop Free Container Good for Kubernetes users
Windows Hyper-V Free VM Built-in on Pro/Enterprise
Windows WSL2 + Ubuntu Free Linux Env Best Linux dev experience
Windows Docker Desktop + WSL2 Commercial Container Powerful, GUI-based
Windows Docker Engine in WSL2 Free Container Lightweight, CLI-focused





Conclusion


Virtualization and containerization have matured into robust, flexible options for Mac and Windows users alike. Just remember:

  • On Apple Silicon, you're locked into ARM-compatible guests.
  • On Windows ARM, use Windows 11 ARM Edition for VMs.
  • For container workloads, Docker + WSL2 (on Windows) and Docker or Rancher (on Mac) offer clean, modular workflows.

-- Links --


macOS Virtualization Tools


Parallels Desktop

https://www.parallels.com/products/desktop/

UTM (Free)

https://mac.getutm.app/

Docker Desktop for Mac (Commercial)

https://www.docker.com/products/docker-desktop/

Rancher Desktop (Free)

https://rancherdesktop.io/


Windows Virtualization Tools


Hyper-V (Built-in)

https://learn.microsoft.com/en-us/windows-server/virtualization/hyper-v/hyper-v-overview

WSL2 (Windows Subsystem for Linux)

https://learn.microsoft.com/en-us/windows/wsl/

Docker Desktop for Windows (Commercial)

https://www.docker.com/products/docker-desktop/

Docker Engine in WSL2 (Free)

https://docs.docker.com/engine/


Links

    Archives


    • [HCL Domino]
    • [Domino on Linux]
    • [Nash!Com]
    • [Daniel Nashed]