Domino on Linux/Unix, Troubleshooting, Best Practices, Tips and more ...

alt

Daniel Nashed

Docker - a new player in the LLM business

Daniel Nashed – 5 May 2025 09:30:28

Docker has a new feature in beta. Running models on Docker.
There is not much information about the underlaying technology used.

But during installation you can see that it installs a llama-server (which Ollama and also Domino IQ are using).


Here is a link to the official documentation -->
https://docs.docker.com/model-runner/
Docker provides a registry for models.  For example:
https://hub.docker.com/r/ai/qwen3

To pull a model you just use the new model command. The following is a good small model to test.


docker model pull ai/qwen3:0.6B-Q4_0



Once downloaded you can list models


docker model list

MODEL NAME          PARAMETERS  QUANTIZATION    ARCHITECTURE  MODEL ID      CREATED     SIZE

ai/qwen3            8.19 B      IQ2_XXS/Q4_K_M  qwen3         79fa56c07429  4 days ago  4.68 GiB

ai/qwen3:0.6B-Q4_0  751.63 M    Q4_0            qwen3         df9f2a333a63  4 days ago  441.67 MiB



There are multiple ways to access the AI components.


1. Command Line


Form command line you can just start a model very very similar to what Ollama does


docker model run ai/qwen3:0.6B-Q4_0



2. Within containers


From within containers you can just use the API end-points against:
http://model-runner.docker.internal/

For example the OpenAI end-point:


POST /engines/llama.cpp/v1/chat/completions



3. Docker Socket


curl --unix-socket $HOME/.docker/run/docker.sock \

   localhost/exp/vDD4.40/engines/llama.cpp/v1/chat/completions



4. Expose a TCP socket on Docker host loopback interface


curl
http://localhost:12434/engines/llama.cpp/v1/chat/completions


First look results


This looks like a great new option to run LLM models.


For my first test it looked like it was not using my GPU.

But even on my very old Thinkpad (will test with the new GPU machine) the performance with this small model was OK.


This just the beginning and there is more to discover. I just took a quick peek into it.


There is more to discover. Alone the integration into the registry and having everything from one vendor is interesting.

In addition it is part of the Docker stack and companies would not need to use an open source project like Ollama directly.


This sounds like a smart Docker move to me.


Below are some screen shots from my test this morning.



Image:Docker - a new player in the LLM business


Image:Docker - a new player in the LLM business

Image:Docker - a new player in the LLM business


Links

    Archives


    • [HCL Domino]
    • [Domino on Linux]
    • [Nash!Com]
    • [Daniel Nashed]