Daniel Nashed's Blog

Domino on Linux/Unix, Troubleshooting, Best Practices, Tips and more ...

Daniel Nashed

Tags

Docker - a new player in the LLM business

Daniel Nashed – 5 May 2025 09:30:28

Docker has a new feature in beta. Running models on Docker.
There is not much information about the underlaying technology used.
But during installation you can see that it installs a llama-server (which Ollama and also Domino IQ are using).

Here is a link to the official documentation --> https://docs.docker.com/model-runner/
Docker provides a registry for models. For example: https://hub.docker.com/r/ai/qwen3

To pull a model you just use the new model command. The following is a good small model to test.
docker model pull ai/qwen3:0.6B-Q4_0

Once downloaded you can list models
docker model list MODEL NAME PARAMETERS QUANTIZATION ARCHITECTURE MODEL ID CREATED SIZE ai/qwen3 8.19 B IQ2_XXS/Q4_K_M qwen3 79fa56c07429 4 days ago 4.68 GiB ai/qwen3:0.6B-Q4_0 751.63 M Q4_0 qwen3 df9f2a333a63 4 days ago 441.67 MiB

There are multiple ways to access the AI components.

1. Command Line

Form command line you can just start a model very very similar to what Ollama does
docker model run ai/qwen3:0.6B-Q4_0

2. Within containers

From within containers you can just use the API end-points against: http://model-runner.docker.internal/

For example the OpenAI end-point: POST /engines/llama.cpp/v1/chat/completions

3. Docker Socket
curl --unix-socket $HOME/.docker/run/docker.sock \ localhost/exp/vDD4.40/engines/llama.cpp/v1/chat/completions

4. Expose a TCP socket on Docker host loopback interface
curlhttp://localhost:12434/engines/llama.cpp/v1/chat/completions

First look results

This looks like a great new option to run LLM models.

For my first test it looked like it was not using my GPU.
But even on my very old Thinkpad (will test with the new GPU machine) the performance with this small model was OK.

This just the beginning and there is more to discover. I just took a quick peek into it.

There is more to discover. Alone the integration into the registry and having everything from one vendor is interesting.
In addition it is part of the Docker stack and companies would not need to use an open source project like Ollama directly.

This sounds like a smart Docker move to me.

Below are some screen shots from my test this morning.

Image:Docker - a new player in the LLM business

Image:Docker - a new player in the LLM business

Comments [0]

Domino on Linux/Unix, Troubleshooting, Best Practices, Tips and more ...

Daniel Nashed

Daniel Nashed's Blog

Docker - a new player in the LLM business

Recent Entries

Feeds

Links

Archives