Daniel Nashed's Blog

Domino on Linux/Unix, Troubleshooting, Best Practices, Tips and more ...

Daniel Nashed

Tags

Ollama - latest small LLMs which work well on modern CPUs

Daniel Nashed – 1 May 2025 16:29:14

Domino IQ does now support external LLMs. One of the best ways to run LLMs is Ollama.

Running on an NVIDIA CPU is the best performing way.
Running on a Mac Apple Silicon is probably the second best option.

But if you choose a small model, the performance is quite OK on a modern CPU.
It should be a modern CPU like Intel 12 or 13 generation.

The following recent models will work well.

For external LLMs you will need TLS, which can be implemented with a simple NGINX reverse proxy setup for example.

qwen3

ollama run qwen3:0.6b
ollama run qwen3:1.7b

the 0.6b model is really small but quite OK. And very fast. But the next bigger one is still also OK from performance point of view.

https://ollama.com/library/qwen3

granite3.3

ollama run granite3.3:2b

https://ollama.com/library/granite3

gemma3

ollama run gemma3:1b

https://ollama.com/library/gemma3

Comments [0]

Domino on Linux/Unix, Troubleshooting, Best Practices, Tips and more ...

Daniel Nashed

Daniel Nashed's Blog

Ollama - latest small LLMs which work well on modern CPUs

Recent Entries

Feeds

Links

Archives