1.7 KiB
1.7 KiB
Ollama Models — ThinkCentre 1
CPU-only inference (Intel UHD 730, no dedicated GPU). 16 GB RAM.
Installed
| Model | Size | RAM | Speed (tok/s) | Best for |
|---|---|---|---|---|
qwen2.5:7b |
~4.7 GB | ~6 GB | ~15-25 | Default — code, German, reasoning |
Recommended Candidates
| Model | Size | RAM needed | Notes |
|---|---|---|---|
qwen2.5:7b ✅ |
4.7 GB | 6 GB | Best quality/speed ratio on CPU |
mistral:7b |
4.1 GB | 5 GB | Strong English reasoning |
llama3.2:3b |
2.0 GB | 3 GB | Fastest, lower quality |
qwen2.5:14b |
9.0 GB | 11 GB | Better quality, slower (~8 tok/s) |
deepseek-r1:7b |
4.7 GB | 6 GB | Strong at reasoning/math |
nomic-embed-text |
0.3 GB | 1 GB | Embeddings (QMD alternative) |
API
# Chat
curl http://192.168.0.91:11434/api/generate -d '{
"model": "qwen2.5:7b",
"prompt": "Your prompt here",
"stream": false
}'
# Via OpenAI-compatible endpoint
curl http://192.168.0.91:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"qwen2.5:7b","messages":[{"role":"user","content":"Hello"}]}'
Management
ollama list # Installed models
ollama pull <model> # Download model
ollama rm <model> # Remove model
ollama run <model> # Interactive chat
sudo systemctl status ollama
sudo systemctl restart ollama
OpenClaw Integration (future)
Add to openclaw.json as fallback:
{
"agents": {
"defaults": {
"model": {
"fallbacks": [
"openrouter/anthropic/claude-sonnet-4-6",
"ollama/qwen2.5:7b@http://192.168.0.91:11434",
"google/gemini-2.5-flash"
]
}
}
}
}