# Ollama Models — ThinkCentre 1 CPU-only inference (Intel UHD 730, no dedicated GPU). 16 GB RAM. ## Installed | Model | Size | RAM | Speed (tok/s) | Best for | |-------|------|-----|---------------|----------| | `qwen2.5:7b` | ~4.7 GB | ~6 GB | ~15-25 | Default — code, German, reasoning | ## Recommended Candidates | Model | Size | RAM needed | Notes | |-------|------|-----------|-------| | `qwen2.5:7b` ✅ | 4.7 GB | 6 GB | Best quality/speed ratio on CPU | | `mistral:7b` | 4.1 GB | 5 GB | Strong English reasoning | | `llama3.2:3b` | 2.0 GB | 3 GB | Fastest, lower quality | | `qwen2.5:14b` | 9.0 GB | 11 GB | Better quality, slower (~8 tok/s) | | `deepseek-r1:7b` | 4.7 GB | 6 GB | Strong at reasoning/math | | `nomic-embed-text` | 0.3 GB | 1 GB | Embeddings (QMD alternative) | ## API ```bash # Chat curl http://192.168.0.91:11434/api/generate -d '{ "model": "qwen2.5:7b", "prompt": "Your prompt here", "stream": false }' # Via OpenAI-compatible endpoint curl http://192.168.0.91:11434/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"qwen2.5:7b","messages":[{"role":"user","content":"Hello"}]}' ``` ## Management ```bash ollama list # Installed models ollama pull # Download model ollama rm # Remove model ollama run # Interactive chat sudo systemctl status ollama sudo systemctl restart ollama ``` ## OpenClaw Integration (future) Add to `openclaw.json` as fallback: ```json { "agents": { "defaults": { "model": { "fallbacks": [ "openrouter/anthropic/claude-sonnet-4-6", "ollama/qwen2.5:7b@http://192.168.0.91:11434", "google/gemini-2.5-flash" ] } } } } ```