Ollama Models — ThinkCentre 1

CPU-only inference (Intel UHD 730, no dedicated GPU). 16 GB RAM.

Installed

Model	Size	RAM	Speed (tok/s)	Best for
`qwen2.5:7b`	~4.7 GB	~6 GB	~15-25	Default — code, German, reasoning

Recommended Candidates

Model	Size	RAM needed	Notes
`qwen2.5:7b` ✅	4.7 GB	6 GB	Best quality/speed ratio on CPU
`mistral:7b`	4.1 GB	5 GB	Strong English reasoning
`llama3.2:3b`	2.0 GB	3 GB	Fastest, lower quality
`qwen2.5:14b`	9.0 GB	11 GB	Better quality, slower (~8 tok/s)
`deepseek-r1:7b`	4.7 GB	6 GB	Strong at reasoning/math
`nomic-embed-text`	0.3 GB	1 GB	Embeddings (QMD alternative)

API

# Chat
curl http://192.168.0.91:11434/api/generate -d '{
  "model": "qwen2.5:7b",
  "prompt": "Your prompt here",
  "stream": false
}'

# Via OpenAI-compatible endpoint
curl http://192.168.0.91:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"qwen2.5:7b","messages":[{"role":"user","content":"Hello"}]}'

Management

ollama list              # Installed models
ollama pull <model>      # Download model
ollama rm <model>        # Remove model
ollama run <model>       # Interactive chat
sudo systemctl status ollama
sudo systemctl restart ollama

OpenClaw Integration (future)

Add to openclaw.json as fallback:

{
  "agents": {
    "defaults": {
      "model": {
        "fallbacks": [
          "openrouter/anthropic/claude-sonnet-4-6",
          "ollama/qwen2.5:7b@http://192.168.0.91:11434",
          "google/gemini-2.5-flash"
        ]
      }
    }
  }
}

1.7 KiB Raw Blame History

Ollama Models — ThinkCentre 1

Installed

Recommended Candidates

API

Management

OpenClaw Integration (future)

1.7 KiB

Raw Blame History