GoModel registers Ollama through OLLAMA_BASE_URL. To run multiple Ollama
instances behind one gateway, use suffixed env vars.
OLLAMA_A_BASE_URL = http://host.docker.internal:11434/v1
OLLAMA_B_BASE_URL = http://host.docker.internal:11435/v1
GOMODEL_MASTER_KEY = change-me
OLLAMA_A_BASE_URL registers provider ollama-a; OLLAMA_B_BASE_URL
registers ollama-b. Use different ports or hostnames per instance.
On Linux, add --add-host=host.docker.internal:host-gateway to the
docker run command so the container can reach Ollama on the host.
Run GoModel
Docker (.env file)
Docker (inline -e)
Binary (make build)
docker run --rm -p 8080:8080 --env-file .env enterpilot/gomodel
Route to a specific backend
curl -s http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer change-me" \
-H "Content-Type: application/json" \
-d '{
"model": "ollama-a/llama3.2",
"messages": [{"role": "user", "content": "Reply with exactly ok."}]
}'
GET /v1/models returns provider-qualified IDs such as ollama-a/llama3.2
and ollama-b/llama3.2. A bare model name is routed to whichever provider
exposes it first by registration order — use the qualified form to pick
explicitly.
When YAML still helps
Reach for config.yaml only when generated names like ollama-a are not
enough, or when you want per-provider resilience overrides:
providers :
local-fast :
type : ollama
base_url : "http://ollama-fast:11434/v1"
local-large :
type : ollama
base_url : "http://ollama-large:11434/v1"