Skip to content

Add local Ollama model selection to the web composer#147

Merged
pufit merged 1 commit into
ClickHouse:mainfrom
Wachynaky:feat/ollama-model-selection
Jun 24, 2026
Merged

Add local Ollama model selection to the web composer#147
pufit merged 1 commit into
ClickHouse:mainfrom
Wachynaky:feat/ollama-model-selection

Conversation

@Wachynaky

Copy link
Copy Markdown
Contributor

Expose locally-installed Ollama models as selectable chat models in the composer's model picker. The Claude Agent SDK only speaks the Anthropic Messages API, so Ollama (OpenAI-compatible) is reached through the bundled CLIProxyAPI proxy, registered as an openai-compatibility upstream — this requires proxy.enabled.

Backend:

  • OllamaConfig + ollama_routable gate (Ollama enabled AND proxy enabled)
  • nerve/ollama.py: best-effort model discovery via Ollama GET /api/tags
  • proxy/service.py: register discovered models as a proxy upstream
  • GET /api/models route for the picker
  • engine: thread per-session model, recreate the SDK client on a mid-session model switch, and suppress Anthropic-only knobs (extended thinking, effort, context-1m beta) for non-Claude models
  • server: pass the WS per-message model through to run()
  • startup warning when ollama.enabled but proxy.enabled is false

Frontend:

  • api.getModels() + optional model arg on ws.sendMessage
  • chatStore holds available/selected/default model (persisted to localStorage)
  • ChatInput renders a model picker, shown only when more than one model is offered

config.example.yaml: document the proxy and ollama blocks.

Expose locally-installed Ollama models as selectable chat models in the
composer's model picker. The Claude Agent SDK only speaks the Anthropic
Messages API, so Ollama (OpenAI-compatible) is reached through the bundled
CLIProxyAPI proxy, registered as an openai-compatibility upstream — this
requires proxy.enabled.

Backend:
- OllamaConfig + ollama_routable gate (Ollama enabled AND proxy enabled)
- nerve/ollama.py: best-effort model discovery via Ollama GET /api/tags
- proxy/service.py: register discovered models as a proxy upstream
- GET /api/models route for the picker
- engine: thread per-session model, recreate the SDK client on a
  mid-session model switch, and suppress Anthropic-only knobs (extended
  thinking, effort, context-1m beta) for non-Claude models
- server: pass the WS per-message model through to run()
- startup warning when ollama.enabled but proxy.enabled is false

Frontend:
- api.getModels() + optional model arg on ws.sendMessage
- chatStore holds available/selected/default model (persisted to localStorage)
- ChatInput renders a model picker, shown only when more than one model
  is offered

config.example.yaml: document the proxy and ollama blocks.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@Wachynaky

Copy link
Copy Markdown
Contributor Author

Sorry, I guess this isn’t the right way or the right code for this feature, but I was doing it locally and in the end the Nerve agent uploaded it itself, with my help running the Git commands!

@pufit pufit merged commit 983735c into ClickHouse:main Jun 24, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants