Skip to content

feat(plugins): Ollama support#352

Open
Dialvive wants to merge 3 commits into
mpfaffenberger:mainfrom
Dialvive:feature/ollama_support
Open

feat(plugins): Ollama support#352
Dialvive wants to merge 3 commits into
mpfaffenberger:mainfrom
Dialvive:feature/ollama_support

Conversation

@Dialvive
Copy link
Copy Markdown

@Dialvive Dialvive commented May 21, 2026

Summary

Adds a new plugin that lets Code Puppy connect to local inference servers — Ollama, LM Studio, vLLM, llama.cpp, and any other OpenAI Chat Completions-compatible endpoint — without sending requests to remote model providers.

The plugin is auto-discovered by the existing plugin loader. Zero existing files were modified.

User Story

As a user, I want to run Code Puppy with local models through Ollama (or any OpenAI Chat Completions-compatible endpoint), so that I don't connect with remote model providers, use the hardware on my machine to process requests, and reduce costs.

Acceptance Criteria

  • A new plugin code_puppy/plugins/ollama/ is created following the plugin architecture
  • The plugin registers an "ollama" model type via the register_model_type callback hook
  • Users can configure Ollama models in ~/.code_puppy/extra_models.json using the existing custom_endpoint config format
  • When no custom_endpoint is provided, the plugin defaults to http://localhost:11434/v1 with api_key "ollama"
  • The OLLAMA_HOST environment variable overrides the default base URL
  • The handler creates OpenAIChatModel (Chat Completions API), NOT OpenAIResponsesModel (Responses API)
  • All existing Code Puppy features (tools, agents, sub-agents, plugins, streaming) work with Ollama models that support tool/function calling
  • The plugin fails gracefully — never crashes the app
  • Unit tests cover all paths with 95%+ coverage
  • No existing files are modified — plugin loader auto-discovers the new directory
  • Code passes ruff format and ruff check

Test Plan

# Test Case Expected Result
1 Handler with custom_endpoint config Uses provided URL, api_key, headers via get_custom_config()
2 Handler without custom_endpoint Defaults to http://localhost:11434/v1, api_key "ollama"
3 OLLAMA_HOST env var set Overrides default URL, appends /v1 if missing
4 Handler returns OpenAIChatModel NOT OpenAIResponsesModel — verified via isinstance check
5 Handler failure (e.g., bad config) Returns None, does not raise
6 get_ollama_model_types() return structure Returns [{"type": "ollama", "handler": callable}]
7 ModelFactory.get_model integration Type "ollama" routes to the handler via callback
8 Plugin auto-discovery register_callbacks.py is found by plugin loader

How to Use

1 — Install Ollama

brew install ollama

2 — Pull a model with tool calling support

ollama pull qwen3:8b         # ~6 GB VRAM  — good for testing
ollama pull qwen3:14b        # ~10 GB VRAM — better quality
ollama pull qwen3:30b        # ~20 GB VRAM — recommended

3 — Start the Ollama server

brew services start ollama
curl http://localhost:11434/api/tags

4 — Configure the model in Code Puppy

Create or edit ~/.code_puppy/extra_models.json:

{
  "ollama-qwen3": {
    "type": "ollama",
    "name": "qwen3:8b",
    "context_length": 32768
  }
}

Multiple models can be registered at once:

{
  "ollama-qwen3-8b": {
    "type": "ollama",
    "name": "qwen3:8b",
    "context_length": 32768
  },
  "ollama-qwen3-30b": {
    "type": "ollama",
    "name": "qwen3:30b",
    "context_length": 131072
  }
}

Custom host (different port, remote machine, LM Studio):

{
  "lmstudio-codellama": {
    "type": "ollama",
    "name": "codellama:34b",
    "context_length": 16384,
    "custom_endpoint": {
      "url": "http://192.168.1.50:1234/v1",
      "api_key": "lm-studio"
    }
  }
}

Alternatively, set OLLAMA_HOST to override the default endpoint without editing the config:

export OLLAMA_HOST=http://myserver:11434

5 — Run Code Puppy and switch to the local model

./code-puppy-dev --interactive

Inside the session:

/model ollama-qwen3

Or start directly on the model:

./code-puppy-dev --interactive --model ollama-qwen3

Testing

Unit tests (no Ollama required — fully mocked)

pytest tests/test_ollama_plugin.py -v

Coverage: 100% on plugin files across 15 test cases covering:

  • custom_endpoint path (uses get_custom_config())
  • Default localhost path
  • OLLAMA_HOST env var — append /v1, already ends with /v1, trailing slash, empty string
  • Returns OpenAIChatModel not OpenAIResponsesModel
  • model_config["name"] resolution with fallback to config key
  • Graceful None return on exception
  • Handler registration structure

Verified End-to-End

  • ollama serve + qwen3:8b on localhost
  • Model appears in /model picker after configuring extra_models.json
  • File read/write tool calls work
  • Multi-step agentic tasks complete successfully
  • No remote API calls made during local model session

Dialvive added 3 commits May 20, 2026 18:12
Adds docstring for the Ollama plugin.
Add unit tests for the Ollama plugin model type handler, covering various scenarios including custom endpoints, environment variables, and model creation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant