Skip to content
| Marketplace
Sign in
Visual Studio Code>AI>Ollama CodeNew to Visual Studio Code? Get it now.
Ollama Code

Ollama Code

Corey Gaspard

|
9 installs
| (0) | Free
Agentic coding panel for your local Ollama models. A Claude Code / Codex–style chat experience powered by the open-source OpenCode agent, running entirely against your local models.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Ollama Code

An agentic coding panel for your local Ollama models — a Claude Code / Codex–style chat experience that runs entirely on your machine.

Under the hood it drives the open-source OpenCode agent (Apache/MIT) as a headless server pointed at your Ollama server. You get a real agent — file edits, shell tools, permissions, multi-step reasoning — with no cloud model and no API key.

Demo

Ollama Code demo

Why

The official Claude Code and Codex VS Code extensions are not open source, so they can't be adapted to local models. The CLIs behind several agents are open, though — and OpenCode in particular ships a headless server with a built-in Ollama provider. This extension wraps that server in a native chat panel and fills its model picker with the models you actually have installed.

Features

  • Chat panel in the Activity Bar / secondary side bar (and "Open in Editor Tab" for parallel conversations)
  • Streaming responses with a Claude-style timeline — thinking, tool steps, answer
  • Reasoning blocks for thinking-capable models (collapsible)
  • Agent tools — file reads/edits, shell, search — surfaced as collapsible tool cards
  • Permission prompts — Allow once / Allow always / Deny, inline
  • Model manager — load / eject Ollama models from the composer, with loaded state, context size, and capability badges (👁 vision / 🔧 tools)
  • Multi-server — register, switch, and remove Ollama servers; offline mode with a connection banner
  • Context meter with compaction indicator, thinking toggle, image attachments for vision models, and the open file attached as excludable context
  • Session history — persistent, resumable, auto-named; delete one or clear all
  • Auto-context — loads the selected model with an adequate num_ctx so OpenCode's large system prompt doesn't overflow Ollama's small default window

Requirements

  • VS Code 1.104+
  • Ollama installed and running (default http://127.0.0.1:11434) with at least one model pulled — e.g. ollama pull llama3.2 (use a tool-capable model for the agent)
  • OpenCode installed (brew install sst/tap/opencode or npm i -g opencode-ai). Auto-detected from PATH or ~/.opencode/bin/opencode.

Quick start

  1. Start Ollama and pull a model: ollama pull llama3.2 (or qwen3, mistral-small, etc.).
  2. Install this extension (or run it from source — see below).
  3. Click the Ollama icon in the Activity Bar.
  4. Pick a model, type a task, hit Enter.

Settings

Setting Default Description
ollamaCode.ollamaBaseUrl http://127.0.0.1:11434 Ollama server host (root, no /v1)
ollamaCode.opencodePath (auto) Path to the opencode binary
ollamaCode.serverPort 0 Embedded server port (0 = auto)
ollamaCode.defaultModel (first) Default model id (e.g. llama3.2:3b)
ollamaCode.agent build build (can edit) or plan (read-only)
ollamaCode.autoEnsureContext true Load the model with an adequate num_ctx before prompting
ollamaCode.minContextLength 32768 Context window (num_ctx) to load models with
ollamaCode.keepAlive 30m Ollama keep_alive — how long a model stays loaded

How it works

VS Code webview (chat UI)
        │  postMessage
        ▼
Extension host (bridge)
        │  HTTP + SSE  (raw fetch)
        ▼
opencode serve  ──native ollama provider──▶  Ollama (/api/chat, local model)
   (OLLAMA_HOST + OPENCODE_CONFIG_CONTENT injected at launch)

The extension enumerates your installed models with Ollama's REST API (/api/tags, /api/show, /api/ps), then augments OpenCode's built-in ollama provider with those models (capabilities, context limit, num_ctx) via the OPENCODE_CONFIG_CONTENT environment variable — nothing is written to your workspace or global config. The active server is passed through OLLAMA_HOST. Model load/eject uses /api/generate with keep_alive and options.num_ctx.

Develop from source

npm install
npm run compile        # type-check + bundle (extension + webview)
# then press F5 in VS Code to launch the Extension Development Host
npm run package:vsix   # build a .vsix

License

MIT

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft