OpenWire

Expose VS Code language models as an OpenAI-compatible REST API on localhost.

One extension. Every model VS Code can see. Standard API. Built for agents.

Features

OpenAI-compatible — /v1/chat/completions, /v1/models with streaming (SSE)
Auto-discovery — finds every language model registered in VS Code
Tool forwarding — pass OpenAI-format tools, get tool_calls back
Multi-provider content handling — normalises Anthropic-style content arrays, OpenAI strings, and Gemini parts into a consistent format
XML tool call fallback — when native tool forwarding isn't available, parses Claude's XML <function_calls> output into proper tool_calls objects
Rate limiting — configurable per-minute request cap
API key auth — optional Bearer token authentication
Zero dependencies — pure Node.js HTTP, no Express, no frameworks

Models

Any model available through VS Code's Language Model API is automatically exposed — no configuration needed. This typically includes:

Claude — Opus, Sonnet, Haiku
GPT — Codex, GPT-4.1, o4-mini
Gemini — Gemini Pro, Gemini Flash
Ollama — any locally running Ollama models (Llama, Qwen, DeepSeek, Mistral, etc.)
Any other models registered via the VS Code Language Model API

Run GET /v1/models to see what's available in your setup.

Provider Compatibility

OpenWire normalises differences between providers so callers always get a consistent OpenAI-format response:

Provider	Content format	Tool calling	Status
Claude (Anthropic)	Array of `{"type":"text","text":"..."}` parts	Native via VS Code API; XML `<function_calls>` fallback parsed automatically	✅ Full support
GPT (OpenAI)	Plain string	Native `tool_calls` via VS Code API	✅ Full support
Gemini (Google)	Plain string or parts array	Native via VS Code API	✅ Full support
Ollama (local)	Plain string	Depends on model capability	✅ Supported

Content normalisation — Incoming messages with content as an array of content parts (Anthropic format), a plain string (OpenAI/Gemini), or null are all normalised to plain strings before forwarding to the VS Code LM API.

Tool call fallback — When the VS Code LM API can't forward tools natively (e.g. older VS Code versions), Claude may output tool calls as XML. OpenWire detects and converts these to standard tool_calls objects in the response, so callers never see raw XML.

Quick Start

Install from the VS Code Marketplace (or load the .vsix). The server starts automatically on http://127.0.0.1:3030.

# List available models
curl http://localhost:3030/v1/models

# Chat completion
curl http://localhost:3030/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4.6",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

# Streaming
curl http://localhost:3030/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4.6",
    "messages": [{"role": "user", "content": "Explain zero-knowledge proofs"}],
    "stream": true
  }'

Endpoints

Method	Path	Description
`GET`	`/health`	Health check
`GET`	`/v1/models`	List available models
`GET`	`/v1/models/:id`	Get specific model
`POST`	`/v1/chat/completions`	Chat completion (streaming + non-streaming)
`POST`	`/v1/completions`	Legacy completions (mapped to chat)

Configuration

All settings live under openWire.server.* in VS Code:

Setting	Default	Description
`autoStart`	`true`	Start server when VS Code launches
`host`	`127.0.0.1`	Bind address
`port`	`3030`	Port number
`apiKey`	`""`	Bearer token for authentication
`defaultModel`	`""`	Fallback model when none specified
`defaultSystemPrompt`	`""`	Injected system prompt if none present
`maxConcurrentRequests`	`4`	Concurrent request limit
`rateLimitPerMinute`	`60`	Rate limit
`requestTimeoutSeconds`	`300`	Request timeout
`enableLogging`	`false`	Verbose logging

Commands

OpenWire: Start Server
OpenWire: Stop Server
OpenWire: Restart Server
OpenWire: Toggle Server

Using with OpenClaw

OpenWire can serve as a model provider for OpenClaw agents. Register OpenWire as a custom provider called copilot-proxy in your ~/.openclaw/openclaw.json:

{
  "models": {
    "providers": {
      "copilot-proxy": {
        "baseUrl": "http://localhost:3030/v1",
        "apiKey": "n/a",
        "api": "openai-completions",
        "authHeader": false,
        "models": [
          {
            "id": "claude-sonnet-4.6",
            "name": "Claude Sonnet 4.6",
            "contextWindow": 128000,
            "maxTokens": 8192
          }
          // add any other models from /v1/models
        ]
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "copilot-proxy/claude-sonnet-4.6"
      }
    }
  },
  "plugins": {
    "entries": {
      "copilot-proxy": { "enabled": true }
    }
  }
}

Set authHeader: false since OpenWire handles authentication through VS Code's Copilot session — no API keys are needed. Run curl http://localhost:3030/v1/models to see all available model IDs.

Architecture

src/
  extension.ts          — activation, commands, status bar
  models/
    discovery.ts        — model discovery, caching, dedup
  routes/
    chat.ts             — chat completions + tool forwarding
  server/
    config.ts           — settings loader
    gateway.ts          — HTTP server, routing, middleware
  ui/
    sidebar.ts          — webview sidebar panel
  types/
    vscode-lm.d.ts      — type augmentations

Lightweight · zero runtime dependencies

License

MIT

OpenWire

lewiswigmore

OpenWire

Features

Models

Provider Compatibility

Quick Start

Endpoints

Configuration

Commands

Using with OpenClaw

Architecture

License