Features
- OpenAI-compatible —
/v1/chat/completions, /v1/models with streaming (SSE)
- Auto-discovery — finds every language model registered in VS Code
- Tool forwarding — pass OpenAI-format tools, get
tool_calls back
- Multi-provider content handling — normalises Anthropic-style content arrays, OpenAI strings, and Gemini parts into a consistent format
- XML tool call fallback — when native tool forwarding isn't available, parses Claude's XML
<function_calls> output into proper tool_calls objects
- Rate limiting — configurable per-minute request cap
- API key auth — optional Bearer token authentication
- Zero dependencies — pure Node.js HTTP, no Express, no frameworks
Models
Any model available through VS Code's Language Model API is automatically exposed — no configuration needed. This typically includes:
- Claude — Opus, Sonnet, Haiku
- GPT — Codex, GPT-4.1, o4-mini
- Gemini — Gemini Pro, Gemini Flash
- Ollama — any locally running Ollama models (Llama, Qwen, DeepSeek, Mistral, etc.)
- Any other models registered via the VS Code Language Model API
Run GET /v1/models to see what's available in your setup.
Provider Compatibility
OpenWire normalises differences between providers so callers always get a consistent OpenAI-format response:
| Provider |
Content format |
Tool calling |
Status |
| Claude (Anthropic) |
Array of {"type":"text","text":"..."} parts |
Native via VS Code API; XML <function_calls> fallback parsed automatically |
✅ Full support |
| GPT (OpenAI) |
Plain string |
Native tool_calls via VS Code API |
✅ Full support |
| Gemini (Google) |
Plain string or parts array |
Native via VS Code API |
✅ Full support |
| Ollama (local) |
Plain string |
Depends on model capability |
✅ Supported |
Content normalisation — Incoming messages with content as an array of content parts (Anthropic format), a plain string (OpenAI/Gemini), or null are all normalised to plain strings before forwarding to the VS Code LM API.
Tool call fallback — When the VS Code LM API can't forward tools natively (e.g. older VS Code versions), Claude may output tool calls as XML. OpenWire detects and converts these to standard tool_calls objects in the response, so callers never see raw XML.
Quick Start
Install from the VS Code Marketplace (or load the .vsix). The server starts automatically on http://127.0.0.1:3030.
# List available models
curl http://localhost:3030/v1/models
# Chat completion
curl http://localhost:3030/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4.6",
"messages": [{"role": "user", "content": "Hello"}]
}'
# Streaming
curl http://localhost:3030/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4.6",
"messages": [{"role": "user", "content": "Explain zero-knowledge proofs"}],
"stream": true
}'
Endpoints
| Method |
Path |
Description |
GET |
/health |
Health check |
GET |
/v1/models |
List available models |
GET |
/v1/models/:id |
Get specific model |
POST |
/v1/chat/completions |
Chat completion (streaming + non-streaming) |
POST |
/v1/completions |
Legacy completions (mapped to chat) |
Configuration
All settings live under openWire.server.* in VS Code:
| Setting |
Default |
Description |
autoStart |
true |
Start server when VS Code launches |
host |
127.0.0.1 |
Bind address |
port |
3030 |
Port number |
apiKey |
"" |
Bearer token for authentication |
defaultModel |
"" |
Fallback model when none specified |
defaultSystemPrompt |
"" |
Injected system prompt if none present |
maxConcurrentRequests |
4 |
Concurrent request limit |
rateLimitPerMinute |
60 |
Rate limit |
requestTimeoutSeconds |
300 |
Request timeout |
enableLogging |
false |
Verbose logging |
Commands
- OpenWire: Start Server
- OpenWire: Stop Server
- OpenWire: Restart Server
- OpenWire: Toggle Server
Using with OpenClaw
OpenWire can serve as a model provider for OpenClaw agents. Register OpenWire as a custom provider called copilot-proxy in your ~/.openclaw/openclaw.json:
{
"models": {
"providers": {
"copilot-proxy": {
"baseUrl": "http://localhost:3030/v1",
"apiKey": "n/a",
"api": "openai-completions",
"authHeader": false,
"models": [
{
"id": "claude-sonnet-4.6",
"name": "Claude Sonnet 4.6",
"contextWindow": 128000,
"maxTokens": 8192
}
// add any other models from /v1/models
]
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "copilot-proxy/claude-sonnet-4.6"
}
}
},
"plugins": {
"entries": {
"copilot-proxy": { "enabled": true }
}
}
}
Set authHeader: false since OpenWire handles authentication through VS Code's Copilot session — no API keys are needed. Run curl http://localhost:3030/v1/models to see all available model IDs.
Architecture
src/
extension.ts — activation, commands, status bar
models/
discovery.ts — model discovery, caching, dedup
routes/
chat.ts — chat completions + tool forwarding
server/
config.ts — settings loader
gateway.ts — HTTP server, routing, middleware
ui/
sidebar.ts — webview sidebar panel
types/
vscode-lm.d.ts — type augmentations
Lightweight · zero runtime dependencies
License
MIT