Skip to content
| Marketplace
Sign in
Visual Studio Code>AI>Copilot Adapter KitNew to Visual Studio Code? Get it now.
Copilot Adapter Kit

Copilot Adapter Kit

salilvnair

|
1 install
| (0) | Free
Plugin-based provider mesh for Copilot Chat. BYOK, zero config.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

CAK
Copilot Adapter Kit

Plugin · Mesh · Deploy
Any model. Every provider. One picker. Zero compromise.

MIT VS Code Node


Copilot Adapter Kit brings any OpenAI‑compatible model into GitHub Copilot Chat — OpenAI, Ollama, LM Studio, vLLM, Groq, Fireworks, Together AI, or your own self‑hosted endpoint. All at once, side‑by‑side in the model picker.

Agent mode. Tool calling. Streaming. Vision. Thinking blocks. Built‑in 429 retry, error mapping, diagnostics, and request dumps. Zero code changes when adding new providers — just JSON config.

One extension. Every backend. Zero friction.


⚡ 30‑Second Setup

// settings.json
{
  "copilot-adapter-kit.providers": {
    "openai": { "baseUrl": "https://api.openai.com/v1" }
  }
}
1. Cmd+Shift+P → "Copilot Adapter Kit: Set API Key" → pick "openai" → paste sk-...
2. Click **$(cak-icon) CAK** in the status bar → Add Provider (family: openai, name: My OpenAI, URL) → Add Model (id, name, family: openai)
3. Cmd+Shift+I → Copilot Chat → pick your model from the dropdown
4. Chat. Done.

📋 Table of Contents

  • Providers & Recipes
    • OpenAI
    • Ollama
    • LM Studio
    • Any OpenAI‑Compatible Provider
    • Multiple Providers at Once
  • Manage Models
  • Architecture
  • Settings Reference
  • Commands Reference
  • Interceptors (Built‑in Middleware)
  • Developer Guide
    • Project Structure
    • Adding a New Engine
    • Adding a New Interceptor
  • Troubleshooting
  • Contributing
  • License

🌐 Providers & Recipes

Copilot Adapter Kit uses a family‑based routing model. Each provider family gets its own endpoint, API key, and optional model aliases. The bridge resolves model → family → provider config + key → engine automatically.

OpenAI

The default family. Uses the OpenAI Chat Completions API over SSE streaming.

// settings.json
{
  "copilot-adapter-kit.providers": {
    "openai": {
      "name": "My OpenAI",
      "baseUrl": "https://api.openai.com/v1"
    }
  }
}
Cmd+Shift+P → "Copilot Adapter Kit: Set API Key" → pick "openai" → paste sk-...

What you get:

  • Add your own models via the Panel UI or copilot-adapter-kit.models setting
  • Vision (paste screenshots into chat)
  • Tool calling (Agent mode)
  • 128 tool limit, up to 1M context
  • All models are user‑defined — no built‑ins, full control

Ollama

Run models locally via Ollama's OpenAI‑compatible endpoint. No API key needed, no data leaves your machine.

Step 1: Install & start Ollama

# macOS
brew install ollama
ollama serve

# Pull a model
ollama pull llama3.1:8b-instruct-q8_0
ollama pull qwen2.5-coder:14b

Step 2: Configure the provider

{
  "copilot-adapter-kit.providers": {
    "ollama": {
      "baseUrl": "http://localhost:11434/v1",
      "modelAlias": {
        "llama3-8b": "llama3.1:8b-instruct-q8_0",
        "qwen-coder": "qwen2.5-coder:14b"
      }
    }
  }
}

Step 3: Register models in the picker

{
  "copilot-adapter-kit.models": [
    {
      "id": "llama3-8b",
      "name": "Llama 3.1 8B",
      "family": "ollama",
      "detail": "Local Llama 3.1 8B Instruct Q8_0",
      "maxIn": 128000,
      "maxOut": 8192,
      "toolCalling": 64
    },
    {
      "id": "qwen-coder",
      "name": "Qwen 2.5 Coder 14B",
      "family": "ollama",
      "detail": "Local Qwen 2.5 Coder 14B",
      "maxIn": 32768,
      "maxOut": 8192,
      "toolCalling": 64,
      "image": false
    }
  ]
}

Step 4: Set a dummy API key (Ollama requires one but ignores it)

Cmd+Shift+P → "Copilot Adapter Kit: Set API Key" → pick "ollama" → type "ollama"

Step 5: Select your local model from the Copilot Chat picker and start chatting.

💡 Tip for large models: Set "maxOut": 8192 or lower to avoid timeouts. Use "toolCalling": 64 for smaller models that struggle with many tools.


LM Studio

LM Studio exposes an OpenAI‑compatible server on port 1234. Same pattern as Ollama.

Step 1: Install LM Studio, download a model, and start the local server.

  • Go to the Developer tab (or Local Server tab)
  • Select your model
  • Click Start Server (default: http://localhost:1234)

Step 2: Configure

{
  "copilot-adapter-kit.providers": {
    "lmstudio": {
      "baseUrl": "http://localhost:1234/v1"
    }
  },
  "copilot-adapter-kit.models": [
    {
      "id": "qwen-coder",
      "name": "Qwen 2.5 Coder",
      "family": "lmstudio",
      "detail": "Local via LM Studio",
      "maxIn": 32768,
      "maxOut": 8192,
      "toolCalling": 64,
      "image": false
    }
  ]
}

Step 3: Set a dummy key

Cmd+Shift+P → "Copilot Adapter Kit: Set API Key" → pick "lmstudio" → type "lmstudio"

Step 4: Pick your model in Copilot Chat and go.

⚠️ LM Studio note: The model ID you configure ("qwen-coder") must match the model loaded in LM Studio's server. Use "modelAlias" if the LM Studio model name differs from what you want in the picker.


Any OpenAI‑Compatible Provider

The openai engine speaks the standard OpenAI Chat Completions API. Any provider that implements /v1/chat/completions with SSE streaming works out of the box:

Provider Base URL
Groq https://api.groq.com/openai/v1
Fireworks https://api.fireworks.ai/inference/v1
Together AI https://api.together.xyz/v1
DeepSeek https://api.deepseek.com/v1
vLLM (self‑hosted) http://your-server:8000/v1
OpenRouter https://openrouter.ai/api/v1
Mistral https://api.mistral.ai/v1
xAI Grok https://api.x.ai/v1
Your own proxy https://your-proxy.example.com/v1

Generic recipe:

{
  "copilot-adapter-kit.providers": {
    "groq": {
      "baseUrl": "https://api.groq.com/openai/v1",
      "modelAlias": { "llama3-70b": "llama-3.1-70b-versatile" }
    }
  },
  "copilot-adapter-kit.models": [
    {
      "id": "llama3-70b",
      "name": "Llama 3.1 70B (Groq)",
      "family": "groq",
      "detail": "Groq LPU — blazing fast",
      "maxIn": 128000,
      "maxOut": 8192,
      "toolCalling": 128
    }
  ]
}
Cmd+Shift+P → "Copilot Adapter Kit: Set API Key" → pick "groq" → paste your Groq API key

Multiple Providers at Once

All providers can coexist. Each model's "family" field determines which engine and API key are used:

{
  "copilot-adapter-kit.providers": {
    "openai":   { "baseUrl": "https://api.openai.com/v1" },
    "ollama":   { "baseUrl": "http://localhost:11434/v1" },
    "lmstudio": { "baseUrl": "http://localhost:1234/v1" },
    "groq":     { "baseUrl": "https://api.groq.com/openai/v1" },
    "deepseek": { "baseUrl": "https://api.deepseek.com/v1" }
  },
  "copilot-adapter-kit.models": [
    { "id": "gpt-5.5",       "name": "GPT-5.5",          "family": "openai" },
    { "id": "llama3-8b",     "name": "Llama 3.1 8B",     "family": "ollama" },
    { "id": "qwen-coder",    "name": "Qwen 2.5 Coder",   "family": "lmstudio" },
    { "id": "llama3-70b",    "name": "Llama 3.1 70B",    "family": "groq" },
    { "id": "deepseek-chat", "name": "DeepSeek V4",      "family": "deepseek" }
  ]
}

Each family gets its own API key. Run Set API Key once per provider.


🧩 Manage Models

There are no built‑in models. All models are user‑defined — add them via the Panel UI or JSON.

The Model ID is the exact name sent to the API (e.g. gpt-5.2, llama3-70b). The Name is the display label in the picker.

Field Required Description
id ✅ Exact API model name sent to the provider
family ✅ Provider family (openai, ollama, groq, etc.)
name — Display name in the picker. Defaults to id.
maxIn — Max input tokens. Default 128000.
maxOut — Max output tokens. Default 16384.
image — Vision/image support. Default true.
thinking — Reasoning token support. Default false.
toolCalling — Max parallel tool calls. Default 128.
apiPath — Per‑model API path override (e.g. /responses). Falls back to provider default.

🧱 Architecture

┌──────────────────────────────────────────┐
│          Copilot Chat (VS Code)            │
└────────────────┬─────────────────────────┘
                 │ LanguageModelChatProvider
┌────────────────▼─────────────────────────┐
│  conduit/copilot-bridge.ts                 │
│  model → family → provider config + key   │
│  Tool stabilization (opt‑in)              │
└────────────────┬─────────────────────────┘
                 │ Engine SPI
┌────────────────▼─────────────────────────┐
│  mesh/pipeline.ts  (AOP Chain)             │
│  ┌───────────────┐ ┌──────────────┐       │
│  │RateLimitGuard │→│ ErrorWarden  │→      │
│  │ 429 retry×3   │ │ HTTP+net map │       │
│  └───────────────┘ └──────────────┘       │
│  ┌───────────────┐                        │
│  │  DiagTracer   │  fingerprint·dump      │
│  └───────────────┘                        │
└────────────────┬─────────────────────────┘
                 │
┌────────────────▼─────────────────────────┐
│  mesh/discovery.ts  (Provider Registry)    │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐  │
│  │ openai   │ │ ollama   │ │ lmstudio │  │
│  │ SSE      │ │ SSE      │ │ SSE      │  │
│  └──────────┘ └──────────┘ └──────────┘  │
└──────────────────────────────────────────┘

Design Patterns

Pattern Implementation
SPI (Service Provider Interface) mesh/contract.ts — Engine interface. Every backend implements it.
IoC (Inversion of Control) kernel/context.ts — single bootstrapper wires all services.
AOP (Aspect‑Oriented) mesh/pipeline.ts — interceptor chain wraps every engine call.
Factory mesh/discovery.ts — register engines by family, lookup at runtime.
Strategy Per‑family ProviderConfig with optional modelAlias translations.

⚙️ Settings Reference

All settings are under the copilot-adapter-kit prefix.

copilot-adapter-kit.providers

{
  "copilot-adapter-kit.providers": {
    "openai": {
      "baseUrl": "https://api.openai.com/v1",   // Required: API endpoint
      "modelAlias": {                            // Optional: picker-id → API model name
        "gpt-4o": "gpt-4o-2024-08-06"
      }
    }
  }
}

Each key under providers is a family name. It must match the family field in your model definitions and the engine registered in ProviderDiscovery.

  • baseUrl — The API endpoint (e.g. https://api.openai.com/v1).
  • name — Friendly display name (shown as label with family as colored chip).
  • defaultApiPath — Default API path. Default: /chat/completions.
  • modelApiPaths — Per‑model path overrides (e.g. {"codex-5.3": "/responses"}).

copilot-adapter-kit.models

Array of custom model definitions. See Manage Models for the full schema.

copilot-adapter-kit.maxTokens

{ "copilot-adapter-kit.maxTokens": 4096 }

Maximum output tokens sent to the provider. 0 (default) means no limit — the provider's default applies.

copilot-adapter-kit.logLevel

{ "copilot-adapter-kit.logLevel": "quiet" }   // Default: no output channel
{ "copilot-adapter-kit.logLevel": "meta"  }   // Log request fingerprints & diffs
{ "copilot-adapter-kit.logLevel": "dump"  }   // meta + write request payloads to disk

View logs: Cmd+Shift+P → "Copilot Adapter Kit: Show Logs".
View dumps: Cmd+Shift+P → "Copilot Adapter Kit: Open Dumps Folder".

copilot-adapter-kit.stabilizeTools

{ "copilot-adapter-kit.stabilizeTools": true }

⚠️ Experimental. Pre‑activates VS Code tool activators to lock the tools array across conversation turns. Helps maintain cache prefix stability. If you see "tool list is unstable" warnings, enable this.

copilot-adapter-kit.hiddenCustomModels

Managed automatically by the Panel UI when you hide custom models. No JSON editing needed.


⌨️ Commands Reference

Command ID Description
Open Panel copilot-adapter-kit.openPanel 🎨 Form UI — providers, models, keys, config, danger zone
Configure copilot-adapter-kit.configure QuickPick wizard — max tokens, log level, etc.
Add Provider copilot-adapter-kit.addProvider Step‑by‑step form — family, friendly name, base URL
Remove Provider copilot-adapter-kit.removeProvider Cascade deletes provider + all its models + keys
Add Model copilot-adapter-kit.addModel 8‑step form with dropdowns — id, name, family, context, vision, thinking, tools
Remove Model copilot-adapter-kit.removeModel Pick a model to remove
Set API Key copilot-adapter-kit.setApiKey Store per‑provider API key in OS keychain
Clear API Key copilot-adapter-kit.clearApiKey Remove a provider's API key
Open Settings copilot-adapter-kit.openSettings Jump to raw JSON settings editor
Show Logs copilot-adapter-kit.showLogs Open the "Copilot Adapter Kit" output channel
Open Dumps Folder copilot-adapter-kit.openDumps Reveal the request dump directory in Finder

Click $(plug) CAK in the status bar to open the panel. All commands also available under Copilot Adapter Kit: in the Command Palette (Cmd+Shift+P).


🛡️ Interceptors

Every request passes through a chain of interceptors — middleware that can inspect, modify, or short‑circuit the stream. Think of it as Express‑style middleware for LLM calls.

RateLimitGuard — 429 Auto‑Retry

  • Catches HTTP 429 responses from the provider
  • Parses "try again in Xs" from the response body
  • Shows a thinking block with wait time: "Rate limited. Retrying in 10s... (1/3)"
  • Retries up to 3 times with the full request
  • If the Retry-After header or body is missing, defaults to 10s

ErrorWarden — Friendly Error Messages

Maps raw error codes to actionable user messages:

Error Message
401 "Invalid API key. Run Set API Key."
402 "Insufficient balance. Top up your account."
500/502/503 "Provider server error. Retry shortly."
ENOTFOUND "DNS lookup failed. Check network/firewall."
ECONNREFUSED "Connection refused. Verify baseUrl and service status."
ETIMEDOUT "Connection timed out. Service may be overloaded."
CERT_HAS_EXPIRED "TLS verification failed. Check certificate."
ECONNRESET "Connection interrupted. Check network stability."

All errors include a collapsible <details> block with raw response text for debugging.

DiagTracer — Request Diagnostics

At logLevel: meta:

  • Logs each request: model, message count, tool count
  • Computes a fingerprint (message structure hash) and diffs against the previous request
  • Detects shifts in system prompt windows, user messages, and tool configuration
  • Calibrates token estimation from real usage data

At logLevel: dump:

  • Writes the full request payload (JSON) to $TMPDIR/copilot-adapter-kit-dumps/
  • Writes the system prompt separately as .sys.txt for easy inspection

👨‍💻 Developer Guide

nvm use 22             # Requires Node ≥22
npm install            # Install dependencies
npm run watch          # Compile in watch mode

Project Structure

copilot-adapter-kit/
├── src/
│   ├── entry.ts                   # VS Code activate/deactivate
│   ├── kernel/                    # IoC container & configuration
│   │   ├── context.ts             # ApplicationContext — boots all services
│   │   ├── vault.ts               # Per-family key storage (OS keychain)
│   │   └── tuning.ts              # Settings facade
│   ├── mesh/                      # Engine SPI & pipeline
│   │   ├── contract.ts            # Engine, Payload, StreamEvents interfaces
│   │   ├── discovery.ts           # Provider registry (register engines here)
│   │   ├── pipeline.ts            # AOP interceptor chain
│   │   └── engines/
│   │       └── openai/
│   │           ├── openai-engine.ts   # OpenAI SSE stream implementation
│   │           └── wire-format.ts     # VS Code messages → OpenAI JSON
│   ├── conduit/                   # VS Code API integration
│   │   ├── copilot-bridge.ts      # LanguageModelChatProvider impl
│   │   └── model-catalog.ts       # Model registry + user model loader
│   ├── crosscut/                  # Interceptors (cross-cutting concerns)
│   │   ├── rate-limit-guard.ts    # 429 retry with thinking block
│   │   ├── error-warden.ts        # HTTP + network error → friendly text
│   │   ├── diag-tracer.ts         # Request logging, fingerprinting, dumps
│   │   ├── insight-engine.ts      # Request fingerprint hashing & diff
│   │   └── tool-stabilizer.ts     # Tool pre-activation stabilizer
│   └── tooling/                   # Utility classes
│       └── token-math.ts          # Approximate token counting
├── package.json
├── tsconfig.json
└── README.md

Adding a New Engine

The Engine SPI is the only contract you need to implement. Here's how to add a new provider (e.g., Anthropic):

1. Implement the Engine interface

// src/mesh/engines/anthropic/anthropic-engine.ts
import { Engine, Payload, StreamEvents } from '../../contract';

export class AnthropicEngine implements Engine {
  readonly family = 'anthropic';
  private baseUrl = '';
  private apiKey = '';

  configure(endpoint: string, key: string): void {
    this.baseUrl = endpoint;
    this.apiKey = key;
  }

  async stream(req: Payload, sink: StreamEvents, signal?: AbortSignal): Promise<void> {
    // Translate Payload → Anthropic Messages API format
    // Call fetch(), handle streaming, emit onToken/onToolSignal/onComplete
    // On error: sink.onFault(error)
  }
}

2. Register the engine

// src/mesh/discovery.ts
import { AnthropicEngine } from './engines/anthropic/anthropic-engine';

export class ProviderDiscovery {
  constructor() {
    this.register(new OpenAIEngine());
    this.register(new AnthropicEngine());   // ← Add here
  }
}

3. Add built‑in models (optional)

// src/conduit/model-catalog.ts
export const BUILTIN_CATALOG: ModelMeta[] = [
  // ...existing...
  { id: 'claude-opus', name: 'Claude Opus 4', family: 'anthropic',
    version: 'claude-opus-4', detail: 'Most capable Anthropic model',
    maxIn: 200000, maxOut: 16384, image: true, thinking: true, toolCalling: 128 },
];

4. Users configure it

{
  "copilot-adapter-kit.providers": {
    "anthropic": { "baseUrl": "https://api.anthropic.com" }
  }
}

That's it. The pipeline, interceptors, key management, and model picker all work automatically for the new family.

Adding a New Interceptor

Implement the Interceptor interface and register it in Context.bootstrap():

// src/crosscut/my-interceptor.ts
import type { Interceptor } from '../mesh/pipeline';

export class MyInterceptor implements Interceptor {
  async intercept(payload, engine, sink, signal, next) {
    // BEFORE: inspect/modify payload or sink
    console.log('request:', payload.model);

    await next();  // Call the next interceptor (or the engine)

    // AFTER: the stream has completed
  }
}
// src/kernel/context.ts
import { MyInterceptor } from '../crosscut/my-interceptor';

static async bootstrap(ext: vscode.ExtensionContext): Promise<Context> {
  // ...
  ctx.pipeline.use(ctx.rateLimitGuard);
  ctx.pipeline.use(ctx.errorWarden);
  ctx.pipeline.use(new MyInterceptor());   // ← Add here
  ctx.pipeline.use(ctx.tracer);
  // ...
}

Interceptors run in registration order. RateLimitGuard and ErrorWarden wrap sink.onFault to intercept errors — DiagTracer wraps lifecycle events for observability.

Key Design Decisions

  • No shared API key. Each family gets its own key in the OS keychain (copilot-adapter-kit.apiKey.{family}). There is no fallback key.
  • No engine discovery from config. Engines are compile‑time registered in ProviderDiscovery. The config only provides endpoint + key. This keeps the SPI surface small and prevents arbitrary code execution.
  • Error reporting is inline, not thrown. The bridge reports errors as LanguageModelTextPart so the user sees a formatted message in chat rather than a red error banner.
  • Thinking blocks use LanguageModelThinkingPart (proposed API) with ID 'cak-thinking' for the glow animation in VS Code Insiders.
  • All interceptors use async chains fully awaited. This is critical for 429 retry — missing an await means the guard fires after the response is already returned.

🔧 Troubleshooting

"No API key configured" warning

Run Cmd+Shift+P → "Copilot Adapter Kit: Set API Key", pick the provider, and paste your key. Keys are stored per‑family — make sure you set the key for the correct provider.

"No baseUrl configured for provider" error

Add the provider to copilot-adapter-kit.providers in your settings.json:

{
  "copilot-adapter-kit.providers": {
    "ollama": { "baseUrl": "http://localhost:11434/v1" }
  }
}
Ollama connection refused

Make sure Ollama is running:

curl http://localhost:11434/v1/models

If not, start it: ollama serve

LM Studio connection refused

Make sure the LM Studio local server is started:

  1. Open LM Studio → Developer tab (or Local Server)
  2. Load your model
  3. Click "Start Server"
  4. Verify: curl http://localhost:1234/v1/models
429 Rate Limit errors

CAK auto‑retries up to 3 times. If you still see 429s:

  • Reduce request rate (fewer parallel chats)
  • Upgrade your provider tier
  • For local models, 429 shouldn't happen — check your proxy configuration
Model not showing in picker
  1. Make sure the model's family matches a key in copilot-adapter-kit.providers
  2. Make sure you've set an API key for that family
  3. Run Cmd+Shift+P → "Developer: Reload Window" to refresh the picker

🤝 Contributing

Issues and PRs are welcome. Before adding a new engine, please read the Adding a New Engine section. The SPI is intentionally small — keep engine implementations self‑contained in src/mesh/engines/{family}/.


License

MIT © salilvnair


📦 Publishing

npm run compile          # Build TypeScript → out/
npm run logo             # Generate icon font from resources/cak-icon-src.svg
npm run package          # Create .vsix file

Quick publish

npm run publish          # Publish to marketplace
npm run publish:patch    # Auto‑bump patch version + publish
npm run publish:minor    # Auto‑bump minor version + publish

Manual upload

  1. Go to marketplace.visualstudio.com/manage
  2. Click New Extension → upload the .vsix

Prerequisites

  • Node ≥22 (use nvm use 22)
  • "publisher": "salilvnair" matches marketplace publisher ID
  • resources/icon.png — extension icon (also used in panel header)
  • resources/cak-icons.woff — custom icon font for status bar $(cak-icon)
  • .vscodeignore excludes src/, node_modules/, etc.
Provider Family baseUrl
OpenAI openai https://api.openai.com/v1
Azure OpenAI openai https://{res}.openai.azure.com/openai/deployments/{dep}
Ollama ollama http://localhost:11434/v1
LM Studio lmstudio http://localhost:1234/v1
vLLM openai http://localhost:8000/v1
Groq openai https://api.groq.com/openai/v1
Together AI openai https://api.together.xyz/v1
LiteLLM Proxy openai http://localhost:4000/v1

Providers with non‑OpenAI APIs (Anthropic, Google) need a custom engine — see the Anthropic example above.


📄 License

MIT · Copyright © 2026 salilvnair

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft