VS Code AI Normalizer
One local proxy. Many VS Code AI surfaces. Route Copilot Chat, Agent mode, and inline chat through your own OpenAI-compatible endpoints—even when the upstream model speaks inline XML tools instead of JSON tool_calls.
AI Normalizer runs a lightweight local proxy, translates tool formats via pluggable adapters, discovers models from each upstream GET /v1/models, and syncs them into VS Code BYOK chatLanguageModels.json.
Who this is for
- You use a custom or proxied API (Gemini non-
customtools, OpenRouter, vLLM, corporate gateway) with VS Code Copilot BYOK.
- Upstream returns XML / text tool calls but VS Code expects OpenAI-style JSON tools.
- You want one config for endpoints, per-model overrides, and automatic model list refresh—without hand-editing every model id.
Requirements
| Requirement |
Notes |
| VS Code 1.96+ |
Language model BYOK APIs |
| Custom Endpoint provider |
Language models in VS Code |
normalizer-proxy binary |
Built via pnpm run build:proxy (dev) or bundled with the extension |
| Copilot BYOK policy |
Business/Enterprise may need admin opt-in |
How it works
Copilot Chat / Agent
│
▼
http://127.0.0.1:3847/v1/chat/completions (AI Normalizer proxy)
│
├── adapter: inline-xml-tools → XML ↔ JSON tool_calls
└── adapter: openai-pass-through → unchanged
│
▼
Your upstream API (Gemini proxy, OpenRouter, etc.)
On activate, the extension:
- Discovers models per endpoint (
GET /v1/models or custom modelsUrl).
- Caches results in
models-cache.json (global storage).
- Merges with
aiNormalizer.modelOverrides and optional manual models[].
- Reloads the proxy and syncs
chatLanguageModels.json.
Installation
From source (development):
pnpm install
pnpm run build
Press F5 to launch an Extension Development Host (.vscode preLaunch runs pnpm run build, which compiles TypeScript and builds bin/normalizer-proxy). Or package a VSIX with pnpm run build then npx @vscode/vsce package — see docs/PUBLISHING.md.
End users: Install AI Endpoint Normalizer from the Marketplace (enable Pre-release until stable) or install the VSIX from GitHub Releases. Configure endpoints below, run AI Normalizer: Sync Language Models, reload the window if models do not appear. The extension ships bin/normalizer-proxy (platform-specific); override with aiNormalizer.proxyBinaryPath if needed.
Questions & bugs: GitHub Issues
Configuration reference
| Setting |
Default |
Description |
aiNormalizer.proxyPort |
3847 |
Local proxy port |
aiNormalizer.autoStartProxy |
true |
Start proxy on activation |
aiNormalizer.autoSyncOnActivate |
true |
Sync language models after proxy is up |
aiNormalizer.proxyBinaryPath |
"" |
Override path to normalizer-proxy |
aiNormalizer.profilesPath |
"" |
External JSON for named tool profiles |
aiNormalizer.profiles |
{} |
Inline named profiles |
aiNormalizer.modelCachePath |
"" |
Discovered-models cache file (empty = global storage) |
aiNormalizer.copilotByokSecretId |
aiNormalizer |
chat.lm.secret.* id in synced chatLanguageModels.json |
aiNormalizer.modelOverrides |
{} |
Per-model overrides, keys endpointId/modelId |
aiNormalizer.endpoints |
[] |
Upstream endpoints (see below) |
aiNormalizer.syncTargets |
[chatLanguageModels] |
Which consumers to update |
aiNormalizer.inlineCompletion |
{ enabled: false } |
Experimental FIM inline provider |
Per endpoint:
| Field |
Required |
Description |
id |
yes |
Stable id for routing and overrides |
upstreamUrl |
yes |
Chat completions URL upstream |
adapter |
yes |
inline-xml-tools, openai-pass-through, or json-tools-in-text |
adapterProfile |
no |
Named profile (e.g. gemini-non-customtools) |
apiKeySecretId |
no |
SecretStorage key (default aiNormalizer.endpoint.<id>) |
discoverModels |
no |
Discovery options (see auto-discovery example) |
models |
no |
Manual pin list; merged with discovered ids |
discoverModels:
| Field |
Default |
Description |
enabled |
true if models[] empty |
Fetch upstream model list |
modelsUrl |
derived from upstreamUrl |
Override list URL |
refreshOnActivate |
true |
Refetch when stale on startup |
ttlMinutes |
60 |
Cache TTL before refetch |
Models URL derivation (override with discoverModels.modelsUrl if needed):
Upstream upstreamUrl (chat) |
Derived modelsUrl |
…/v1/chat/completions |
…/v1/models |
…/v1beta/openai/chat/completions |
…/v1beta/openai/models |
…/v1beta/openai |
…/v1beta/openai/models |
…/openai/chat/completions |
…/openai/models |
…/openai |
…/openai/models |
Quick start (Gemini OpenAI-compatible)
- Add endpoint settings (no API key in JSON):
{
"aiNormalizer.endpoints": [
{
"id": "gemini",
"displayName": "Gemini",
"upstreamUrl": "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions",
"adapter": "openai-pass-through",
"discoverModels": { "enabled": true }
}
]
}
- Command Palette → AI Normalizer: Set Endpoint API Key → paste your Gemini API key.
- AI Normalizer: Refresh Model Catalog → AI Normalizer: Sync Language Models → reload VS Code window.
- Pick AI Normalizer model in Chat; if prompted for a BYOK secret, enter
local (see API keys below).
Configuration examples
Gemini Google OpenAI-compatible (/v1beta/openai/)
Official compat base uses /v1beta/openai/ (not /v1/). Use full chat completions URL; models URL is derived automatically.
{
"aiNormalizer.endpoints": [
{
"id": "gemini",
"upstreamUrl": "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions",
"adapter": "openai-pass-through",
"discoverModels": { "enabled": true }
}
]
}
Run Set Endpoint API Key (stores under aiNormalizer.endpoint.gemini by default). For XML-tool proxies, use inline-xml-tools + gemini-non-customtools profile instead.
Upstream returns <tool_use> blocks; Copilot needs JSON tool_calls.
{
"aiNormalizer.endpoints": [
{
"id": "gemini-locked",
"displayName": "Gemini (normalized)",
"upstreamUrl": "https://your-proxy.example/v1/chat/completions",
"adapter": "inline-xml-tools",
"adapterProfile": "gemini-non-customtools",
"discoverModels": {
"enabled": true,
"ttlMinutes": 120
}
}
]
}
Run AI Normalizer: Set Endpoint API Key after saving settings.
OpenAI-compatible pass-through (OpenRouter, vLLM)
No tool translation; discovery still works.
{
"aiNormalizer.endpoints": [
{
"id": "openrouter",
"upstreamUrl": "https://openrouter.ai/api/v1/chat/completions",
"adapter": "openai-pass-through",
"discoverModels": { "enabled": true }
}
]
}
Auto-discovery only (no manual model list)
Leave models omitted or []. Overrides tune discovered ids:
{
"aiNormalizer.endpoints": [
{
"id": "local-llm",
"upstreamUrl": "http://127.0.0.1:1234/v1/chat/completions",
"adapter": "openai-pass-through",
"discoverModels": {
"enabled": true,
"modelsUrl": "http://127.0.0.1:1234/v1/models"
}
}
],
"aiNormalizer.modelOverrides": {
"local-llm/llama3": {
"name": "Llama 3 Local",
"toolCalling": true,
"maxInputTokens": 32768,
"maxOutputTokens": 4096
}
}
}
Per-model overrides (Agent vs chat)
Disable tools for a model that does not support Agent mode:
{
"aiNormalizer.modelOverrides": {
"openrouter/meta-llama/llama-3-70b-instruct": {
"toolCalling": false,
"name": "Llama 3 70B (chat only)"
}
}
}
Keys are always endpointId/modelId where modelId matches the upstream list id.
Manual pin + discovery
Keep a manual entry to force capabilities for one id; discovery fills the rest:
{
"aiNormalizer.endpoints": [
{
"id": "gemini-locked",
"upstreamUrl": "https://your-proxy.example/v1/chat/completions",
"adapter": "inline-xml-tools",
"adapterProfile": "gemini-non-customtools",
"discoverModels": { "enabled": true },
"models": [
{
"id": "gemini-2.5-pro",
"toolCalling": true,
"maxInputTokens": 1048576,
"maxOutputTokens": 65536
}
]
}
]
}
Merge order: discovered → profile defaults → modelOverrides → manual models[] (manual wins on conflicts).
Multi-endpoint
All models appear under one BYOK provider group (default name AI Normalizer):
{
"aiNormalizer.endpoints": [
{
"id": "gemini-locked",
"upstreamUrl": "https://gemini-proxy.example/v1/chat/completions",
"adapter": "inline-xml-tools",
"adapterProfile": "gemini-non-customtools",
"discoverModels": { "enabled": true }
},
{
"id": "openrouter",
"upstreamUrl": "https://openrouter.ai/api/v1/chat/completions",
"adapter": "openai-pass-through",
"discoverModels": { "enabled": true }
}
]
}
{
"aiNormalizer.profiles": {
"my-gateway-xml": {
"toolFormatProfile": {
"toolCallOpen": "<invoke>",
"toolCallClose": "</invoke>",
"toolResultOpen": "<result>",
"toolResultClose": "</result>",
"nameAttribute": "tool",
"idAttribute": "call_id"
},
"capabilityDefaults": {
"toolCalling": true,
"maxInputTokens": 200000,
"maxOutputTokens": 8192
}
}
},
"aiNormalizer.endpoints": [
{
"id": "gateway",
"upstreamUrl": "https://gateway.example/v1/chat/completions",
"adapter": "inline-xml-tools",
"adapterProfile": "my-gateway-xml",
"discoverModels": { "enabled": true }
}
]
}
External profiles file
{
"aiNormalizer.profilesPath": "C:\\Users\\you\\.config\\ai-normalizer\\profiles.json"
}
Sync provider name
{
"aiNormalizer.syncTargets": [
{
"id": "chatLanguageModels",
"enabled": true,
"options": { "providerName": "My Company AI" }
}
]
}
Inline editor chat
After sync, run AI Normalizer: Set Inline Chat Default Model or set:
{
"inlineChat.defaultModel": "gemini-2.5-pro"
}
(use the exact model id from the picker).
Experimental ghost-text completions
Copilot ghost-text does not use BYOK. Optional FIM via the extension:
{
"aiNormalizer.inlineCompletion": {
"enabled": true,
"modelId": "codestral-latest",
"completionsPath": "/v1/completions"
}
}
API keys
There are two separate keys. Most setup friction is from mixing them up.
1. Upstream API key (required for discovery and proxy → provider)
Used as Authorization: Bearer … when the extension discovers models and when the proxy calls Gemini/OpenRouter/etc.
| Step |
Action |
| 1 |
Configure aiNormalizer.endpoints[].id + upstreamUrl (optional apiKeySecretId; default is aiNormalizer.endpoint.<id>) |
| 2 |
AI Normalizer: Set Endpoint API Key — pick endpoint, paste key (stored in VS Code SecretStorage) |
| 3 |
AI Normalizer: Refresh Model Catalog |
To remove a key: AI Normalizer: Clear Endpoint API Key.
Keys are never written to settings.json or chatLanguageModels.json.
2. Copilot BYOK key (VS Code → local proxy)
Synced models use "apiKey": "${input:chat.lm.secret.<id>}" where <id> is aiNormalizer.copilotByokSecretId (default aiNormalizer).
When you first use a synced model, VS Code may prompt for this secret. The local proxy usually does not validate it — enter any placeholder such as local or none. Only use a real value if you add auth on the proxy later.
Never commit API keys in git-tracked files.
Commands
| Command |
Description |
| AI Normalizer: Start Proxy |
Start local proxy |
| AI Normalizer: Stop Proxy |
Stop proxy if this window started it; otherwise detach from a shared proxy |
| AI Normalizer: Reload Proxy Config |
Push config to /admin/reload |
| AI Normalizer: Set Endpoint API Key |
Store upstream Bearer token (SecretStorage) |
| AI Normalizer: Clear Endpoint API Key |
Remove stored upstream key |
| AI Normalizer: Refresh Model Catalog |
Force upstream discovery, reload proxy, sync |
| AI Normalizer: Sync Language Models |
Write chatLanguageModels.json from merged catalog |
| AI Normalizer: Export Active Profile |
Save merged tool profiles JSON |
| AI Normalizer: Set Inline Chat Default Model |
Set inlineChat.defaultModel |
Status bar (cloud icon): click to sync language models.
Model cache file
Default location: extension global storage models-cache.json.
Shape:
{
"version": 1,
"updatedAt": "2026-05-29T12:00:00.000Z",
"endpoints": {
"gemini-locked": {
"fetchedAt": "2026-05-29T11:00:00.000Z",
"sourceUrl": "https://your-proxy.example/v1/models",
"models": [{ "id": "gemini-2.5-pro", "name": "Gemini 2.5 Pro" }]
}
}
}
Override path with aiNormalizer.modelCachePath. Per-model tuning stays in aiNormalizer.modelOverrides (user settings), not in the cache file.
Multiple VS Code windows
One shared proxy listens on aiNormalizer.proxyPort (default 3847). Shared files: chatLanguageModels.json, extension globalStorage (models-cache.json, proxy-config.json).
- The first window to start the proxy owns the process (PID recorded in
proxy-owner.json).
- Additional windows attach to the existing proxy, push config via
/admin/reload, and do not spawn a second process.
- Stop Proxy in an attached window only detaches locally; it does not kill the shared proxy.
- Closing a window only stops the proxy if that window spawned it.
Run Refresh Model Catalog or Sync Language Models from any window after changing settings.
Troubleshooting
| Problem |
What to try |
| Models missing in picker |
Refresh Model Catalog → Sync Language Models → reload window |
| Model hidden in Agent mode |
Set toolCalling: true in override or manual models[] |
| Discovery returns 0 models |
Check modelsUrl, API key, Output channel logs; some gateways omit /v1/models — add manual models[] |
| Discovery 401/403 |
Set Endpoint API Key; check Output for derived modelsUrl |
Wrong models URL (/v1/models on v1beta) |
Use full …/chat/completions upstream URL or set discoverModels.modelsUrl |
| Proxy binary not found |
pnpm run build (populates bin/); auto-detect also checks target/release and target/debug under the extension folder |
| Proxy won't start (port in use) |
Another window may already run the proxy — check Output for Attaching to existing proxy; or change proxyPort |
| Second window shows proxy failed |
Reload extension build with attach support; ensure first window’s proxy is healthy on /health |
| Tools not invoked |
Confirm inline-xml-tools + profile tags match upstream XML |
| Cursor vs Code path |
Cache/sync uses Cursor or Code under %APPDATA% based on vscode.env.appName |
| Garbled Output logs |
Update extension + proxy build (ANSI stripped when spawned from VS Code) |
Limitations
| Surface |
Support |
| Copilot Chat / Agent (BYOK) |
Yes |
| Inline editor chat |
Yes (BYOK model id) |
| Copilot ghost-text |
No BYOK; optional extension FIM only |
| Cline / Continue sync |
Planned (sync target registry); use same proxy URL manually for now |
Proxy routes
| Route |
Description |
GET /health |
Health check |
GET /v1/models |
Models from merged config |
POST /v1/chat/completions |
Normalized chat |
POST /v1/completions |
Pass-through for inline FIM |
POST /admin/reload |
Hot-reload config |
Development
The extension is ESM (package.json "type": "module"). TypeScript uses moduleResolution: "bundler" with .ts import paths; esbuild bundles src/extension.ts → dist/extension.js (single file, vscode external).
pnpm run build # proxy binary + esbuild extension bundle
pnpm run compile # esbuild only
pnpm run watch # esbuild --watch
pnpm run lint # tsc --noEmit
pnpm run test # Rust + TS (tsx) + proxy integration
pnpm run test:ts # unit tests via tsx (no VS Code host)
pnpm run test:integration
Manual pre-publish checklist (Copilot BYOK, multi-window): docs/TESTING.md.
Releases and Marketplace upload: docs/PUBLISHING.md.
Security
AI Normalizer runs a local native proxy and handles upstream API keys. Please read SECURITY.md before installing in sensitive environments.
| Topic |
Behavior |
| Network |
Proxy binds to loopback by default (aiNormalizer.proxyPort, default 3847). Do not forward this port to untrusted networks. |
| API keys |
Stored in VS Code SecretStorage only — never in settings.json or synced chatLanguageModels.json. |
| Synced file |
Updates chatLanguageModels.json with model metadata and ${input:chat.lm.secret.*} placeholders, not raw keys. |
| Binaries |
normalizer-proxy is built from this repo; release VSIXes are produced by CI. |
| Reporting |
Vulnerabilities: GitHub Security Advisories (see SECURITY.md). |
Review the AI Normalizer output channel when debugging; treat upstream keys like production credentials.
License
MIT