Custom LLM ProviderConnect any OpenAI-compatible AI endpoint to GitHub Copilot Chat in Visual Studio Code. 🎯 Primary Use Case — Alibaba Cloud Coding PlanThis extension was developed primarily to bring Alibaba Cloud Coding Plan into Visual Studio Code. Alibaba's Coding Plan feature in Model Studio lets you run powerful Qwen Coder models in full agent mode — editing files, running tests, searching your codebase — all from within GitHub Copilot Chat. This extension bridges the gap by exposing those models directly in the VS Code model picker.
✨ Features
🚀 Quick Start1. Add your first providerOpen the Command Palette ( The wizard will ask for:
After saving, the extension automatically fetches available models from the provider.
2. Pick a model in Copilot ChatOpen Copilot Chat (
3. Use the
|
| Provider | Base URL |
|---|---|
| Alibaba DashScope (international) | https://coding-intl.dashscope.aliyuncs.com/v1 |
| Alibaba DashScope (standard) | https://dashscope.aliyuncs.com/compatible-mode/v1 |
| OpenRouter | https://openrouter.ai/api/v1 |
| Any OpenAI-compatible API | custom URL |
🔌 Multi-Provider Support
You can connect multiple providers at once — for example, use Alibaba DashScope and OpenRouter side by side. Each provider has its own URL and API key; models from all providers are merged into a single list in the Copilot Chat picker.
Add a provider:
Ctrl+Shift+P → Custom LLM: Add provider
Manage providers (edit / remove):
Ctrl+Shift+P → Custom LLM: Manage providers
Refresh the model list:
Ctrl+Shift+P → Custom LLM: Refresh model list from API
⚙️ Settings
Provider configuration
Providers are stored in the customLlm.providers array. Each entry has three fields:
| Field | Description |
|---|---|
name |
Display name shown in info messages (e.g. "Alibaba DashScope") |
baseUrl |
Base URL ending with /v1 |
apiKey |
API key (sk-…). Leave empty if not required. |
You can also edit settings directly in User Settings JSON (Ctrl+Shift+P → Open User Settings (JSON)):
"customLlm.providers": [
{
"name": "Alibaba DashScope",
"baseUrl": "https://coding-intl.dashscope.aliyuncs.com/v1",
"apiKey": "sk-YOUR-KEY-HERE"
},
{
"name": "OpenRouter",
"baseUrl": "https://openrouter.ai/api/v1",
"apiKey": "sk-or-YOUR-KEY-HERE"
}
]
Legacy settings (
customLlm.baseUrlandcustomLlm.apiKey) are automatically migrated to the newcustomLlm.providersformat on the first startup after updating to v0.4.0. No manual action needed.
Model list
customLlm.models is auto-populated by model discovery and does not normally need to be edited manually. The extension merges discovered models with any existing entries — custom entries are preserved.
Default models
If no providers are configured or the API is unreachable, the extension falls back to these built-in defaults (all available in Alibaba Cloud Coding Plan):
| Model ID | Display Name | Provider | Context |
|---|---|---|---|
qwen3-coder-plus |
Qwen3 Coder Plus | Alibaba | 128K |
qwen3-coder-next |
Qwen3 Coder Next | Alibaba | 128K |
qwen3-max-2026-01-23 |
Qwen3 Max | Alibaba | 128K |
qwen3.5-plus |
Qwen3.5 Plus | Alibaba | 1M |
qwen3.6-plus |
Qwen3.6 Plus | Alibaba | 1M |
glm-5 |
GLM-5 | Zhipu | 200K |
glm-4.7 |
GLM-4.7 | Zhipu | 128K |
kimi-k2.5 |
Kimi K2.5 | Moonshot | 256K |
MiniMax-M2.5 |
MiniMax M2.5 | MiniMax | 256K |
🛠️ Commands
| Command | Description |
|---|---|
Custom LLM: Add provider |
Guided wizard to add a new provider (name → URL → API key → auto-discover models) |
Custom LLM: Manage providers |
List, edit, or remove configured providers |
Custom LLM: Refresh model list from API |
Manually re-fetch models from all configured providers |
Custom LLM: Test connection |
Send a test request to each configured provider and report the result |
📋 Requirements
- Visual Studio Code
1.119.0or later - GitHub Copilot extension installed and signed in (individual plan)
- An API key for your chosen provider
🔁 Retry Behavior
The extension automatically retries failed requests with exponential backoff:
| Retry | Delay | Triggered by |
|---|---|---|
| 1st | ~1 second | Rate limit (429), Server errors (5xx) |
| 2nd | ~2 seconds | Same as above |
| 3rd | ~4 seconds | Same as above |
Maximum delay capped at 10 seconds. Request cancellation is never retried.
⚠️ Known Limitations
- GitHub Copilot Coding Plan (GitHub's native multi-step agent mode) is tied to GitHub's own infrastructure and cannot use custom providers. Use Alibaba Cloud Coding Plan as a powerful alternative.
- Inline completions (ghost text) are provided by GitHub Copilot and cannot be redirected
- Models only appear in the picker on individual GitHub Copilot plans (not Business/Enterprise)
🛠️ Troubleshooting
"Add Models" → "Custom LLM" shows no input dialogs (v0.4.8)
Fixed in v0.4.8. In earlier versions, clicking Add Models → Custom LLM in the Copilot model picker opened the management command, but the name / URL / API key dialogs immediately disappeared without any input. The root cause was that VS Code 1.104+ keeps the model-picker webview in focus when it invokes the managementCommand, causing any synchronous showInputBox call to be dismissed as soon as the webview stole focus back. The fix adds a short settle-time so the picker panel fully closes before the first input dialog opens.
Custom model can't be invoked — Copilot loops on gpt-4o-mini requests (v0.4.7)
Fixed in v0.4.7. If a custom model appeared to do nothing and VS Code kept sending auxiliary gpt-4o-mini (*.copilotmd) requests in a loop, the extension was inadvertently writing model IDs into Copilot's BYOK settings (github.copilot.chat.customOAIModels). When Copilot found the same ID in both registrations, it routed via BYOK, found no API key there, and silently looped. The fix removes that secondary write entirely. The extension also cleans up stale BYOK entries left by earlier versions on startup.
Original Copilot models reply "no input given" / context fills to ~95% in a loop
Fixed in v0.4.5. In earlier versions, the @qwen participant was sticky, so once you used it, every subsequent message was silently routed through the participant — even when you switched the picker back to a built-in Copilot model (GPT-4, Claude, etc.). On agent-mode turns where the actual content was in references/tool results rather than plain prompt text, the participant forwarded an empty user message, the model replied "what do you need?", and the loop repeated until the context filled.
If you see this on 0.4.4 or earlier, update to 0.4.5+. As a workaround on older versions, fully clear the @qwen mention from the prompt and reload the chat.
Agent mode loops / repeating the same actions
If the model keeps calling the same tool in a loop, switch to a larger variant — smaller models sometimes struggle with complex multi-step tool orchestration. Try qwen3-coder-plus or qwen3-max instead of lighter models.
Models don't appear in the picker
- Open
Ctrl+Shift+P→ Chat: Manage Language Models - Hover over each model in the Custom LLM section
- Click the eye icon 👁 to make it visible in the picker
If the section doesn't appear at all, check that the extension is active: Ctrl+Shift+P → Extensions: Show Installed Extensions and verify Custom LLM Provider is enabled.
To force a model refresh: Ctrl+Shift+P → Custom LLM: Refresh model list from API.
401 Unauthorized — "invalid access token or token expired"
Your API key is missing or incorrect. Fix:
- Open
Ctrl+Shift+P→ Custom LLM: Manage providers - Select your provider → Edit
- Paste your API key (starts with
sk-) - Make sure there are no extra spaces around the key
Get your key from Alibaba Cloud Model Studio → API Keys section.
Requests fail with 404 or empty responses
Check that the baseUrl ends with /v1 and the model id values match exactly what your provider expects. For DashScope, use https://coding-intl.dashscope.aliyuncs.com/v1.
Settings changes not taking effect
The extension hot-reloads on settings change, but it may take a few seconds. If models still don't update, reload the VS Code window: Ctrl+Shift+P → Developer: Reload Window.
Image attachment returns an error
Not all models support image input. If you see "This model does not support image input", switch to a multimodal model. For Alibaba DashScope, qwen-vl-max supports vision. Coding-focused models (qwen3-coder-*, qwen3.6-plus, etc.) are text-only.
Migrating from v0.3.x or earlier
The old customLlm.baseUrl and customLlm.apiKey settings are automatically migrated to the new customLlm.providers array on the first launch. If you need to re-run migration manually, remove the customLlm.providers entry from your settings and reload VS Code.
🔨 Building from Source
# Clone the repository
git clone https://github.com/milhaus123/vscode-custom-llm-provider.git
cd vscode-custom-llm-provider
# Install dev dependencies
npm install
# Compile TypeScript
npm run compile
# Package as .vsix
npx vsce package
# Publish to the marketplace
npx vsce publish
# zvýší a publikuje
npx vsce publish patch
# Install locally
code --install-extension vscode-custom-llm-provider-*.vsix
💖 Support the Project
If this extension saves you time, consider buying me a coffee!
Your support helps keep the project maintained and updated with new model releases. 🙏
📄 License
MIT © 2026 Martin Říha



