Unify Chat Provider

Integrate multiple LLM API providers into VS Code's GitHub Copilot Chat using the Language Model API.

English | 简体中文

Features

🐑 Free Tier Access: Aggregates the latest free mainstream model channel configurations!
📦 Out of the Box: One-click configuration, automatic syncing of official model lists, and migration from other tools.
🔌 Perfect Compatibility: Supports all major LLM API formats (OpenAI Chat Completions, OpenAI Responses, Anthropic Messages, Ollama Chat, Gemini).
🎯 Deep Adaptation: Adapts to special API features and best practices of 45+ mainstream providers.
🚀 Best Performance: Built-in recommended parameters for 200+ mainstream models, allowing you to maximize model potential without tuning.
💾 Import and Export: Complete import/export support; import configs via Base64, JSON, URL, or URI.
💎 Great UX: Visual interface configuration, fully customizable model parameters, supports unlimited provider and model configurations, and supports coexistence of multiple configuration variants for the same provider and model.
✨ One More Thing: One-click use of your Claude Code, Gemini CLI, Antigravity, Github Copilot, OpenAI Codex (ChatGPT Plus/Pro), xAI Grok (SuperGrok / X Premium+) account quotas.

Installation

Search for Unify Chat Provider in the VS Code Extension Marketplace and install it.
Download the latest .vsix file from GitHub Releases, then install it in VS Code via Install from VSIX... or by dragging it into the Extensions view.

Quick Start

If the provider you want to add is in the Provider Support Table, use One-Click Configuration.

Otherwise, you can also manually configure any provider and model.

You might also be looking for:

One-Click Migration: Migrate from other apps or extensions.
Manage Providers: Unified management for all providers and models.
Import and Export: Back up or export configurations to share with others.

⚠️ Avoid VS Code background tasks consuming Copilot quota

VS Code currently uses utility models for some background tasks by default. If you use a free Copilot account, this may consume your Copilot quota.

You need to set these to other models in settings.json yourself to avoid consuming Copilot quota, or use the quick settings interface provided by this extension. See Quick Set VS Code Default Model for details.

Basic Operations

The UI is integrated into the VS Code Command Palette for a more native experience. Here’s the basic workflow:

Open the Command Palette:
- From the menu: View -> Command Palette...
- Or with the shortcut: Ctrl+Shift+P (Windows/Linux) or Cmd+Shift+P (Mac)
Search commands:
- Type Unify Chat Provider: or ucp: to find all commands.
Run a command:
- Select a command with mouse or arrow keys, then press Enter.

One-Click Configuration

See the Provider Support Table for providers supported by one-click configuration.

If your provider is not in the list, you can add it via Manual Configuration.

Steps:

Open the VS Code Command Palette and search for Unify Chat Provider: Add Provider From Well-Known Provider List.
Select the provider you want to add.
Follow the prompts to configure authentication (usually an API key, or it may require logging in via the browser), then you’ll be taken to the config import screen.
- This screen lets you review and edit the config that will be imported.
- For details, see the Provider Settings section.
Click Save to complete the import and start using the models in Copilot Chat.

Manual Configuration

This section uses DeepSeek as an example, adding the provider and two models.

DeepSeek supports One-Click Configuration. This section shows the manual setup for demonstration purposes.

Preparation: get the API information from the provider docs, at least the following:
- API Format: The API format (e.g., OpenAI Chat Completions, Anthropic Messages).
- API Base URL: The base URL of the API.
- Authentication: Usually an API key; obtained from the user center or console after registration.
Open the VS Code Command Palette and search for Unify Chat Provider: Add Provider.

- This screen is similar to the [Provider Settings](#provider-settings) screen, and includes in-place documentation for each field.
Fill in the provider name: Name.
- The name must be unique and is shown in the model list. Here we use DeepSeek.
- You can create multiple configs for the same provider with different names, e.g., DeepSeek-Person, DeepSeek-Team.
Choose the API format: API Format.
- DeepSeek uses the OpenAI Chat Completion format, so select that.
- To see all supported formats, refer to the API Format Support Table.
Set the base URL: API Base URL.
- DeepSeek’s base URL is https://api.deepseek.com.
Configure authentication: Authentication.
- DeepSeek uses API Key for authentication, so select API Key.
- Enter the API key generated from the DeepSeek console.
Click Models to go to the model management screen.
Enable Auto-Fetch Official Models.
- This example uses auto-fetch to reduce configuration steps; see Auto-Fetch Official Models for details.
- For model fields and other ways to add models, see Manage Models.
Click Save to finish. You can now use the models in Copilot Chat.

One-Click Migration

See the Application Migration Support Table to learn which apps and extensions are supported.

If your app/extension is not in the list, you can configure it via One-Click Configuration or Manual Configuration.

Steps:

Open the VS Code Command Palette and search for Unify Chat Provider: Import Config From Other Applications.

The UI lists all supported apps/extensions and the detected config file paths.
Use the button group on the far right of each item for additional actions:
1. Custom Path: Import from a custom config file path.
2. Import From Config Content: Paste the config content directly.

Choose the app/extension you want to import, then you’ll be taken to the config import screen.

This screen lets you review and edit the config that will be imported.
For details, see the Provider Settings section.

Click Save to complete the import and start using the imported models in Copilot Chat.

Manage Providers

You can create unlimited provider configurations, and multiple configs can coexist for the same provider.
Provider names must be unique.

Provider List

Open the VS Code Command Palette and search for Unify Chat Provider: Manage Providers.

Add Provider: Add a new provider via Manual Configuration.
Add From Well-Known Provider List: Add a new provider via One-Click Configuration.
Import From Config: Import an existing provider config (or an array of provider configs). See Import and Export.
Import From Other Applications: Import configs from other apps/extensions via One-Click Migration.
Export All Providers: Export all provider configs. See Import and Export.

The UI also shows all existing providers. Click a provider item to enter the Model List screen.

The button group on the right of each provider item provides additional actions:

Export: Export this provider config. See Import and Export.
Duplicate: Clone this provider config to create a new one.
Delete: Delete this provider config.

Provider Settings

Models: This button only appears while adding or importing a config; click it to enter the Model List screen.

This screen shows all configuration fields for the provider. For field details, see Provider Parameters.

Manage Models

Each provider can have unlimited model configurations.
The same model ID can exist under different providers.
Within a single provider config, you cannot have multiple identical model IDs directly, but you can create multiple configs by adding a #xxx suffix.
For example, you can add both glm4.7 and glm4.7#thinking to quickly switch thinking on/off.
The #xxx suffix is automatically removed when sending requests.
Model names can be duplicated, but using distinct names is recommended to avoid confusion.

Model List

Add Model: Go to Add Model Manually.
Add From Well-Known Model List: Go to One-Click Add Models.
Add From Official Model List: Fetch the latest official model list via API. See One-Click Add Models.
Import From Config: Import an existing model config (or an array of model configs). See Import and Export.
Auto-Fetch Official Models: Enable or disable Auto-Fetch Official Models.
Provider Settings: Go to Provider Settings.
Export: Export this provider config or the model array config. See Import and Export.
Duplicate: Clone this provider config to create a new one.
Delete: Delete this provider config.

Add Model Manually

This screen is similar to the Model Settings screen; you can read the in-place documentation to understand each field.

One-Click Add Models

This screen lists all models that can be added with one click. You can import multiple selected models at once.

See the Model Support Table for the full list of supported models.

Auto-Fetch Official Models

This feature periodically fetches the latest official model list from the provider’s API and automatically configures recommended parameters, greatly simplifying model setup.

Tip

A provider’s API may not return recommended parameters. In that case, recommended parameters are looked up from an internal database by model ID. See the Model Support Table for models that have built-in recommendations.

Auto-fetched models show an internet icon before the model name.
If an auto-fetched model ID conflicts with a manually configured one, only the manually configured model is shown.
Auto-fetched models are refreshed periodically; you can also click (click to fetch) to refresh manually.
Run the VS Code command Unify Chat Provider: Refresh All Provider's Official Models to trigger refresh for all providers.

Model Settings

Export: Export this model config. See Import and Export.
Duplicate: Clone this model config to create a new one.
Delete: Delete this model config.

This screen shows all configuration fields for the model. For field details, see Model Parameters.

Sync Built-in Parameters to All Configs

Run Unify Chat Provider: Sync Built-in Parameters to All Configs to sync local model parameters with the built-in model parameters.

This is typically used after a new version updates or optimizes built-in model parameters, allowing you to sync existing configs in one click.

Commit Message Generation

You can generate commit messages via the following commands:

Unify Chat Provider: Generate Commit Message
Unify Chat Provider: Generate Commit Message(All Changes)
Unify Chat Provider: Generate Commit Message(Staged Changes)
Unify Chat Provider: Generate Commit Message(Unstaged Changes)

You can also click the sparkle button on the right side of the commit message input box in the Source Control view to generate a commit message (on first use, you need to click the dropdown arrow next to the button and select Unify Chat Provider: Generate Commit Message from the dropdown menu).

Balance Monitoring

Use this feature to monitor provider balances in Provider Settings.

Run the VS Code command Unify Chat Provider: Provider Balance Monitoring to open the balance monitoring panel.
Configure it from the Balance Monitor field.
Built-in methods:
- Moonshot AI Balance: no extra config required; uses provider baseUrl and API key.
- Kimi Code Usage: no extra config required; uses provider baseUrl and API key.
- New API Balance: always shows API key balance; user balance is optional and requires userId + systemToken (sensitive data).
- DeepSeek Balance: no extra config required; uses provider baseUrl and API key.
- OpenRouter Balance: no extra config required; uses provider baseUrl and API key.
- SiliconFlow Balance: no extra config required; uses provider baseUrl and API key.
- AIHubMix Balance: no extra balance config required; uses provider baseUrl, API key, and optional APP-Code from provider extraHeaders.
- Claude Relay Service Balance: no extra config required; uses provider baseUrl and API key.
- Antigravity Usage: no extra config required; uses the provider OAuth credential and project settings.
- Gemini CLI Usage: no extra config required; uses the provider OAuth credential and optional project settings.
- Codex Usage: no extra config required; uses provider credential (API key or Codex auth token).
Run the VS Code command Unify Chat Provider: Refresh All Providers' Balance Information to force refresh balances for all configured providers.

Adjust Parameters

Global Settings

Most unifyChatProvider.* settings are application-scoped and shared across profiles on the same device.
Commit message generation settings are window-scoped and can be configured per workspace.

Name	ID	Description
Global Network Settings	`networkSettings`	Global network settings. Timeout and retry affect chat requests; proxy affects provider HTTP requests.
Model Display Name Template	`modelDisplayNameTemplate`	Template for chat model names. Default: `{modelName}{{ ({providerName})}}`.
Balance Refresh Interval	`balanceRefreshIntervalMs`	Periodic refresh interval for provider balances (milliseconds).
Balance Throttle Window	`balanceThrottleWindowMs`	Throttle window for post-request balance refresh (milliseconds).
Display Balance in Configuration	`displayBalanceInConfiguration`	Shows refreshed balance information in the model configuration button area. Default: disabled.
Store API Key in Settings	`storeApiKeyInSettings`	Please see Cloud Sync Compatibility for details.
Enable Detailed Logging	`verbose`	Enables more detailed logging for troubleshooting errors.
Commit Message Buttons	`commitMessageGeneration.enableButtons`	Controls whether commit message generation buttons are shown in the Source Control view.
Commit Message Model	`commitMessageGeneration.model`	Model selection used for commit message generation.
Commit Message Format	`commitMessageGeneration.format`	Commit message format used for commit message generation.
Commit Message Custom Instructions	`commitMessageGeneration.customInstructions`	Additional instructions appended to the commit message generation system prompt.
Commit Message Exclude Files	`commitMessageGeneration.excludeFiles`	VS Code glob patterns for files whose diffs should be omitted from commit message generation prompts.

Proxy Configuration

Proxy settings can be configured globally through unifyChatProvider.networkSettings.proxy or per provider through unifyChatProvider.endpoints[].proxy. The effective order is:

Provider proxy
Global networkSettings.proxy
VS Code HTTP proxy settings

proxy.type supports:

vscode (default): Use VS Code http.proxy, http.proxyAuthorization, http.proxyStrictSSL, and http.noProxy.
direct: Connect directly and bypass VS Code/global proxy settings.
custom: Use proxy.url; optional fields are authorization, strictSSL, and noProxy.

Supported custom proxy URL protocols are http, https, socks, socks4, socks4a, socks5, and socks5h. Proxy settings apply to provider HTTP requests, including chat requests, balance refreshes, and official model fetching.

Example global proxy:

{
  "unifyChatProvider.networkSettings": {
    "proxy": {
      "type": "custom",
      "url": "http://127.0.0.1:7890",
      "noProxy": ["localhost", "127.0.0.1", ".example.com"]
    }
  }
}

Example provider override:

{
  "unifyChatProvider.endpoints": [
    {
      "type": "openai",
      "name": "OpenAI Direct",
      "baseUrl": "https://api.openai.com",
      "proxy": {
        "type": "direct"
      },
      "models": ["gpt-5"]
    }
  ]
}

Provider Parameters

The following fields correspond to ProviderConfig (field names used in import/export JSON).

Name	ID	Description
API Format	`type`	Provider type (determines the API format and compatibility logic).
Provider Name	`name`	Unique name for this provider config (used for list display and references).
API Base URL	`baseUrl`	API base URL, e.g. `https://api.anthropic.com`.
Disable URL Normalization	`useRawBaseUrl`	Whether to disable automatic URL normalization, including provider-specific URL handling such as appending `/v1` or stripping suffixes.
Transport Mode	`transport`	Preferred transport mode for this provider. Leave empty to use the provider default.
Service Tier	`serviceTier`	Default processing tier for requests from this provider.
Context Cache	`contextCache`	Context cache configuration (used by providers that support prompt caching).
Context Cache Type	`contextCache.type`	`only-free` (default): use context cache only when it's free. `allow-paid`: use it even if it may incur cost.
Context Cache TTL (seconds)	`contextCache.ttl`	TTL in seconds. Leave empty to use the provider's default TTL. Some providers may quantize this to supported TTL presets; paid presets may require `allow-paid`.
Authentication	`auth`	Authentication config object.
Balance Monitor	`balanceProvider`	Provider-level balance monitoring config.
Models	`models`	Array of model configurations (`ModelConfig[]`).
Extra Headers	`extraHeaders`	HTTP headers appended to every request (`Record<string, string>`).
Extra Body Fields	`extraBody`	Extra fields appended to request body (`Record<string, unknown>`), for provider-specific parameters.
Proxy	`proxy`	Provider-level proxy override. See Proxy Configuration.
Timeout	`timeout`	Timeout settings for HTTP requests and SSE streaming (milliseconds).
Connection Timeout	`timeout.connection`	Maximum time to wait for establishing a TCP connection; default `60000` (60 seconds).
Response Interval Timeout	`timeout.response`	Maximum time to wait between SSE chunks; default `300000` (5 minutes).
Retry	`retry`	Retry settings for transient errors (chat requests only).
Max Retries	`retry.maxRetries`	Maximum number of retry attempts; default `10`.
Initial Delay	`retry.initialDelayMs`	Initial delay before the first retry (milliseconds); default `1000`.
Max Delay	`retry.maxDelayMs`	Maximum delay cap for retries (milliseconds); default `60000`.
Backoff Multiplier	`retry.backoffMultiplier`	Exponential backoff multiplier; default `2`.
Jitter Factor	`retry.jitterFactor`	Jitter factor (0-1) to randomize delay; default `0.1`.
Auto-Fetch Official Models	`autoFetchOfficialModels`	Whether to periodically fetch and auto-update the official model list from the provider API.

Model Parameters

The following fields correspond to ModelConfig (field names used in import/export JSON).

Name	ID	Description
Model ID	`id`	Model identifier (you can use a `#xxx` suffix to create multiple configs for the same model; the suffix is removed when sending requests).
Display Name	`name`	Name shown in the UI (usually falls back to `id` if empty).
Model Family	`family`	A grouping identifier for grouping/matching models (e.g., `gpt-4`, `claude-3`).
Max Input Tokens	`maxInputTokens`	Maximum input/context tokens (some providers interpret this as total context for “input + output”).
Max Output Tokens	`maxOutputTokens`	Maximum generated tokens (required by some providers, e.g., Anthropic’s `max_tokens`).
Tokenizer	`tokenizer`	Tokenizer used for VS Code token counting (`provideTokenCount`). Default: `default`.
Token Count Multiplier	`tokenCountMultiplier`	Multiplier applied to the token count before returning it to VS Code. Default: `1.0`.
Capabilities	`capabilities`	Capability declaration (for UI and routing logic; may also affect request construction).
Tool Calling	`capabilities.toolCalling`	Whether tool/function calling is supported; if a number, it represents the maximum tool count.
Image Input	`capabilities.imageInput`	Whether image input is supported.
Edit Tools	`capabilities.editTools`	Edit tool hint preset for VS Code / Copilot Chat.
Streaming	`stream`	Whether streaming responses are enabled (if unset, default behavior is used).
Temperature	`temperature`	Sampling temperature (randomness).
Top-K	`topK`	Top-k sampling.
Top-P	`topP`	Top-p (nucleus) sampling.
Frequency Penalty	`frequencyPenalty`	Frequency penalty.
Presence Penalty	`presencePenalty`	Presence penalty.
Parallel Tool Calling	`parallelToolCalling`	Whether to allow parallel tool calling (`true` enable, `false` disable, `undefined` use default).
Service Tier	`serviceTier`	Processing tier / speed preset for OpenAI and Anthropic requests. `auto` lets the provider choose; `standard` maps to OpenAI `default` and Anthropic `standard_only`; `flex` / `scale` map to OpenAI tiers and Anthropic `standard_only`; `priority` maps to OpenAI Priority Tier and Anthropic Fast mode. Leave empty to omit the field.
Verbosity	`verbosity`	Constrain verbosity: `low` / `medium` / `high` (not supported by all providers).
Thinking	`thinking`	Thinking/reasoning related config (support varies by provider).
Thinking Mode	`thinking.type`	`enabled` / `disabled` / `auto`
Thinking Budget Tokens	`thinking.budgetTokens`	Token budget for thinking.
Thinking Effort	`thinking.effort`	`none` / `minimal` / `low` / `medium` / `high` / `xhigh` / `max`
Thinking Summary	`thinking.summary`	Reasoning / thinking summary mode: `none` / `auto` / `concise` / `detailed`
Extra Headers	`extraHeaders`	HTTP headers appended to this model request (`Record<string, string>`).
Extra Body Fields	`extraBody`	Extra fields appended to this model request body (`Record<string, unknown>`).
Preset Templates	`presetTemplates`	Configured preset templates can be selected from the VS Code model configuration submenu. Each template corresponds to one enum option group. Templates are applied in declaration order, and later templates override earlier fields with the same name.

Service Tier Notes

Leaving serviceTier empty means this extension omits the service tier / speed fields and keeps the provider default behavior.
Mapping for the OpenAI API:
- auto -> auto
- standard -> default
- flex -> flex
- scale -> scale
- priority -> priority
Mapping for Anthropic Messages API:
- auto -> auto
- standard / flex / scale -> standard_only
- priority -> speed: "fast" with fast-mode-2026-02-01

Preset Template Notes

You can configure multiple preset templates for a single model. Each template corresponds to one enum option group displayed in the VS Code model selection submenu.

You can define custom preset templates to switch model parameters quickly. For example:

{
  "presetTemplates": [
    {
      "name": "Reasoning Effort",
      "id": "reasoningEffort",
      "presets": [
        {
          "name": "High",
          "description": "Suitable for tasks involving planning, coding, synthesis, or more difficult reasoning.",
          "id": "high",
          "config": {
            "thinking": {
              "type": "enabled",
              "effort": "high"
            },
            "temperature": 0.7
          }
        },
        {
          "name": "Low",
          "description": "A small amount of extra thinking can improve reliability with almost no added latency.",
          "id": "low",
          "config": {
            "thinking": {
              "type": "enabled",
              "effort": "low"
            },
            "temperature": 0.4
          }
        },
        {
          "name": "Default",
          "description": "Use the model's current configuration.",
          "id": "default",
          "config": {}
        }
      ],
      "default": "default"
    }
  ]
}

config overrides fields in the model configuration. In the example above, high and low override thinking and temperature, while default overrides nothing and uses the model's current configuration.
If multiple templates override the same field, they are applied in declaration order, and later templates override earlier fields with the same name.

Import and Export

Supported import/export payloads:

Single provider configuration
Single model configuration
Multiple provider configurations (array)
Multiple model configurations (array)

Supported import/export formats:

Base64-url encoded JSON config string (export uses this format only)
Plain JSON config string
A URL pointing to a Base64-url encoded or plain JSON config string

URI Support

Supports importing provider configs via VS Code URI.

Example:

vscode://SmallMain.vscode-unify-chat-provider/import-config?config=<input>

<input> supports the same formats as in Import and Export.

Override Config Fields

You can add query parameters to override certain fields in the imported config.

Example:

vscode://SmallMain.vscode-unify-chat-provider/import-config?config=<input>&auth={"method":"api-key","apiKey":"my-api-key"}

The import will override the auth field before importing.

Provider Advocacy

If you are a developer for an LLM provider, you can add a link like the following on your website so users can add your model to this extension with one click:

<a href="vscode://SmallMain.vscode-unify-chat-provider/import-config?config=eyJ0eXBlIjoi...">Add to Unify Chat Provider</a>

Cloud Sync Compatibility

Extension configs are stored in settings.json, so they work with VS Code Settings Sync.

However, sensitive information is stored in VS Code Secret Storage by default, which currently does not sync.

So after syncing to another device, you may be prompted to re-enter keys or re-authorize.

If you want to sync sync-safe sensitive data (e.g., API keys), enable storeApiKeyInSettings.

OAuth credentials are always kept in Secret Storage to avoid multi-device token refresh conflicts.

This can increase the risk of user data leakage, so evaluate the risk before enabling.

Quick Set VS Code Default Model

You can open the quick settings interface with the VS Code command Unify Chat Provider: Change VS Code Default Model.

The following settings can be changed quickly:

★ chat.utilityModel
★ chat.utilitySmallModel
★ chat.exploreAgent.defaultModel
★ github.copilot.chat.exploreAgent.model
inlineChat.defaultModel
chat.planAgent.defaultModel
github.copilot.chat.askAgent.model
github.copilot.chat.implementAgent.model

Items marked with ★ mean:

By default, VS Code uses Copilot built-in models for these settings. These models do not consume premium quota on paid plans, but may consume free quota on free plans.
It is recommended to set them to fast, inexpensive models.

You can select the Change All Built-in Utility Models button to update all ★ items at once.

API Format Support Table

API	ID	Typical Endpoint	Notes
OpenAI Chat Completion API	`openai-chat-completion`	`/v1/chat/completions`	If the base URL doesn’t end with a version suffix, `/v1` is appended automatically.
OpenAI Responses API	`openai-responses`	`/v1/responses`	If the base URL doesn’t end with a version suffix, `/v1` is appended automatically.
Google AI Studio (Gemini API)	`google-ai-studio`	`/v1beta/models:generateContent`	Automatically detect the version number suffix.
Google Vertex AI	`google-vertex-ai`	`/v1beta/models:generateContent`	Provide different base URL based on authentication.
Anthropic Messages API	`anthropic`	`/v1/messages`	Automatically removes duplicated `/v1` suffix.
Ollama Chat API	`ollama`	`/api/chat`	Automatically removes duplicated `/api` suffix.

Provider Support Table

The providers listed below support One-Click Configuration. Implementations follow the best practices from official docs to help you get the best performance.

Tip

Even if a provider is not listed, you can still use it via Manual Configuration.

Provider	Supported Features	Free Quota	Balance Monitor
Open AI
Google AI Studio
Google Vertex AI	Authentication
Anthropic	InterleavedThinking FineGrainedToolStreaming AlwaysOnAdaptiveThinking
xAI
Hugging Face (Inference Providers)
OpenRouter	CacheControl ReasoningParam ReasoningDetails ClaudeAdaptiveVerbosity	Details	✅
AIHubMix			✅
Cerebras	ReasoningField DisableReasoningParam ClearThinking	Details
OpenCode Zen (OpenAI Chat Completion)	ReasoningContent	Details
OpenCode Zen (OpenAI Responses)	ReasoningContent	Details
OpenCode Zen (Anthropic Messages)	InterleavedThinking FineGrainedToolStreaming	Details
OpenCode Zen (Gemini)		Details
OpenCode Go (OpenAI Chat Completion)	ReasoningContent	Details
OpenCode Go (Anthropic Messages)	InterleavedThinking FineGrainedToolStreaming	Details
Nvidia		Details
Kilo Code	RawBaseUrl	Details
Alibaba Cloud Model Studio (China)	ThinkingParam3 ReasoningContent
Alibaba Cloud Model Studio (Team Token Plan)	ThinkingParam3 ReasoningContent
Alibaba Cloud Model Studio (International)	ThinkingParam3 ReasoningContent
Tencent Cloud TokenHub (China)	ThinkingParam ReasoningEffortParam ReasoningContent
Tencent Cloud TokenHub (International)	ThinkingParam ReasoningEffortParam ReasoningContent
Tencent Cloud TokenHub (Personal Token Plan)	RawBaseUrl ThinkingParam ReasoningEffortParam ReasoningContent
Tencent Cloud Token Plan (Enterprise)	RawBaseUrl ThinkingParam ReasoningEffortParam ReasoningContent
Model Scope (API-Inference)	ThinkingParam3 ReasoningContent	Details
Cline Bot		Details
Volcano Engine	AutoThinking ThinkingParam2 VolcContextCaching	Details
Volcano Engine (Coding Plan)	AutoThinking ThinkingParam2
Byte Plus	AutoThinking ThinkingParam2 VolcContextCaching
DeepSeek	ThinkingParam ReasoningEffortParam ReasoningContent		✅
Gitee AI
Xiaomi MIMO	ThinkingParam ReasoningContent
Xiaomi MIMO (China, Token Plan)	ThinkingParam ReasoningContent
Xiaomi MIMO (Singapore, Token Plan)	ThinkingParam ReasoningContent
Xiaomi MIMO (Europe, Token Plan)	ThinkingParam ReasoningContent
Ollama Local
Ollama Cloud
StepFun (China)	ReasoningField
StepFun (International)	ReasoningField
ZhiPu AI	ThinkingParam ReasoningEffortParam ReasoningContent ClearThinking	Details
ZhiPu AI (Coding Plan)	ThinkingParam ReasoningEffortParam ReasoningContent ClearThinking
Z.AI	ThinkingParam ReasoningEffortParam ReasoningContent ClearThinking	Details
Z.AI (Coding Plan)	ThinkingParam ReasoningEffortParam ReasoningContent ClearThinking
MiniMax (China)	ReasoningDetails
MiniMax (International)	ReasoningDetails
LongCat		Details
Moonshot AI (China)	ReasoningContent		✅
Moonshot AI (International)	ReasoningContent		✅
Moonshot AI (Coding Plan)	ReasoningContent		✅
StreamLake Vanchin (China)		Details
StreamLake Vanchin (China, Coding Plan)
StreamLake Vanchin (International)		Details
StreamLake Vanchin (International, Coding Plan)
SiliconFlow (China)	ThinkingParam3 ThinkingBudgetParam ReasoningContent	Details	✅
SiliconFlow (International)	ThinkingParam3 ThinkingBudgetParam ReasoningContent	Details	✅

Experimental Supported Providers:

⚠️ Warning: Adding the following providers may violate their Terms of Service!

Your account may be suspended or permanently banned.

You need to accept the risks yourself; all risks are borne by you.

Provider	Free Quota	Balance Monitor
OpenAI Codex (ChatGPT Plus/Pro)		✅
xAI Grok Build (SuperGrok / X Premium+)
GitHub Copilot	Details
Google Antigravity	Details	✅
Google Gemini CLI	Details	✅
Claude Code
Synthetic	Details	✅

Long-Term Free Quotas:

Kilo Code

Often includes free models, including stealth models and limited-time frontier models.
Availability can change frequently, so check Kilo's latest listing in-app.

Cline Bot

Supported models:
- minimax/minimax-m2.5
- kwaipilot/kat-coder-pro
- z-ai/glm-5

GitHub Copilot

Some models have free quotas, others require Copilot subscription. After subscription, it is completely free with monthly refreshing quotas.
Supported models: Claude, GPT, Grok, Gemini and other mainstream models.

Google Antigravity

Each model has a certain free quota, refreshing over time.
Supported models: Claude 4.5 Series, Gemini 3.1 Series, Gemini 3 Series.

Google Gemini CLI

Each model has a certain free quota, refreshing over time.
Supported models: Gemini 3.1 Series, Gemini 3 Series, Gemini 2.5 Series.

Synthetic

Provides various mainstream models via OpenAI-compatible API.
Supported models: MiniMax M2.5, Qwen 3.5, Kimi K2.5, GLM 4.7, DeepSeek V3.2 / V3 / R1, Llama 3.3 and others.

Cerebras

Some models have free quotas, refreshing over time.
Supported models:
- GLM 4.7
- GPT-OSS-120B
- Qwen 3 235B Instruct
- ...

Nvidia

Completely free, but with rate limits.
Supports almost all open-source weight models.

Volcano Engine

Each model has a certain free quota, refreshing over time.
Supported models: Doubao, Kimi, DeepSeek and other mainstream models.

Model Scope

Each model has a certain free quota, refreshing over time.
Supported models: GLM, Kimi, Qwen, DeepSeek and other mainstream models.

ZhiPu AI / Z.AI

Some models are completely free.
Supported models: GLM Flash series models.

SiliconFlow

Some models are completely free.
Supported models: Mostly open-source weight models under 32B.

StreamLake

Completely free, but with rate limits.
Supported models:
- KAT-Coder-Pro V2
- KAT-Coder-Pro V1
- KAT-Coder-Exp-72B-1010
- KAT-Coder-Air V1

LongCat

Has a certain free quota, refreshing over time.
Supported models:
- LongCat-Flash-Chat
- LongCat-Flash-Thinking
- LongCat-Flash-Thinking-2601
- LongCat-Flash-Lite

OpenRouter

Some models have certain free quotas, refreshing over time.
Supported models: Frequently changing, models with 'free' in the name.

OpenCode Zen

Some models are completely free.
Supported models: Frequently changing, models with 'free' in the name.

Ollama Cloud

Each model has a certain free quota, refreshing over time.
Supports almost all open-source weight models.

Model Support Table

The models listed below support One-Click Add Models, and have built-in recommended parameters to help you get the best performance.

Tip

Even if a model is not listed, you can still use it via Add Model Manually and tune the parameters yourself.

Vendor	Series	Supported Models
OpenAI	GPT-5 Series	GPT-5, GPT-5.1, GPT-5.2, GPT-5.4, GPT-5.5, GPT-5.4 pro, GPT-5.4 Mini, GPT-5.4 Nano, GPT-5.2 pro, GPT-5 mini, GPT-5 nano, GPT-5 pro, GPT-5-Codex, GPT-5.1-Codex, GPT-5.2-Codex, GPT-5.3-Codex, GPT-5.3-Codex-Spark, GPT-5.1-Codex-Max, GPT-5.1-Codex-mini, GPT-5.2 Chat, GPT-5.1 Chat, GPT-5 Chat
	GPT-4 Series	GPT-4o, GPT-4o mini, GPT-4o Search Preview, GPT-4o mini Search Preview, GPT-4.1, GPT-4.1 mini, GPT-4.1 nano, GPT-4.5 Preview, GPT-4 Turbo, GPT-4 Turbo Preview, GPT-4
	GPT-3 Series	GPT-3.5 Turbo, GPT-3.5 Turbo Instruct
	o Series	o1, o1 pro, o1 mini, o1 preview, o3, o3 mini, o3 pro, o4 mini
	oss Series	gpt-oss-120b, gpt-oss-20b
	Deep Research Series	o3 Deep Research, o4 mini Deep Research
	Other Models	babbage-002, davinci-002, Codex mini, Computer Use Preview
Google	Gemini 3.5 Series	gemini-3.5-flash
	Gemini 3.1 Series	gemini-3.1-pro-preview, gemini-3.1-flash-lite-preview
	Gemini 3 Series	gemini-3-pro-preview, gemini-3-flash-preview
	Gemini 2.5 Series	gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite
	Gemini 2.0 Series	gemini-2.0-flash, gemini-2.0-flash-lite
	Gemma 4 Series	Gemma 4 31B, Gemma 4 26B A4B, Gemma 4 E4B, Gemma 4 E2B
Anthropic	Claude 5 Series	Claude Fable 5, Claude Mythos 5, Claude Sonnet 5
	Claude 4 Series	Claude Opus 4.8, Claude Opus 4.7, Claude Opus 4.6, Claude Sonnet 4.6, Claude Sonnet 4.5, Claude Haiku 4.5, Claude Opus 4.5, Claude Sonnet 4, Claude Opus 4.1, Claude Opus 4
	Claude 3 Series	Claude Sonnet 3.7, Claude Sonnet 3.5, Claude Haiku 3.5, Claude Haiku 3, Claude Opus 3
xAI	Grok 4.20 Series	Grok 4.20 0309 (Reasoning), Grok 4.20 0309 (Non-Reasoning)
	Grok 4 Series	Grok 4.1 Fast (Reasoning), Grok 4.1 Fast (Non-Reasoning), Grok 4, Grok 4 Fast (Reasoning), Grok 4 Fast (Non-Reasoning), Grok 4.3
	Grok Build Series	Grok Build 0.1
	Grok Code Series	Grok Code Fast 1
Cursor	Composer Series	Composer 2.5
	Grok 3 Series	Grok 3, Grok 3 Mini
	Grok 2 Series	Grok 2 Vision
Meta	Llama 3 Series	Llama 3.1 8B, Llama 3.1 70B, Llama 3.1 405B, Llama 3.3 70B
NVIDIA	Nemotron 3 Series	Nemotron 3 Super 120B A12B
DeepSeek	DeepSeek V4 Series	DeepSeek V4 Flash, DeepSeek V4 Pro
	Compatibility Aliases	DeepSeek Chat, DeepSeek Reasoner
	DeepSeek V3 Series	DeepSeek V3.2, DeepSeek V3.2 Exp, DeepSeek V3.2 Speciale, DeepSeek V3.1, DeepSeek V3.1 Terminus, DeepSeek V3, DeepSeek V3 (0324)
	DeepSeek R1 Series	DeepSeek R1, DeepSeek R1 (0528)
	DeepSeek V2.5 Series	DeepSeek V2.5
	DeepSeek V2 Series	DeepSeek V2
	DeepSeek VL Series	DeepSeek VL, DeepSeek VL2
	DeepSeek Coder Series	DeepSeek Coder, DeepSeek Coder V2
	DeepSeek Math Series	DeepSeek Math V2
ByteDance	Doubao 2.1 Series	Doubao Seed 2.1 Pro, Doubao Seed 2.1 Turbo
	Doubao 2.0 Series	Doubao Seed 2.0 Pro, Doubao Seed 2.0 Lite, Doubao Seed 2.0 Mini, Doubao Seed 2.0 Code Preview
	Doubao 1.8 Series	Doubao Seed 1.8, Doubao Seed Code Preview
	Doubao 1.6 Series	Doubao Seed 1.6, Doubao Seed 1.6 Lite, Doubao Seed 1.6 Flash, Doubao Seed 1.6 Vision
	Doubao 1.5 Series	Doubao 1.5 Pro 32k, Doubao 1.5 Pro 32k Character, Doubao 1.5 Lite 32k
	Other Models	Doubao Lite 32k Character
MiniMax	MiniMax M3 Series	MiniMax-M3
	MiniMax M2 Series	MiniMax-M2.7, MiniMax-M2.7-Highspeed, MiniMax-M2.5, MiniMax-M2.5-Highspeed, MiniMax-M2.1, MiniMax-M2.1-Highspeed, MiniMax-M2
LongCat	LongCat 2 Series	LongCat 2.0
	LongCat Flash Series	LongCat Flash Chat, LongCat Flash Thinking, LongCat Flash Thinking 2601, LongCat Flash Lite
StreamLake	KAT-Coder Series	KAT-Coder-Pro V2, KAT-Coder-Pro V1, KAT-Coder-Exp-72B-1010, KAT-Coder-Air V1
Moonshot AI	Kimi K2.7 Series	Kimi K2.7 Code
	Kimi K2.6 Series	Kimi K2.6
	Kimi K2.5 Series	Kimi K2.5
	Kimi K2 Series	Kimi K2 Thinking, Kimi K2 Thinking Turbo, Kimi K2 0905 Preview, Kimi K2 0711 Preview, Kimi K2 Turbo Preview, Kimi For Coding
Qwen	Qwen 3.7 Series	Qwen3.7-Max, Qwen3.7-Plus
	Qwen 3.6 Series	Qwen3.6-Max-Preview, Qwen3.6-Plus, Qwen3.6-Flash, Qwen3.6-35B-A3B
	Qwen 3.5 Series	Qwen3.5-Plus, Qwen3.5-Flash, Qwen3.5-397B-A17B, Qwen3.5-122B-A10B, Qwen3.5-27B, Qwen3.5-35B-A3B, Qwen3.5-9B, Qwen3.5-4B, Qwen3.5-2B, Qwen3.5-0.8B
	Qwen 3 Series	Qwen3-Max, Qwen3-Max-Thinking, Qwen3-Max Preview, Qwen3-Coder-Next, Qwen3-Coder-Plus, Qwen3-Coder-Flash, Qwen3-VL-Plus, Qwen3-VL-Flash, Qwen3-VL-32B-Instruct, Qwen3 0.6B, Qwen3 1.7B, Qwen3 4B, Qwen3 8B, Qwen3 14B, Qwen3 32B, Qwen3 30B A3B, Qwen3 235B A22B, Qwen3 30B A3B Thinking 2507, Qwen3 30B A3B Instruct 2507, Qwen3 235B A22B Thinking 2507, Qwen3 235B A22B Instruct 2507, Qwen3 Coder 480B A35B Instruct, Qwen3 Coder 30B A3B Instruct, Qwen3-Omni-Flash, Qwen3-Omni-Flash-Realtime, Qwen3-Omni 30B A3B Captioner, Qwen-Omni-Turbo, Qwen-Omni-Turbo-Realtime, Qwen3-VL 235B A22B Thinking, Qwen3-VL 235B A22B Instruct, Qwen3-VL 32B Thinking, Qwen3-VL 30B A3B Thinking, Qwen3-VL 30B A3B Instruct, Qwen3-VL 8B Thinking, Qwen3-VL 8B Instruct, Qwen3 Next 80B A3B Thinking, Qwen3 Next 80B A3B Instruct, Qwen-Plus, Qwen-Flash, Qwen-Turbo, Qwen-Max, Qwen-Long, Qwen-Doc-Turbo, Qwen Deep Research
	Qwen 2.5 Series	Qwen2.5 0.5B Instruct, Qwen2.5 1.5B Instruct, Qwen2.5 3B Instruct, Qwen2.5 7B Instruct, Qwen2.5 14B Instruct, Qwen2.5 32B Instruct, Qwen2.5 72B Instruct, Qwen2.5 7B Instruct (1M), Qwen2.5 14B Instruct (1M), Qwen2.5 Coder 0.5B Instruct, Qwen2.5 Coder 1.5B Instruct, Qwen2.5 Coder 3B Instruct, Qwen2.5 Coder 7B Instruct, Qwen2.5 Coder 14B Instruct, Qwen2.5 Coder 32B Instruct, Qwen2.5 Math 1.5B Instruct, Qwen2.5 Math 7B Instruct, Qwen2.5 Math 72B Instruct, Qwen2.5-VL 3B Instruct, Qwen2.5-VL 7B Instruct, Qwen2.5-VL 32B Instruct, Qwen2.5-Omni-7B, Qwen2 7B Instruct, Qwen2 72B Instruct, Qwen2 57B A14B Instruct, Qwen2-VL 72B Instruct
	Qwen 1.5 Series	Qwen1.5 7B Chat, Qwen1.5 14B Chat, Qwen1.5 32B Chat, Qwen1.5 72B Chat, Qwen1.5 110B Chat
	QwQ/QvQ Series	QwQ-Plus, QwQ 32B, QwQ 32B Preview, QVQ-Max, QVQ-Plus, QVQ 72B Preview
	Qwen Coder Series	Qwen-Coder-Plus, Qwen-Coder-Turbo
	Other Models	Qwen-Math-Plus, Qwen-Math-Turbo, Qwen-VL-OCR, Qwen-VL-Max, Qwen-VL-Plus, Qwen-Plus Character (JA)
Xiaomi MIMO	MiMo V2.5 Series	MiMo V2.5 Pro UltraSpeed, MiMo V2.5 Pro, MiMo V2.5
	MiMo V2 Series	MiMo V2 Pro, MiMo V2 Omni, MiMo V2 Flash
ZhiPu AI	GLM 5 Series	GLM-5.2, GLM-5.1, GLM-5, GLM-5V-Turbo, GLM-5-Turbo
	GLM 4 Series	GLM-4.7, GLM-4.7-Flash, GLM-4.7-FlashX, GLM-4.6, GLM-4.5, GLM-4.5-X, GLM-4.5-Air, GLM-4.5-AirX, GLM-4-Plus, GLM-4-Air-250414, GLM-4-Long, GLM-4-AirX, GLM-4-FlashX-250414, GLM-4.5-Flash, GLM-4-Flash-250414, GLM-4.6V, GLM-4.5V, GLM-4.1V-Thinking-FlashX, GLM-4.6V-Flash, GLM-4.1V-Thinking-Flash
	CodeGeeX Series	CodeGeeX-4
Tencent HY	HY 2.0 Series	HY 2.0 Think, HY 2.0 Instruct
	HY 1.5 Series	HY Vision 1.5 Instruct
StepFun	Step 3 Series	Step 3, Step 3.5 Flash
	Step 2 Series	Step 2 16k, Step 2 16k Exp, Step 2 Mini
	Step 1 Series	Step 1 8k, Step 1 32k, Step 1 128k, Step 1 256k, Step 1o Turbo Vision, Step 1o Vision 32k, Step 1v 8k, Step 1v 32k, Step R1 V Mini
OpenCode Zen	Zen	Big Pickle

Application Migration Support Table

The applications listed below support One-Click Migration.

Application	Notes
Claude Code	Migration is supported only when using a custom Base URL and API Key.
Codex	Supports Base URL, API Key, and OAuth.
Gemini CLI	Migration is supported only when using the following auth methods: `GEMINI_API_KEY`, `GOOGLE_API_KEY`, `GOOGLE_APPLICATION_CREDENTIALS`.

Contributing

Feel free to open an issue to report bugs, request features, or ask for support of new providers/models.
Pull requests are welcome. See the roadmap.

Development

Build: npm run compile
Watch: npm run watch
Interactive release: npm run release
GitHub Actions release: Actions → Release (VS Code Extension) → Run workflow

License

MIT @ SmallMain

Acknowledgements

Awesome Codex CLI