Copilot Custom Provider

Adds a VS Code language model provider named Custom OpenAI Responses for OpenAI Responses API compatible services.

This extension uses VS Code's public LanguageModelChatProvider extension API. It does not replace or proxy GitHub Copilot's built-in models, and it is not the built-in Custom Endpoint provider. It adds separate custom models to the VS Code/Copilot model picker and sends requests to the endpoints you configure.

VS Code also documents a built-in Custom Endpoint/BYOK path for compatible third-party endpoints. This extension does not target that implementation or require the built-in Custom Endpoint UI. Those docs are used only as a reference for common model capability names and Responses API conventions.

Requirements

VS Code 1.121.0 or newer with the chatProvider API available.
One or more services compatible with the OpenAI Responses API.

Quick Setup

Configure one or more profiles in User Settings or workspace .vscode/settings.json:

{
  "copilotCustomProvider.profiles": [
    {
      "id": "host-a",
      "name": "Host A",
      "baseUrl": "https://host-a.example.com",
      "models": [
        {
          "id": "gpt-5.5",
          "name": "GPT-5.5 Medium",
          "toolCalling": true,
          "vision": true,
          "reasoningEffort": "medium",
          "patch": {
            "drop": {
              "truncation": true
            }
          }
        },
        {
          "id": "gpt-5.5-high",
          "apiModel": "gpt-5.5",
          "name": "GPT-5.5 High",
          "toolCalling": true,
          "vision": true,
          "reasoningEffort": "high"
        }
      ]
    },
    {
      "id": "host-b",
      "name": "Host B",
      "baseUrl": "https://host-b.example.com",
      "models": [
        {
          "id": "gpt-5.5",
          "name": "GPT-5.5",
          "toolCalling": true,
          "vision": true,
          "reasoningEffort": "medium",
          "supportedEndpoints": ["/responses", "ws:/responses"]
        }
      ]
    }
  ]
}

Then run this command for each profile that requires a key:

Custom OpenAI Responses: Set API Key

This step is important. settings.json defines the profiles and models, but the key is normally stored separately in VS Code SecretStorage. The command asks which profile to update. Custom models can appear in the Copilot/Chat model picker before a key is set, but a request will fail until the selected profile has a key or requireApiKey is false.

Base URL

baseUrl is resolved to a Responses API URL:

https://host-a.example.com: requests are sent to https://host-a.example.com/v1/responses.
https://host-a.example.com/v1: requests are sent to https://host-a.example.com/v1/responses.
https://host-a.example.com/proxy: requests are sent to https://host-a.example.com/proxy/v1/responses.
URLs already containing /responses, /chat/completions, or /messages are treated as explicit endpoint URLs and used as configured.

The same rule applies to model-level baseUrl. If a model does not set baseUrl, it uses the profile baseUrl.

API Keys

Keys should normally be set with Custom OpenAI Responses: Set API Key.

Keys are stored in this extension's VS Code SecretStorage, not in settings.json.
The secret is tied to profiles[].id. Internally it is stored as copilotCustomProvider.apiKey.<profile id>.
If you change a profile id, run Set API Key again for the new id.
profiles[].apiKey is supported as an inline fallback, but avoid it for real secrets, especially in workspace settings.
SecretStorage takes priority over inline apiKey.
Use Custom OpenAI Responses: Clear API Key to remove the stored key for one profile.

When apiKeyHeader is omitted, requests use common OpenAI/Azure auth defaults:

URLs containing openai.azure use api-key: <key>.
Other URLs use Authorization: Bearer <key>.

Set apiKeyHeader and apiKeyPrefix only when the whole profile needs a different default. For per-model gateways, use models[].requestHeaders; authorization and api-key can override the inferred auth header, and ${apiKey} is replaced with the profile key.

Profile Fields

Field	Required	Default	Meaning
`id`	Yes	-	Stable profile id. Used for SecretStorage and default model ids. Must be unique.
`name`	No	`id`	Display name shown in model details and key prompts.
`baseUrl`	Usually	-	Service base URL for this profile. Unless it already contains `/responses`, `/chat/completions`, or `/messages`, `/v1/responses` is appended. Can be omitted if every model has its own `baseUrl`.
`apiKey`	No	-	Inline key fallback. Prefer the Set API Key command.
`requireApiKey`	No	`true`	Require a key when using this profile. Models still appear before a key is set. Set `false` for local/proxy endpoints that need no key.
`apiKeyHeader`	No	inferred	Optional profile-level auth header override. When omitted, `openai.azure` URLs use `api-key`; other URLs use `Authorization`.
`apiKeyPrefix`	No	`Bearer`	Prefix used when `apiKeyHeader` is explicitly set. Use `""` for raw key headers.
`extraHeaders`	No	`{}`	Extra static headers for this profile. Reserved, unsafe, and auth-related header overrides are ignored. Do not put secrets here.
`requestBodyOverrides`	No	`{}`	JSON fields merged into every request for this profile.
`models`	No	GPT-5 default	Models shown under this profile.

Model Fields

Field	Required	Default	Meaning
`id`	Yes	-	Model id exposed under this profile. VS Code/Copilot sees `<profile id>/<model id>`. Sent as the Responses API `model` only when `apiModel` is not set.
`apiModel`	No	`id`	Actual model value sent to the Responses API request. Use this when the exposed model id should differ from the upstream model id.
`name`	No	`id`	Base display name. Used as `${modelName}` in `modelNameTemplate`.
`baseUrl`	No	profile `baseUrl`	Per-model base URL or full Responses API URL override. Uses the same automatic `/v1/responses` rule, profile key, and headers.
`family`	No	`apiModel` or `id`	Model family advertised to VS Code.
`version`	No	`1`	Model version advertised to VS Code.
`maxInputTokens`	No	`128000`	Input token budget advertised to VS Code.
`maxOutputTokens`	No	`16384`	Output token budget advertised and sent as `max_output_tokens`.
`toolCalling`	No	`false`	Advertise tool support and forward VS Code tools to the Responses API request.
`vision`	No	`false`	Advertise image input support. Image data is sent as Responses API `input_image`.
`thinking`	No	`false`	Advertise thinking support. When `true`, requests include `reasoning.encrypted_content`, encrypted reasoning items are round-tripped, and `temperature` is removed from the final body.
`streaming`	No	global setting	Set `false` to send `stream: false` for this model even when global streaming is enabled.
`editTools`	No	-	Edit tool hints exposed through VS Code model capabilities: `find-replace`, `multi-find-replace`, `apply-patch`, `code-rewrite`.
`reasoningEffort`	No	unset	Preferred/default reasoning effort for this model. The value can be any string, but it is sent only if included in the model's advertised effort list.
`supportsReasoningEffort`	No	default five levels	Reasoning effort levels accepted by the model. Because this provider is Responses-only, omit it or set `[]` to use the provider default five levels, or set a non-empty array to use those exact picker values.
`temperature`	No	-	Sent as `temperature` when set. Removed from the final request body for thinking models, matching VS Code BYOK behavior.
`topP`	No	-	Sent as `top_p` when set.
`zeroDataRetentionEnabled`	No	`false`	Uses the common BYOK/Custom Endpoint field name. When `true`, `previous_response_id` is not sent and requests use `store: false`.
`supportedEndpoints`	No	`["/responses"]`	Endpoint mode metadata for this extension. Keep the default for HTTP/SSE. Include `ws:/responses` when the model/endpoint supports Responses WebSocket v2.
`requestHeaders`	No	`{}`	Model-level request headers. Auth headers can override the inferred default, and `${apiKey}` is interpolated.
`extraBody`	No	`{}`	Extra JSON fields merged into requests for this model.
`patch.drop.truncation`	No	`false`	Deletes top-level `truncation` for third-party relay APIs that cannot handle it. Default `false` keeps request semantics unchanged.

Global Settings

Field	Default	Meaning
`copilotCustomProvider.enabled`	`true`	Enable or disable the provider.
`copilotCustomProvider.defaultReasoningEffort`	`medium`	Used when a model does not set `reasoningEffort`. It is sent only if the value is included in that model's advertised effort list.
`copilotCustomProvider.requestTimeoutMs`	`120000`	HTTP timeout in milliseconds.
`copilotCustomProvider.enableStreaming`	`true`	Request streaming responses.
`copilotCustomProvider.maxRetries`	`1`	Retry count for failed non-cancelled HTTP requests.
`copilotCustomProvider.tokenEstimateCharsPerToken`	`4`	Fallback token estimate used by VS Code.
`copilotCustomProvider.modelNameTemplate`	`${profileName}/${modelName}`	Template for model names shown in the picker.
`copilotCustomProvider.logLevel`	`off`	Output logging level. Set `debug` to log outgoing HTTP request headers and body.
`copilotCustomProvider.logRequests`	`false`	Legacy metadata logging switch. Prefer `logLevel`.
`copilotCustomProvider.requestBodyOverrides`	`{}`	JSON fields merged into every request.

modelNameTemplate supports ${profileId}, ${profileName}, ${modelId}, ${modelName}, ${apiModel}, ${reasoningEffort}, and ${baseUrlHost}.

Model Names and IDs

By default, model names are shown as:

Host A/GPT-5.5

This keeps models from different profiles distinguishable inside the single Custom OpenAI Responses provider group.

VS Code model ids must be unique inside this provider. Upstream model ids do not have to be unique.

For example, two profiles can both use:

{ "id": "gpt-5.5" }

They appear to VS Code as host-a/gpt-5.5 and host-b/gpt-5.5, while each service receives:

{ "model": "gpt-5.5" }

If one profile exposes the same upstream model multiple times, give each picker entry a different id and point them at the same apiModel:

[
  { "id": "gpt-5.5-medium", "apiModel": "gpt-5.5", "reasoningEffort": "medium" },
  { "id": "gpt-5.5-high", "apiModel": "gpt-5.5", "reasoningEffort": "high" }
]

Reasoning Effort

This provider is Responses-only, so each configured model exposes Copilot's native Thinking Effort picker by default. Omit supportsReasoningEffort or set it to [] to use the provider default five levels, currently minimal, low, medium, high, and xhigh. Set a non-empty array to use those exact picker values. The selected value is received as options.modelConfiguration.reasoningEffort and sent to the Responses API as nested reasoning.effort.

Request priority is options.modelConfiguration.reasoningEffort, then options.modelOptions.reasoningEffort, then options.modelOptions.reasoning.effort, then model reasoningEffort. If no request value exists, the default is chosen from the advertised enum: model reasoningEffort, then global defaultReasoningEffort, then high for Claude families or medium for others, then the first advertised level.

Relay Compatibility

patch.drop.truncation is a compatibility switch. Some third-party relay APIs do not correctly process Copilot's truncation: "disabled" field in Responses API requests. If debugging shows that this field causes the relay to reject or mishandle requests, set patch.drop.truncation to true for that model.

Endpoint Modes

By default, this extension uses HTTP/SSE for Responses requests:

{
  "supportedEndpoints": ["/responses"]
}

To use Responses WebSocket v2 for a model, declare:

{
  "supportedEndpoints": ["/responses", "ws:/responses"]
}

Only add ws:/responses when your service explicitly supports the Responses WebSocket API. The setting is a capability declaration; the extension does not probe or guess WebSocket support from the URL.

Debug Logs

For request debugging, set:

{
  "copilotCustomProvider.logLevel": "debug"
}

Debug logs are written to the Custom OpenAI Responses output channel. API key headers are redacted, but request bodies can contain prompt and workspace content.

Copilot Custom Provider

tzraeq

Copilot Custom Provider

Requirements

Quick Setup

Base URL

API Keys

Profile Fields

Model Fields

Global Settings

Model Names and IDs

Reasoning Effort

Relay Compatibility

Endpoint Modes

Debug Logs