Copilot Proxy
AboutCopilot Proxy is a VS Code extension that exposes GitHub Copilot's language models through local OpenAI-compatible and Anthropic-compatible API servers. This lets you leverage your existing Copilot subscription to power external applications, scripts, and tools - no additional API costs, just your Copilot subscription. Perfect for developers who want to use Copilot's models in custom workflows, automation scripts, or with tools that expect an OpenAI or Anthropic-compatible endpoint - including Claude Code.
Features
Prerequisites
InstallationManual Install - Preferred
From Source - debugging/launching may not work
UsageStarting the ServerThe server starts automatically by default. You can also:
Status BarThe status bar shows the current server state:
Click the status bar item to open the interactive status panel. Status PanelThe status panel provides:
Output LoggingView real-time logs in VS Code's Output panel (select "Copilot Proxy" from the dropdown):
Example output:
Using with External ToolsExample ScriptsTwo Python examples are included in the Simple Example (
|
| Environment Variable | Default | Description |
|---|---|---|
VSCODE_LLM_ENDPOINT |
http://127.0.0.1:8080/v1/chat/completions |
Proxy endpoint URL |
VSCODE_LLM_FALLBACK |
true |
Enable/disable Anthropic fallback |
ANTHROPIC_API_KEY |
(none) | Required for fallback support |
With Python (OpenAI client)
from openai import OpenAI
client = OpenAI(
base_url="http://127.0.0.1:8080/v1",
api_key="not-needed" # Any value works
)
response = client.chat.completions.create(
model="claude-3.5-sonnet",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
With Python (streaming)
from openai import OpenAI
client = OpenAI(
base_url="http://127.0.0.1:8080/v1",
api_key="not-needed"
)
stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a short poem"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
With curl (streaming)
curl http://127.0.0.1:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3.5-sonnet",
"messages": [{"role": "user", "content": "Write a haiku"}],
"stream": true
}'
With Node.js
const response = await fetch('http://127.0.0.1:8080/v1/chat/completions', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
model: 'claude-3.5-sonnet',
messages: [{ role: 'user', content: 'Hello!' }]
})
});
const data = await response.json();
console.log(data.choices[0].message.content);
With Claude Code
You can use the proxy to run Claude Code through your Copilot subscription - no Anthropic API key required.
PowerShell:
$env:ANTHROPIC_BASE_URL = 'http://127.0.0.1:8080'
$env:ANTHROPIC_MODEL = 'claude-opus-4.6'
$env:ANTHROPIC_AUTH_TOKEN = 'a'
$env:ANTHROPIC_API_KEY = 'a'
claude
Bash/Zsh:
export ANTHROPIC_BASE_URL="http://127.0.0.1:8080"
export ANTHROPIC_MODEL="claude-opus-4.6"
export ANTHROPIC_AUTH_TOKEN="a"
export ANTHROPIC_API_KEY="a"
claude
Note: The
ANTHROPIC_API_KEYandANTHROPIC_AUTH_TOKENvalues can be anything - the proxy doesn't validate them. TheANTHROPIC_MODELshould be a valid Anthropic model name that Claude Code recognizes. The proxy maps all requests to the best available Copilot model (or the model configured incopilotProxy.defaultModel).
With Anthropic Python SDK
import anthropic
client = anthropic.Anthropic(
base_url="http://127.0.0.1:8080",
api_key="not-needed" # Any value works
)
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}]
)
print(message.content[0].text)
With curl (Anthropic format)
curl -X POST http://127.0.0.1:8080/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: a" \
-d '{
"model": "claude-sonnet-4-20250514",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Hello!"}]
}'
With LangChain
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
base_url="http://127.0.0.1:8080/v1",
api_key="not-needed",
model="claude-3.5-sonnet"
)
response = llm.invoke("What is the capital of France?")
print(response.content)
API Endpoints
Once running, the following endpoints are available:
POST /v1/messages
Anthropic-compatible messages endpoint. Works with the Anthropic SDK and Claude Code.
curl -X POST http://127.0.0.1:8080/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: a" \
-d '{
"model": "claude-sonnet-4-20250514",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Hello!"}],
"stream": false
}'
Request Body:
model(optional): Model name. The proxy maps this to the best available Copilot model.messages: Array of messages withrole(user,assistant) andcontentmax_tokens(required): Maximum tokens to generatestream(optional): Set totruefor streaming responses (SSE format)tools(optional): Array of tool definitions (Anthropic format)tool_choice(optional): Tool choice configurationuse_vscode_tools(optional): Include VS Code registered toolstool_execution(optional):"none"or"auto"for auto-execute mode
Response:
{
"id": "msg_abc123",
"type": "message",
"role": "assistant",
"content": [{
"type": "text",
"text": "Hello! How can I help you today?"
}],
"model": "copilot-claude-3.5-sonnet",
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 0,
"output_tokens": 0
}
}
POST /v1/chat/completions
OpenAI-compatible chat completions endpoint.
curl http://127.0.0.1:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3.5-sonnet",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": false
}'
Request Body:
model(optional): Model ID or partial name to match. If omitted, uses default model setting or first available.messages: Array of chat messages withrole(system,user,assistant) andcontentstream(optional): Set totruefor streaming responses (SSE format)temperature(optional): Accepted but not forwarded to VS Code APImax_tokens(optional): Accepted but not forwarded to VS Code API
Response (non-streaming):
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1234567890,
"model": "copilot-claude-3.5-sonnet",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0
}
}
Response (streaming):
Server-Sent Events (SSE) format compatible with OpenAI's streaming API.
GET /v1/models
List available models.
curl http://127.0.0.1:8080/v1/models
Response:
{
"object": "list",
"data": [
{
"id": "copilot-claude-3.5-sonnet",
"object": "model",
"created": 1234567890,
"owned_by": "copilot",
"name": "Claude 3.5 Sonnet",
"family": "claude-3.5-sonnet",
"version": "1.0",
"maxInputTokens": 16384
}
]
}
GET /health
Health check endpoint.
curl http://127.0.0.1:8080/health
Response:
{
"status": "ok",
"models_available": 5
}
GET /v1/tools
List available tools from VS Code (built-in, extensions, and MCP servers).
# List all tools
curl http://127.0.0.1:8080/v1/tools
# Filter by tags
curl "http://127.0.0.1:8080/v1/tools?tags=vscode,editor"
# Filter by name pattern
curl "http://127.0.0.1:8080/v1/tools?name=get_*"
Response:
{
"object": "list",
"data": [
{
"name": "get_open_editors",
"description": "Get list of currently open editors",
"inputSchema": { "type": "object", "properties": {} },
"tags": ["vscode", "editor"]
}
]
}
Tool Calling
The proxy supports OpenAI-compatible tool/function calling, allowing models to invoke tools and receive results.
Pass-Through Mode (Default)
In pass-through mode, the proxy returns tool calls to your application. You execute the tools and send results back.
# Step 1: Send request with tools
curl -X POST http://127.0.0.1:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3.5-sonnet",
"messages": [{"role": "user", "content": "What is the weather in London?"}],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"}
},
"required": ["location"]
}
}
}]
}'
# Response includes tool_calls - execute the tool, then send results back
See examples/vscode_llm_tools_simple.py for a complete pass-through example.
Auto-Execute Mode
In auto-execute mode, the proxy handles tool execution using VS Code's registered tools. You just send a request and get the final answer.
curl -X POST http://127.0.0.1:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3.5-sonnet",
"messages": [{"role": "user", "content": "List files in the src folder"}],
"use_vscode_tools": true,
"tool_execution": "auto",
"max_tool_rounds": 5
}'
Tool Calling Options:
| Option | Type | Default | Description |
|---|---|---|---|
tools |
array | - | Array of tool definitions (OpenAI format) |
tool_choice |
string | "auto" |
"none", "auto", or "required" |
use_vscode_tools |
boolean | false |
Include all VS Code registered tools |
tool_execution |
string | "none" |
"none" (pass-through) or "auto" (proxy executes) |
max_tool_rounds |
number | 10 |
Max iterations in auto mode (0 = unlimited) |
See examples/vscode_llm_tools_auto.py for a complete auto-execute example.
Tool Calling Examples
Three tool calling examples are included in the examples/ folder:
List Tools (vscode_llm_list_tools.py)
Discover what tools are available in VS Code:
py examples/vscode_llm_list_tools.py
py examples/vscode_llm_list_tools.py --tags vscode
py examples/vscode_llm_list_tools.py --schema # Show parameter schemas
Pass-Through Mode (vscode_llm_tools_simple.py)
Handle tool calls yourself - useful when you need control over tool execution:
py examples/vscode_llm_tools_simple.py
Auto-Execute Mode (vscode_llm_tools_auto.py)
Let the proxy handle everything - just ask and get answers:
py examples/vscode_llm_tools_auto.py
Configuration
Settings available in VS Code Settings (search for "Copilot Proxy"):
| Setting | Default | Description |
|---|---|---|
copilotProxy.port |
8080 |
Port number for the proxy server |
copilotProxy.autoStart |
true |
Automatically start when VS Code opens |
copilotProxy.defaultModel |
"" |
Default model when not specified in request (leave empty for first available) |
Commands
Copilot Proxy: Start Server- Start the proxy serverCopilot Proxy: Stop Server- Stop the proxy serverCopilot Proxy: Show Status- Open the interactive status panel
Limitations
- System Messages: VS Code LM API doesn't have a system role - system messages are converted to user messages
- Token Counts: Token counts in responses are always 0 (VS Code API doesn't expose this)
- Temperature/Max Tokens: These parameters are accepted but not forwarded to the underlying API
- Request Size: Maximum request body size is 10MB (requests larger than this will receive a 413 error)
- Request Timeout: Requests timeout after 30 seconds (will receive a 408 error)
Security
Copilot Proxy is designed for local development use. The following security considerations apply:
Localhost-Only Binding
The server binds to 127.0.0.1 (localhost) by default. This means:
- Only applications on your local machine can access the proxy
- The server is not accessible from other devices on your network
- This is intentional to prevent unauthorized access
No Authentication
The API does not require authentication because:
- It's designed for trusted local applications only
- Your Copilot subscription credentials are managed securely by VS Code
- Adding authentication would add friction without meaningful security benefit in a localhost context
CORS Configuration
The server allows all origins (Access-Control-Allow-Origin: *) because:
- Browser-based local development tools need CORS headers
- Localhost binding already limits access to local applications
- Restrictive CORS would break integration with local web tools
Request Limits
The following limits protect against resource exhaustion:
| Limit | Value | Purpose |
|---|---|---|
| Request body size | 10 MB | Prevents memory exhaustion |
| Request timeout | 30 seconds | Prevents connection exhaustion |
| Keep-alive timeout | 5 seconds | Manages idle connections |
Best Practices
- Do not expose the proxy to the network (don't modify binding to
0.0.0.0) - Do not run in production environments
- The proxy is for development and testing only
Troubleshooting
"No language models available"
- Ensure GitHub Copilot extension is installed
- Ensure you're signed into GitHub with Copilot access
- Try running
GitHub Copilot: Sign Infrom Command Palette - Check the Output panel for error details
"Port already in use"
- Change the port in settings (
copilotProxy.port) - Or stop whatever is using that port
Model not found
- Use
GET /v1/modelsto see available models - Model matching is flexible:
claude,sonnet, orclaude-3.5-sonnetall work - Check the Output panel to see which model was selected
Check the Logs
Open VS Code's Output panel and select "Copilot Proxy" from the dropdown to see detailed logs including:
- All errors with timestamps
- Request/response details
- Model selection information
License
MIT