LLS OAI - OpenAI-compatible & Anthropic for Copilot Chat
A VS Code extension that integrates multiple OpenAI-compatible and Anthropic API providers into GitHub Copilot Chat.
Features
- 🚀 Multiple Provider Support - Add and manage multiple OpenAI-compatible and Anthropic API providers
- 🦙 Local LLM Friendly - OpenAI-compatible and Responses API providers can be used without an API key, making local runtimes such as llama.cpp, LM Studio, Ollama-compatible endpoints, vLLM, and LocalAI easy to connect
- 🔐 Secure Key Storage - API keys are stored securely using VS Code's secret storage
- 🎨 Beautiful Configuration UI - Easy-to-use webview interface for managing providers
- 🔌 Copilot Integration - Seamlessly integrates with GitHub Copilot Chat
- 📤 Import/Export Config - Backup and restore your provider configurations
- 💾 Auto Save Chat History - Automatically save chat conversations to local files with dual-path support (global + project-level)
- 🔄 Expert Mode Chat History - Automatically saves Expert Mode conversations including tool calls and multi-turn expert interactions
- 🔄 Copilot Records Migration - Import and export Copilot chat records between machines
- 🌐 Multi-language UI - Supports English, Simplified Chinese, Traditional Chinese, Korean, Japanese, French, and German. Auto mode follows the VS Code display language and falls back to English for unsupported languages
- 🖥️ Global & Project System Prompt Settings - Dual system prompt inputs (global + workspace-scoped) appended to user messages for better model adherence
- ✅ Enhanced TODO Settings - When enabled, the model is strongly instructed to create, track, and update all tasks through the TODO tool before taking any action
- 🎯 Expert Mode - Use mid/low-tier models for development tasks and high-tier models as expert reviewers for supplementation and quality assurance
- 🧭 Solution Provider - Delegate solution design, implementation planning, and architecture proposals to a dedicated solution model, with optional expert review before finalizing
- ✨ Prompt Enhancement - Optimize prompts with a dedicated provider/model before sending or saving them. Supports global and workspace configuration, provider/model overrides, auto-submit behavior, and system prompt optimization buttons
- 🖼️ Vision/Multimodal Input Support - Supports image input forwarding across OpenAI-compatible Chat Completions, Responses API, and Anthropic Messages API providers
- 🧠 Strict Reasoning Compatibility - Preserves and replays
reasoning_content for tool-call turns, improving compatibility with DeepSeek Reasoner and other strict OpenAI-compatible providers
✨ Prompt Enhancement
Prompt Enhancement lets you use a selected model to rewrite, clarify, and structure prompts before they are used. It is useful when you want a lightweight or specialized model to turn a rough request into a more precise instruction for Copilot Chat or for your custom system prompts.
Key Features
- Dedicated Optimization Model — Choose a Prompt Enhancement provider and model independently from your normal chat model.
- Global and Workspace Settings — Enable Prompt Enhancement globally, or use workspace-level
Use global / Enabled / Disabled controls for project-specific behavior.
- Provider/Model Overrides — Workspace settings can override the global Prompt Enhancement provider/model when both values are selected.
- System Prompt Optimization — Global and project system prompt editors include an Optimize button that rewrites the current prompt using the selected Prompt Enhancement model.
- Uses Current Dropdown Selection — System prompt optimization uses the provider/model currently selected in the settings dropdowns, even before you save the settings.
- Optional Auto-submit — Prompt Enhancement can insert the optimized prompt into Chat as a draft or submit it automatically, depending on your configuration.
- Localized UI — Prompt Enhancement controls are localized across English, Simplified Chinese, Traditional Chinese, Korean, Japanese, French, and German.
Configuration
Prompt Enhancement is configured in the settings UI:
- Global Settings — Enable/disable Prompt Enhancement, select the optimization provider/model, and configure auto-submit behavior.
- Project Settings — Follow global settings or force Prompt Enhancement on/off for the current workspace. You can also override the optimization provider/model for that project.
If a project provider/model override is incomplete, the extension falls back to the global Prompt Enhancement provider/model.
🧠 Strict Reasoning Content Compatibility
Some OpenAI-compatible reasoning models, such as DeepSeek Reasoner, require the original assistant reasoning_content to be included again when a conversation continues after tool results. LLS OAI preserves this content for tool-call turns and restores it to the matching assistant tool-call message by tool call ID.
Reasoning cache files are stored locally per chat session:
~/.LLSOAI/reasoning/<session-id>.json
This compatibility layer helps avoid strict-provider errors when using tools, Expert Mode, or Solution Provider flows with models that validate reasoning-content continuity.
🦙 Local LLMs Without API Keys
OpenAI-compatible and Responses API providers support an optional API key. This makes it possible to connect local model servers that do not require authentication, such as llama.cpp, LM Studio, Ollama-compatible OpenAI endpoints, vLLM, and LocalAI.
Example local llama.cpp-style configuration:
| Field |
Example |
| API Type |
OpenAI-Compatible |
| Base URL |
http://127.0.0.1:8080/v1 |
| API Key |
Leave empty |
When the API key is empty, the extension does not send an empty Authorization: Bearer header. If your provider does require authentication, simply enter the API key and the normal Authorization: Bearer <key> header will be sent.
Notes:
- Anthropic providers still require an API key.
- Expert Mode and Solution Provider selectors intentionally show only providers with a configured API key. No-key local providers are supported for normal chat models.
- If a local server does not expose
/models, you can disable auto-fetch and add models manually in the provider UI.
🎯 Expert Mode
Use mid/low-tier models for daily development work, and bring in high-tier models as expert reviewers when you need deeper insight — fast, cost-effective, and powerful.
Expert Mode enables a dual-model workflow that maximizes both efficiency and quality. Your primary model handles the bulk of development tasks at high speed and low cost, while a high-tier expert model is called in to review, supplement, and elevate the output.
How It Works
- Main Model — Your configured primary model (e.g., GPT-4o-mini, Claude Haiku) takes on all development tasks. It's fast, affordable, and capable for the majority of day-to-day coding work.
- Expert Model — A high-tier model (e.g., GPT-4o, Claude 3.5/3.7 Sonnet/Opus) that reviews the main model's output and adds expert-level corrections, improvements, and additional context.
The Workflow
[User Request]
↓
[Main Model — Mid/Low-tier]
Fast, affordable development
↓
[Expert Model — High-tier]
Expert review & supplement
↓
[Enhanced Response → Copilot Chat]
When to Use
| Scenario |
Main Model |
Expert Model |
| Routine coding, refactoring, bug fixes |
✅ |
Optional |
| Complex architecture decisions |
✅ |
✅ Recommended |
| Security-sensitive or critical code |
✅ |
✅ Recommended |
| Deep reasoning or edge-case analysis |
✅ |
✅ Recommended |
Configuration
Expert Mode is configured in the provider settings:
- Main Model Tool Name — The tool name (e.g.,
ask_llsoai) that triggers the main model in Copilot Chat
- Expert Model Tool Name — The tool name for the expert model (e.g.,
ask_llsoai_expert)
- Expert Tool Invocation — The main model is guided by a system prompt to call the expert tool (
ask_llsoai) when it cannot confidently solve a task, needs independent verification, requires deeper investigation, or when you explicitly ask it to delegate to the expert. The expert model can use the same VS Code tools as the main model and returns its findings for the main model to incorporate into the final response.
- Expert Settings Hint — A custom hint displayed in the Expert Mode settings panel to guide the expert model's behavior
Benefits
- 💰 Cost Efficiency — Handle the majority of tasks with affordable mid/low-tier models
- ⚡ Speed — Main model responses are fast, reducing wait time during development
- 🧠 Quality Assurance — Expert model reviews catch issues that smaller models might miss
- 🔧 Flexible — Choose how often the expert model is involved based on your needs
- 🔗 Seamless — Expert model output is automatically integrated into the Copilot Chat conversation
- 💾 Chat History — Expert Mode conversations are automatically saved to chat history, including tool calls and multi-turn interactions
🧭 Solution Provider
Delegate solution design, implementation planning, and architecture proposals to a dedicated solution model — with optional expert review before finalizing.
Solution Provider enables a three-layer collaboration workflow that maximizes planning quality:
[User Request]
↓
[Main Model]
Decides whether to delegate solution design
↓
[Solution Provider Model]
Generates structured solution: goals, constraints, phased steps, risks, validation plan
↓
[Expert Model — Optional Review]
Independent review of the proposed solution
↓
[Solution Model Absorbs Review → Final Solution → Main Model]
Enhanced Response → Copilot Chat
How It Works
- Main Model — Your primary model that decides whether a task requires structured solution design. If so, it calls
ask_solution_provider with a self-contained task description.
- Solution Model — A dedicated model configured for solution design, implementation planning, architecture proposals, risk analysis, and phased roadmaps. It uses VS Code tools to inspect the workspace and ground the plan in the actual project.
- Expert Model (Optional) — When "expert review" is enabled, the solution model must call
ask_llsoai at least once before finalizing. The expert independently reviews the proposed solution, and the solution model absorbs the review feedback before returning the final result.
Configuration
Solution Provider is configured in the provider settings:
- Global Settings — Enable/disable, select solution provider, select solution model, and toggle expert review.
- Workspace Settings — Override global with
Use global / Force enabled / Force disabled for both solution provider and expert review independently.
- Project Provider/Model Override — Leave blank to use global solution model, or set workspace-specific provider/model.
Key Features
- 📋 Structured Output — Solution models return structured results with
writeStatus, solutionSummary, solutionFile/fullSolutionInline, and error/reason fields.
- 📝 Auto Draft Persistence — Solution models can optionally persist solutions as Markdown in
.LLSOAI/Solution/drafts/ for traceability and better expert review quality.
- 🔒 Safety Guards — Recursive delegation prevention (solution model never sees
ask_solution_provider), expert review count limits, forced review reminder limits with graceful degradation.
- 🔀 Tool Call Prefix Isolation — Solution model tools use
llsoai_solution: prefix, expert model tools use llsoai: prefix — no interference when both run concurrently.
💾 Chat History
Dual-Path Saving
Chat history can be saved to two locations simultaneously:
| Location |
Default Path |
Description |
| Global |
~/.LLSOAI/chat_*.json |
Centralized storage for all conversations |
| Project |
<project>/.LLSOAI/YYYY-MM-DD/ |
Date-organized per-project storage |
Each save location can be independently enabled or disabled:
- Global — Always overwrites the latest session, keeping a single up-to-date record
- Project — Organized by date, creating a new file each day for historical tracking
Expert Mode Chat History
Expert Mode conversations are automatically saved when the expert model completes streaming:
- ✅ Saves after each expert response (text or tool calls)
- ✅ Includes user's question, expert responses, and tool interactions
- ✅ Both global and project-level saves apply to Expert Mode
- ✅ Saves the complete expert context for review and continuity
📸 Timeline Snapshot
Timeline is an intelligent file history management system that automatically tracks file changes and provides powerful restoration capabilities:
Auto Snapshot on Save
When you save any file, the extension automatically creates a timestamped snapshot:
| Feature |
Description |
| Automatic Backup |
Files are snapshotted on every save |
| Location |
~/.LLSOAI/History/<mapped-file-path>/ |
| Metadata |
metadata.json tracks all snapshots with SHA-256, line count, timestamps |
| Smart Trimming |
Keeps max 20 snapshots per file to save space |
| Exclusion |
Automatically excludes .git, node_modules, build dirs, and secret files |
Git Integration
Timeline intelligently integrates with Git to keep snapshots clean:
| Feature |
Description |
| Git Clean Detection |
When a file is removed by git clean, its snapshots are automatically cleaned |
| Commit Change Tracking |
Watches .git/HEAD, packed-refs, refs/heads/** for commit changes |
| Worktree Support |
Handles Git worktree scenarios correctly |
| Safe Cleanup |
On git operations, snapshots of modified files are cleaned while preserving metadata |
Three built-in tools are available in every provider for file history exploration:
timeline_list_by_file
Lists all snapshots for a given file path:
- Returns snapshot records with timestamps, SHA-256, line counts
- Shows which snapshots have been git-cleaned
timeline_restore_snapshot
Restores a file to a previous snapshot state:
- Validates snapshot ID against metadata
- Refuses to restore files protected by Git HEAD
- Creates automatic
beforeRestore backup before restoration
- Safety checks ensure you can always recover
timeline_read_snapshot_lines
Reads partial content from any snapshot:
- Supports 1-based line numbers
- Maximum 200 lines per request
- Returns
totalLines for pagination
- Handles out-of-range requests gracefully
Timeline tools support seamless multi-turn interactions:
- When you call a timeline tool, the model can continue making timeline calls
- Maximum 3 continuation rounds per conversation turn
- Tool results are automatically appended to the conversation
- Perfect for exploring file history without leaving Copilot Chat
~/.LLSOAI/History/<file-absolute-path>/
├── metadata.json # Snapshot index and metadata
└── snapshots/
└── <snapshot-id> # Raw file content (no JSON wrapper)
Benefits
- 🛡️ Safety Net — Every save creates a recoverable backup
- 🔍 Easy Exploration — List and read historical versions without leaving Copilot
- ♻️ Smart Restoration — Restore any snapshot with automatic safety backups
- 🧹 Git-Aware — Automatically cleans stale snapshots on git operations
- 🔧 Non-blocking — Snapshot failures don't interfere with file operations
Supported APIs
| API Type |
Endpoint |
Notes |
| OpenAI-compatible |
/v1/chat/completions |
Any OpenAI-compatible API |
| Anthropic |
/v1/messages |
Claude models with automatic format conversion |
Anthropic API Features
When using Anthropic API type, the extension automatically handles:
- ✅ Message format conversion (system/user/assistant/tool roles)
- ✅ Tool definitions conversion (
input_schema ↔ parameters)
- ✅ Tool choice mapping (
auto/none/required ↔ auto/none/any)
- ✅ Streaming response translation
- ✅ Full tool calling support (including no-argument tools)
Requirements
- VS Code 1.104.0 or higher
- GitHub Copilot Chat extension
Getting Started
- Install the extension
- Click on the "LLS OAI" status bar item or use the command palette:
LLS OAI: Manage Providers
- Click "Add Provider" to configure your first provider
- Fill in:
- Name: A unique identifier for this provider (e.g., "MyOpenAI", "Claude")
- API Type: Select "OpenAI-compatible" or "Anthropic"
- Base URL: The API endpoint (e.g.,
https://api.openai.com/v1 or https://api.anthropic.com)
- API Key: Your API key for authentication
- Models: Add one or more models with their configurations
- Save and start using your provider in Copilot Chat!
When configuring the Base URL, do NOT include the API endpoint path suffix:
| ✅ Correct |
❌ Incorrect |
https://api.openai.com/v1 |
https://api.openai.com/v1/chat/completions |
https://api.anthropic.com/v1 |
https://api.anthropic.com/v1/messages |
https://your-proxy.com/v1 |
https://your-proxy.com/v1/chat/completions |
The extension automatically appends the correct endpoint based on the API type:
- OpenAI-compatible → appends
/chat/completions
- Anthropic → appends
/messages
Provider Configuration
Each provider requires:
- Name: Unique identifier shown in Copilot
- API Type:
OpenAI-compatible or Anthropic
- Base URL: API endpoint URL
- API Key: Authentication key (stored securely)
- Models: List of models with:
- Model ID (API identifier)
- Display Name (shown in Copilot UI)
- Context Length
- Max Tokens
- Temperature & Top-P settings
- Vision support flag
Commands
LLS OAI: Manage Providers - Open the provider management UI
LLS OAI: Open Configuration UI - Open configuration panel
Import/Export
You can backup and restore your provider configurations:
- Click "Export" to save all configurations to a JSON file
- Click "Import" to restore from a previously exported file
Note: API keys are not included in exports for security reasons. You'll need to re-enter them after importing.
Multi-language UI
The configuration UI supports multiple display languages:
- English
- Simplified Chinese
- Traditional Chinese
- Korean
- Japanese
- French
- German
Use the language selector above the Global Settings and Project Settings buttons to switch languages. The Auto (VS Code) option follows the VS Code display language. Unsupported or unknown languages fall back to English.
Auto Save Chat History
You can configure automatic chat history saving:
- Open the LLS OAI configuration panel
- Scroll to "Save Chat History" section
- Click "Settings" to configure:
- Auto Save Chat History: Toggle to enable/disable
- Save Path: Custom directory for saved chats (Default: Windows
%APPDATA%/LLSOAI, macOS/Linux ~/.LLSOAI)
Chat sessions are automatically saved as JSON files. When a conversation is compressed, an archive file is created with a timestamp.
Custom System Prompt
Customize the system prompt that is sent with every chat request. This is useful for adding persistent instructions, coding style preferences, or project-specific context.
Features
- Global System Prompt: Applies to all VS Code projects (user settings)
- Workspace System Prompt: Applies only to the current project (workspace settings)
- Dual Input: Both prompts can be used simultaneously — they are merged into a single system message
- User Message Appendix: Custom prompts are also appended to the last user message for better model adherence
- Open the LLS OAI configuration panel
- Scroll to System Prompt section
- Click Edit to open the modal
- Fill in:
- Global System Prompt: Your personal default instructions (applies everywhere)
- Workspace System Prompt: Project-specific instructions (applies only to this workspace)
- Click Save
Debug
The merged system message content is written to ~/.LLSOAI/system.txt for verification.
Copilot Records Migration
Migrate your Copilot chat records between different machines:
Export
- Click "Export" in the Copilot Records section
- The extension will find your current project's chat records in VS Code storage
- Records are saved to
.LLSOAI/<timestamp>/ folder in your project
Import
- Place the exported
.LLSOAI/<timestamp>/ folder into your project's .LLSOAI/ directory
- Click "Import" in the Copilot Records section
- The extension will find the latest exported records and copy them to VS Code storage
- Close and reopen VS Code to load the migrated chat records
Changelog
2.1.0
- Multi-language UI: Added UI language selection with English, Simplified Chinese, Traditional Chinese, Korean, Japanese, French, and German support
- Auto Language Detection: The Auto option follows the VS Code display language and falls back to English when the language is unsupported
- Localized Configuration UI: Provider management, settings panels, modals, validation messages, and dynamic UI text are localized
2.0.0
- Enhanced TODO Settings: Renamed "Force TODO" to "Enhanced TODO" throughout the configuration UI for clearer terminology
- Mandatory TODO Tool Usage: When Enhanced TODO is enabled, the model is now strongly instructed to use the TODO tool before taking any action, with clear requirements that all TODO items must be detailed, specific, and include actionable steps
- Global & Project System Prompt Settings: Added global and workspace-scoped system prompt settings with dual input fields in the configuration UI. System prompts are appended to user messages for better model adherence
1.3.3
- Custom System Prompt: Global and workspace-scoped custom system prompts with dual input fields in configuration UI
- System Prompt Merging: Multiple system prompt sources (global, workspace, VS Code Copilot) are merged into a single system message
- User Message Prompt Appendix: Custom prompts are also appended to the last user message for better model adherence
1.3.0
- Anthropic API Support: Full support for Anthropic Messages API (
/v1/messages) alongside OpenAI-compatible endpoints
- Automatic Format Conversion: Bidirectional conversion between OpenAI and Anthropic formats
License
MIT
Support
For issues and feature requests, please open an issue on the repository.