Skip to content
| Marketplace
Sign in
Visual Studio Code>Programming Languages>Universal LLMNew to Visual Studio Code? Get it now.
Universal LLM

Universal LLM

Marcel Kowalik

|
24 installs
| (0) | Free
The most customizable LLM wrapper: OpenAI, Anthropic, Ollama & Python scripts.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Universal LLM

The easiest, most customizable LLM wrapper for VS Code on the planet.

Universal LLM is a "dumb pipe" extension. It doesn't assume which AI model you are using. It simply connects VS Code to any API endpoint (OpenAI, Anthropic, Ollama, LM Studio, or your own Python scripts) using full JSON customization.

Features

  • Universal Compatibility: Works with OpenAI, Claude, DeepSeek, Ollama, LM Studio, and generic HTTP endpoints.
  • Multiple Profiles: Switch instantly between "GPT-4o", "Local Llama 3", and "Custom Python Script" via a dropdown.
  • Staging Area: Highlight code (Ctrl+Shift+L) to stack multiple files into context before sending.
  • Markdown Support: Full syntax highlighting and markdown rendering in the chat.
  • One-Click Copy: Copy code blocks instantly.
  • "God Mode" Configuration: Full control over JSON body, headers, and parsing logic.

Keybindings

  • Ctrl + L (Cmd + L on Mac): Start New Chat with selected text.
  • Ctrl + Shift + L (Cmd + Shift + L on Mac): Add Selection to current input without sending.
  • Ctrl + Enter: Send Message.

Setup & Configuration

  1. Install the extension.
  2. Open VS Code Settings (Ctrl + ,) and search for ullm.
  3. Click Edit in settings.json under Ullm: Profiles.

Example Configuration

Paste this into your settings.json to get started:

"ullm.profiles": [
    {
        "name": "OpenAI GPT-4o",
        "url": "https://api.openai.com/v1/chat/completions",
        "apiKey": "sk-YOUR_OPENAI_KEY_HERE",
        "bodyTemplate": {
            "model": "gpt-4o",
            "temperature": 0.7,
            "messages": "${messages}"
        }
    },
    {
        "name": "Anthropic Claude 3.5",
        "url": "https://api.anthropic.com/v1/messages",
        "apiKey": "sk-ant-YOUR_KEY_HERE",
        "headers": {
            "x-api-key": "sk-ant-YOUR_KEY_HERE",
            "anthropic-version": "2023-06-01"
        },
        "bodyTemplate": {
            "model": "claude-3-5-sonnet-20240620",
            "max_tokens": 1024,
            "messages": "${messages}",
            "_responsePath": "content[0].text" 
        }
    },
    {
        "name": "Local Ollama",
        "url": "http://localhost:11434/api/chat",
        "bodyTemplate": {
            "model": "llama3",
            "stream": false,
            "messages": "${messages}",
            "_responsePath": "message.content"
        }
    }
]


## How It Works

### The `${messages}` Placeholder
In your `bodyTemplate`, the string `"${messages}"` is replaced by the actual chat history array `[{"role": "user", "content": "..."}]` before sending.

### The `_responsePath` Magic Key
Different APIs hide the answer in different JSON fields. You can tell the extension where to look by adding `_responsePath` to your template. The extension reads this, removes it from the payload, and uses it to parse the answer.

*   **OpenAI:** (Auto-detected, usually not needed)
*   **Anthropic:** `content[0].text`
*   **Ollama:** `message.content` or `response`
*   **Custom Script:** `result` or `output.text`

### Debugging
If something isn't working:
1.  Open the Output Panel (`Ctrl + Shift + U`).
2.  Select **"Universal LLM Debug"** from the dropdown.
3.  You will see the exact URL, Payload, and Raw Response for every request.

## License
MIT
  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft