Universal LLM
The easiest, most customizable LLM wrapper for VS Code on the planet.
Universal LLM is a "dumb pipe" extension. It doesn't assume which AI model you are using. It simply connects VS Code to any API endpoint (OpenAI, Anthropic, Ollama, LM Studio, or your own Python scripts) using full JSON customization.
Features
- Universal Compatibility: Works with OpenAI, Claude, DeepSeek, Ollama, LM Studio, and generic HTTP endpoints.
- Multiple Profiles: Switch instantly between "GPT-4o", "Local Llama 3", and "Custom Python Script" via a dropdown.
- Staging Area: Highlight code (
Ctrl+Shift+L) to stack multiple files into context before sending.
- Markdown Support: Full syntax highlighting and markdown rendering in the chat.
- One-Click Copy: Copy code blocks instantly.
- "God Mode" Configuration: Full control over JSON body, headers, and parsing logic.
Keybindings
Ctrl + L (Cmd + L on Mac): Start New Chat with selected text.
Ctrl + Shift + L (Cmd + Shift + L on Mac): Add Selection to current input without sending.
Ctrl + Enter: Send Message.
Setup & Configuration
- Install the extension.
- Open VS Code Settings (
Ctrl + ,) and search for ullm.
- Click Edit in settings.json under
Ullm: Profiles.
Example Configuration
Paste this into your settings.json to get started:
"ullm.profiles": [
{
"name": "OpenAI GPT-4o",
"url": "https://api.openai.com/v1/chat/completions",
"apiKey": "sk-YOUR_OPENAI_KEY_HERE",
"bodyTemplate": {
"model": "gpt-4o",
"temperature": 0.7,
"messages": "${messages}"
}
},
{
"name": "Anthropic Claude 3.5",
"url": "https://api.anthropic.com/v1/messages",
"apiKey": "sk-ant-YOUR_KEY_HERE",
"headers": {
"x-api-key": "sk-ant-YOUR_KEY_HERE",
"anthropic-version": "2023-06-01"
},
"bodyTemplate": {
"model": "claude-3-5-sonnet-20240620",
"max_tokens": 1024,
"messages": "${messages}",
"_responsePath": "content[0].text"
}
},
{
"name": "Local Ollama",
"url": "http://localhost:11434/api/chat",
"bodyTemplate": {
"model": "llama3",
"stream": false,
"messages": "${messages}",
"_responsePath": "message.content"
}
}
]
## How It Works
### The `${messages}` Placeholder
In your `bodyTemplate`, the string `"${messages}"` is replaced by the actual chat history array `[{"role": "user", "content": "..."}]` before sending.
### The `_responsePath` Magic Key
Different APIs hide the answer in different JSON fields. You can tell the extension where to look by adding `_responsePath` to your template. The extension reads this, removes it from the payload, and uses it to parse the answer.
* **OpenAI:** (Auto-detected, usually not needed)
* **Anthropic:** `content[0].text`
* **Ollama:** `message.content` or `response`
* **Custom Script:** `result` or `output.text`
### Debugging
If something isn't working:
1. Open the Output Panel (`Ctrl + Shift + U`).
2. Select **"Universal LLM Debug"** from the dropdown.
3. You will see the exact URL, Payload, and Raw Response for every request.
## License
MIT