Include any LLM in your VSCode workflow, including the new OpenAI open-weight models gpt-oss-120b and gpt-oss-20b. Test advanced local AI models like DeepSeek, Gemma3, GPT-OSS directly inside your editor without sending data to the cloud.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
A privacy-first VS Code extension to chat with any local LLM through Ollama—including the new OpenAI open-weight models (gpt-oss-120b and gpt-oss-20b).
Get code help, brainstorm, and run advanced AI—all offline. Your code and queries never leave your machine.
Works with Any Major Open LLM: Use DeepSeek, Gemma, Llama 3, or the newest OpenAI models (gpt-oss-120b, gpt-oss-20b) through Ollama.
Real-Time AI Interaction: Get coding assistance, brainstorm ideas, and debug issues without leaving VS Code
Privacy-First: All processing happens locally - your code and data never leave your machine
Beautifully Formatted Responses: Clean markdown rendering for better readability
Seamless Integration: Matches VS Code's themes for a consistent experience
ollama pull gpt-oss-120b
ollama pull gpt-oss-20b
# Or any other model you prefer
ollama pull deepseek-r1:1.5b
ollama pull gemma:7b
ollama pull llama3:8b
Usage
Start Ollama in your terminal:
ollama serve
In VS Code:
Open Command Palette (Ctrl+Shift+P)
Run OmniChat: Start
A chat panel will open
Set your preferred model:
Open Command Palette (Ctrl+Shift+P)
Run OmniChat: Set Model
Enter your model (e.g. gpt-oss-120b, gpt-oss-20b, deepseek-r1:1.5b, etc.)
Start chatting!
Type your question and click "Ask"
Receive beautifully formatted responses
Models
OmniChat AI works with any model available through Ollama. Some popular options:
gpt-oss-120b — Large, enterprise-grade open-weight model
gpt-oss-20b — Great for laptops and edge computers