RB Ollama Agents — Visual Studio Code, Antigravity, Cursor, VSCodium, Windsurf, Gitpod

Free, MIT, BYOK chat sidebar for VS Code, Antigravity, Cursor, VSCodium, Windsurf, Gitpod — direct APIs for Ollama (local + cloud), DeepSeek, Qwen / Alibaba, Zhipu (GLM), Baidu (ERNIE), Moonshot (Kimi), Tencent Hunyuan, Xiaomi Mimo, OpenAI, Anthropic Claude, Google Gemini, OpenRouter, Groq. The pay-less alternative to ~$5K/year Antigravity / Cursor / Copilot Pro / Codeium.

🐳 Taking the revolution a bit further for our WhalesBrother. Open-weights and open-API frontier models from China and the global open-source community deserve first-class IDE support. This is our small contribution to that wave — free, MIT, public.

Version history

Full release notes: CHANGELOG.md
Tagged releases and downloadable VSIX files: https://github.com/robinbakshi007/ollama-direct-custom-agent/releases

If RB Ollama saves you the ~$5,000/year an agentic IDE subscription would have cost, please consider supporting development:

💖 Become a GitHub Sponsor — monthly or one-time tiers
☕ Tip $1.99 via PayPal — quick one-tap thank-you

Sponsorship funds open-source frontier-model tooling for WhalesBrother, EU, India, Asia-Pacific and USA developer communities — keeping this stack free, MIT, BYOK and zero-telemetry forever.

We are an ISO 27001:2022 certified company. Our security posture, change management and key-handling practices are independently audited — your API keys live only in your OS keychain (VS Code SecretStorage, macOS Keychain / Windows Credential Manager / libsecret).

🏆 Why we are the best

Built to last 10+ years. The architecture is OpenAI-compatible at the wire level — every new frontier model that ships an /v1/chat/completions endpoint works on day one, no extension update required.
No middleman, no proxy, no telemetry. Your prompts go straight from your editor to the model provider you choose. Period.
MIT licensed, fully open source. Audit, fork, self-host, embed.
One install, six editors. VS Code, Antigravity, Cursor, VSCodium, Windsurf, Gitpod — same extension, same UX.
Encrypted secrets at rest. API keys + Ollama Cloud session cookies stored via OS-level SecretStorage. Never written to settings.json in plaintext.
Frontier coverage. Ollama (local + cloud), DeepSeek, Qwen / Alibaba, Zhipu (GLM), Baidu (ERNIE), Moonshot (Kimi), Tencent Hunyuan, Xiaomi Mimo, OpenAI, Anthropic, Gemini, OpenRouter, Groq.

🚀 Coming next: bigger than Mythos

We've already shipped the foundation that will easily last the next decade. The next tool we publish will go further than Mythos — and it will be fully open-source so the community can audit it and prevent any malicious harm. Star the repo to get notified.

📊 Full feature table

Capability	RB Ollama (free)	Antigravity	Cursor	Copilot Pro	Codeium
Annual cost (typical seat)	$0	~$5,000	~$240	~$240	~$180
Direct Ollama local models	✅	❌	❌	❌	partial
Direct Ollama Cloud (`:cloud` models)	✅	❌	❌	❌	❌
Direct DeepSeek / Qwen / GLM / Kimi / ERNIE / Hunyuan / Mimo	✅	❌	partial	❌	❌
Direct OpenAI / Claude / Gemini / Groq / OpenRouter (BYOK)	✅	❌	partial	❌	partial
Auto routing (local → cloud) with $-saved counter	✅	❌	❌	❌	❌
Encrypted SecretStorage for API keys + cookies	✅	n/a	n/a	n/a	n/a
Drag-and-drop images / PDFs / DOCX / TXT / MD	✅	partial	partial	partial	❌
Vision auto-routing (image → vision-capable model)	✅	❌	❌	❌	❌
Agent modes: Chat / Plan / Code / Ask / Architect	✅	partial	partial	partial	❌
Multi-agent roster + drag-sortable priority order	✅	❌	❌	❌	❌
Role-split parallel agents (architect / builder / validator)	✅	❌	❌	❌	❌
Per-task assistant routing (Image / Doc / Code / QA)	✅	❌	❌	❌	❌
Multi-period analytics (today / week / fortnight / month / quarter / YTD / custom)	✅	❌	❌	❌	❌
CSV export with date-range filename suffix	✅	❌	❌	❌	❌
Cloud token guard + account rotation	✅	❌	❌	❌	❌
Context window meter + reserved-for-response	✅	partial	partial	❌	❌
Zero telemetry	✅	❌	❌	❌	❌
MIT licensed, fully open source	✅	❌	❌	❌	❌
ISO 27001:2022 certified maintainer	✅	n/a	n/a	n/a	n/a
One install across VS Code / Antigravity / Cursor / VSCodium / Windsurf / Gitpod	✅	❌	❌	❌	partial

🌍 Communities we serve

We are actively assisting and accepting contributions from:

🐳 WhalesBrother — open-weights / open-API frontier models from China and the global open-source community
🇪🇺 European Union developer collectives (GDPR-respecting AI tooling, on-device first)
🇮🇳 India — IndiaStack, ONDC, BHASHINI integrators
🌏 Asia Pacific — Japan, Korea, Singapore, Australia, NZ, ASEAN
🇺🇸 United States developer communities — independents, startups and education

If your community wants a localised onboarding guide, open an issue or sponsor a workstream.

🔐 Trust & compliance

ISO 27001:2022 certified information-security management system covering source-code handling, key custody and release engineering.
API keys & Ollama Cloud session cookies are encrypted at rest in the OS keychain via VS Code SecretStorage — never in settings.json.
Privacy-first analytics model — by default, no usage telemetry is sent. If you explicitly enable billing/security analytics consent in Plans settings, the client shares only an anonymous random install UUID (no device fingerprint) plus consent/retention metadata; country is derived server-side from request IP and not collected directly on-device.
Data minimization & retention — analytics consent is optional and revocable, and retention days are configurable (default 30 days).
MIT licensed — full source on GitHub, reproducible build via node esbuild.js --production && npx @vscode/vsce package.

Billing/security analytics are opt-in and disabled unless you provide explicit consent.
The extension uses a random anonymous install ID only; it does not use hardware IDs, MAC address, serial number, or persistent device fingerprinting.
Country is resolved server-side from request IP and should be stored only at country granularity.
Retention must be limited to the configured period and deleted after expiry.
Promo codes and license tokens are stored in encrypted SecretStorage (OS keychain), never plaintext settings.

Why this exists

The big-name agentic IDEs are charging hundreds-to-thousands of dollars per seat per year for thin wrappers around the same public APIs you can call yourself. Meanwhile:

Ollama Cloud has frontier models like gpt-oss:120b-cloud, deepseek-v4-pro:cloud, gemini-3-flash-preview:cloud, kimi-k2.6:cloud, glm-5.1:cloud, gemma4:31b-cloud — none of which the major IDE agents let you wire into their chat sidebar.
DeepSeek, Qwen, Zhipu, Moonshot, Hunyuan, Mimo all expose OpenAI-compatible APIs — and they are dramatically cheaper than GPT-4-class models for everyday coding.
Antigravity / Cursor / Codex / Copilot Chat keep this firmly behind their own backends.

This extension is the missing bridge. It is free, MIT-licensed, open source, zero-telemetry, no proxy of mine in the middle.

What you get

✅ One install across VS Code, Antigravity, Cursor, VSCodium, Windsurf, Gitpod
✅ Sidebar chat with model picker inside the composer (matches native Antigravity / Codex / Gemini Code Assist UX)
✅ ✨ Auto routing — automatically picks the cheapest viable model (local first), with a live $ saved / % saved counter
✅ Inline bottom Cost & Analytics panel — expandable below the composer, with per-model usage %, request/token/task split, and day/week CSV export
✅ Context window meter with Reserved for response display and token guard warnings
✅ Cloud token guard — when cloud requests approach reserve limits, route to local fallback automatically (configurable)
✅ Cloud account rotation — configure multiple Ollama cloud account profiles and auto-switch when weekly usage threshold is reached
✅ Agent modes in the + menu — Chat / Plan / Code / Ask / Architect
✅ Assistant dropdown with grouped options: Digital Assistant (OpenClaw), Agents (Hermes, Claude Code, Codex, Copilot CLI, OpenCode, Droid, Goose, Pi, Pool), Chat & RAG (Onyx), and Automation (n8n)
✅ Assistant Routing panel with dedicated dropdowns for Image / Document / Code / QA and optional auto-launch toggle
✅ Multi-agent setup controls in settings (multiAgentSetupEnabled, multiAgentRoster) to activate/deactivate routing and define the agent roster
✅ Team orchestration controls (assistantTeamOrchestrationEnabled, assistantTeamRoles, assistantTeamRequirePlanApproval) for lead/subagent style workflows
✅ In-chat gear shortcut on extension info rows to open RB Ollama settings directly
✅ + menu: drag-and-drop PNG, JPG, PDF, DOCX, TXT, MD — vision images automatically route to a vision-capable model
✅ Permissions preset (Default / Auto-review / Full access / Custom) shaping the system prompt
✅ One-click "Add provider" — quick-pick presets for DeepSeek, Qwen, Zhipu, Baidu, Moonshot, Hunyuan, Mimo, OpenAI, Claude, Gemini, OpenRouter, Groq. Just paste your API key.
✅ Bring-your-own-key custom providers for any other OpenAI-compatible endpoint
✅ Settings page surfaced under @ext:RobinBakshi.ollama-direct-custom-agent
✅ Zero telemetry, zero proxies, zero accounts of mine

Digital Assistants quick guide

Use this when you want OpenClaw, Hermes Agent, Claude Code, Codex, Copilot CLI, and others to be routed automatically by task.

Open RB Ollama Settings (gear icon in chat, or command: RB Ollama: Open Settings).
In Digital Assistants tab, assign assistant per task type:

Image tasks
Document tasks
Code tasks
QA tasks

In RB Ollama Agents tab, enable:

Auto-route by task type
Multi-agent orchestration (optional)

In API Keys tab, add keys only when needed:

If you already use Ollama Cloud models, assistant keys are usually not required.
For standalone assistants/providers, add keys there.

API key security

Keys are encrypted via VS Code SecretStorage (OS keychain: macOS Keychain, Windows Credential Manager, Linux libsecret).
The extension UI shows only masked previews of stored keys.
Legacy plaintext keys in settings are auto-migrated to SecretStorage and cleared.

Screenshots

Screenshots are optional and intentionally not hot-linked here unless files exist, to avoid broken rendering in GitHub pages.

Add your files here when ready:

docs/screenshots/antigravity.png
docs/screenshots/vscode.png

Tip: keep image names exactly as above for consistency across release notes.

Install

Step 1 — Install Ollama (free) and (optionally) Ollama Pro

Install Ollama → https://ollama.com/download
Sign in: ollama signin (free)
(Optional, only for :cloud models) subscribe to Ollama Pro: https://ollama.com/settings/billing

Pull at least one model:

# Free local models (no Pro needed) — recommended for token savings
ollama pull qwen3-coder:30b      # 18 GB — great coding
ollama pull llama3.1:8b          # 4.9 GB — fast general
ollama pull gemma4:e4b           # 9.6 GB — multimodal/vision
ollama pull deepseek-coder-v2:16b   # 9 GB — coding (smaller)

# Cloud (Ollama Pro)
ollama pull gemini-3-flash-preview:cloud
ollama pull gpt-oss:120b-cloud
ollama pull deepseek-v4-pro:cloud
ollama pull gemma4:31b-cloud

Verify:

ollama list
curl http://127.0.0.1:11434/api/tags

Step 2 — Install the extension

Editor	One-click	Direct .vsix download	CLI
VS Code	Install from Marketplace	⬇ Latest .vsix	`code --install-extension RobinBakshi.ollama-direct-custom-agent`
Antigravity	Extensions panel → search RB Ollama Agents	⬇ Latest .vsix	`antigravity --install-extension RobinBakshi.ollama-direct-custom-agent`
Cursor	Extensions panel → search RB Ollama Agents	⬇ Latest .vsix	`cursor --install-extension RobinBakshi.ollama-direct-custom-agent`
VSCodium	Install from Open VSX	⬇ Latest .vsix	`codium --install-extension RobinBakshi.ollama-direct-custom-agent`
Windsurf / Gitpod	Extensions panel → search RB Ollama Agents	⬇ Latest .vsix	`<editor-cli> --install-extension RobinBakshi.ollama-direct-custom-agent`

Direct downloads (all releases): https://github.com/robinbakshi007/ollama-direct-custom-agent/releases

Or install the .vsix directly:

# Download from the latest release page and install that file
# (filename changes every version)
code         --install-extension ./ollama-direct-custom-agent-<version>.vsix
antigravity  --install-extension ./ollama-direct-custom-agent-<version>.vsix
cursor       --install-extension ./ollama-direct-custom-agent-<version>.vsix
codium       --install-extension ./ollama-direct-custom-agent-<version>.vsix

Step 3 — Open it

Restart your editor once after install
Click the speech-bubble icon in the left activity bar → RB Ollama Agents
Pick ✨ Auto in the composer dropdown — done.

✨ Auto routing & savings counter

Selecting ✨ Auto (prefer local → cloud) routes each turn to the cheapest viable model:

If your prompt has images attached, Auto picks the best vision-capable model (cloud preferred — Gemini 3 Flash, Gemma 4, GPT-4o, Claude 3 — falling back to a local vision model).
Otherwise, Auto picks a local model (free, $0/token), preferring coder/instruct variants.
Only if no local model is installed does Auto fall back to a :cloud model, preferring small/cheap ones (*flash*, *mini*, *haiku*).

The header above the chat shows a live tally:

$1.27 saved (78%)        24 local · 7 cloud requests       ⚙  ↺

$ saved = local-token-count × your cloudPricePerMTok setting (default $0.50 / 1M tok).
% saved = local tokens ÷ total tokens.
⚙ opens settings, ↺ resets the counter.

Tune the strategy under settings → RB Ollama: Auto Prefer:

local-first (default) — always go local when possible
cheapest-cloud — for prompts > 4 K chars, jump to a cheap cloud model
balanced — for prompts > 8 K chars, use cloud

Per-model usage percentages

The extension now tracks request share by model and shows percentages in two places:

A compact Usage: bar under the savings header (modelA 42% · modelB 31% ...)
Model dropdown labels (use X%)

This helps users manually track model mix while still keeping Auto as default.

Model Analytics panel + exports

Expand the Model Analytics panel below the usage bar to view:

model-level usage percentage
request count
token split
task split (chat, image, doc, code, qa)

Export manual reports for finance/compliance/audits:

Export Day CSV
Export Week CSV

Coding accuracy percentages

There is no universal real-time "accuracy" feed from model vendors, so this extension supports manual benchmark percentages via settings:

ollamaDirectCustomAgent.modelAccuracyOverrides

Example:

"ollamaDirectCustomAgent.modelAccuracyOverrides": [
  { "id": "deepseek-v4-pro:cloud", "accuracyPercent": 91 },
  { "id": "qwen3-coder:30b", "accuracyPercent": 86 },
  { "id": "custom:openrouter:google/gemini-2.5-flash", "accuracyPercent": 88 }
]

These show up in the picker as acc X%.

Task-based model handover

When Auto is selected, you can route specific task types to dedicated models:

taskModelImage (image understanding)
taskModelDoc (PDF/DOCX/text-heavy extraction)
taskModelCode (coding mode)
taskModelQa (Q&A / Ask mode, e.g. Jules-style QA model)

Enable/disable with:

ollamaDirectCustomAgent.taskRoutingEnabled

If a task model is set to __auto__, normal Auto routing applies.

Onyx (Chat & RAG)

Onyx is a self-hostable chat/RAG system that can connect to Ollama and supports custom agents, connectors, deep research, and MCP/OpenAPI actions.

Quick path:

Deploy Onyx via quickstart: https://docs.onyx.app/deployment/getting_started/quickstart
In setup, choose Ollama as provider
Set Ollama URL:

local: http://127.0.0.1:11434
Docker: http://host.docker.internal:11434

In this extension, choose assistant Onyx (Chat & RAG) and click Launch to open setup/launch guidance.

n8n (Automation with Ollama)

n8n workflows can call Ollama nodes for automations and agents.

Quick path:

Install n8n
In n8n, create Ollama credentials
Set API URL:

local: http://localhost:11434
Docker: http://host.docker.internal:11434

Build workflow with Ollama nodes and select model (for example qwen3-coder)

Cloud path:

Create API key at https://ollama.com/settings/keys
In n8n, set API URL https://ollama.com and add key

In this extension, choose assistant n8n (Automation) and click Launch.

Documentation index sync (`llms.txt`)

Use one of these to sync the official index for model/tool discovery:

+ → Plugins → Sync Ollama docs index (llms.txt)
command: RB Ollama: Sync Ollama Documentation Index (llms.txt)

Source: https://docs.ollama.com/llms.txt

Click + in the composer or drag & drop files anywhere on the composer:

File type	Behaviour
PNG / JPG / GIF / WebP	Sent as multimodal images. Auto routes to a vision model (`gemma4:31b-cloud`, `gemini-3-flash-preview:cloud`, `gpt-4o`, `claude-3-*`, etc.)
PDF	Text extracted via `pdfjs-dist`, prepended to your prompt
DOCX	Text extracted via `mammoth`, prepended to your prompt
TXT / MD / source files	Read as UTF-8 and prepended

Coming via MCP (planned): browser actions, Slack/Gmail/Drive/Calendar connectors.

Click + in the composer and pick a mode — it shapes the assistant's system prompt:

Mode	Behaviour
💬 Chat (default)	Normal conversational coding assistant.
🗂 Plan	Produces a numbered, step-by-step plan. Does not write final code unless asked.
💻 Code	Direct, ready-to-paste code edits with minimal prose.
❓ Ask	Answers in 1–3 sentences, no proposed changes.
🏗 Architect	High-level design, trade-offs, mermaid diagrams.

The mode shows in the composer placeholder, e.g. Ask anything… [Plan mode].

Settings → RB Ollama: Permissions:

Preset	What changes
Default	System prompt: only reference files explicitly attached
Auto-review	System prompt: present commands clearly and ask for confirmation
Full access	System prompt: freely reference workspace context
Custom	Use your own multi-line `systemPrompt`

🔒 Honest note: VS Code extensions cannot enforce OS-level sandboxing. This setting controls what the model is told it may do, not what your editor will actually let it do. Real shell/browser execution is a future feature via MCP.

Bring your own key — DeepSeek, Qwen, Zhipu, Baidu, Moonshot, Hunyuan, Mimo, Claude, GPT, Gemini

The fastest way: open the + menu in the chat composer → 🔑 Add provider… (or Cmd-Shift-P → RB Ollama: Add Provider). Pick from the preset catalogue, paste your API key, done.

Provider	Endpoint preset	Models shipped
DeepSeek (direct)	`https://api.deepseek.com/v1`	`deepseek-chat`, `deepseek-reasoner`, `deepseek-coder`
Qwen / Alibaba DashScope	`https://dashscope-intl.aliyuncs.com/compatible-mode/v1`	`qwen-max`, `qwen-plus`, `qwen-flash`, `qwen-vl-max`, `qwen-coder-plus`, `qwen3-coder-plus`, `qwen3-max`
Zhipu AI (GLM)	`https://open.bigmodel.cn/api/paas/v4`	`glm-4.6`, `glm-4-plus`, `glm-4-flash`, `glm-4v-plus`
Baidu Qianfan (ERNIE)	`https://qianfan.baidubce.com/v2`	`ernie-4.5-turbo-128k`, `ernie-4.0-turbo-8k`, `ernie-speed-128k`
Moonshot AI (Kimi)	`https://api.moonshot.cn/v1`	`kimi-k2-0905-preview`, `moonshot-v1-128k`, `moonshot-v1-32k`
Tencent Hunyuan (混元)	`https://api.hunyuan.cloud.tencent.com/v1`	`hunyuan-turbos-latest`, `hunyuan-large`, `hunyuan-vision`
Xiaomi Mimo V2 Pro	`https://api.xiaomi.com/v1`	`mimo-v2-pro`, `mimo-v2`
OpenAI	`https://api.openai.com/v1`	`gpt-4o`, `gpt-4o-mini`, `o4-mini`
Anthropic Claude	`https://api.anthropic.com/v1`	`claude-3-5-sonnet-latest`, `claude-3-5-haiku-latest`
Google Gemini	`https://generativelanguage.googleapis.com/v1beta/openai`	`gemini-2.5-flash`, `gemini-2.5-pro`
OpenRouter	`https://openrouter.ai/api/v1`	`anthropic/claude-3.5-sonnet`, `google/gemini-2.5-flash`, `x-ai/grok-2`
Groq	`https://api.groq.com/openai/v1`	`llama-3.3-70b-versatile`, `qwen-2.5-coder-32b`

You can also add them by hand in settings.json:

"ollamaDirectCustomAgent.customProviders": [
  {
    "id": "deepseek",
    "name": "DeepSeek",
    "baseUrl": "https://api.deepseek.com/v1",
    "apiKey": "sk-...",
    "models": ["deepseek-chat", "deepseek-reasoner"]
  },
  {
    "id": "qwen",
    "name": "Qwen",
    "baseUrl": "https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
    "apiKey": "sk-...",
    "models": ["qwen-max", "qwen-plus", "qwen-flash", "qwen-vl-max", "qwen-coder-plus"]
  }
]

Note: a few endpoints (Anthropic native, some Baidu/Hunyuan auth modes) deviate slightly from OpenAI's spec. If you hit 400/401, try the OpenRouter route for the same model — it's always OpenAI-compat.

All settings (`@ext:RobinBakshi.ollama-direct-custom-agent`)

Setting	Default	Purpose
`endpoint`	`http://127.0.0.1:11434`	Ollama HTTP base URL
`defaultModel`	`__auto__`	Initial selection in the picker
`defaultAssistant`	`openclaw`	Assistant selected in the assistant dropdown
`assistantTaskImage`	`openclaw`	Assistant assignment for image tasks
`assistantTaskDoc`	`hermes`	Assistant assignment for document tasks
`assistantTaskCode`	`codex`	Assistant assignment for coding tasks
`assistantTaskQa`	`claude`	Assistant assignment for QA tasks
`assistantAutoRouting`	`true`	Automatically route assistant selection by task type (image/doc/code/qa)
`assistantAutoLaunch`	`false`	Auto-launch assigned assistant when task is detected
`assistantApiKeys`	`{}`	Legacy migration field. Values are moved to encrypted SecretStorage and removed from settings. Use Settings → API Keys.
`multiAgentSetupEnabled`	`true`	Activates task-based multi-agent setup (Image/Doc/Code/QA)
`multiAgentRoster`	`[openclaw, hermes, claude, codex, copilot, opencode, droid, goose, pi, pool, onyx, n8n]`	Controls which assistants are available in multi-agent routing
`permissions`	`default`	`default` / `auto-review` / `full-access` / `custom`
`systemPrompt`	(empty)	Used when permissions = custom
`autoPrefer`	`local-first`	Auto routing strategy
`taskRoutingEnabled`	`true`	Enable task-based model routing for Auto
`taskModelImage`	`__auto__`	Preferred model for image tasks
`taskModelDoc`	`__auto__`	Preferred model for doc tasks
`taskModelCode`	`__auto__`	Preferred model for code tasks
`taskModelQa`	`__auto__`	Preferred model for QA tasks
`modelAccuracyOverrides`	`[]`	Manual accuracy labels shown in picker
`cloudPricePerMTok`	`0.5`	$/1M tokens, used for the savings counter
`openOnStartup`	`false`	Focus the sidebar at startup
`composerEnterBehavior`	`send`	`send` (Enter sends) or `newline`
`customProviders`	`[]`	Array of `{id,name,baseUrl,apiKey,models[]}`

Cost-savings playbook

Default to Auto. It chooses local whenever possible.
Pull qwen3-coder:30b or llama3.1:8b — they cover ~80 % of everyday coding for free.
Reserve :cloud and BYOK models (Claude, GPT-4o, Gemini) for: long-context reasoning, vision, hard refactors.
Watch the $ saved counter go up.

Troubleshooting

Problem	Fix
`Cannot reach Ollama at http://127.0.0.1:11434`	Run `ollama serve` or open the Ollama menubar app.
`No models found`	`ollama pull llama3.1:8b` then click ↻.
Cloud model auth error	`ollama signin` and verify Ollama Pro at https://ollama.com/settings/billing.
Doesn't appear in Antigravity	Restart Antigravity after install. Speech-bubble icon in left activity bar.
Extension installs but engine version error	This extension targets `vscode ^1.95.0`. Update Antigravity / Cursor / VSCodium.
BYOK provider returns 401	Wrong API key, or the `baseUrl` doesn't end at the `/v1`-style root.

Roadmap

🛠 MCP client — connect Slack, Gmail, Drive, Calendar, Playwright (browser), shell — using the open Model Context Protocol ecosystem
🛠 Shell tool execution with the permissions preset above gating each call
🛠 Markdown / code-block rendering in the chat log
🛠 Per-workspace model defaults

PRs welcome.

Building from source

git clone https://github.com/robinbakshi007/ollama-direct-custom-agent
cd ollama-direct-custom-agent
npm install
npm run package
npx @vscode/vsce package
# → ollama-direct-custom-agent-0.7.5.vsix

code         --install-extension ./ollama-direct-custom-agent-0.7.5.vsix
antigravity  --install-extension ./ollama-direct-custom-agent-0.7.5.vsix

License

MIT — see LICENSE. Use it, fork it, ship it.

RB Ollama Agents

Robin Bakshi

RB Ollama Agents — Visual Studio Code, Antigravity, Cursor, VSCodium, Windsurf, Gitpod

Version history

🏆 Why we are the best

🚀 Coming next: bigger than Mythos

📊 Full feature table

🌍 Communities we serve

🔐 Trust & compliance

Why this exists

What you get

Digital Assistants quick guide

API key security

Screenshots

Install

Step 1 — Install Ollama (free) and (optionally) Ollama Pro

Step 2 — Install the extension

Step 3 — Open it

✨ Auto routing & savings counter

Per-model usage percentages

Model Analytics panel + exports

Coding accuracy percentages

Task-based model handover

Onyx (Chat & RAG)

n8n (Automation with Ollama)

Documentation index sync (`llms.txt`)

+ menu — attachments

Agent modes (in the + menu)

Bring your own key — DeepSeek, Qwen, Zhipu, Baidu, Moonshot, Hunyuan, Mimo, Claude, GPT, Gemini

All settings (`@ext:RobinBakshi.ollama-direct-custom-agent`)

Cost-savings playbook

Troubleshooting

Roadmap

Building from source

License

RB Ollama Agents

Robin Bakshi

RB Ollama Agents — Visual Studio Code, Antigravity, Cursor, VSCodium, Windsurf, Gitpod

Version history

💖 Sponsor this project

🏆 Why we are the best

🚀 Coming next: bigger than Mythos

📊 Full feature table

🌍 Communities we serve

🔐 Trust & compliance

Privacy Policy Notes (GDPR / DPDP / CCPA)

Why this exists

What you get

Digital Assistants quick guide

API key security

Screenshots

Install

Step 1 — Install Ollama (free) and (optionally) Ollama Pro

Step 2 — Install the extension

Step 3 — Open it

✨ Auto routing & savings counter

Per-model usage percentages

Model Analytics panel + exports

Coding accuracy percentages

Task-based model handover

Onyx (Chat & RAG)

n8n (Automation with Ollama)

Documentation index sync (llms.txt)

+ menu — attachments

Agent modes (in the + menu)

Bring your own key — DeepSeek, Qwen, Zhipu, Baidu, Moonshot, Hunyuan, Mimo, Claude, GPT, Gemini

All settings (@ext:RobinBakshi.ollama-direct-custom-agent)

Cost-savings playbook

Troubleshooting

Roadmap

Building from source

License

Documentation index sync (`llms.txt`)

All settings (`@ext:RobinBakshi.ollama-direct-custom-agent`)