Skip to content
| Marketplace
Sign in
Visual Studio Code>Machine Learning>GromNew to Visual Studio Code? Get it now.
Grom

Grom

Ryan Connolly

| (0) | Free
Your local AI assistant. Runs entirely on your machine — no cloud, no account, no telemetry. Powered by Ollama, LM Studio, or any OpenAI-compatible API.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Grom

Grom

BYOLLM

Bring Your Own LLM. Your model. Your machine. Your rules.


Grom is a privacy-first AI coding assistant for VS Code that runs entirely on your local hardware.

No cloud. No account. No telemetry. No dark patterns. No upsell.

Just you, your code, and Grom.


Why Grom?

Other tools say they support local models. Try it and you'll find yourself three config screens deep, staring at a broken connection, wondering why the cloud path is suspiciously smooth.

Grom was built because local AI shouldn't require fighting your tools.

If Ollama is running, Grom works. That's the whole deal.


Meet Grom

Grom is the little robot who lives in your sidebar. He watches your cursor, thinks while you type, and goes to sleep when you're idle. He's not a feature — he's the soul of the tool.

His antenna tells you what's happening before you read a word:

State What it means
Antenna bent PLAN mode — thinking broadly
Antenna straight BUILD mode — focused, ready to ship
Antenna bobbing Waiting for a response
Antenna drooping, faded Your server isn't running
Glitching Server returned an error
Eyes closed, thought bubble Idle — wake him up by typing

Two Modes, One Purpose

PLAN mode — warm honey gold. Grom thinks architecturally. Break down problems, plan features, talk through ideas before a single line is written.

BUILD mode — focused blue. Grom is direct and implementation-ready. Write code, fix bugs, ship things.

The UI colour shifts with the mode. So does Grom's personality.


What Grom Does

Providers

Grom works with local servers out of the box — no account, no key. Cloud providers are supported optionally if you want to bring your own key.

Built-in:

Provider Notes
Ollama Local, 127.0.0.1:11434 — recommended
LM Studio Local, 127.0.0.1:1234
Open Code Local
OpenAI GPT-4o, o1, o3-mini
Anthropic Claude Sonnet, Claude Opus

Custom providers — add any OpenAI-compatible endpoint or Anthropic-compatible proxy via grom.customProviders. Gemini, Groq, Mistral, OpenRouter, Together AI, and most other cloud APIs work out of the box. See Adding a Custom Provider below.

Switch providers and models without leaving the panel. Grom detects model capabilities automatically — vision, tool use, and reasoning models each show their own icon.

Knows What You're Working On

Use @ in any message to attach context:

Mention What it includes
@filename Any workspace file — open tabs shown first
@problems All current VS Code errors and warnings
@git Your current uncommitted diff (git diff HEAD)
@terminal Recent output from the integrated terminal
@url:https://... Fetches a web page and includes its text

Auto-context is on by default — Grom reads the file you have open automatically.

Inline Autocomplete

Ghost-text completions as you type, powered by FIM models.

  • Adaptive debounce — speeds up when you're accepting, slows down when you're not
  • Word-by-word accept — Tab accepts the next word; keep pressing for more
  • Dedicated model — set a fast FIM model (e.g. qwen2.5-coder:1.5b) separate from your chat model
  • Per-language routing — different models for different languages via grom.languageModels
  • Toggle — click ✦ Grom in the status bar to enable/disable instantly

Inline Edit

Select code, press Ctrl+Shift+I, describe what you want. Grom rewrites it and opens a diff. Accept or Reject.

Compose — Multi-file Edit

Press Ctrl+Shift+O or type /compose. Describe changes across your codebase. Review per-file or apply everything at once. Undo the whole run with one click.

Every code block in compose format gets a 💾 Save button — opens a diff showing exactly what will change.

Agentic Loop

Grom doesn't just reply once — it works through tasks step by step, calling tools based on what the last one returned.

Tool What it does
read_file Read any file in your workspace
write_file Write or create a file, then open it in the editor
list_directory List files and folders at a path
delete_file Delete a file
search_files Search workspace files by regex pattern
run_terminal Run a shell command and return its output
browse_web Fetch a live web page and return its text content

Note on model size: Tool call accuracy scales with model size. 32B+ models call tools reliably. Smaller models (1.5B–7B) occasionally write prose instead of a tool call. Grom handles this by re-prompting once when it detects prose where a tool call was expected, and enables structured JSON mode after the first tool use. For complex agentic tasks, 14B+ is significantly more reliable.

MCP Tool Use

Connect any Model Context Protocol server and Grom's model can call its tools during chat. Tool calls stream live with a badge showing which tool is running. Configure via grom.mcpServers.

Grom Memory

Persistent memory injected into every new chat — like custom instructions, but yours.

Only use TypeScript.
Never push code directly, always explain changes first.
My stack is React 18 + Express.

Open it with the brain icon in the header.

Conversations

  • Multiple chat sessions, persistent across restarts
  • /compact trims long histories — a divider marks exactly where the cut was made
  • Export any conversation as .md, import it back to continue
  • Search through any conversation with live highlighting
  • Per-session system prompt override via the chat bubble icon

Custom Prompt Files

Create .grom/*.md files in your workspace. A file at .grom/deploy.md becomes /deploy — shareable with your whole team via git.


Slash Commands

Type / to open the command menu:

Command What it does
/explain Explain the active file
/refactor Refactor for clarity and best practices
/fix Find and fix bugs
/tests Write unit tests
/docs Write documentation
/review Full code review
/commit Generate a git commit message
/compose Multi-file edit mode
/search <query> Web search via DuckDuckGo
/<name> Any .grom/<name>.md file in your workspace

Keyboard Shortcuts

Shortcut Action
Ctrl+Shift+G / Cmd+Shift+G Open Grom
Ctrl+Shift+I / Cmd+Shift+I Inline edit (requires selection)
Ctrl+Shift+Y / Cmd+Shift+Y Accept inline diff
Ctrl+Shift+U / Cmd+Shift+U Reject inline diff
Ctrl+Shift+O / Cmd+Shift+O Open Compose mode
Enter Send message
Shift+Enter New line

Requirements

Grom works with local servers (no account or key needed) or cloud providers (bring your own key).

Editors:

Grom runs in VS Code and any VS Code-compatible editor:

  • Visual Studio Code — recommended
  • Google Antigravity — confirmed working
  • Cursor, Windsurf, and other VS Code forks should work too

Local — runs entirely on your machine:

  • Ollama — recommended, free, runs most open models
  • LM Studio — great UI for managing models

Cloud — optional, requires an API key from each provider:

  • OpenAI — GPT-4o, o1, o3-mini
  • Anthropic — Claude Sonnet, Claude Opus
  • Gemini, Groq, Mistral, OpenRouter, and any OpenAI-compatible endpoint

Recommended local models:

Use Model
Chat qwen2.5-coder:32b, deepseek-coder-v2, llama3.1
Autocomplete qwen2.5-coder:1.5b, deepseek-coder:1.3b, starcoder2:3b
Embeddings (RAG) nomic-embed-text, mxbai-embed-large

Settings

Setting Description Default
grom.apiUrl Your local server URL http://127.0.0.1:11434
grom.model Chat model name qwen2.5-coder
grom.useOllamaFormat Use Ollama's chat format true
grom.autocomplete Enable inline completions true
grom.autocompleteModel Dedicated FIM model (chat model)
grom.languageModels Per-language model overrides {}
grom.ragEnabled Enable codebase indexing true
grom.embeddingModel Ollama model for semantic RAG (blank)
grom.mcpServers MCP server definitions []
grom.customProviders Custom provider endpoints; keys stored securely in OS keychain []
grom.robotAnimations Enable Grom's animations true
grom.theme UI theme: Grom, Cyberpunk, Classic, High Contrast Grom
grom.agentEnabled Enable the agentic loop true
grom.agentMaxIterations Max tool-call rounds per task 20

Per-Language Model Routing

{
  "python": "qwen2.5-coder:1.5b",
  "typescript": "qwen2.5-coder:32b",
  "rust": "deepseek-coder-v2"
}

Adding a Custom Provider

OpenAI and Anthropic are built-in — select them from the provider dropdown. Use grom.customProviders for everything else.

API keys are never stored in settings files. Grom prompts for a key the first time you select a provider that needs one, then stores it securely in the OS keychain (Windows Credential Manager / macOS Keychain / libsecret on Linux). Click the lock icon next to the provider dropdown at any time to update or clear a key.

[
  { "name": "Gemini",     "url": "https://generativelanguage.googleapis.com/v1beta/openai" },
  { "name": "Groq",       "url": "https://api.groq.com/openai" },
  { "name": "Mistral",    "url": "https://api.mistral.ai" },
  { "name": "OpenRouter", "url": "https://openrouter.ai/api" },
  { "name": "Together",   "url": "https://api.together.xyz" },
  { "name": "Local (no key)", "url": "http://127.0.0.1:8080", "authType": "none" },
  { "name": "Claude proxy",   "url": "https://my-proxy.example.com", "providerFormat": "anthropic" }
]

For most cloud providers, name and url are all you need. Optional fields:

Field Values Default When to set
providerFormat openai, anthropic openai Only for a self-hosted Claude-compatible proxy
authType bearer, x-api-key, none bearer Set to none for keyless local servers
useOllamaFormat true, false false Only for servers using Ollama's /api/chat format

MCP Servers

[
  {
    "name": "filesystem",
    "command": "npx",
    "args": ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/workspace"]
  }
]

Context Window

The radial circle in the toolbar shows how full your context window is. Hover it to see the exact token count and window size. When it fills up, /compact trims old messages — Grom marks the cut point so you always know what's been removed.

Set the exact context window size for your model via grom.modelPricing for an accurate reading.


License

PolyForm Noncommercial 1.0.0 — free to use, free to modify, free to share. Not for commercial use.


BYOLLM. Built in Ireland.

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft