Skip to content
| Marketplace
Sign in
Visual Studio Code>Programming Languages>LocalPilotNew to Visual Studio Code? Get it now.
LocalPilot

LocalPilot

Yuvraj Raina

|
3 installs
| (0) | Free
Local Ollama-powered AI coding assistant for VS Code
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

LocalPilot

LocalPilot is a local Ollama-powered AI coding assistant for VS Code. It is built for developers who want inline suggestions, code explanations, fixes, tests, and chat without sending code to cloud AI services.

Features

  • Copilot-style inline ghost-text suggestions powered by local Ollama models.
  • Status bar menu for Ollama health, model selection, setup, chat, and inline mode.
  • Chat panel for asking questions about selected code or the active file, with quick actions and code-block copy buttons.
  • Selected-code commands for explaining code, adding comments, fixing code, generating tests, and solving coding problems.
  • Code Actions for quick lightbulb access in common languages.
  • Error explainer for selected stack traces or pasted terminal errors.
  • Safe apply workflow with diff preview, copy option, and explicit apply confirmation.
  • First-run setup that checks Ollama, lists local models, and asks before any model pull.
  • Privacy-first defaults: no telemetry, no cloud AI APIs, and local-only model calls.

Screenshots

Screenshots will be added before Marketplace publication.

  • Inline suggestions: resources/screenshots/inline-suggestions.png
  • Chat sidebar: resources/screenshots/chat-sidebar.png
  • Safe diff preview: resources/screenshots/diff-preview.png

Installation

Install LocalPilot from the VS Code Marketplace, or install a packaged VSIX:

code --install-extension localpilot-0.0.1.vsix

For local development:

npm install
npm run compile

Then press F5 in VS Code to launch the Extension Development Host.

Ollama Setup

Install Ollama from:

https://ollama.com/download

Start Ollama:

ollama serve

LocalPilot talks to this local host by default:

http://localhost:11434

You can change it with localpilot.ollamaHost.

Recommended Models

LocalPilot can guide setup on first run with LocalPilot: Run Setup. You can also pull models manually:

ollama pull qwen2.5-coder:1.5b
ollama pull qwen2.5-coder:7b
ollama pull smollm2:360m
ollama pull deepseek-coder:1.3b
ollama pull codegemma:2b

Suggested profiles:

  • Balanced: qwen2.5-coder:1.5b for responsive local autocomplete and chat.
  • Quality: qwen2.5-coder:7b for stronger local coding help on machines with more memory.
  • Micro: smollm2:360m for very low-resource machines.
  • Lite: deepseek-coder:1.3b for older low-end setups.
  • Compact: codegemma:2b for a small general coding model.

LocalPilot never pulls a model automatically. It asks before downloads because model files can be large.

Low RAM Mode

Use localpilot.mode to control resource usage:

  • auto: conservative defaults.
  • micro: shortest context and smallest outputs.
  • lite: balanced low-resource behavior.
  • standard: more useful context and output sizes.
  • custom: use your configured context and output settings.

Related settings:

  • localpilot.inlineDebounceMs
  • localpilot.disableInlineForLargeFiles
  • localpilot.maxFileSizeKb
  • localpilot.enableLowRamWarnings

Commands

  • LocalPilot: Open Chat
  • LocalPilot: Explain Selected Code
  • LocalPilot: Explain Error
  • LocalPilot: Add Comments to Selection
  • LocalPilot: Fix Selected Code
  • LocalPilot: Generate Tests
  • LocalPilot: Solve Coding Problem
  • LocalPilot: Check Ollama Status
  • LocalPilot: Select Local Model
  • LocalPilot: Switch Inline Completion Mode
  • LocalPilot: Open Status Menu
  • LocalPilot: Run Setup

Settings

  • localpilot.ollamaHost
  • localpilot.inlineModel
  • localpilot.chatModel
  • localpilot.lowRamModel
  • localpilot.mode
  • localpilot.enableInlineSuggestions
  • localpilot.inlineCompletionMode (full by default, or line for conservative one-line suggestions)
  • localpilot.maxContextLines
  • localpilot.maxOutputTokens
  • localpilot.temperature
  • localpilot.inlineDebounceMs
  • localpilot.disableInlineForLargeFiles
  • localpilot.maxFileSizeKb
  • localpilot.enableLowRamWarnings

Privacy

LocalPilot does not send code to cloud services. It talks to local Ollama by default and does not collect telemetry. See PRIVACY.md for details.

Troubleshooting

  • Ollama disconnected: run ollama serve, then use LocalPilot: Check Ollama Status.
  • No models found: run LocalPilot: Run Setup or pull a model with ollama pull <model>.
  • Inline suggestions feel slow: switch localpilot.inlineCompletionMode to line, or switch localpilot.mode to micro or lite.
  • Output looks weak: try a larger local model and set it with LocalPilot: Select Local Model.
  • Files are skipped: LocalPilot blocks secrets, generated folders, lock files, and large files by design.

Known Limitations

  • Model quality depends on the local model you install.
  • Very small models may produce incomplete fixes or tests.
  • LocalPilot does not read terminal contents automatically.
  • Full inline completions are bounded and filtered, but local model quality still matters.
  • Large workspaces and generated files are filtered aggressively for safety and responsiveness.

Development

npm run compile
npm run lint
npm test
npm run package
  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft