LocalPilot

LocalPilot is a local Ollama-powered AI coding assistant for VS Code. It is built for developers who want inline suggestions, code explanations, fixes, tests, and chat without sending code to cloud AI services.

Features

Copilot-style inline ghost-text suggestions powered by local Ollama models.
Status bar menu for Ollama health, model selection, setup, chat, and inline mode.
Chat panel for asking questions about selected code or the active file, with quick actions and code-block copy buttons.
Selected-code commands for explaining code, adding comments, fixing code, generating tests, and solving coding problems.
Code Actions for quick lightbulb access in common languages.
Error explainer for selected stack traces or pasted terminal errors.
Safe apply workflow with diff preview, copy option, and explicit apply confirmation.
First-run setup that checks Ollama, lists local models, and asks before any model pull.
Privacy-first defaults: no telemetry, no cloud AI APIs, and local-only model calls.

Screenshots

Screenshots will be added before Marketplace publication.

Inline suggestions: resources/screenshots/inline-suggestions.png
Chat sidebar: resources/screenshots/chat-sidebar.png
Safe diff preview: resources/screenshots/diff-preview.png

Installation

Install LocalPilot from the VS Code Marketplace, or install a packaged VSIX:

code --install-extension localpilot-0.0.1.vsix

For local development:

npm install
npm run compile

Then press F5 in VS Code to launch the Extension Development Host.

Ollama Setup

Install Ollama from:

https://ollama.com/download

Start Ollama:

ollama serve

LocalPilot talks to this local host by default:

http://localhost:11434

You can change it with localpilot.ollamaHost.

Recommended Models

LocalPilot can guide setup on first run with LocalPilot: Run Setup. You can also pull models manually:

ollama pull qwen2.5-coder:1.5b
ollama pull qwen2.5-coder:7b
ollama pull smollm2:360m
ollama pull deepseek-coder:1.3b
ollama pull codegemma:2b

Suggested profiles:

Balanced: qwen2.5-coder:1.5b for responsive local autocomplete and chat.
Quality: qwen2.5-coder:7b for stronger local coding help on machines with more memory.
Micro: smollm2:360m for very low-resource machines.
Lite: deepseek-coder:1.3b for older low-end setups.
Compact: codegemma:2b for a small general coding model.

LocalPilot never pulls a model automatically. It asks before downloads because model files can be large.

Low RAM Mode

Use localpilot.mode to control resource usage:

auto: conservative defaults.
micro: shortest context and smallest outputs.
lite: balanced low-resource behavior.
standard: more useful context and output sizes.
custom: use your configured context and output settings.

Related settings:

localpilot.inlineDebounceMs
localpilot.disableInlineForLargeFiles
localpilot.maxFileSizeKb
localpilot.enableLowRamWarnings

Commands

LocalPilot: Open Chat
LocalPilot: Explain Selected Code
LocalPilot: Explain Error
LocalPilot: Add Comments to Selection
LocalPilot: Fix Selected Code
LocalPilot: Generate Tests
LocalPilot: Solve Coding Problem
LocalPilot: Check Ollama Status
LocalPilot: Select Local Model
LocalPilot: Switch Inline Completion Mode
LocalPilot: Open Status Menu
LocalPilot: Run Setup

Settings

localpilot.ollamaHost
localpilot.inlineModel
localpilot.chatModel
localpilot.lowRamModel
localpilot.mode
localpilot.enableInlineSuggestions
localpilot.inlineCompletionMode (full by default, or line for conservative one-line suggestions)
localpilot.maxContextLines
localpilot.maxOutputTokens
localpilot.temperature
localpilot.inlineDebounceMs
localpilot.disableInlineForLargeFiles
localpilot.maxFileSizeKb
localpilot.enableLowRamWarnings

Privacy

LocalPilot does not send code to cloud services. It talks to local Ollama by default and does not collect telemetry. See PRIVACY.md for details.

Troubleshooting

Ollama disconnected: run ollama serve, then use LocalPilot: Check Ollama Status.
No models found: run LocalPilot: Run Setup or pull a model with ollama pull <model>.
Inline suggestions feel slow: switch localpilot.inlineCompletionMode to line, or switch localpilot.mode to micro or lite.
Output looks weak: try a larger local model and set it with LocalPilot: Select Local Model.
Files are skipped: LocalPilot blocks secrets, generated folders, lock files, and large files by design.

Known Limitations

Model quality depends on the local model you install.
Very small models may produce incomplete fixes or tests.
LocalPilot does not read terminal contents automatically.
Full inline completions are bounded and filtered, but local model quality still matters.
Large workspaces and generated files are filtered aggressively for safety and responsiveness.

Development

npm run compile
npm run lint
npm test
npm run package

LocalPilot

Yuvraj Raina

LocalPilot

Features

Screenshots

Installation

Ollama Setup

Recommended Models

Low RAM Mode

Commands

Settings

Privacy

Troubleshooting

Known Limitations

Development