Baremetal AI is a read-only VS Code chat extension that uses a local Ollama model to explain and audit code in your workspace.
Features
- Local chat sidebar powered by Ollama.
- Read-only code analysis; the extension does not edit files or apply patches.
- Workspace indexing for common source and text files.
- Context selection from the active file, import graph, recently modified files, symbols, and cached summaries.
- Explicit file context through
@ file mentions and drag-and-drop file references.
- Streaming responses with cancellation support.
- Persistent conversation history stored in VS Code global state.
- Background chunk, module, and codebase summaries for additional context.
- Commands to open chat, reindex the workspace, clear history, and check Ollama.
Requirements
- Visual Studio Code
^1.85.0.
- Node.js and npm for source builds.
- Ollama installed and running locally or at a configured endpoint.
- At least one Ollama model pulled locally.
Recommended starter models:
ollama pull qwen2.5-coder:1.5b
ollama pull phi3:mini
Larger models such as qwen2.5-coder:7b or deepseek-coder:6.7b may provide better code analysis if your machine has enough memory.
Installation
Baremetal AI is not currently configured with a marketplace publishing workflow in this repository.
Run From Source
npm install
npm run compile
Then open this folder in VS Code and press F5 to start an Extension Development Host.
Install From a VSIX
Build the package:
npm install
npm run package
Install the generated .vsix:
code --install-extension baremetal-ai-0.1.0.vsix
The package script expects vsce to be available on your machine.
Setup & Configuration
Start Ollama:
ollama serve
Pull a model:
ollama pull qwen2.5-coder:1.5b
Open VS Code settings and search for Baremetal AI.
Set baremetalAi.ollamaUrl if Ollama is not running at http://localhost:11434.
Set baremetalAi.model to the model you pulled.
Use Baremetal AI: Check Ollama Connection to verify connectivity.
Usage
Open the Baremetal AI activity bar view, then ask questions in the chat panel.
Examples:
Explain how the chat provider builds context.
Audit the current file for security and reliability issues.
Trace how Ollama streaming errors are handled.
To force specific files into context, type @ in the chat input and select an indexed workspace file. You can also drag a workspace file into the chat composer. Selected files appear as chips and are included before automatic context selection.
Use Stop or Escape to cancel an active response. Use clear in the chat header or the clear-history command to remove stored conversation history.
Commands
| Command |
Description |
Keybinding |
Baremetal AI: Open Chat |
Opens the Baremetal AI activity bar view. |
None |
Baremetal AI: Reindex Workspace |
Re-scans indexed workspace files. |
None |
Baremetal AI: Clear Conversation History |
Clears persisted chat history. |
None |
Baremetal AI: Check Ollama Connection |
Checks the configured Ollama endpoint and lists installed models. |
None |
Extension Settings
| Setting |
Type |
Default |
Description |
baremetalAi.ollamaUrl |
string |
http://localhost:11434 |
Base URL of the Ollama server. |
baremetalAi.model |
string |
qwen2.5-coder:1.5b |
Ollama model name. The model must already be installed with ollama pull. |
baremetalAi.temperature |
number |
0.2 |
Sampling temperature. Lower values are more deterministic. |
baremetalAi.maxContextFiles |
number |
6 |
Maximum number of files to send as context. |
baremetalAi.maxContextBytes |
number |
16384 |
Hard cap on total bytes of file context. |
baremetalAi.maxFileBytes |
number |
3500 |
Maximum bytes included from a single file. |
baremetalAi.historyTurns |
number |
4 |
Number of recent user/assistant turns to keep in model context. |
baremetalAi.firstTokenTimeoutSec |
number |
120 |
Timeout before aborting a request that has not produced a first token. |
baremetalAi.numCtx |
number |
2048 |
Ollama context window in tokens. |
baremetalAi.numThread |
number |
0 |
Ollama CPU thread count. 0 lets Ollama choose automatically. |
Known Issues
- Startup scans the first workspace folder immediately on activation.
- Only the first workspace folder is indexed in multi-root workspaces.
- Background summarization can compete with chat requests for the local Ollama process.
- Context selection and chunking are heuristic, not parser-backed for every language.
- The extension stores conversation history and summaries in VS Code global state without a retention setting.
- No automated test suite is currently included.
- No
LICENSE file is present in this repository.
Roadmap
- Add automated tests for indexing, context selection, prompt construction, and Ollama streaming.
- Improve multi-root workspace support.
- Add user controls for background summarization.
- Add retention controls for stored chat history and summaries.
- Improve parser-backed chunking for more languages.
Contributing
Install dependencies:
npm install
Compile:
npm run compile
Watch during development:
npm run watch
Run the extension:
- Open this repository in VS Code.
- Press
F5.
- Use the Extension Development Host window to test Baremetal AI.
Package a VSIX:
npm run package
Before contributing, run:
npm run compile
npm audit --audit-level=low
License
The README previously identified the project as MIT licensed, but this repository does not currently include a LICENSE file. Add one before publishing or distributing the extension.