Bramha — Local Memory for GitHub Copilot
Bramha gives GitHub Copilot Chat a long-term memory that runs entirely on your
own machine. Install the extension, reload the window, and Copilot can remember
your decisions, preferences, and project facts across sessions.
Open source · MIT License · No data leaves your computer.
Why you might want this
GitHub Copilot is great inside one chat, but it forgets everything when you
close the window. That means you keep re-explaining the same things:
- "We use pytest, not unittest."
- "Our API base path is
/v2/...."
- "Last week we decided to skip Redis for now."
Bramha stores these small facts locally and feeds the most relevant ones back
to Copilot whenever you ask a related question. You stop repeating yourself,
and Copilot stops giving generic answers.
What it actually does
- Saves notes — Copilot (or you) can save short facts, decisions, and
lessons through an MCP tool call (
upsert_fact, record_episode, etc.).
- Finds the right note — When you ask Copilot something, Bramha searches
your past notes and returns the most relevant ones.
- Stays on your laptop — Everything sits in a local SQLite + LanceDB
store under your VS Code user folder. No cloud, no telemetry.
Under the hood it uses a hybrid search engine: vector search (LanceDB +
nomic-embed-text-v1.5, 768-dim) + keyword search (BM25) + a small
cross-encoder reranker (MiniLM-L6). Results are merged with Reciprocal Rank
Fusion (RRF).
Install
- Download
bramha-3.0.0.vsix from the Releases page (or build it yourself
with npx vsce package).
- In VS Code: Extensions panel → "..." menu → Install from VSIX...
- Reload the window.
The first time it starts, Bramha creates a private Python virtual environment
under VS Code's globalStorage/azul-bramha.bramha/ folder and installs its
dependencies. This takes 1–2 minutes and only happens once.
It then registers itself as a local MCP server, so Copilot Chat can call its
tools automatically.
How well does it work?
Bramha has been measured on standard public long-memory evaluation suites
(independent research datasets we did not create). Headline numbers:
Long-session memory (500 questions, ~19,000 chat sessions)
| Metric |
Bramha |
| Recall@1 |
93.4% |
| Recall@5 |
99.0% |
| Recall@10 |
99.4% |
| MRR |
0.957 |
On the same evaluation, Bramha's Recall@5 is higher than the best public
result we are aware of for an open-source memory layer (~98.4%).
Reproducible with the bundled benchmark script.
Long multi-turn conversations (~2,000 questions, 10 long dialogues)
| Metric |
Result |
| Recall@1 |
80.4% |
| Recall@5 |
94.9% |
| Recall@10 |
97.7% |
| p50 latency |
2.5 s (CPU) |
Reproducible with the bundled benchmark script.
Plain-English takeaway: on a standard long-memory test, Bramha finds the
right memory in its top 5 results 99 times out of 100 — better than other
open-source memory layers we have benchmarked.
How fast is it?
Typical query times on an Apple M-series laptop (CPU only, no GPU needed):
- ~30 ms — BM25 keyword search alone
- ~300 ms — full hybrid pipeline with reranking (warm cache)
- ~2.5 s — full hybrid pipeline cold (first query after start)
Privacy
- All data is stored locally under VS Code's user folder.
- No network calls are made for search, embedding, or reranking — models run on
your CPU.
- The extension never sends your notes anywhere.
Uninstall
VS Code: Extensions panel → Bramha → Uninstall.
To also delete the local data:
rm -rf "$HOME/Library/Application Support/Code/User/globalStorage/azul-bramha.bramha"
(On Windows replace the path with %APPDATA%\Code\User\globalStorage\azul-bramha.bramha,
on Linux ~/.config/Code/User/globalStorage/azul-bramha.bramha.)
License
MIT — see the LICENSE file bundled with the extension.