Bramha — Local Memory for GitHub Copilot

Bramha gives GitHub Copilot Chat a long-term memory that runs entirely on your own machine. Install the extension, reload the window, and Copilot can remember your decisions, preferences, and project facts across sessions.

Open source · MIT License · No data leaves your computer.

Why you might want this

GitHub Copilot is great inside one chat, but it forgets everything when you close the window. That means you keep re-explaining the same things:

"We use pytest, not unittest."
"Our API base path is /v2/...."
"Last week we decided to skip Redis for now."

Bramha stores these small facts locally and feeds the most relevant ones back to Copilot whenever you ask a related question. You stop repeating yourself, and Copilot stops giving generic answers.

What it actually does

Saves notes — Copilot (or you) can save short facts, decisions, and lessons through an MCP tool call (upsert_fact, record_episode, etc.).
Finds the right note — When you ask Copilot something, Bramha searches your past notes and returns the most relevant ones.
Stays on your laptop — Everything sits in a local SQLite + LanceDB store under your VS Code user folder. No cloud, no telemetry.

Under the hood it uses a hybrid search engine: vector search (LanceDB + nomic-embed-text-v1.5, 768-dim) + keyword search (BM25) + a small cross-encoder reranker (MiniLM-L6). Results are merged with Reciprocal Rank Fusion (RRF).

Install

Download bramha-3.0.0.vsix from the Releases page (or build it yourself with npx vsce package).
In VS Code: Extensions panel → "..." menu → Install from VSIX...
Reload the window.

The first time it starts, Bramha creates a private Python virtual environment under VS Code's globalStorage/azul-bramha.bramha/ folder and installs its dependencies. This takes 1–2 minutes and only happens once.

It then registers itself as a local MCP server, so Copilot Chat can call its tools automatically.

How well does it work?

Bramha has been measured on standard public long-memory evaluation suites (independent research datasets we did not create). Headline numbers:

Long-session memory (500 questions, ~19,000 chat sessions)

Metric	Bramha
Recall@1	93.4%
Recall@5	99.0%
Recall@10	99.4%
MRR	0.957

On the same evaluation, Bramha's Recall@5 is higher than the best public result we are aware of for an open-source memory layer (~98.4%). Reproducible with the bundled benchmark script.

Long multi-turn conversations (~2,000 questions, 10 long dialogues)

Metric	Result
Recall@1	80.4%
Recall@5	94.9%
Recall@10	97.7%
p50 latency	2.5 s (CPU)

Reproducible with the bundled benchmark script.

Plain-English takeaway: on a standard long-memory test, Bramha finds the right memory in its top 5 results 99 times out of 100 — better than other open-source memory layers we have benchmarked.

How fast is it?

Typical query times on an Apple M-series laptop (CPU only, no GPU needed):

~30 ms — BM25 keyword search alone
~300 ms — full hybrid pipeline with reranking (warm cache)
~2.5 s — full hybrid pipeline cold (first query after start)

Privacy

All data is stored locally under VS Code's user folder.
No network calls are made for search, embedding, or reranking — models run on your CPU.
The extension never sends your notes anywhere.

Uninstall

VS Code: Extensions panel → Bramha → Uninstall.

To also delete the local data:

rm -rf "$HOME/Library/Application Support/Code/User/globalStorage/azul-bramha.bramha"

(On Windows replace the path with %APPDATA%\Code\User\globalStorage\azul-bramha.bramha, on Linux ~/.config/Code/User/globalStorage/azul-bramha.bramha.)

License

MIT — see the LICENSE file bundled with the extension.

Bramha

Gaurav-Mishra8032