Distyl

Distil your VS Code workspace into a prompt-ready context payload — automatically ranked, token-budgeted, clipboard-ready.

The problem

Every time you prompt an AI assistant, you do the same invisible ritual first: copy-paste code snippets, reference docs, error logs, git state, and background into a chat window. The curation happens in your head, often poorly — too little context gets generic answers, too much produces noisy dilution. Distyl automates that step with a single keystroke.

How it works

Four stages run on every Cmd+Shift+C:

Cmd+Shift+C → [Gather] → [Rank (MiniLM)] → [Pack (token budget)]
                                                      ↓
AI chat ← clipboard ← [Preview panel shows what was sent]

Gather — collectors pull chunks from your active file, recent edits, git diff/log, and terminal history
Rank — local all-MiniLM-L6-v2 embeddings score each chunk against your prompt, with recency and directory-proximity boosts
Pack — greedy token packer fits the top-scoring chunks into your chosen budget (4k / 8k / 16k tokens)
Deliver — payload lands on your clipboard; the preview panel shows exactly what was sent and why

VS Code extension install

Install from the VS Code Marketplace Distyl
Press Cmd+Shift+C (Mac) / Ctrl+Shift+C (Windows/Linux)
Type what you're about to ask → Enter
Paste into Claude / ChatGPT / any AI chat

Settings

Setting	Default	Options
`distyl.budget`	`standard`	`focused` (4k tokens), `standard` (8k tokens), `deep` (16k tokens)

CLI install

npm install -g distyl-cli
distyl -p "fix the auth bug" | claude

Options:

distyl -p "your prompt"              # output packed context to stdout
distyl -p "..." --budget focused     # focused | standard | deep (default: standard)
distyl -p "..." --clipboard          # copy to clipboard instead of stdout
distyl -p "..." --baseline           # use BaselineRanker (no model download)
distyl --help

How the ranker works

Distyl runs all-MiniLM-L6-v2 (a 22M-parameter sentence embedding model) entirely locally via @xenova/transformers — no API calls, no data leaves your machine. Each chunk is embedded, cosine similarity is computed against your prompt embedding, then two boosts are applied multiplicatively: 1.3× for chunks modified in the last 5 minutes, 1.2× for chunks in the same directory as your active file. The top 20 chunks (noise floor 0.1) are handed to the packer. Embeddings are cached in SQLite keyed by content hash, so repeated runs are near-instant.

The --baseline flag uses a source-priority heuristic (active file → recent edits → git diff hunks by recency) without any embeddings — useful for A/B comparison or instant results on first run.

Known limitations

Single workspace folder (multi-root monorepos: picks the first folder)
CLI recent-edits uses file mtime — approximate compared to VS Code's event stream
First CLI run downloads MiniLM weights (~80 MB from HuggingFace); subsequent runs use the embedding cache
Terminal collector requires VS Code shell integration (bash/zsh with VS Code shell integration enabled)
Preview panel is read-only in V1 (manual chunk editing: V1.1)

Development

npm install
npm run compile       # one-shot build → dist/extension.js
npm run compile:cli   # build CLI → dist/cli.js
npm run watch         # rebuild on change
npm run check-types   # tsc --noEmit
npm test              # run test suite

Press F5 in VS Code to launch the Extension Development Host, then press Cmd+Shift+C and type your prompt.

Distyl

David-T-Yu

Distyl

The problem

How it works

VS Code extension install

Settings

CLI install

How the ranker works

Known limitations

Development

Demo