Knowledge RAG VS Code Extension
Knowledge RAG turns any folder of documents into a searchable knowledge base directly inside VS Code. It ships with an embedded sentence-transformers model (all-MiniLM-L6-v2), ingests multi-format files, builds embeddings on your machine, and lets you query the corpus through a rich results panel or GitHub Copilot Chat without sending content to external services.
Highlights
- Local RAG pipeline - Analyze and index Markdown, text, Office, PDF, JSON/YAML/XML, and image files (via OCR) into
<your-folder>/.knowledge-rag.
- Hands-free Python runtime - The extension bootstraps a virtual environment inside
extension/.venv, installs the requirements listed in python/requirements.txt, and exposes a knowledgeRag.pythonPath setting for custom interpreters.
- GitHub Copilot integration - Send the top search hits to Copilot Chat, run the
Knowledge RAG: Query Knowledge Base with GitHub Copilot command, or mention @knowledge-rag directly inside Copilot conversations.
- Comfortable UX - Dedicated output channel, Explorer context-menu entry, Command Palette commands, and keybindings (
Ctrl+K Q / Ctrl+K Shift+Q) keep the workflow discoverable.
Requirements
- Visual Studio Code 1.88.0 or newer.
- Python 3.8+ available on your PATH (3.10+ recommended for faster inference). Configure
knowledgeRag.pythonPath if you prefer a specific interpreter or virtual environment.
- (~1.5 GB free disk space for the embedded model, Python environment, and generated embeddings).
- GitHub Copilot Chat extension (optional, only needed for Copilot features).
Installation
Install the packaged extension
- Build the project (
npm install && npm run compile) or use the pre-built knowledge-rag-<version>.vsix located in the extension folder.
- In VS Code open Extensions > ... > Install from VSIX... and pick the
.vsix file.
- Reload VS Code when prompted. The Knowledge RAG output channel appears once the extension activates.
Run from source (for development)
cd extension && npm install
- Use
npm run watch for incremental builds.
- From VS Code, run Run > Start Debugging (F5) to launch an Extension Development Host with Knowledge RAG preloaded.
Step-by-step usage
Follow these steps the first time you set up a workspace and any time your knowledge base changes.
- Open the folder that contains your knowledge base. This is the root directory you want to index. Any
.knowledge-rag folder created by previous runs can stay; re-runs are incremental.
- Tell Knowledge RAG which folder to use.
- Command Palette:
Ctrl+Shift+P -> Knowledge RAG: Select Knowledge Base Folder.
- Explorer: right-click a folder -> Knowledge RAG > Select Knowledge Base Folder.
- The path is stored per-workspace (
knowledgeRag.knowledgeBasePath), so you only have to do this once per project.
- Let the extension prepare Python dependencies (first run only).
- On activation, the extension looks for Python, creates
extension/.venv, and installs everything in python/requirements.txt.
- Watch the Knowledge RAG output channel for progress. Set
knowledgeRag.pythonPath if you need a non-default interpreter. You can re-trigger installation anytime via Knowledge RAG: Start Knowledge Base Analysis (it validates before running).
- Analyze the knowledge base.
- Command Palette -> Knowledge RAG: Start Knowledge Base Analysis.
- A progress notification tracks ingestion. Supported file types are:
txt, md, markdown, pdf, ppt, pptx, doc, docx, json, yaml, yml, xml, png, jpg, jpeg, gif, bmp.
- Outputs land in
<your-folder>/.knowledge-rag/ (embeddings.json, processing_tracker.json, analysis.log). The tracker automatically skips unchanged files; delete it to force a full rebuild.
- Query the knowledge base inside VS Code.
- Command Palette -> Knowledge RAG: Query Knowledge Base or press
Ctrl+K Q (Cmd+K Q on macOS).
- Enter a natural-language question. Results open in a Webview with similarity scores, previews, Open File buttons (jumps to the chunk location), and quick copy actions.
- Use GitHub Copilot for richer answers (optional).
- Command Palette -> Knowledge RAG: Query Knowledge Base with GitHub Copilot (
Ctrl+K Shift+Q). The command bundles the top matches and streams the model's answer, keeping clickable citations.
- Inside Copilot Chat you can also mention
@knowledge-rag in any conversation. The participant will run the same retrieval pipeline and feed context back to Copilot.
- Re-run analysis whenever source files change. The processing tracker ensures only modified files are re-embedded, so it is safe to trigger the analysis frequently.
Commands & Keybindings
| Command |
Description |
Default keybinding |
Knowledge RAG: Select Knowledge Base Folder |
Persist the folder that should be analyzed and queried. |
n/a (Command Palette / Explorer context menu) |
Knowledge RAG: Start Knowledge Base Analysis |
Run the Python ingestion/embedding pipeline and update .knowledge-rag. |
n/a |
Knowledge RAG: Query Knowledge Base |
Prompt for a question and show ranked results in a VS Code webview. |
Ctrl+K Q / Cmd+K Q |
Knowledge RAG: Query Knowledge Base with GitHub Copilot |
Retrieve context then forward the prompt + sources to Copilot. |
Ctrl+K Shift+Q / Cmd+K Shift+Q |
Settings
knowledgeRag.knowledgeBasePath (string, workspace scope) - Set automatically via the select-folder command, but can be edited directly in Settings/settings.json.
knowledgeRag.pythonPath (string, machine scope) - Override the interpreter used for dependency installation and script execution (e.g., C:\\Python311\\python.exe or /usr/bin/python3). Leave empty to let the extension resolve Python automatically.
Where your data lives
Inside the knowledge base folder the extension creates .knowledge-rag/ with:
embeddings.json - Vector store with chunk metadata and embedding vectors.
processing_tracker.json - Hashes + timestamps used to skip unchanged files.
analysis.log - Detailed ingestion log you can review or attach to bug reports.
Delete this folder to reset the index or export it alongside your documents to share an already-embedded knowledge base.
Troubleshooting
- "Python not found." Install Python 3.8+ and/or set
knowledgeRag.pythonPath. On Windows, ensure py.exe or python.exe is on PATH.
- Dependency install fails. Open the Knowledge RAG output channel for the pip error, make sure you have network access for the first install (only sentence-transformers dependencies need downloading), and confirm you have ~1 GB free disk space.
- Query returns no results. Verify you ran the analysis step and that the files you care about are not excluded. Check
.knowledge-rag/embeddings.json to confirm embeddings exist.
- GitHub Copilot integration missing. Install/enable the official GitHub Copilot Chat extension and sign in. Without it the regular
Query Knowledge Base command still works.
- OCR accuracy issues. Image ingestion relies on
pytesseract. Install the system-level Tesseract OCR binary if you plan to embed screenshots or scans.
Development tips
- Run
npm run lint or npm run typecheck before bundling.
- The extension entry point is
src/extension.ts. Bundled output lives in dist/extension.js (via npm run bundle).
- Python scripts live under
python/. Use python/DEPENDENCIES.md for vendor management details, and keep large models out of Git history unless they are already vendored (see python/model).
Happy querying! Let us know if you automate new workflows so we can document them here.
| |