RAGKnightSelf-contained semantic search and RAG for your codebase — no external dependencies. RAGKnight brings local Retrieval-Augmented Generation (RAG) directly into VS Code. Index your workspace, search with semantic, hybrid, or keyword matching, and ask questions about your code — all powered by local models that run on your machine. Features🔍 Three Search Modes
💬 Chat Participant (
|
| Command | Description |
|---|---|
/search |
Search across your indexed codebase |
/ask |
Ask a question — retrieves context and generates an answer |
/index |
Index the current workspace |
/index-common |
Index a directory into the shared common knowledge base |
/status |
Show index status |
/learn |
Record a learning to cumulative knowledge |
/knowledge |
View all cumulative knowledge entries |
/clear |
Clear the workspace index |
🧠 Agentic Query Planning
Complex questions are automatically decomposed into sub-queries, executed in parallel, and merged for comprehensive answers. Uses Copilot for intelligent planning with a heuristic fallback.
🔌 Pluggable Backends
- Generation: Copilot LM API or local Ollama (auto-fallback)
- Embeddings: Local sentence-transformers (all-MiniLM-L6-v2) or VS Code LM API
📦 Self-Contained
Everything installs automatically on first launch:
- Python backend with sentence-transformers
- Ollama for local LLM generation
- No API keys required — works fully offline
📚 Cumulative Knowledge
Record learnings that persist across sessions and get injected into every RAG prompt automatically — the system gets smarter the more you use it.
Getting Started
- Install the extension
- Open a workspace and run RAGKnight: Setup from the Command Palette (or accept the automatic setup prompt)
- Run RAGKnight: Index Workspace to index your code
- Use
@rag /search your queryor@rag /ask your questionin Copilot Chat
Commands
| Command | Description |
|---|---|
RAGKnight: Setup |
One-time setup — installs Python backend and Ollama |
RAGKnight: Search Codebase |
Search your indexed code |
RAGKnight: Ask Question |
Ask a question with RAG-powered answers |
RAGKnight: Index Workspace |
Index the current workspace |
RAGKnight: Index Directory to Common |
Add a directory to shared knowledge |
RAGKnight: Show Status |
View index statistics |
RAGKnight: Pull Ollama Model |
Download an Ollama model |
RAGKnight: Change LLM Model |
Switch the Ollama generation model |
RAGKnight: Change Embedding Model |
Switch the sentence-transformers model |
RAGKnight: Select Copilot Model |
Pick from available Copilot models |
RAGKnight: Record Learning |
Add to cumulative knowledge |
RAGKnight: Show Knowledge |
View cumulative knowledge |
RAGKnight: Clear Workspace Index |
Clear workspace index |
RAGKnight: Clear Common Index |
Clear shared common index |
Settings
| Setting | Default | Description |
|---|---|---|
ragknight.searchMode |
hybrid |
Search mode: semantic, hybrid, or bm25 |
ragknight.scope |
all |
Search scope: all, workspace, or common |
ragknight.topK |
10 |
Number of search results |
ragknight.generationBackend |
ollama |
LLM backend: auto, copilot, or ollama |
ragknight.embeddingBackend |
local |
Embedding backend: auto, copilot, or local |
ragknight.ollamaModel |
llama3.2 |
Ollama model for generation |
ragknight.embeddingModel |
all-MiniLM-L6-v2 |
Sentence-transformers model |
ragknight.agenticMode |
true |
Enable agentic query decomposition |
ragknight.chunkSize |
512 |
Characters per text chunk when indexing |
Requirements
- VS Code 1.93+
- Python 3.10+ must be available on your system PATH
- GitHub Copilot (optional) — enables Copilot LM API for generation and embeddings
Architecture
RAGKnight uses a local Python backend with:
- LanceDB for vector storage (embedded, zero-config)
- sentence-transformers for embeddings (all-MiniLM-L6-v2, ~80MB, downloads on first use)
- Ollama for local LLM generation (bundled, auto-managed)
- BM25 for keyword search (pure Python, no dependencies)
Per-workspace indexes ensure each project has its own search space, while a shared common index lets you add reference material accessible from anywhere.
License
MIT