RagStack
Semantic code search powered by your own AI. Index any project, search with natural language. Runs on your infrastructure — your code never leaves your network.

Features
- Semantic Search — Ask questions in plain English, find relevant code across your entire codebase
- One-Click Indexing — Click "Index" to send your project to your RAG server. Files are chunked, embedded, and stored in pgvector
- Server Selector — Switch between Local, LAN, Cloud, or Custom server with one click from the sidebar
- Click to Navigate — Search results link directly to the matching code in your editor
- Right-Click Search — Select any text, right-click, "RagStack: Search Selected Text"
- Multi-Project — Index multiple projects under different names, search across all of them
- Remote Friendly — Works over LAN or internet. Index from any machine, query from anywhere
- Private & Self-Hosted — Your code stays on your server. Uses Ollama for embeddings (no OpenAI API needed)
Quick Start
- Install the extension
- Click the RagStack icon in the sidebar (magnifying glass)
- Click the server button at the top to select your AI server
- Click Index to index your current workspace (first time, takes a few minutes)
- Type a question and press Enter — results appear instantly
Codeva — Terminal Mode (like Claude Code)
Just type codeva in any VS Code terminal:
codeva
Or via command palette: Ctrl+Shift+P → "Codeva: AI Terminal"
This opens an interactive AI coding assistant — ask questions, search code, index projects, get AI-powered answers with codebase context. No separate install needed.
╔══════════════════════════════════════╗
║ Codeva — AI Coding Assistant ║
║ powered by RagStack ║
╚══════════════════════════════════════╝
codeva> how does the wallet page fetch real balance?
Searching codebase...
Found 6 relevant chunks
Thinking...
The wallet page fetches the real balance via...
── Sources ──
72% src/pages/Wallet.tsx
68% src/api/agent.ts
Terminal commands:
| Command |
Description |
/search <query> |
Search code only (instant, no LLM) |
/index |
Index current directory |
/stats |
Show index statistics |
/server local\|lan\|cloud |
Switch AI server |
/model general\|code |
Switch LLM model |
/read <file> |
Display a file |
/help |
Show all commands |
/quit |
Exit |
| any text |
RAG search + AI answer |
Server Configuration
Click the server button in the sidebar to choose your AI stack:
| Option |
URL |
When to use |
| Local |
http://localhost:30810 |
RAG server running on your machine |
| LAN |
http://192.168.x.x:30810 |
RAG server on another machine in your network |
| Cloud |
https://your-domain.com/rag |
RAG server exposed via reverse proxy (Traefik, Nginx) |
| Custom |
Any URL |
Your own endpoint |
Network Examples
Same machine (development):
ragstack.serverUrl = http://localhost:30810
Office LAN (shared AI server):
ragstack.serverUrl = http://192.168.1.109:30810
Remote / VPN / Cloud (works from anywhere):
ragstack.serverUrl = https://ai.mycompany.com/rag
Kubernetes with Traefik:
ragstack.serverUrl = https://ai.mycompany.com/rag
(Traefik strips /rag prefix and routes to the RAG server NodePort)
Requirements
RagStack requires a running RAG server with:
- Ollama — Local LLM runtime with
nomic-embed-text model
- pgvector — PostgreSQL with vector extension
- RAG Server — FastAPI server (included in the RagStack repo)
Quick Setup (Docker)
# Pull the embedding model
ollama pull nomic-embed-text
# Start pgvector
docker run -d --name pgvector -p 5432:5432 \
-e POSTGRES_PASSWORD=password \
pgvector/pgvector:pg17
# Start RAG server
docker run -d --name rag-server -p 30810:8100 \
-e DB_HOST=host.docker.internal \
-e DB_PASSWORD=password \
-e OLLAMA_HOST=http://host.docker.internal:11434 \
your-rag-server-image
Kubernetes Setup
Deploy Ollama, pgvector, and the RAG server as K8s services. Expose the RAG server via NodePort (30810) or through an Ingress/Traefik route. See the RagStack repo for full K8s manifests.
Settings
| Setting |
Default |
Description |
ragstack.serverUrl |
http://localhost:30810 |
RAG server URL (use the sidebar button to change) |
ragstack.topK |
8 |
Number of search results |
ragstack.projectName |
(folder name) |
Project name for grouping indexed files |
ragstack.indexExtensions |
.py,.ts,.tsx,.js,.jsx,.scss,.css,.sql,.yaml,.yml,.md,.json,.toml,.html,.go,.rs,.java,.cs,.rb,.php,.sh |
File extensions to index |
ragstack.skipDirs |
node_modules,.git,dist,build,__pycache__,.venv,venv,.next,coverage,.turbo,.cache |
Directories to skip during indexing |
Commands
| Command |
Shortcut |
Description |
RagStack: Search Code |
Ctrl+Shift+P |
Open search prompt |
RagStack: Search Selected Text |
Right-click menu |
Search with selected text |
RagStack: Index Project |
Sidebar "Index" button |
Index a project folder |
RagStack: Clear Project Index |
Ctrl+Shift+P |
Delete all chunks for a project |
Codeva: AI Terminal |
Ctrl+Shift+P or type codeva |
Open interactive AI terminal (like Claude Code) |
How It Works
Your Code (VS Code) Your AI Server
┌──────────────┐ HTTPS/HTTP ┌─────────────────┐
│ Index: │ ──────────────────>│ RAG Server │
│ file chunks │ POST /index │ (FastAPI) │
│ │ │ │ │
│ Search: │ ──────────────────>│ v │
│ "how does │ POST /retrieve │ Ollama │
│ auth work" │ │ nomic-embed │
│ │ <──────────────────│ │ │
│ Results: │ JSON response │ v │
│ score + code│ │ pgvector │
└──────────────┘ │ (cosine search)│
└─────────────────┘
- Index: Each file is chunked (~800 chars), embedded into 768-dim vectors via Ollama's
nomic-embed-text, and stored in pgvector. Unchanged chunks are skipped automatically on re-index.
- Search: Your query is embedded the same way, then pgvector finds the most similar code chunks using cosine similarity.
- Navigate: Click a result header to open the file and jump to the matching code in your editor.
Supported Languages
TypeScript, JavaScript, Python, Go, Rust, Java, C#, Ruby, PHP, Shell, SQL, SCSS, CSS, HTML, YAML, JSON, TOML, Markdown — and any other text-based file via custom ragstack.indexExtensions.
Support
License
MIT
| |