Skip to content
| Marketplace
Sign in
Visual Studio Code>Other>MCP Doc SearchNew to Visual Studio Code? Get it now.
MCP Doc Search

MCP Doc Search

De Otio

|
2 installs
| (0) | Free
Semantic documentation search for any monorepo. VS Code extension + MCP server. Local embeddings, no API key required.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

MCP Doc Search

VS Code Marketplace Installs CI License: MIT

Semantic documentation search for any monorepo.

Large repos can have hundreds or thousands of markdown files of documentation. This extension helps developers manage them by enabling precise document retrieval—find and include only the relevant sections you need, dramatically reducing context bloat and token usage in AI assistant conversations.

  • VS Code extension: type-ahead search in the command palette, auto-reindex on save, status bar indicator
  • MCP server: search_docs, list_docs, reindex_docs, get, multi_get, plus per-file set_context / list_contexts / remove_context tools so any MCP-compatible AI assistant can find and read the right document in a single call
  • Local embeddings: auto-downloads all-MiniLM-L6-v2 (ONNX, 22MB) on first use, then works fully offline — no API key required
  • Heading-aware chunking: splits markdown on #/## boundaries, skips code fences, prepends document title as breadcrumb context
  • Hybrid search: vector similarity + keyword re-ranking (+0.03 per matching term, camelCase-aware)

Quick start

Install the VS Code extension

Install from the VS Code Marketplace:

code --install-extension de-otio.mcp-doc-search

Or grab a per-platform VSIX from the latest GitHub Release:

code --install-extension mcp-doc-search-<target>-<version>.vsix

Configure for your repo

Open VS Code settings and set:

Setting Default Description
docSearch.docGlob doc/**/*.md Glob pattern for docs to index
docSearch.indexDir .doc-search-index Where to store the vector index (auto-added to .gitignore)
docSearch.headingDepth 2 Split on # only (1) or # and ## (2)
docSearch.embedProvider local local, ollama, or openai
docSearch.autoReindex true Auto-reindex on file save

Use it

  • Cmd+Shift+P → "Doc Search: Reindex Documentation" — build the initial index (takes ~30s for large repos)
  • Cmd+Shift+P → "Doc Search: Search Documentation" — type-ahead semantic search, click a result to open it
  • Cmd+Shift+P → "Doc Search: Generate .mcp.json" — creates .mcp.json so any MCP client can use the same index

Understanding scores

Each result includes a score (0–1) computed from vector similarity plus keyword re-ranking:

Score Meaning
0.8–1.0 Highly relevant
0.5–0.8 Moderately relevant
0.2–0.5 Somewhat relevant
0.0–0.2 Low relevance

Pass explain: true to search_docs to get a detailed breakdown:

  • vectorScore — raw cosine similarity from embeddings
  • keywordTermsMatched — query terms found in the chunk
  • keywordBonus — boost applied (+0.03 per matching term)
  • finalScore — combined score (same as score)
  • rank — position in result list (1-indexed)

MCP integration

After running "Generate .mcp.json", connect any MCP-compatible client (Claude Code, Cursor, etc.). The MCP tools appear automatically:

search_docs("authentication flow")               → semantic search
search_docs("authentication", explain=true)      → same, with per-result score breakdown
list_docs()                                       → list every indexed file
get("doc/api.md")                                → read one file (full text)
multi_get("doc/**/auth*.md")                     → read many files in one call
reindex_docs(force=true)                         → full rebuild

# Per-file context notes the indexer carries alongside chunks
set_context("doc/api.md", "primary API reference")
list_contexts()
remove_context("doc/api.md")

Embedding providers

Provider Quality Setup Cost
local (default) Good (384-dim) None — ships with extension Free
ollama Better (768-dim) brew install ollama && ollama pull nomic-embed-text Free
openai Best (1536-dim) Set docSearch.openaiApiKey ~$0.02/M tokens

CLI

A standalone CLI is included — no MCP client required.

# Semantic search
mcp-doc-search search "authentication flow" --n 5
mcp-doc-search search "map view feed" --files          # one path per line
mcp-doc-search search "query" --min-score 0.7 --json   # JSON output

# Browse the index
mcp-doc-search list
mcp-doc-search list --json

# Rebuild the index
mcp-doc-search reindex
mcp-doc-search reindex --force   # re-embed every file

# Read files from the workspace
mcp-doc-search get doc/api.md
mcp-doc-search get doc/api.md --from-line 20 --max-lines 50

# Read multiple files (glob or comma list)
mcp-doc-search multi-get "doc/**/*.md" --files         # list matched paths
mcp-doc-search multi-get "doc/a.md,doc/b.md" --json

# Index health
mcp-doc-search status
mcp-doc-search status --json

# Per-file context notes carried alongside the index
mcp-doc-search context add doc/api.md "primary API reference"
mcp-doc-search context list
mcp-doc-search context remove doc/api.md

Flags: --json (machine-readable output), --files (paths only, for search/multi-get), --explain (score breakdown for search).

Environment: same as the MCP server — DOC_SEARCH_WORKSPACE, DOC_SEARCH_GLOB, DOC_SEARCH_INDEX_DIR, USE_OPENAI=1, OLLAMA_URL.

Exit codes: 0 = success, 1 = user error (bad args / missing file), 2 = engine error.

HTTP daemon mode

By default, each MCP client spawns the server as a short-lived stdio subprocess. The embed model takes ~1–2 s to load on cold start. Running a long-lived HTTP daemon amortises that cost across all clients.

Start the daemon

# One-shot foreground (useful for smoke-testing)
node dist/mcp-server.js --http --port 8181

# Detached daemon (parent exits, child runs in background)
node dist/mcp-server.js --http --port 8181 --daemon
# → MCP daemon started (PID: 12345, port: 8181)

# Verify it's up
curl http://localhost:8181/health
# → {"status":"ok","uptime":3.1}

Stop the daemon

node dist/mcp-server.js --stop
# → stopped (PID: 12345)

Point Claude Code at the HTTP endpoint

Edit your .mcp.json (or ~/.claude.json) to use the http transport:

{
  "mcpServers": {
    "doc-search": {
      "type": "http",
      "url": "http://localhost:8181/mcp"
    }
  }
}

vs stdio transport (the default, spawns a new process per client):

{
  "mcpServers": {
    "doc-search": {
      "type": "stdio",
      "command": "node",
      "args": ["/path/to/dist/mcp-server.js"],
      "env": { "DOC_SEARCH_WORKSPACE": "/path/to/your/repo" }
    }
  }
}

Idle model disposal

After 5 minutes of inactivity, the daemon automatically releases the embed pipeline from memory. The next request transparently reloads it (~1 s penalty), then stays fast again.

Development

npm install
npm run build       # bundle extension.js + mcp-server.js
npm test            # unit tests
npm run test:coverage  # coverage report
npm run package     # build .vsix for current platform

Platform-specific builds

LanceDB ships native binaries. Build for each platform:

npm run package:darwin-arm  # macOS Apple Silicon
npm run package:darwin-x64  # macOS Intel
npm run package:linux-x64   # Linux
npm run package:win-x64     # Windows

Architecture

src/
  core/          # Shared engine (no VS Code or MCP deps)
    types.ts     # DocChunk, SearchResult, EmbedProvider interfaces
    chunker.ts   # Markdown heading-aware chunking with fence detection
    embedder.ts  # LocalEmbedder, OllamaEmbedder, OpenAIEmbedder
    vectorstore.ts  # LanceDB wrapper (file-backed, cosine metric)
    searcher.ts  # Hybrid search: vector + keyword re-ranking
    indexer.ts   # Crawl, chunk, embed, upsert with mtime cache
  extension/     # VS Code extension shell
  mcp/           # MCP server: stdio + HTTP daemon transports
bin/             # Standalone CLI entry point

Three build outputs:

  • dist/extension.js — VS Code extension host
  • dist/mcp-server.js — standalone Node.js MCP server (stdio / HTTP daemon)
  • dist/mcp-doc-search.js — standalone CLI binary

Contributing

Contributions are welcome — see CONTRIBUTING.md for setup, test, and PR conventions. By participating you agree to abide by the Code of Conduct.

Security

If you believe you've found a security issue, please follow the disclosure process in SECURITY.md. Do not open a public GitHub issue for suspected vulnerabilities.

License

MIT — see LICENSE.

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft