LoreGraph

Evidence-backed repository intelligence for VS Code.

LoreGraph extracts the business concepts, configuration dependencies, and code relationships hidden inside complex enterprise codebases, builds them into a knowledge graph, and turns that graph into an interactive visual map and Docusaurus documentation — without ever inventing a fact.

Core thesis: No evidence, no documentation claim. LoreGraph is not a generic AI doc generator. Every entity, relationship, and sentence it produces links back to a specific file and line in your repository.

The problem: tribal knowledge

Large enterprise systems bury years of undocumented knowledge in config files, environment variables, feature flags, Spring classes, Kafka topics, database schemas, comments, acronyms, and half-stale docs. New engineers can't tell what a business term means, where it's implemented, which configs control it, or which docs to trust.

LoreGraph makes that knowledge explicit and auditable.

How it works

Repository Scanner → Config Parser → Code Usage Resolver → Domain Term Extractor
→ Annotation Extractor → Relationship Builder → Knowledge Graph → Confidence Scorer
→ Visual Graph Builder → Documentation Generator → Drift Detector

Scan the workspace (skipping node_modules, target, .env, secrets, etc.).
Parse code with a tree-sitter AST — Java & TypeScript/JS/JSX get framework-aware extraction (Spring, Nest, Kafka, REST), while Python, Go, C and C++ get structural extraction (classes/structs/interfaces, functions, inheritance, references); config/SQL/markdown use lightweight parsers. All scope-aware and import-resolved, never from comments or strings.
Build a knowledge graph where code symbols are identified by file + fully-qualified name (so same-named classes don't collapse) and every node and edge carries SourceRef evidence.
Score confidence: human annotations = high, multi-source = boosted, naming-only = low.
Trace multi-hop flows by traversing the graph (event: producer → topic → consumer; request: endpoint → controller → service → table).
Visualize the graph inside VS Code (React + React Flow) with an edge-aware layered layout (dagre).
Generate Docusaurus docs from graph-backed claims only, with C4-style Mermaid + sequence diagrams.
Flag missing/uncertain knowledge as explicit open questions.

Anti-hallucination design

Documentation claims are generated only from extracted graph evidence.
Inferred meanings are hedged ("appears to", "is likely related to").
Insufficient evidence produces an open question, never an invented answer.
Every generated page has an Evidence section and a confidence badge.
Existing docs/comments and multiple independent sources raise confidence; naming alone stays low.
Human @lore annotations produce high-confidence, verified claims.

Human-in-the-loop annotations

Drop @lore comments anywhere (//, #, *, --):

// @lore.term Reconciliation
// @lore.purpose Compares trade records across systems to identify breaks.
// @lore.owner Trade Operations Platform
// @lore.risk Changing retry settings may delay end-of-day break detection.

These become verified, high-confidence graph claims.

Visual Graph Explorer

Four modes, one interactive canvas:

Concept Graph — Obsidian-style map of business terms ↔ configs, services, topics, tables, docs.
Config Dependency Map — where each config key is defined, read, and documented.
Architecture Map — simplified C4 view: API → services → messaging → data & config.
Flow Map — directional event flows from Kafka producers to consumers.

Search, filter by entity type (click the legend) and confidence, focus a node, click evidence to jump to source, and inspect any node's claims, related entities, and open questions.

Generated documentation

LoreGraph: Generate Documentation Site writes a ready-to-build Docusaurus project:

loregraph-docs/
  docs/
    intro.md
    onboarding/{start-here,knowledge-map}.md
    business-terms/{glossary, <term>.md ...}
    configuration/{overview,config-keys,feature-flags,environment-variables}.md
    architecture/{overview,service-map,concept-map}.md
    flows/{inferred-flows,kafka-flows}.md
    operations/{open-questions,docs-drift-report}.md
  docusaurus.config.ts
  sidebars.ts
  package.json

Every page includes title, status, confidence, summary, evidence-backed claims, related concepts/configs/code, open questions, and a "Needs Human Review" section where applicable.

Commands (Command Palette → "LoreGraph:")

Command	What it does
Analyze Repository	Scans, extracts, builds the graph, updates the sidebar
Open Graph Explorer	Opens the interactive visual graph
Generate Documentation Site	Generates the full Docusaurus site
Generate Business Glossary	Generates business term pages only
Explain Current File	Shows concepts/configs/evidence for the active file
Focus Current Symbol in Graph	Locates the symbol under the cursor in the graph
Detect Documentation Drift	Reports potentially stale/undocumented knowledge
Export Knowledge Graph	Exports the graph as JSON

Configuration

Setting	Default	Description
`loregraph.scan.maxFiles`	`5000`	Maximum number of files to scan.
`loregraph.scan.maxFileSizeKb`	`1024`	Skip files larger than this size (KB).
`loregraph.docs.outputDir`	`loregraph-docs`	Output directory for generated documentation.

Local-first, no external API

LoreGraph runs entirely offline. The LLMProvider interface exists for future AI providers, but the default LocalTemplateProvider generates all documentation deterministically from the grounded EvidenceBackedPrompt — which, by contract, only ever exposes graph evidence, never raw source.

Roadmap

Real LLM provider integration (grounded by the same evidence contract)
GitHub PR documentation drift checks
Jira/Confluence import
CODEOWNERS-based ownership mapping (partial today)
Cross-file TypeScript reference resolution via module-path imports (AST today resolves by import/short name)
Docusaurus deployment workflow
Team review workflow for verifying generated claims
Semantic embeddings for better term clustering
Call-graph-precise request flows (method-level call edges, building on the current reference graph)

License

MIT

LoreGraph

aminaos

LoreGraph

The problem: tribal knowledge

How it works

Anti-hallucination design

Human-in-the-loop annotations

Visual Graph Explorer

Generated documentation

Commands (Command Palette → "LoreGraph:")

Configuration

Local-first, no external API

Roadmap

License