CortexMark — VS Code Extension
A VS Code extension that provides session-based batch processing for the CortexMark with a Markdown preview panel, quality dashboard, real-time progress tracking, analysis module integration, and a chat panel.
Migration Notes
This extension was renamed from PhiniteLab PDF Pipeline to CortexMark.
- Old extension ID:
phinitelab-pdf-pipeline-vscode
- New extension ID:
cortexmark-vscode
- Old publisher:
phinitelab
- New publisher:
cortexmark
Because the Marketplace identity changed, existing installs may need a manual install/upgrade to the new cortexmark-vscode package. Session metadata is migrated from .phinitelab-pdf-pipeline/sessions.json to .cortexmark/sessions.json automatically when present.
Features
- Session management: create, activate, and delete pipeline sessions
- Batch PDF processing: add individual PDFs or entire folders to a session
- Pipeline execution: run the full pipeline or individual stages (convert, QA, diff)
- Analysis modules: run Cross-Reference, Algorithm Extraction, Notation Glossary, and Semantic Chunking analyses directly from the sidebar or chat
- Markdown preview: side-by-side WebView panel with rendered math formulas, QA badges, and content statistics (theorem/proof/definition/algorithm/formula/figure counts)
- Quality dashboard: sidebar panel showing pipeline metrics — average QA score with badge breakdown, cross-reference resolution rate, algorithm counts, notation statistics
- Progress visualization: notification-bar progress during pipeline and analysis execution with cancellation support
- Sidebar tree view: sessions, files (with status icons), actions, analysis tools, and output browsing
- Chat panel: command-driven panel with 11 commands (English + Turkish) for pipeline control and analysis
- Real-time logging: output channel shows pipeline progress as it runs
- Auto-detection: finds workspace
.venv for Python execution
- File watchers: detect new PDFs in
data/raw/ (optional auto-processing)
CortexMark
├── Sessions
│ └── ★ experiment1 (active)
│ ├── ○ paper.pdf (queued)
│ ├── ↻ textbook.pdf (processing)
│ ├── ✓ thesis.pdf (done)
│ └── ✗ broken.pdf (error)
├── Actions
│ ├── ▶ Run Full Pipeline
│ ├── ▶ Convert Only
│ ├── 📊 Generate QA Report
│ ├── 🔍 Compare Two Folders
│ └── ⚙ Open Config
├── Analysis
│ ├── 🔗 Cross References
│ ├── 💻 Algorithm Extraction
│ ├── 𝑥 Notation Glossary
│ ├── ✂ Semantic Chunking
│ └── ▶▶ Run All Analyses
├── Outputs
│ ├── raw_md/
│ ├── cleaned_md/
│ ├── chunks/
│ └── quality/
├── Dashboard (webview)
│ ├── Pipeline Overview (PDF/output counts)
│ ├── Quality (badges, avg score)
│ ├── Cross References (resolution rate)
│ ├── Algorithms (count, depth)
│ └── Notation (symbols, entries)
└── Chat (webview)
└── /status /process /qa /crossref /algorithm
/notation /chunk /analyze /preview /help
Commands
| Command |
Description |
Refresh |
Refresh sidebar tree and dashboard |
New Session |
Create a new processing session |
Delete Session |
Remove a session and its data |
Set as Active |
Switch the active session |
Process Active Session |
Run pipeline on active session |
Add PDFs... |
Add PDF files to active session |
Add PDF Folder... |
Add a folder of PDFs |
Run Full Pipeline |
Execute all stages |
Convert Only |
Run convert stage only |
Generate QA Report |
Run quality analysis |
Compare Two Folders |
Diff two output directories |
Open Config |
Open configs/pipeline.yaml |
Reveal in Explorer |
Open output folder |
Delete |
Delete output file or folder |
| Run Cross-Reference Analysis |
Detect and resolve cross-references |
| Run Algorithm Extraction |
Extract pseudocode and algorithm blocks |
| Run Notation Glossary |
Build mathematical symbol table |
| Run Semantic Chunking |
Theorem-aware content splitting |
| Run All Analyses |
Execute all 4 analysis modules sequentially |
| Preview Markdown |
Open Markdown in side preview panel |
| Refresh Preview |
Reload the preview content |
| Refresh Dashboard |
Reload dashboard metrics |
Chat Commands
| Command |
Turkish Alias |
Description |
status |
durum |
Refresh session status |
process / run |
çalıştır |
Run pipeline on active session |
qa |
kalite |
Generate QA report |
crossref |
çapraz referans |
Cross-reference analysis |
algorithm / algo |
algoritma |
Algorithm extraction |
notation / glossary |
notasyon |
Notation glossary |
chunk / semantic |
bölümleme |
Semantic chunking |
analyze / all |
analiz |
Run all analyses |
preview |
önizleme |
Preview active Markdown |
help |
yardım |
Show available commands |
Settings
| Setting |
Default |
Description |
cortexmark.pythonPath |
python3 |
Python executable. Leave empty for workspace .venv auto-detection. |
cortexmark.configPath |
configs/pipeline.yaml |
Pipeline config file path relative to workspace root. |
cortexmark.defaultEngine |
dual |
Default conversion engine (docling, markitdown, or dual). |
cortexmark.autoProcess |
false |
Automatically run the pipeline when new PDFs are detected. |
Architecture
| File |
Purpose |
src/extension.ts |
Activation, command registration (22 commands), file watchers, panel integration |
src/sessionManager.ts |
Session persistence (.cortexmark/sessions.json), event emitter |
src/sessionTree.ts |
Tree data provider (Sessions, Actions, Analysis, Outputs groups) |
src/pipelineRunner.ts |
Python subprocess spawning with progress bar, cancellation, and analysis module support |
src/previewPanel.ts |
Markdown preview WebView with QA badges, math rendering, and content statistics |
src/dashboardPanel.ts |
Quality metrics dashboard WebView with report parsing and badge visualization |
src/chatView.ts |
Chat panel with 11 commands (English + Turkish aliases) |
src/types.ts |
TypeScript interfaces (PdfFile, Session, FileStatus) |
Development
cd vscode-extension
npm install
npm run compile
Press F5 in VS Code to run the extension in an Extension Development Host.
Prerequisites
- Node.js 18+
- The Python pipeline package installed in the workspace (see main README)
| |