PDF Processor for VS Code
Open PDF files directly in VS Code. Extract tables, text, and images. Compare two PDFs side by side. Edit the extracted HTML in a built-in Monaco editor. Everything runs locally with no file upload.

Install and use
- Install this extension.
- Right-click any
.pdf file in Explorer.
- Pick "Open with PDF Processor."
The PDF opens in a VS Code custom editor with five views: PDF canvas, Doc (rendered HTML), Visual Diff (PDF vs extracted output), Editor (Monaco code editor on the HTML output), and Compare Diff (load a second PDF to diff).
Commands
| Command |
What it does |
Ginexys: Open with PDF Processor |
Open a PDF in the custom editor |
Ginexys: Extract PDF Text (Analyze) |
Open the Analyze tab on a PDF |
Ginexys: View PDF |
Open in PDF canvas view |
Ginexys: Edit Extracted PDF Content |
Open the Monaco code editor on the extracted output |
Each command also appears in the Explorer context menu for .pdf files.
File types
.pdf
Key features
- Browser-native extraction. High-speed PDF parsing using PDF.js and local Web Workers. Native PDF text fidelity preserved with no rasterization.
- Structured output. Layout-aware conversion to HTML with CSS preservation. Tables, headings, lists, images, columns, zones.
- Visual diff. Side-by-side verification of the original PDF against the extracted output. IntersectionObserver-driven page sync.
- Compare Diff. Load two PDFs and diff them with split or unified view, word-level and character-level highlighting.
- Rich text editor. Built-in Monaco editor specialized for document refinement. Edit the extracted HTML and watch the rendered Doc view update live.
- Selection Mode. Treat the rendered HTML like CAD. Drag zones and regions, group them into columns, apply Bootstrap-style layouts.
- Export. Markdown, HTML, DOC, XML, JSON, plus Notion and Google Sheets (Pro).
- Zero data upload. All processing happens locally. No PDF ever leaves your machine.
Pro features
Some features require a Ginexys Pro account:
- Advance Extraction. AI-powered layout analysis via Docling and OpenRouter. Currently a waitlist feature.
- Analyze tab. Per-region confidence scores, zone flag inspector, tolerance controls, and per-page re-extraction.
- Export to Notion and Google Sheets. Direct integration with both platforms.
- Visual Diff report export. Full PDF diff report as a downloadable file.
Free extraction runs as normal. Pro features are clearly marked with a gold PRO badge.
Requires
Ginexys Core Extension is a dependency. VS Code installs it automatically.
Part of the Ginexys pipeline
PDF Processor is the Extract step:
Extract (PDF/image to structured data) PDF Processor (this extension)
v
Transform (reshape, edit, clean) TAFNE
v
Engineer (schematic / topology editor) Schema Editor
Get the whole pipeline: Ginexys Developer Tools.
Privacy
PDFs never leave your machine. Extraction runs entirely in a local Web Worker. The extension contains no telemetry. Pro features that route through Ginexys backend (Docling, OpenRouter) only run when you explicitly enable Advance Extraction on a given file.
License
MIT. See LICENSE.
Source and issues
https://github.com/carnworkstudios/doc-extractor
Web version
https://ginexys.com/tools/pdf-processor/