Skip to content
| Marketplace
Sign in
Visual Studio Code>Formatters>PDF XtractNew to Visual Studio Code? Get it now.
PDF Xtract

PDF Xtract

AJAL R

|
4 installs
| (0) | Free
Convert PDF files to TXT or JSON format — supports scanned PDFs with built-in OCR (Tesseract)
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

PDF Xtract

A Visual Studio Code extension that converts PDF files to TXT or JSON format — with built-in OCR for scanned and vector-graphics PDFs.

Features

  • Convert PDF to TXT: Extract plain text content from PDF files
  • Convert PDF to JSON: Extract PDF content with metadata in structured JSON format
  • Built-in OCR: Automatically uses Tesseract OCR for PDFs without embedded text (scanned docs, "Print to PDF", etc.)
  • Context Menu Integration: Right-click on any PDF file in the explorer to convert
  • Command Palette: Access conversion commands via Command Palette (Ctrl+Shift+P)
  • Progress Feedback: Visual progress indicator during conversion

Usage

Method 1: Context Menu (Right-click)

  1. Right-click on any .pdf file in the VS Code Explorer
  2. Select either:
    • PDF: Convert to TXT - Extracts plain text
    • PDF: Convert to JSON - Extracts text with metadata

Method 2: Command Palette

  1. Press Ctrl+Shift+P (or Cmd+Shift+P on Mac)
  2. Type "PDF" and select:
    • PDF: Convert to TXT
    • PDF: Convert to JSON
  3. Select the PDF file you want to convert

Output Format

TXT Format

  • Simple plain text extraction
  • Preserves text content from all pages
  • Saved as filename.txt in the same directory

JSON Format

Structured output with metadata:

{
  "metadata": {
    "filename": "document.pdf",
    "convertedAt": "2026-02-26T...",
    "totalPages": 5,
    "info": {
      "Title": "Document Title",
      "Author": "Author Name",
      ...
    }
  },
  "content": {
    "text": "Full text content...",
    "pages": 5,
    "version": "1.7"
  }
}

Installation

From Source

  1. Clone or download this repository
  2. Copy the folder to your VS Code extensions directory:
    • Windows: %USERPROFILE%\.vscode\extensions
    • macOS/Linux: ~/.vscode/extensions
  3. Run npm install in the extension folder
  4. Restart VS Code

From VSIX Package (if available)

  1. Download the .vsix file
  2. In VS Code, go to Extensions view (Ctrl+Shift+X)
  3. Click the "..." menu at the top
  4. Select "Install from VSIX..."
  5. Choose the downloaded file

Development Setup

# Install dependencies
npm install

# Run the extension in development mode
# Press F5 in VS Code to open Extension Development Host

Requirements

  • Visual Studio Code 1.80.0 or higher
  • Node.js installed for dependency management

Dependencies

  • pdf-parse: PDF parsing library

Known Limitations

  • Complex PDF layouts may not preserve exact formatting in text output
  • Scanned PDFs (images) require OCR and are not supported
  • Very large PDFs may take longer to process

Release Notes

1.0.0

  • Initial release
  • PDF to TXT conversion
  • PDF to JSON conversion with metadata
  • Context menu integration

Contributing

Feel free to submit issues and enhancement requests!

License

MIT License

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft