The Document Analyzer is a Visual Studio Code extension that allows users to compare two documents (DOCX, TXT, or PDF) using a local Ollama language model. It extracts text from the documents, analyzes their similarity or requirement matching using the selected model, and presents a detailed report with a matching percentage, matched items, unmatched items, and additional notes.
Features
Supported File Types: Compare documents in DOCX, TXT, or PDF formats.
Local LLM Analysis: Uses a local Ollama instance for secure, offline document comparison.
Progress Feedback: Displays a progress bar during analysis with status updates.
Split View Editing: Opens documents side-by-side for editing before analysis.
Detailed Results: Provides a markdown report with:
Matching percentage (0–100%).
Lists of matched and unmatched items.
Additional notes from the analysis.
User-Friendly Prompts: Clear instructions for selecting files, with supported formats (DOCX, TXT, PDF) displayed in prompts and file pickers.
Error Handling: Robust error messages and logging to an output channel for debugging.
Requirements
Visual Studio Code: Version 1.85.0 or higher.
Node.js: Version 16.x or higher for building the extension.
Ollama: A local Ollama server running at http://localhost:11434 with at least one model installed (e.g., llama3.2:latest).
Dependencies: The following Node.js packages are required (automatically installed during setup):
axios: For HTTP requests to the Ollama API.
mammoth: For extracting text from DOCX files.
pdf-parse: For extracting text from PDF files.
chalk: For colored console logging.
Usage
Open VS Code and ensure the extension is installed.
Run the Command:
Open the Command Palette (Ctrl+Shift+P or Cmd+Shift+P on macOS).
Select Document Analyzer: Analyze Document
Select a Model:
Choose an Ollama model from the list (fetched from your local Ollama server).
Select Documents:
If a supported file (DOCX, TXT, PDF) is open in the active editor, it will be used as the first document.
Otherwise, a file picker will prompt for the first document, with a note about supported formats (DOCX, TXT, PDF).
Select the second document via a file picker, which also displays supported formats.
Edit Documents (Optional):
The selected documents open in a split view for editing.
Choose Analyze Now to proceed or Edit Documents to modify them first.
View Analysis:
A progress bar appears during analysis, showing steps like "Preparing documents..." and "Processing response...".
Results are displayed in a new markdown editor with:
Matching percentage.
Lists of matched and unmatched items.
Additional notes.
Results are also logged to the Document Analyzer output channel.
Review Logs:
- Open the Output panel (Ctrl+Shift+U) and select Document Analyzer to view logs for debugging or details.