📸 MCP Screenshot - VS Code Extension
Give AI agents visual superpowers to see, analyze, and document your applications like never before.
This isn't just another screenshot tool—it's the bridge that gives AI agents visual awareness of your development environment, enabling them to become intelligent documentation partners, UI analyzers, and visual debugging assistants.
🔗 Repository
This package is now maintained in its own repository: https://github.com/Digital-Defiance/vscode-mcp-screenshot
This repository is part of the AI Capabilitites Suite on GitHub.
🤔 Why Do AI Agents Need to See Your Screen?
AI agents today are powerful but blind. They can read your code and suggest improvements, but they can't see what your application actually looks like or how users interact with it. This creates a massive gap in their ability to help you.
🎯 Revolutionary Use Cases
📚 "AI, create documentation for this feature"
You: "Document the new dashboard layout"
AI: *Captures screenshots, analyzes UI components*
AI: "Created comprehensive guide with 8 annotated screenshots showing navigation flow, key features, and user interactions..."
🐛 "AI, help debug this UI issue"
You: "The button alignment looks wrong"
AI: *Captures current state, analyzes layout*
AI: "Button is 3px off-center. Here's the CSS fix and a before/after comparison..."
🎥 "AI, create a demo walkthrough"
You: "Show how the authentication flow works"
AI: *Captures key screens, creates step-by-step guide*
AI: "Generated 6-step visual walkthrough with annotations and user journey map..."
🔍 "AI, analyze this design for accessibility"
You: "Check if this form is accessible"
AI: *Captures form, analyzes contrast and layout*
AI: "Found 3 accessibility issues: low contrast on labels, missing focus indicators, inadequate spacing..."
📊 "AI, compare these two implementations"
You: "Which design works better?"
AI: *Captures both versions, analyzes differences*
AI: "Version B has 23% better visual hierarchy and clearer call-to-action placement..."
🎨 "AI, help with responsive design"
You: "How does this look on different screen sizes?"
AI: *Captures multiple viewport sizes*
AI: "Mobile layout breaks at 768px - navigation overlaps content. Here's the media query fix..."
✨ What This Changes
Before: AI could only work with code and text descriptions
- ❌ "The button looks weird" → AI guesses what you mean
- ❌ "Create documentation" → AI writes generic text
- ❌ "Check the layout" → AI can't see the actual result
After: AI can see and analyze your actual application
- ✅ Visual debugging - AI sees exactly what's wrong
- ✅ Intelligent documentation - AI creates guides with real screenshots
- ✅ Design analysis - AI evaluates actual user interfaces
- ✅ Accessibility audits - AI checks real visual contrast and layout
- ✅ Responsive testing - AI captures and compares different screen sizes
🚀 Features
Screenshot Capabilities
- Full Screen Capture: Capture entire displays or specific monitors
- Window Capture: Target specific application windows
- Region Capture: Capture rectangular screen regions
- Multi-Format Support: PNG, JPEG, WebP, BMP with quality control
- PII Masking: Automatic detection and redaction of sensitive information
- Multi-Monitor Support: Works seamlessly with multiple displays
- Privacy Controls: Exclude sensitive windows and applications
- MCP Integration: Purpose-built for AI agent workflows
Language Server Protocol (LSP) Features
The extension includes a built-in Language Server that provides intelligent code assistance for screenshot-related operations:
- Hover Information: Get instant documentation when hovering over screenshot functions, configuration objects, and identifiers
- Code Lenses: Quick action buttons appear inline for capturing screenshots, listing displays, and listing windows
- Diagnostics: Real-time validation of screenshot parameters with helpful error messages and suggestions
- Code Completion: Smart autocomplete for screenshot configuration properties and parameter values
- AI Agent Commands: Programmatic command execution for automated screenshot workflows
- Multi-Language Support: Works with JavaScript, TypeScript, JSX, TSX, and JSON configuration files
Installation
- Install from VS Code Marketplace (coming soon)
- Or install from VSIX file:
code --install-extension mcp-screenshot-0.0.1.vsix
Usage
Commands
- MCP Screenshot: Capture Full Screen - Capture the entire screen
- MCP Screenshot: Capture Window - Select and capture a specific window
- MCP Screenshot: Capture Region - Capture a rectangular region
- MCP Screenshot: List Displays - Show all connected displays
- MCP Screenshot: List Windows - Show all visible windows
- MCP Screenshot: Open Settings - Configure extension settings
LSP Features
Hover over screenshot-related code to see documentation:
- Function Calls: Hover over
captureFullScreen(), captureWindow(), or captureRegion() to see parameter documentation and examples
- Configuration Objects: Hover over screenshot configuration properties to see valid values and types
- Identifiers: Hover over display or window IDs to see information about that resource (when available)
All hover information is formatted as markdown with clear sections for parameters, return values, and usage examples.
Code Lenses
Code lenses appear as inline action buttons in your code:
- 📸 Capture Screenshot: Appears near screenshot capture functions - click to execute the capture
- 🖥️ List Displays: Appears near display enumeration code - click to see all connected displays
- 🪟 List Windows: Appears near window enumeration code - click to see all visible windows
Code lenses provide quick access to screenshot operations without leaving your editor.
Diagnostics
Real-time validation catches issues before runtime:
- Invalid Format: Warns when format is not 'png', 'jpeg', or 'webp' and suggests valid options
- Quality Range: Errors when quality parameter is outside 0-100 range
- Missing Parameters: Errors when required screenshot parameters are missing
- Deprecated APIs: Informational messages for deprecated screenshot APIs with migration guidance
All diagnostics include the exact location, clear messages, and suggested fixes.
Code Completion
Smart autocomplete for screenshot code:
- Configuration Properties: Type in a screenshot config object to see all valid properties with documentation
- Format Values: Autocomplete suggests 'png', 'jpeg', 'webp' when typing format parameters
- Quality Values: Autocomplete suggests common quality values (80, 90, 95, 100)
All completion items include documentation and insert with correct syntax.
AI Agent Commands
The LSP exposes commands for programmatic execution:
mcp.screenshot.capture: Execute screenshot capture with parameters
mcp.screenshot.listDisplays: Get list of available displays
mcp.screenshot.listWindows: Get list of available windows
mcp.screenshot.getCapabilities: Get screenshot system capabilities
Commands return structured results or errors for reliable automation.
Keyboard Shortcuts
You can assign custom keyboard shortcuts to any command via VS Code's keyboard shortcuts settings.
Configuration
Configure the extension via VS Code settings:
{
"mcpScreenshot.defaultFormat": "png",
"mcpScreenshot.defaultQuality": 90,
"mcpScreenshot.saveDirectory": "${workspaceFolder}/screenshots",
"mcpScreenshot.enablePIIMasking": false,
"mcpScreenshot.autoSave": true,
"mcpScreenshot.autoStart": true
}
Settings
mcpScreenshot.defaultFormat: Default image format (png, jpeg, webp, bmp)
mcpScreenshot.defaultQuality: Default quality for lossy formats (1-100)
mcpScreenshot.saveDirectory: Default directory for saving screenshots
mcpScreenshot.enablePIIMasking: Enable PII detection and masking by default
mcpScreenshot.autoSave: Automatically save screenshots to disk
mcpScreenshot.autoStart: Automatically start MCP server when VS Code opens
mcpScreenshot.serverCommand: Command to run MCP screenshot server
mcpScreenshot.serverArgs: Arguments for MCP screenshot server command
Requirements
- Visual Studio Code 1.85.0 or higher
- Node.js 18.0.0 or higher
- Platform-specific dependencies:
- Linux: X11 or Wayland, ImageMagick
- macOS: screencapture (built-in)
- Windows: screenshot-desktop library
🎮 Real-World Examples
AI-Powered Workflows
Documentation Generation
1. You: "@copilot Document the user registration flow"
2. AI: *Uses MCP Screenshot to capture each step*
3. AI: *Analyzes UI elements and user journey*
4. AI: *Generates markdown with embedded screenshots*
5. Result: Complete documentation with visual guides
Bug Report Creation
1. You: "@copilot This form validation isn't working right"
2. AI: *Captures current state and error conditions*
3. AI: *Analyzes expected vs actual behavior*
4. AI: *Creates detailed bug report with screenshots*
5. Result: Professional bug report ready for your team
Design Review & Feedback
1. You: "@copilot Review this new feature design"
2. AI: *Captures different states and interactions*
3. AI: *Analyzes usability and accessibility*
4. AI: *Provides specific improvement suggestions*
5. Result: Actionable design feedback with visual examples
Responsive Design Testing
1. You: "@copilot Check how this looks on mobile"
2. AI: *Captures multiple viewport sizes*
3. AI: *Identifies layout issues and breakpoints*
4. AI: *Suggests CSS improvements*
5. Result: Responsive design fixes with before/after comparisons
Manual Commands
Capture Full Screen
- Open Command Palette (Ctrl+Shift+P / Cmd+Shift+P)
- Run "MCP Screenshot: Capture Full Screen"
- Screenshot is saved to configured directory
Capture Specific Window
- Open Command Palette
- Run "MCP Screenshot: Capture Window"
- Select window from the list
- Choose whether to include window frame
- Screenshot is captured
Capture Region
- Open Command Palette
- Run "MCP Screenshot: Capture Region"
- Enter coordinates and dimensions
- Screenshot is captured
Using LSP Features in Code
// Hover over captureFullScreen to see documentation
const screenshot = await captureFullScreen({
format: 'png', // Hover to see valid formats
quality: 90 // Hover to see valid range
});
Example 2: Code Lenses
// A code lens "📸 Capture Screenshot" appears above this function
async function takeScreenshot() {
const result = await captureFullScreen({ format: 'png' });
return result;
}
// A code lens "🖥️ List Displays" appears above this function
async function getDisplays() {
const displays = await listDisplays();
return displays;
}
Example 3: Diagnostics
// ❌ Error: Quality must be between 0 and 100
const screenshot = await captureFullScreen({
format: 'png',
quality: 150 // Diagnostic appears here
});
// ⚠️ Warning: Invalid format, use 'png', 'jpeg', or 'webp'
const screenshot2 = await captureFullScreen({
format: 'gif' // Diagnostic appears here
});
Example 4: Code Completion
// Type inside the config object to see completions
const screenshot = await captureFullScreen({
// Type 'f' to see 'format' completion
// Type 'q' to see 'quality' completion
// Type 'e' to see 'enablePIIMasking' completion
});
Example 5: AI Agent Command Execution
// AI agents can execute commands programmatically
const result = await vscode.commands.executeCommand(
'mcp.screenshot.capture',
{
type: 'fullscreen',
format: 'png',
quality: 90
}
);
const displays = await vscode.commands.executeCommand(
'mcp.screenshot.listDisplays'
);
Privacy & Security
- PII Masking: Automatically detect and redact emails, phone numbers, and credit cards
- Window Exclusion: Exclude password managers and authentication dialogs
- Path Validation: Restrict file saves to allowed directories
- Rate Limiting: Prevent capture spam
Troubleshooting
Extension Not Starting
Check the Output panel (View → Output → MCP Screenshot) for error messages.
Permission Errors on Linux
Ensure X11 access:
xhost +local:
macOS Screen Recording Permission
Grant screen recording permission:
- System Preferences → Security & Privacy → Privacy
- Select "Screen Recording"
- Add Visual Studio Code
Support
License
MIT License - See LICENSE file for details
Contributing
Contributions are welcome! Please see our contributing guidelines in the repository.
Supported File Types
The LSP features work in the following file types:
- JavaScript (
.js)
- TypeScript (
.ts)
- JSX (
.jsx)
- TSX (
.tsx)
- JSON (
.json) - Configuration validation only
Changelog
0.1.0 (LSP Integration)
- Added Language Server Protocol support
- Hover information for screenshot APIs
- Code lenses for quick actions
- Real-time diagnostics and validation
- Code completion for configuration
- AI agent command execution
- Multi-language support (JS, TS, JSX, TSX, JSON)
0.0.1 (Initial Release)
- Full screen capture
- Window capture
- Region capture
- Multi-format support
- PII masking
- Multi-monitor support
- MCP integration