Skip to content
| Marketplace
Sign in
Visual Studio Code>Machine Learning>Local Agent Screen ViewerNew to Visual Studio Code? Get it now.
Local Agent Screen Viewer

Local Agent Screen Viewer

Otimiza.pro

| (0) | Free
Real-time screen capture with Vision API for AI agents. GPU-accelerated (DXGI), change detection, HTTP server for autonomous agent vision.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Local Agent Screen Viewer

Real-time screen capture with Vision API for AI agents. GPU-accelerated, change detection, and an HTTP server that lets any AI agent autonomously see your screen.

Built for Windows AI automation workflows — no manual screenshots needed.

Features

Screen Viewer (Webview Panel)

  • Live Desktop Stream in a VS Code panel at configurable frame rates (1-10 FPS)
  • GPU-Accelerated via DXGI Desktop Duplication (node-screenshots) — minimal CPU usage
  • Automatic Fallback to PowerShell-based capture when GPU is unavailable
  • Dark Theme HUD with live FPS counter, status indicator, pause/resume controls
  • HD Capture button for full-resolution PNG screenshots

Vision Server (HTTP API for AI Agents)

  • HTTP server on 127.0.0.1:7899 — any agent or tool can request screen frames
  • Change Detection — only returns frames when the screen actually changed (saves bandwidth + tokens)
  • 5 endpoints: /frame, /frame/changed, /frame/hd, /status, /capture
  • Zero user interaction — agents see the screen autonomously
  • Works with any language (Python, Node.js, Rust, etc.) via simple HTTP GET

AI Integration

  • AI-Ready Screenshots saved to ~/.local-agent/screenshots/
  • Compatible with the Local Agent Python framework for full computer-use automation
  • Frame data returned as JPEG/PNG for direct LLM vision input

Getting Started

Basic Usage (Screen Viewer)

  1. Open the Command Palette (Ctrl+Shift+P)
  2. Run Local Agent: Iniciar Screen Viewer
  3. A panel opens showing your live desktop
  4. Hover to reveal Pause and Capture HD buttons

Vision Server (for AI Agents)

  1. Ctrl+Shift+P > Local Agent: Iniciar Vision Server (API para Agente)
  2. The server starts on http://127.0.0.1:7899
  3. Any process can now request frames:
# Get current screen frame
curl http://127.0.0.1:7899/frame -o screen.jpg

# Get frame only if screen changed (304 if no change)
curl http://127.0.0.1:7899/frame/changed -o screen.jpg

# Full-resolution PNG
curl http://127.0.0.1:7899/frame/hd -o screen_hd.png

# Server status
curl http://127.0.0.1:7899/status

# Force immediate capture
curl -X POST http://127.0.0.1:7899/capture -o capture.jpg
# Python example
import httpx

r = httpx.get("http://127.0.0.1:7899/frame")
with open("screen.jpg", "wb") as f:
    f.write(r.content)

# Check if screen changed
r = httpx.get("http://127.0.0.1:7899/frame/changed")
if r.status_code == 200:
    print("Screen changed!", len(r.content), "bytes")
elif r.status_code == 304:
    print("No change")

Commands

Command Description
Local Agent: Iniciar Screen Viewer Opens viewer panel + starts Vision Server
Local Agent: Parar Screen Viewer Stops capture and closes panel
Local Agent: Capturar Tela para IA Saves high-res PNG to ~/.local-agent/screenshots/
Local Agent: Iniciar Vision Server Starts HTTP API only (no webview panel)
Local Agent: Parar Vision Server Stops the HTTP Vision Server

Settings

Setting Default Description
localAgent.screenViewer.fps 5 Frames per second (1-10)
localAgent.screenViewer.quality 70 JPEG stream quality (1-100)
localAgent.screenViewer.scale 0.5 Image scale factor (0.1-1.0)
localAgent.visionServer.port 7899 Vision Server HTTP port
localAgent.visionServer.diffThreshold 0.02 Change detection threshold (0.001-0.5)

Vision Server API

GET /frame

Returns the latest screen frame as JPEG.

Response Headers:

  • X-Frame-Timestamp — Unix timestamp (ms) of the frame
  • X-Frame-Changed — "true" or "false" (change detection result)

GET /frame/changed

Returns the frame only if the screen changed since the last request. Returns 304 Not Modified if unchanged.

GET /frame/hd

Captures and returns a full-resolution PNG (on demand, not from the stream).

GET /status

Returns JSON with server and capture state:

{
  "server": "vision-server",
  "version": "0.2.0",
  "running": true,
  "capture": {
    "active": true,
    "backend": "GPU (DXGI)",
    "hasFrame": true,
    "frameCount": 1234,
    "changeDetected": true
  }
}

POST /capture

Forces an immediate capture and returns the frame as JPEG.

Architecture

+------------------------------------------+
|  VS Code Extension                       |
|  +----------------+  +----------------+  |
|  | ScreenCapture  |->| ChangeDetector |  |
|  | (DXGI / PS)    |  | (pixel sample) |  |
|  +----------------+  +-------+--------+  |
|                              |            |
|  +---------------------------v---------+  |
|  | VisionServer :7899                  |  |
|  | GET /frame | /frame/changed | /hd   |  |
|  | GET /status | POST /capture         |  |
|  +---------------------------+---------+  |
+---------------------------- -|------------+
                               | HTTP
           +-------------------v-------------------+
           |  Any AI Agent (Python, Node, etc.)    |
           |  httpx.get("127.0.0.1:7899/frame")   |
           +---------------------------------------+

Capture Backends

Backend Technology Performance
GPU (primary) DXGI Desktop Duplication via node-screenshots Fastest, minimal CPU
PowerShell (fallback) System.Drawing screen capture Works on all Windows

Image resizing and JPEG encoding use sharp for optimal performance.

Requirements

  • Windows 10/11 (DXGI + PowerShell capture are Windows-specific)
  • VS Code 1.85+

Known Limitations

  • Windows only — DXGI and PowerShell capture are Windows-specific APIs
  • Primary monitor — captures the primary display only
  • Vision Server — binds to 127.0.0.1 (localhost only, not exposed to network)

Contributing

Contributions are welcome! Please open an issue or pull request on GitHub.

License

MIT - Anderson Belem (Otimiza.pro)

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft