Skip to content
| Marketplace
Sign in
Visual Studio Code>Visualization>WebSketchNew to Visual Studio Code? Get it now.
WebSketch

WebSketch

mcp-tool-shop

|
2 installs
| (0) | Free
Capture web pages as grammar-based IR for LLM consumption
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

English | 日本語 | 中文 | Español | Français | हिन्दी | Italiano | Português

WebSketch

WebSketch

Turn any web page into a structured tree that LLMs can actually understand.

CI Marketplace License: MIT

The Problem · The Solution · Quick Start · How It Reads · Use Cases · Ecosystem


The Problem

You want an LLM to reason about a web page — its layout, navigation, interactive elements, content hierarchy. Your current options all fall short:

Approach Tokens What you lose
Screenshot 1,000+ (vision) Can't read small text, guesses layout, zero interactivity info
Raw HTML 50,000–500,000 Drowns in div soup, inline styles, scripts, SVG noise
Readability extract 2,000–10,000 Strips all structure — no nav, no buttons, no forms
DOM dump 10,000–100,000 Class names, data attributes, framework artifacts everywhere

The core issue: none of these speak the language of UI. LLMs don't need <div class="sc-bdnxRM jJFqsI"> — they need NAV, BUTTON, CARD, LIST.

The Solution

WebSketch captures the page into a semantic tree using 23 UI primitives. One click in VS Code, paste into any LLM:

PAGE
├─ HEADER {sticky}
│  ├─ *LINK "Home"
│  ├─ *LINK "Products"
│  ├─ *LINK "Pricing"
│  └─ *INPUT <search> "Search..."
├─ SECTION <main>
│  ├─ TEXT <h1> "Welcome to Acme"
│  ├─ TEXT "Build faster with our platform..."
│  └─ *BUTTON "Get Started"
├─ LIST (3 items)
│  ├─ CARD
│  │  ├─ IMAGE "Feature icon"
│  │  └─ TEXT "Real-time sync across devices"
│  ├─ CARD
│  │  ├─ IMAGE "Feature icon"
│  │  └─ TEXT "99.9% uptime guarantee"
│  └─ CARD
│     ├─ IMAGE "Feature icon"
│     └─ TEXT "Enterprise-grade security"
└─ FOOTER
   ├─ NAV
   │  ├─ *LINK "Terms"
   │  ├─ *LINK "Privacy"
   │  └─ *LINK "Contact"
   └─ TEXT "© 2026 Acme Inc."

200–800 tokens. Not 50,000. Not a pixel grid. A clean tree that any text model can reason about.

Head-to-Head

Metric WebSketch Raw HTML Screenshot
Tokens 200–800 50,000+ 1,000+ (vision)
Structure Full semantic tree Nested div chaos Pixel grid
Text content Quoted, labeled Buried in markup OCR-dependent
Interactive elements Marked with * Hidden in attributes Invisible
Heading hierarchy <h1> through <h6> Lost in class names Guessed from size
Landmarks <main>, <nav>, <search> Requires DOM expertise Not available
Works with Any text LLM Nothing useful Vision models only

Quick Start

  1. Install from the VS Code Marketplace or run:
    ext install mcp-tool-shop.websketch-vscode
    
  2. Open the Command Palette (Ctrl+Shift+P / Cmd+Shift+P)
  3. Run WebSketch: Capture URL
  4. Paste a URL and hit Enter
  5. Click Copy for LLM and paste into your prompt

The LLM tab is the default view. One click copies the tree to your clipboard, ready for any model.

How It Reads

Every line in the tree is information-dense and machine-parseable:

├─ *BUTTON <search> {sticky} "Find products"
│    │       │         │         │
│    │       │         │         └─ Actual visible text
│    │       │         └─ Flags (sticky, scrollable)
│    │       └─ Semantic hint (search, main, h1, aside...)
│    └─ Interactive (clickable/typeable)
│
└─ Tree structure shows parent-child nesting

The Grammar

Symbol Meaning Example
* prefix User can interact with it *LINK, *BUTTON, *INPUT
<semantic> HTML5/ARIA meaning preserved <h1>, <main>, <search>, <aside>
{flags} Layout behavior {sticky}, {scrollable}
"label" Visible text content "Sign up free", "Search..."
(N items) List item count LIST (12 items)
Indentation Parent-child hierarchy HEADER > NAV > LIST > LINK

The 23 Roles

Category Roles
Layout PAGE, HEADER, FOOTER, SECTION, NAV
Content TEXT, IMAGE, ICON, CARD, LIST, TABLE
Interactive BUTTON, LINK, INPUT, CHECKBOX, RADIO, FORM
Overlays MODAL, TOAST, DROPDOWN
Navigation PAGINATION
Fallback UNKNOWN

These 23 roles are a fixed vocabulary — the same across every website. LLMs learn them once and can reason about any page.

What Gets Captured

WebSketch doesn't just dump the DOM. It runs a 5-tier classifier on every visible element:

Tier Source Example
1. ARIA role role="navigation" → NAV
2. HTML tag <button>, <h1> → BUTTON, TEXT <h1>
3. Class heuristics .card, .modal, .toast → CARD, MODAL, TOAST
4. Structural analysis 3+ same-role siblings → LIST
5. Fallback Text-only elements → TEXT, SECTION, UNKNOWN

Then it cleans up:

  • Transparent table traversal — TR/TD/TH/LI are skipped, children promoted to the surface
  • Zero-content pruning — empty, non-interactive, invisible nodes dropped
  • Wrapper collapsing — meaningless single-child SECTION wrappers removed
  • Cascading prune — hollow wrapper chains with no content are eliminated entirely
  • Label extraction — visible text pulled from links, buttons, headings, images, inputs

The result: a clean tree with the minimum nodes needed to understand the page.

Use Cases

For Prompt Engineers

"Describe this page's layout" — Paste the tree. The LLM sees exact structure, headings, and navigation without drowning in HTML. Works with ChatGPT, Claude, Gemini, Llama — any text model.

"What can a user do on this page?" — Every interactive element is marked with *. Links, buttons, inputs, checkboxes — all labeled with their visible text. The LLM can enumerate every possible user action.

"Compare these two pages" — Two trees side by side. The LLM can diff structure, spot missing elements, compare navigation patterns — all in a few hundred tokens.

For Developers

"Generate a test plan for this UI" — The tree maps directly to test targets. *BUTTON "Submit", *INPUT <email> "Enter email", *LINK "Terms" — each is a testable interaction with its visible label.

"Build something that looks like this" — The semantic tree is close to a component hierarchy. HEADER > NAV > LIST > LINK maps directly to React/Vue/Svelte components. The LLM can scaffold a matching layout.

"Audit this page for accessibility" — Missing landmarks, unlabeled inputs, heading hierarchy gaps — all visible in the tree. Semantic hints like <main>, <nav>, <search> show what ARIA roles are present (or absent).

For AI Agents

MCP integration — Use websketch-mcp to give your AI agent the ability to capture and reason about any web page as part of its tool chain.

Automated monitoring — Use websketch-cli to capture pages on a schedule, diff the trees, and detect structural changes.

Four Views

Tab What it shows Best for
LLM (default) Indented semantic tree with labels, semantics, flags Pasting into LLM prompts
ASCII Box-drawing wireframe with spatial layout Visual layout understanding
Tree Collapsible node tree with color-coded role badges Debugging captures
JSON Full WebSketchCapture IR with syntax highlighting Programmatic use and pipelines

Commands

Command Description
WebSketch: Capture URL Prompt for URL, capture, and display
WebSketch: Capture URL from Clipboard Capture whatever URL is on your clipboard
WebSketch: Copy LLM Tree to Clipboard Copy the tree — paste straight into ChatGPT, Claude, etc.
WebSketch: Export LLM Tree Save as .md for prompt libraries or docs
WebSketch: Export Capture as JSON Full IR capture with bboxes, hashes, metadata
WebSketch: Export ASCII Wireframe Box-drawing layout view

Settings

Setting Default Description
websketch.chromePath Auto-detect Path to Chrome or Edge executable
websketch.viewportWidth 1280 Viewport width in pixels
websketch.viewportHeight 800 Viewport height in pixels
websketch.timeout 30000 Navigation timeout (ms)
websketch.waitAfterLoad 1000 Extra wait for JS rendering (ms)

Ecosystem

WebSketch is a family of tools built on a shared grammar:

Package What it does
@mcptoolshop/websketch-ir Core IR — grammar, validation, rendering, diffing, fingerprinting
websketch-vscode VS Code extension — capture pages from your editor (this repo)
websketch-cli Command-line capture and rendering
websketch-extension Chrome extension for in-browser capture
websketch-mcp MCP server for LLM agent integration

All tools produce the same WebSketchCapture IR, so outputs are interchangeable between pipelines.

Requirements

  • VS Code 1.85+
  • Chrome or Edge installed on your system

No bundled browser. No 200MB download. WebSketch uses puppeteer-core with whatever browser you already have.

License

MIT License — see LICENSE for details.

Part of MCP Tool Shop

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft