English | 日本語 | 中文 | Español | Français | हिन्दी | Italiano | Português

WebSketch
Turn any web page into a structured tree that LLMs can actually understand.
The Problem ·
The Solution ·
Quick Start ·
How It Reads ·
Use Cases ·
Ecosystem
The Problem
You want an LLM to reason about a web page — its layout, navigation, interactive elements, content hierarchy. Your current options all fall short:
| Approach |
Tokens |
What you lose |
| Screenshot |
1,000+ (vision) |
Can't read small text, guesses layout, zero interactivity info |
| Raw HTML |
50,000–500,000 |
Drowns in div soup, inline styles, scripts, SVG noise |
| Readability extract |
2,000–10,000 |
Strips all structure — no nav, no buttons, no forms |
| DOM dump |
10,000–100,000 |
Class names, data attributes, framework artifacts everywhere |
The core issue: none of these speak the language of UI. LLMs don't need <div class="sc-bdnxRM jJFqsI"> — they need NAV, BUTTON, CARD, LIST.
The Solution
WebSketch captures the page into a semantic tree using 23 UI primitives. One click in VS Code, paste into any LLM:
PAGE
├─ HEADER {sticky}
│ ├─ *LINK "Home"
│ ├─ *LINK "Products"
│ ├─ *LINK "Pricing"
│ └─ *INPUT <search> "Search..."
├─ SECTION <main>
│ ├─ TEXT <h1> "Welcome to Acme"
│ ├─ TEXT "Build faster with our platform..."
│ └─ *BUTTON "Get Started"
├─ LIST (3 items)
│ ├─ CARD
│ │ ├─ IMAGE "Feature icon"
│ │ └─ TEXT "Real-time sync across devices"
│ ├─ CARD
│ │ ├─ IMAGE "Feature icon"
│ │ └─ TEXT "99.9% uptime guarantee"
│ └─ CARD
│ ├─ IMAGE "Feature icon"
│ └─ TEXT "Enterprise-grade security"
└─ FOOTER
├─ NAV
│ ├─ *LINK "Terms"
│ ├─ *LINK "Privacy"
│ └─ *LINK "Contact"
└─ TEXT "© 2026 Acme Inc."
200–800 tokens. Not 50,000. Not a pixel grid. A clean tree that any text model can reason about.
Head-to-Head
| Metric |
WebSketch |
Raw HTML |
Screenshot |
| Tokens |
200–800 |
50,000+ |
1,000+ (vision) |
| Structure |
Full semantic tree |
Nested div chaos |
Pixel grid |
| Text content |
Quoted, labeled |
Buried in markup |
OCR-dependent |
| Interactive elements |
Marked with * |
Hidden in attributes |
Invisible |
| Heading hierarchy |
<h1> through <h6> |
Lost in class names |
Guessed from size |
| Landmarks |
<main>, <nav>, <search> |
Requires DOM expertise |
Not available |
| Works with |
Any text LLM |
Nothing useful |
Vision models only |
Quick Start
- Install from the VS Code Marketplace or run:
ext install mcp-tool-shop.websketch-vscode
- Open the Command Palette (
Ctrl+Shift+P / Cmd+Shift+P)
- Run WebSketch: Capture URL
- Paste a URL and hit Enter
- Click Copy for LLM and paste into your prompt
The LLM tab is the default view. One click copies the tree to your clipboard, ready for any model.
How It Reads
Every line in the tree is information-dense and machine-parseable:
├─ *BUTTON <search> {sticky} "Find products"
│ │ │ │ │
│ │ │ │ └─ Actual visible text
│ │ │ └─ Flags (sticky, scrollable)
│ │ └─ Semantic hint (search, main, h1, aside...)
│ └─ Interactive (clickable/typeable)
│
└─ Tree structure shows parent-child nesting
The Grammar
| Symbol |
Meaning |
Example |
* prefix |
User can interact with it |
*LINK, *BUTTON, *INPUT |
<semantic> |
HTML5/ARIA meaning preserved |
<h1>, <main>, <search>, <aside> |
{flags} |
Layout behavior |
{sticky}, {scrollable} |
"label" |
Visible text content |
"Sign up free", "Search..." |
(N items) |
List item count |
LIST (12 items) |
| Indentation |
Parent-child hierarchy |
HEADER > NAV > LIST > LINK |
The 23 Roles
| Category |
Roles |
| Layout |
PAGE, HEADER, FOOTER, SECTION, NAV |
| Content |
TEXT, IMAGE, ICON, CARD, LIST, TABLE |
| Interactive |
BUTTON, LINK, INPUT, CHECKBOX, RADIO, FORM |
| Overlays |
MODAL, TOAST, DROPDOWN |
| Navigation |
PAGINATION |
| Fallback |
UNKNOWN |
These 23 roles are a fixed vocabulary — the same across every website. LLMs learn them once and can reason about any page.
What Gets Captured
WebSketch doesn't just dump the DOM. It runs a 5-tier classifier on every visible element:
| Tier |
Source |
Example |
| 1. ARIA role |
role="navigation" |
→ NAV |
| 2. HTML tag |
<button>, <h1> |
→ BUTTON, TEXT <h1> |
| 3. Class heuristics |
.card, .modal, .toast |
→ CARD, MODAL, TOAST |
| 4. Structural analysis |
3+ same-role siblings |
→ LIST |
| 5. Fallback |
Text-only elements |
→ TEXT, SECTION, UNKNOWN |
Then it cleans up:
- Transparent table traversal —
TR/TD/TH/LI are skipped, children promoted to the surface
- Zero-content pruning — empty, non-interactive, invisible nodes dropped
- Wrapper collapsing — meaningless single-child
SECTION wrappers removed
- Cascading prune — hollow wrapper chains with no content are eliminated entirely
- Label extraction — visible text pulled from links, buttons, headings, images, inputs
The result: a clean tree with the minimum nodes needed to understand the page.
Use Cases
For Prompt Engineers
"Describe this page's layout" — Paste the tree. The LLM sees exact structure, headings, and navigation without drowning in HTML. Works with ChatGPT, Claude, Gemini, Llama — any text model.
"What can a user do on this page?" — Every interactive element is marked with *. Links, buttons, inputs, checkboxes — all labeled with their visible text. The LLM can enumerate every possible user action.
"Compare these two pages" — Two trees side by side. The LLM can diff structure, spot missing elements, compare navigation patterns — all in a few hundred tokens.
For Developers
"Generate a test plan for this UI" — The tree maps directly to test targets. *BUTTON "Submit", *INPUT <email> "Enter email", *LINK "Terms" — each is a testable interaction with its visible label.
"Build something that looks like this" — The semantic tree is close to a component hierarchy. HEADER > NAV > LIST > LINK maps directly to React/Vue/Svelte components. The LLM can scaffold a matching layout.
"Audit this page for accessibility" — Missing landmarks, unlabeled inputs, heading hierarchy gaps — all visible in the tree. Semantic hints like <main>, <nav>, <search> show what ARIA roles are present (or absent).
For AI Agents
MCP integration — Use websketch-mcp to give your AI agent the ability to capture and reason about any web page as part of its tool chain.
Automated monitoring — Use websketch-cli to capture pages on a schedule, diff the trees, and detect structural changes.
Four Views
| Tab |
What it shows |
Best for |
| LLM (default) |
Indented semantic tree with labels, semantics, flags |
Pasting into LLM prompts |
| ASCII |
Box-drawing wireframe with spatial layout |
Visual layout understanding |
| Tree |
Collapsible node tree with color-coded role badges |
Debugging captures |
| JSON |
Full WebSketchCapture IR with syntax highlighting |
Programmatic use and pipelines |
Commands
| Command |
Description |
WebSketch: Capture URL |
Prompt for URL, capture, and display |
WebSketch: Capture URL from Clipboard |
Capture whatever URL is on your clipboard |
WebSketch: Copy LLM Tree to Clipboard |
Copy the tree — paste straight into ChatGPT, Claude, etc. |
WebSketch: Export LLM Tree |
Save as .md for prompt libraries or docs |
WebSketch: Export Capture as JSON |
Full IR capture with bboxes, hashes, metadata |
WebSketch: Export ASCII Wireframe |
Box-drawing layout view |
Settings
| Setting |
Default |
Description |
websketch.chromePath |
Auto-detect |
Path to Chrome or Edge executable |
websketch.viewportWidth |
1280 |
Viewport width in pixels |
websketch.viewportHeight |
800 |
Viewport height in pixels |
websketch.timeout |
30000 |
Navigation timeout (ms) |
websketch.waitAfterLoad |
1000 |
Extra wait for JS rendering (ms) |
Ecosystem
WebSketch is a family of tools built on a shared grammar:
| Package |
What it does |
| @mcptoolshop/websketch-ir |
Core IR — grammar, validation, rendering, diffing, fingerprinting |
| websketch-vscode |
VS Code extension — capture pages from your editor (this repo) |
| websketch-cli |
Command-line capture and rendering |
| websketch-extension |
Chrome extension for in-browser capture |
| websketch-mcp |
MCP server for LLM agent integration |
All tools produce the same WebSketchCapture IR, so outputs are interchangeable between pipelines.
Requirements
- VS Code 1.85+
- Chrome or Edge installed on your system
No bundled browser. No 200MB download. WebSketch uses puppeteer-core with whatever browser you already have.
License
MIT License — see LICENSE for details.
Part of MCP Tool Shop