Skip to content
| Marketplace
Sign in
Visual Studio Code>Linters>Reveal Unicode PoisoningNew to Visual Studio Code? Get it now.
Reveal Unicode Poisoning

Reveal Unicode Poisoning

Resat Caner Bas

|
2 installs
| (0) | Free
Detects and reveals invisible Unicode payloads (Tags block, Bidi overrides, homoglyphs) in source files
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Reveal Unicode Poisoning

A VS Code extension that detects and exposes invisible Unicode characters embedded in source files — a technique used to smuggle hidden payloads, execute Trojan Source attacks, or deceive AI code-review tools.

The Threat

The Unicode Tags block (U+E0000–U+E007F) contains characters that are completely invisible in virtually every rendering surface: browsers, terminals, code editors, and AI assistants. Each tag character maps to a printable ASCII equivalent offset by 0xE0000, so a sequence of them can encode an arbitrary hidden message inside an otherwise normal-looking file.

Three threat categories are detected:

Priority Category Codepoints Risk
1 Unicode Tags U+E0000–U+E007F Invisible payload — silent instruction injection
2 Bidi Overrides U+202A–U+202E, U+2066–U+2069 Trojan Source — reorders visible text
3 Homoglyphs Cyrillic, Greek, Fullwidth Lookalike characters that bypass identifier checks

Features

  • Status bar item — always visible; shows ⚠ 3 hidden chars in red when findings exist, green shield when clean. Click to open the reveal panel.
  • Gutter dots + wavy underlines — colour-coded markers at every flagged character position (even though the characters are invisible, the cursor stop remains).
  • Hover cards — hover any flagged position for a table showing codepoint, decoded value, line, and column.
  • Problems panel — findings are pushed to VS Code's diagnostics so they appear in the Problems tab and survive CI lint passes.
  • Reveal panel — a dedicated webview (opens beside the editor) that shows:
    • A red banner with the fully reconstructed hidden payload.
    • The full annotated source with every invisible character rendered as [U+E0041 → 'A'].
    • A sortable findings table.
  • Strip command — removes all flagged characters and saves a clean copy after confirmation.

Commands

Command Title
unicodePoisonDetector.scan Scan File for Unicode Poison
unicodePoisonDetector.reveal Reveal Hidden Payload
unicodePoisonDetector.strip Strip All Suspicious Characters

Configuration

Setting Default Description
unicodePoisonDetector.scanOnSave true Scan automatically on save
unicodePoisonDetector.scanOnOpen true Scan automatically on open
unicodePoisonDetector.severity "error" Diagnostic severity: error, warning, info

Running Locally

npm install
npm run compile
# Press F5 in VS Code to launch the Extension Development Host

Open samples/poisoned.ts in the development host to see the extension fire immediately.

Detection Logic

The scanner iterates real Unicode codepoints (not UTF-16 code units) using String.prototype.codePointAt, advancing by 2 for surrogate pairs. This avoids double-counting the high and low surrogates of characters above U+FFFF — which is exactly what tag block characters are.

while (i < text.length) {
  const cp = text.codePointAt(i)!;
  if (cp >= 0xE0000 && cp <= 0xE007F) {
    // hidden tag — decoded glyph is String.fromCodePoint(cp - 0xE0000)
  }
  i += cp > 0xFFFF ? 2 : 1;
}

Project Structure

src/
  extension.ts   — activate / deactivate, commands, lifecycle hooks
  scanner.ts     — core codepoint-level detection (all three threat tiers)
  decoder.ts     — tokenizer and summary builder
  decorations.ts — gutter dots, wavy underlines, hover cards
  panel.ts       — webview reveal panel with annotated source + findings table
samples/
  poisoned.ts    — demonstration file with embedded invisible payload

References

  • Unicode Tags block — The Unicode Standard
  • Trojan Source: Invisible Vulnerabilities (CVE-2021-42574)
  • Invisible Prompt Injection via Unicode Tags — Simon Willison
  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft