TripleG3 Windows User

A Visual Studio Code extension that exposes confirmation-gated Windows desktop interaction tools to Copilot-compatible language model flows.

The extension contributes an @windowsUser chat participant and language model tools for:

Capturing all monitors or a selected monitor as an image.
Reading monitor layout and cursor position.
Moving and clicking the mouse.
Holding/releasing mouse buttons and dragging for sliders, selection, resizing, and drag-and-drop.
Pressing keyboard chords such as Ctrl+L, Tab, Enter, and F5.
Typing text into the focused application.
Reading or setting text clipboard contents when approved.
Opening browser/application URIs.
Listing Microsoft Edge profiles and opening Edge with a named profile such as Crystal.
Listing, focusing, maximizing, and capturing specific desktop windows.
Capturing low-cost visual context from a window, screen, or region as downscaled JPEG by default.
Scrolling browser pages, grids, and other scrollable panes with the mouse wheel.
Waiting for UI transitions.

The @windowsUser chat participant uses a monochrome VS Code-style icon at resources/windows-user.svg. It combines Triple G3 branding with a Windows monitor/control motif and is suitable for a future Activity Bar view if one is added.

Publisher

Published by TripleG3 (GitHub). TripleG3 describes its work as software architecture, consulting, and solutions, with public projects and showcases across .NET, C#, Azure, MAUI, Blazor, developer tooling, and Windows-focused automation libraries.

TripleG3's public site tagline is Creative Engineers, supporting a modern world with innovative technology.

Important safety model

This extension can view and control the local Windows desktop. That is powerful and potentially risky, so it is intentionally designed with guardrails:

Tool calls require explicit user confirmation by default.
The extension is disabled in untrusted workspaces.
It does not expose arbitrary shell command execution.
It cannot bypass authentication, MFA, CAPTCHA, access controls, or site terms.
Screen captures may include private information; approve captures only when appropriate.
Text entry uses clipboard paste for reliability and restores the previous text clipboard by default.
Clipboard reads and writes require approval by default because clipboard contents may include secrets or private data.

Privacy and data handling

TripleG3 Windows User is designed to run locally inside Visual Studio Code and not collect information about you.

The extension itself does not collect, sell, or store user information.
The extension does not add custom telemetry or background data collection.
The extension does not proactively push screen captures, clipboard contents, window details, typed text, or other local data anywhere on its own.
The extension only responds to explicit VS Code extension APIs and language model tool calls that are available inside Visual Studio Code.
Copilot-compatible AI features can request these tools when you use the @windowsUser participant or related AI workflows. By default, tool calls require your confirmation before sensitive actions such as screen capture, clipboard access, browser launch, mouse input, keyboard input, or text entry.
If you approve a tool call, the requested result may be provided back to the active VS Code chat or AI experience. How that AI service processes, retains, or uses information after it receives the result is controlled by that AI service, its policies, and your account/configuration, not by this extension.

Review each confirmation carefully before approving it, especially when private information, secrets, customer data, regulated data, or other sensitive content is visible on screen or present in the clipboard.

Try it in VS Code

Open this folder in VS Code.
Run npm install.
Run npm run compile.
Press F5 to launch the Extension Development Host.
Run Windows User: Show Status from the Command Palette to verify the extension can activate and read monitor/cursor metadata.
Open Copilot Chat in the development host.
Ask the participant something like:

@windowsUser open https://example.com, capture the screen, and tell me what is visible

For JIRA workflows, set windowsUser.defaultJiraUrl in VS Code settings or include the JIRA URL in your prompt:

@windowsUser Open JIRA at https://your-company.atlassian.net, go to the current sprint board, filter to me, and summarize the visible issues.

The participant will use small steps: open URI, wait, capture screen, click/type when needed, then report back.

For browser-heavy workflows, the participant can prefer window-aware and low-cost visual actions over full-desktop guessing: list matching windows, focus the right browser/profile window, capture a downscaled JPEG of only that window or a region, scroll inside the relevant pane, and capture again.

For UI workflows that require regular pointer behavior, the participant can drag the mouse directly or use explicit mouse-button down/up actions. Prefer dragMouse for most drags because it always releases the button in a finally block.

Build and install Windows-targeted VSIX packages

This extension is published as Windows-targeted VSIX packages because the runtime tools depend on the local Windows desktop. The default package script builds both Windows x64 and Windows Arm64 artifacts.

npm install
npm run package

Install the generated package that matches your Windows architecture into VS Code:

code --install-extension .\tripleg3-windows-user-win32-x64-0.0.1.vsix

For Windows on Arm64, install tripleg3-windows-user-win32-arm64-0.0.1.vsix instead.

Reload VS Code, open Copilot Chat, and use @windowsUser.

The first Marketplace release is planned as a beta using VS Code Marketplace's Pre-Release channel. To build beta/pre-release VSIX artifacts locally, use npm run package:pre-release or the alias npm run package:beta.

For a quick non-Copilot smoke test, run Windows User: Show Status from the Command Palette. It should show a notification with the monitor count and cursor position, plus detailed JSON in the Windows User output channel.

To test named profile launching, run Windows User: Open Microsoft Edge Profile... from the Command Palette, choose a profile such as Crystal, and enter a URL.

Publish checklist

Before publishing to the Visual Studio Marketplace or an internal extension feed:

Confirm the publisher in package.json matches an existing VS Code Marketplace publisher.
Complete the Marketplace publisher and GitHub secret setup in docs/MARKETPLACE_PUBLISHING_SETUP.md.
Run npm run compile.
Run npm run package:beta and install the generated Windows-targeted .vsix locally.
Validate screen capture, window capture/focus, URI opening, keyboard, mouse, scrolling, and text-entry confirmations.
Review SECURITY.md and keep windowsUser.requireConfirmation enabled by default.
Update CHANGELOG.md for the release.
First release strategy: publish as a beta through the VS Code Marketplace Pre-Release channel. Keep preview: true and use npm run publish:pre-release or npm run publish:beta after authenticating vsce with a publisher token. The script publishes win32-x64 and win32-arm64 targets.
Do not publish a stable release until the beta/pre-release has been validated. When graduating to stable, update the changelog/version strategy and publish without --pre-release.
The GitHub Actions workflow builds and packages pull requests, and automatically publishes the beta on pushes to main after repository secret VSCE_PAT is configured.

Additional manual validation steps are in docs/TESTING.md.

Configuration

windowsUser.requireConfirmation: require confirmation before screen, mouse, keyboard, text, and URI actions. Default: true.
windowsUser.maxToolTurns: maximum tool-calling turns for one participant request. Default: 12.
windowsUser.defaultJiraUrl: optional JIRA URL for prompts that say "open JIRA" without a URL.
windowsUser.restoreClipboardAfterType: restore prior text clipboard after typing. Default: true.

Visual context strategy

Use captureVisualContext for live visual feedback loops. It defaults to:

foreground or matching window capture instead of the whole desktop;
JPEG output instead of PNG;
downscaling to fit within 1280x720;
optional region capture when only part of a screen or window matters;
metadata-only capture with includeImage: false when the model only needs bounds, cursor, or window details.

Use full captureScreen or lossless PNG only when high-fidelity text or exact pixel inspection is needed.

Limitations

The participant relies on screen captures and UI automation. Native APIs, browser APIs, or JIRA REST APIs are more reliable when available.
Visual interpretation depends on the selected Copilot language model's image support.
Some elevated/admin windows may ignore non-elevated input.
Secure desktop prompts, UAC, password managers, and browser security surfaces may not be controllable.
Windows display scaling can affect pixel coordinates; the tool reports physical screen bounds to help the model target clicks.
JPEG visual context is intentionally lossy; use PNG or a smaller region when text is too blurry.