Watson

Automate smarter locator discovery.

Watson is a Visual Studio Code extension that analyzes your application's source code to proactively map out the most reliable element identifiers—IDs, unique classes, custom data attributes, and more—so automated tests can rely on stable selectors instead of brittle index-based locators directly inside the editor.

Key Features (Baseline)

Element inventory – statically parse supported frontend stacks to list every element's id, name, CSS classes, and custom data-* attributes.
Locator quality metrics – compute counts for:
- Elements with IDs / without IDs
- Elements with names / without names
- Duplicate IDs
- Duplicate names
Actionable reports – export structured JSON/CSV plus human-friendly summaries to feed QA dashboards or CI logs.
Configurable gates – define per-metric thresholds (errors vs warnings) and emit GitHub-friendly annotations for CI pipelines.
Change tracking – compare every scan with the previous run to surface coverage drops and new duplicate selectors in both VS Code and the CLI.
Suggested locator guidance – flag elements missing resilient attributes (IDs, data-*, ARIA) and recommend additions in diagnostics and reports.
Accessibility signals – track ARIA role/name coverage to expose gaps that may affect assistive tech and selector reliability.
IDE & PR surfacing – VS Code code lenses flag duplicate selectors / missing locators inline, and optional PR-ready comments summarize scan results.
Optional telemetry – emit structured JSON snapshots for team dashboards when telemetry.enabled is true.

Why Watson?

Source-of-truth selectors – selectors are sourced directly from the codebase rather than rendered DOM snapshots.
Proactive stability – teams can fix fragile or missing locators before tests break.

Project Status

Watson is currently in the design phase as a VS Code plugin. The extension scaffold (TypeScript project, command registration, and placeholder scan action) plus the configuration loader and streaming file discovery are in place while the parsing engine and reporting pipeline are being implemented.

Getting Started (planned)

Step	Description
1	Install the Watson VS Code extension (internal build for now).
2	Add a `watson.config.json` to your workspace or rely on the default patterns (see below).
3	Run `Watson: Scan Project` from the Command Palette to trigger the analyzer.
4	Review reports in the Problems pane, custom panels, or exported artifacts (coming soon).

Implementation details will be added as the scanner and reporting engine are built.

Testing & Continuous Integration

Unit tests: Run npm test (single run) or npm run test:watch (actively watch) to execute the Vitest suite that covers parser helpers like test-suite linkage.
CI pipeline: GitHub Actions workflow (.github/workflows/ci.yml) installs dependencies, compiles the extension, and runs the Vitest suite on pushes to main and every pull request, ensuring regressions are caught before publishing new builds.

Configuration (in progress)

Defaults: JSX/TSX + HTML entry points are included, node_modules and tests are excluded, and reports default to JSON + Markdown in watson-reports/.
Custom file: Drop a watson.config.json in the workspace root or set watson.config in VS Code settings. See watson.config.sample.json for the schema.
Command behavior: When Watson: Scan Project runs, it loads workspace settings first, then a config file, and falls back to defaults with a status notification describing which source was used.
Threshold overrides: Add thresholds.rules to append or override pass/fail logic. Each rule supports:
- metric: one of idCoverage, nameCoverage, duplicateIds, duplicateNames
- value: numeric requirement
- comparator: gte (default for coverage) or lte (default for duplicates)
- level: error (fails builds) or warning (surfaces but doesn’t block)
- messageTemplate: optional string with {actual} / {required} tokens
Change tracking: Toggle changeTracking.enabled (default true), set coverageDropWarningThreshold (percentage points, default 0.01/1pp), and duplicateWarningLimit (max entries surfaced per run).
Locator guidance: Configure suggestedLocators with enabled, custom attribute allowlist (supports data-* wildcards), and limit to control how many suggestions surface per scan.

{
  "thresholds": {
    "duplicateIds": 0,
    "minIdCoverage": 0.5,
    "rules": [
      { "metric": "nameCoverage", "value": 0.25, "level": "warning" },
      { "metric": "duplicateNames", "value": 3, "comparator": "lte" }
    ]
  },
  "changeTracking": {
    "enabled": true,
    "coverageDropWarningThreshold": 0.02,
    "duplicateWarningLimit": 10
  },
  "suggestedLocators": {
    "enabled": true,
    "attributes": ["id", "name", "data-*", "data-testid", "aria-label"],
    "limit": 15
  },
  "outputs": {
    "formats": ["json", "markdown", "github"]
  }
}

File Discovery (in progress)

Glob awareness: Include/exclude lists are respected via minimatch, and command output surfaces how many files matched.
Streaming traversal: A stack-based iterator walks workspace folders without loading the entire tree into memory and respects cancellation tokens for large repos.
User feedback: Progress notifications update every ~100 files, and sample relative paths are shown once the placeholder scan completes.

Parser Adapters (scaffolding)

React/JSX: Uses Babel parser + traverse to collect every JSX opening element with literal attribute values, including IDs, names, classes, and data-*.
HTML templates: Parses static HTML via node-html-parser, reusing the same ElementRecord schema.
Vue SFCs: Extracts <template> blocks with @vue/compiler-sfc and feeds them through the HTML parser, labeling elements with the vue framework tag.
Parser stack decision: Babel was selected over SWC and the ESLint parser because Watson needs Babel's mature plugin ecosystem (JSX, TypeScript, decorators) and detailed AST node metadata for locator tracing, while node-html-parser and @vue/compiler-sfc keep template parsing lightweight without introducing separate runtime dependencies.

Runtime & Packaging

Minimum runtime: Watson targets Node.js 20.10+ to align with the VS Code 1.90 engine baseline and to guarantee stable fs/util APIs used by the extension and CLI tooling.
Packaging: Run npm run package to build dist/ and invoke vsce package --no-dependencies, producing a .vsix that can be side-loaded or published to the Marketplace. The workflow pins VS Code's engine/version to ensure compatibility checks pass during packaging.

Element Catalog & Metrics (in progress)

Normalization: All parser outputs funnel through a catalog that standardizes attribute casing (className → class, etc.) and stores file metadata.
Coverage metrics: Initial engine counts elements with/without IDs or names and reports duplicate ID/name values across files with file references.
Future metrics: Stability scoring, ARIA coverage, and change tracking will build on top of this catalog.

Thresholds & Reporting (in progress)

Configurable gates: watson.config.json defines min coverage percentages and max duplicate counts; the command evaluates them after each scan.
Custom rules: Additional threshold rules may be layered on with severity (error vs warning) and custom messages to reflect team-specific guardrails.
Change tracking: Each scan is compared to the previous snapshot (stored in watson-reports/previous-scan.json). Coverage drops beyond changeTracking.coverageDropWarningThreshold and newly introduced duplicates raise warnings in VS Code, CLI output, and Markdown.
Accessibility signals: Each report highlights ARIA role/name coverage so teams can quickly spot gaps affecting assistive tech and selector reliability.
Report Outputs: JSON + Markdown summaries (plus optional GitHub annotation text when outputs.formats includes github and a PR-ready comment block when pr-comment is included). CSV inventory remains planned for the future.
VS Code surfacing: Duplicate selectors trigger Problems entries, change tracking produces warnings on regressions, and locator guidance surfaces informational diagnostics recommending data-*/ARIA hooks.
CLI parity: Running the CLI from the workspace root emits the same artifacts, prints change-tracking + locator guidance summaries, and sets exit code 1 when thresholds or locator regressions fail, 2 on configuration errors.

Usage Examples (planned)

# Scan a React project and emit JSON + markdown summaries
watson scan ./app --format json,md --out ./reports

# Fail CI if duplicate IDs are detected
watson scan ./app --threshold duplicateIds=0

Support Watson

Run Watson: Support via Buy Me a Coffee from the Command Palette or click the $(coffee) Watson Support status bar item to open buymeacoffee.com/pabustan in your browser.
A quick “Thank you” toast appears inside VS Code after the link opens successfully.
If you prefer a direct link, visit https://buymeacoffee.com/pabustan anytime.

Roadmap Highlights

Locator stability scoring based on uniqueness, specificity, and change frequency.
Change tracking with commit-to-commit diffing and alerts.
Suggested locator generation for elements missing resilient attributes.
IDE/PR integrations for inline feedback.
Accessibility signal overlays (ARIA role/name completeness).

Contributing

Fork the repository and create a feature branch.
Add or update tests/specs relevant to your change.
Open a pull request describing the motivation and approach.

License

Watson is released under the MIT License.

Good Watson

Edwin R.D. Pabustan