Skip to content
| Marketplace
Sign in
Visual Studio Code>Chat>EvaluatorNew to Visual Studio Code? Get it now.
Evaluator

Evaluator

Evaluator Harness Agent

|
1 install
| (0) | Free
Agentic development flow inside GitHub Copilot Chat — auto-scans project, generates PRDs, and orchestrates implementation with quality gates.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Evaluator

Agentic development flow inside GitHub Copilot Chat — auto-scans your project, generates PRDs, and orchestrates implementation with quality gates.

Features

Evaluator lives inside Copilot Chat as @evaluator and guides you through a structured development pipeline:

  1. Project Scanning — Automatically detects your stack (Java/Maven, Java/Gradle, Python, React, Angular, TypeScript/Node) and runs preflight checks.
  2. PRD Generation — Generates a Product Requirements Document from your prompt, leveraging project context.
  3. Spec & Task Generation — Breaks the approved PRD into technical specifications and granular implementation tasks.
  4. Autonomous Implementation — Implements tasks with a validation loop (lint, tests, code review) and a scoring system that ensures quality before opening a PR.
  5. Definition of Done Validation — Validates the repository against a configurable Definition of Done checklist.

Requirements

  • VS Code 1.99+
  • GitHub Copilot Chat active subscription

Usage

Open Copilot Chat (Ctrl+L / Cmd+L) and type:

Command Description
@evaluator /start <prompt> Start the pipeline — scans project, runs preflight, generates PRD
@evaluator /status Show current orchestrator state and progress
@evaluator /approve Approve the PRD and generate SPECs + TASKs
@evaluator /implement Approve SPECs/TASKs and start autonomous implementation
@evaluator /reject <feedback> Reject with feedback — regenerates the current artifact
@evaluator /dod Validate repository against the Definition of Done

Typical Workflow

@evaluator /start Build a user settings page with dark mode toggle
  → reviews PRD
@evaluator /approve
  → reviews SPECs and TASKs
@evaluator /implement
  → autonomous implementation with quality gates
  → PR opened automatically when score >= 80%

Scoring System

Evaluator scores implementations across multiple pillars:

Backend — Test coverage (25%), Contract adherence (25%), Security/best practices (25%), Performance (25%)

Frontend — Figma fidelity (25%), Accessibility (20%), E2E tests (25%), Responsiveness/UX (15%), Performance (15%)

Threshold: score >= 80% opens a PR automatically. Below that, Evaluator enters a self-correction loop (up to 3 attempts).

Supported Stacks

  • Java (Maven & Gradle)
  • Python
  • React
  • Angular
  • TypeScript / Node.js

License

MIT

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft