Skip to content
| Marketplace
Sign in
Visual Studio Code>AI>AISenseNew to Visual Studio Code? Get it now.
AISense

AISense

DeepSharpᴵᴼ

|
4 installs
| (0) | Free
Smart Prompt Compression & Model Routing for GitHub Copilot Chat
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

AISense

AISense Logo

Marketplace Version Installs Install in VS Code

Smart Prompt Compression & Model Routing for GitHub Copilot Chat

AISense is a VS Code extension that sits between you and GitHub Copilot Chat. It compresses your prompts, classifies intent, and routes each request to the cheapest model that fits — saving tokens and money without sacrificing answer quality.

AISense overview

Features

Prompt Compression

An 8-rule pipeline strips noise from your prompts before they reach the model:

Rule What it does
Code Fence Slim Keeps head + tail of long code blocks, elides the middle
Declarative Voice Converts "Could you please…" → direct commands
Instruction Dedup Removes exact-duplicate sentences
Whitespace Collapses blank lines, strips trailing whitespace
Markdown Noise Strips decorative emoji from headings, removes empty bullets
Vague Language Flags vague phrases ("some examples", "fairly short") as advisory notes
Conciseness Hint Appends a brevity hint to save output tokens
Hallucination Guard Appends a grounding hint to reduce hallucination

Every rule can be toggled individually via settings.

Intent-Based Model Routing

A keyword-based classifier detects what you're asking for (code, refactor, reasoning, translate, explain, Q&A) and a first-match-wins policy selects the cheapest model that fits:

qa + short prompt     → copilot-fast
translate             → gpt-4o-mini
large code task       → gpt-5.2-codex
deep reasoning        → claude-opus-4.6

The policy is fully customisable via .aisense/modelPolicy.yaml in your workspace.

Local Model Routing

Route simple intents (Q&A, explain, translate) to a local model for 0 credits. Works with any OpenAI-compatible endpoint:

  • Ollama (default)
  • Docker Model Runner
  • LM Studio
  • llama.cpp / vLLM

Companion panel local model

Agent Hand-off

When your prompt is actionable (create/edit/refactor), AISense shows a "Run with Copilot" button that re-issues the compressed prompt to the default Copilot Chat agent — which has file and terminal tools that chat participants don't. A scope-guard suffix prevents the agent from running unnecessary builds or tests.

Dashboard

A built-in dashboard tracks your savings in real time:

  • Total tokens saved (input + output)
  • Estimated credit savings vs your baseline model
  • Per-request breakdown with intent, model, and compression ratio
  • Scope-guard savings from agent hand-offs

Dashboard

Companion Panel

A sidebar panel in the Activity Bar shows your most recent requests with model, intent, and savings at a glance. Includes the local model status and a quick toggle.

Companion panel

Instruction File Linter

Automatically lints *.instructions.md, copilot-instructions.md, and *.prompt.md files for:

  • Token bloat (configurable threshold, default 500 tokens)
  • Duplicate sentences
  • "Act as a…" role prompts (which increase hallucination risk)

Offers a Compress code action to fix issues inline.

Status Bar Token Counter

A live token counter in the status bar shows the token count of your current selection (or whole file). Click it to compress.

Status bar

Custom Agent Template

Install the "AISense Saver" agent template into your workspace via the command palette. It acts as a cost-optimised agent that compresses every prompt and prefers local models for simple tasks.

Getting Started

Prerequisites

  • VS Code 1.95+
  • GitHub Copilot Chat extension

Installation

  1. Install the .vsix from a release or build from source:

    pnpm install
    pnpm run build
    
  2. Install in VS Code:

    code --install-extension aisense-*.vsix
    
  3. Open any workspace and type @aisense in Copilot Chat.

Local Model Setup (Optional)

  1. Install Ollama and pull a model:
    ollama pull llama3.2
    
  2. Enable local routing:
    { "aisense.local.enabled": true }
    
  3. The Companion panel shows a green dot when the local model is reachable.

Commands

Command Description
AISense: Run API Smoke Test Verify Copilot API access
AISense: Compress Selection Compress the selected text and show a diff
AISense: Open Dashboard Open the savings dashboard
AISense: Run Demo (Goldset) Run the built-in demo prompts
AISense: Install "Saver" Custom Agent Install the agent template into your workspace
AISense: Refresh Model Pricing Auto-discover model pricing from the Copilot API

Configuration

All settings live under aisense.*. Key settings:

Setting Default Description
aisense.enabled true Master switch
aisense.routing.enabled true Enable compress + route pipeline
aisense.routing.showHeader true Show intent/model/savings header in replies
aisense.routing.showCompressedPrompt true Show the compressed prompt in replies
aisense.routing.forwardHistory true Forward conversation history for multi-turn context
aisense.local.enabled false Route eligible intents to a local model
aisense.local.endpoint http://localhost:11434/v1 OpenAI-compatible endpoint URL
aisense.local.model llama3.2 Model ID for local inference
aisense.linter.enabled true Lint instruction files for token bloat
aisense.statusBar.enabled true Show live token counter

See the full settings reference in the VS Code Settings UI under AISense.

Model Policy

The routing policy is defined in .aisense/modelPolicy.yaml in your workspace root. This is where you control which model handles which intent.

version: 1
defaultModel: gpt-4o-mini

rules:
  - intent: qa
    model: gpt-4o-mini
    reason: "all Q&A goes to the cheapest model"
  - intent: code
    maxTokens: 8000
    model: copilot-fast
    reason: "small code task — fast & cheap is enough"
  - intent: code
    model: claude-sonnet-4.6
    reason: "larger code task — needs more context"
  - intent: reasoning
    model: claude-opus-4.6
    reason: "deep reasoning — premium model"

Rules are evaluated top-to-bottom, first match wins. Use maxTokens to split by prompt size (e.g. small code tasks → cheap model, large ones → premium). If no rule matches, defaultModel is used.

Available intents: code, refactor, reasoning, translate, explain, qa.

The file is loaded on every @aisense call and cached by modification time — edit it and the next call picks up the changes immediately.

Development

pnpm install
pnpm run watch    # incremental rebuild
pnpm run test     # run vitest suite

VS Code Tasks

The workspace ships with pre-configured tasks (defined in .vscode/tasks.json and AISense.code-workspace):

Task Description
npm: build (default build) Production bundle via esbuild
npm: watch Incremental rebuild in watch mode (background task)
AISense: Package (.vsix) Build + package into a .vsix (runs npm: build first)
AISense: Install (Stable) Package + install into VS Code Stable
AISense: Install (Insiders) Package + install into VS Code Insiders
AISense: Install (Stable + Insiders) Package + install into both Stable and Insiders in parallel
AISense: Uninstall (Stable) Remove the extension from VS Code Stable
AISense: Uninstall (Insiders) Remove the extension from VS Code Insiders
AISense: Uninstall (Stable + Insiders) Remove the extension from both Stable and Insiders

Additional npm scripts available from the command line:

Script Command Description
compile pnpm run compile Type-check with tsc --noEmit (no output)
test pnpm run test Run the vitest suite once
test:watch pnpm run test:watch Run vitest in watch mode

Press F5 in VS Code to launch the Extension Development Host.

Disclaimer

AISense provides estimates of token and credit usage based on approximate tokenization and built-in pricing data. It is not a replacement for monitoring your actual usage and costs in your GitHub billing dashboard. Credit calculations may be inaccurate due to tokenizer differences, outdated pricing data, or API changes. You are solely responsible for tracking your own Copilot credit consumption. The author assumes no liability for miscalculations, unexpected charges, or any financial impact resulting from the use of this extension.

No guarantee of savings. The savings figures shown in the dashboard and chat headers are estimates based on counterfactual comparisons. Actual savings depend on your usage patterns, model availability, and Copilot plan. The author of AISense does not guarantee any specific reduction in credit consumption.

Third-party services. AISense routes your prompts to GitHub Copilot models (provided by GitHub/Microsoft) and, if enabled, to a local model endpoint you configure. The author of AISense has no control over these services, their availability, pricing, data handling, or terms of use. Your use of these services is governed by their respective terms and privacy policies.

License

MIT — Copyright (c) 2026 Thomas van Veen

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft