AISense
Smart Prompt Compression & Model Routing for GitHub Copilot Chat
AISense is a VS Code extension that sits between you and GitHub Copilot Chat. It compresses your prompts, classifies intent, and routes each request to the cheapest model that fits — saving tokens and money without sacrificing answer quality.

Features
Prompt Compression
An 8-rule pipeline strips noise from your prompts before they reach the model:
| Rule |
What it does |
| Code Fence Slim |
Keeps head + tail of long code blocks, elides the middle |
| Declarative Voice |
Converts "Could you please…" → direct commands |
| Instruction Dedup |
Removes exact-duplicate sentences |
| Whitespace |
Collapses blank lines, strips trailing whitespace |
| Markdown Noise |
Strips decorative emoji from headings, removes empty bullets |
| Vague Language |
Flags vague phrases ("some examples", "fairly short") as advisory notes |
| Conciseness Hint |
Appends a brevity hint to save output tokens |
| Hallucination Guard |
Appends a grounding hint to reduce hallucination |
Every rule can be toggled individually via settings.
Intent-Based Model Routing
A keyword-based classifier detects what you're asking for (code, refactor, reasoning, translate, explain, Q&A) and a first-match-wins policy selects the cheapest model that fits:
qa + short prompt → copilot-fast
translate → gpt-4o-mini
large code task → gpt-5.2-codex
deep reasoning → claude-opus-4.6
The policy is fully customisable via .aisense/modelPolicy.yaml in your workspace.
Local Model Routing
Route simple intents (Q&A, explain, translate) to a local model for 0 credits. Works with any OpenAI-compatible endpoint:
- Ollama (default)
- Docker Model Runner
- LM Studio
- llama.cpp / vLLM

Agent Hand-off
When your prompt is actionable (create/edit/refactor), AISense shows a "Run with Copilot" button that re-issues the compressed prompt to the default Copilot Chat agent — which has file and terminal tools that chat participants don't. A scope-guard suffix prevents the agent from running unnecessary builds or tests.
Dashboard
A built-in dashboard tracks your savings in real time:
- Total tokens saved (input + output)
- Estimated credit savings vs your baseline model
- Per-request breakdown with intent, model, and compression ratio
- Scope-guard savings from agent hand-offs

Companion Panel
A sidebar panel in the Activity Bar shows your most recent requests with model, intent, and savings at a glance. Includes the local model status and a quick toggle.

Instruction File Linter
Automatically lints *.instructions.md, copilot-instructions.md, and *.prompt.md files for:
- Token bloat (configurable threshold, default 500 tokens)
- Duplicate sentences
- "Act as a…" role prompts (which increase hallucination risk)
Offers a Compress code action to fix issues inline.
Status Bar Token Counter
A live token counter in the status bar shows the token count of your current selection (or whole file). Click it to compress.

Custom Agent Template
Install the "AISense Saver" agent template into your workspace via the command palette. It acts as a cost-optimised agent that compresses every prompt and prefers local models for simple tasks.
Getting Started
Prerequisites
- VS Code 1.95+
- GitHub Copilot Chat extension
Installation
Install the .vsix from a release or build from source:
pnpm install
pnpm run build
Install in VS Code:
code --install-extension aisense-*.vsix
Open any workspace and type @aisense in Copilot Chat.
Local Model Setup (Optional)
- Install Ollama and pull a model:
ollama pull llama3.2
- Enable local routing:
{ "aisense.local.enabled": true }
- The Companion panel shows a green dot when the local model is reachable.
Commands
| Command |
Description |
AISense: Run API Smoke Test |
Verify Copilot API access |
AISense: Compress Selection |
Compress the selected text and show a diff |
AISense: Open Dashboard |
Open the savings dashboard |
AISense: Run Demo (Goldset) |
Run the built-in demo prompts |
AISense: Install "Saver" Custom Agent |
Install the agent template into your workspace |
AISense: Refresh Model Pricing |
Auto-discover model pricing from the Copilot API |
Configuration
All settings live under aisense.*. Key settings:
| Setting |
Default |
Description |
aisense.enabled |
true |
Master switch |
aisense.routing.enabled |
true |
Enable compress + route pipeline |
aisense.routing.showHeader |
true |
Show intent/model/savings header in replies |
aisense.routing.showCompressedPrompt |
true |
Show the compressed prompt in replies |
aisense.routing.forwardHistory |
true |
Forward conversation history for multi-turn context |
aisense.local.enabled |
false |
Route eligible intents to a local model |
aisense.local.endpoint |
http://localhost:11434/v1 |
OpenAI-compatible endpoint URL |
aisense.local.model |
llama3.2 |
Model ID for local inference |
aisense.linter.enabled |
true |
Lint instruction files for token bloat |
aisense.statusBar.enabled |
true |
Show live token counter |
See the full settings reference in the VS Code Settings UI under AISense.
Model Policy
The routing policy is defined in .aisense/modelPolicy.yaml in your workspace root. This is where you control which model handles which intent.
version: 1
defaultModel: gpt-4o-mini
rules:
- intent: qa
model: gpt-4o-mini
reason: "all Q&A goes to the cheapest model"
- intent: code
maxTokens: 8000
model: copilot-fast
reason: "small code task — fast & cheap is enough"
- intent: code
model: claude-sonnet-4.6
reason: "larger code task — needs more context"
- intent: reasoning
model: claude-opus-4.6
reason: "deep reasoning — premium model"
Rules are evaluated top-to-bottom, first match wins. Use maxTokens to split by prompt size (e.g. small code tasks → cheap model, large ones → premium). If no rule matches, defaultModel is used.
Available intents: code, refactor, reasoning, translate, explain, qa.
The file is loaded on every @aisense call and cached by modification time — edit it and the next call picks up the changes immediately.
Development
pnpm install
pnpm run watch # incremental rebuild
pnpm run test # run vitest suite
VS Code Tasks
The workspace ships with pre-configured tasks (defined in .vscode/tasks.json and AISense.code-workspace):
| Task |
Description |
npm: build (default build) |
Production bundle via esbuild |
npm: watch |
Incremental rebuild in watch mode (background task) |
AISense: Package (.vsix) |
Build + package into a .vsix (runs npm: build first) |
AISense: Install (Stable) |
Package + install into VS Code Stable |
AISense: Install (Insiders) |
Package + install into VS Code Insiders |
AISense: Install (Stable + Insiders) |
Package + install into both Stable and Insiders in parallel |
AISense: Uninstall (Stable) |
Remove the extension from VS Code Stable |
AISense: Uninstall (Insiders) |
Remove the extension from VS Code Insiders |
AISense: Uninstall (Stable + Insiders) |
Remove the extension from both Stable and Insiders |
Additional npm scripts available from the command line:
| Script |
Command |
Description |
compile |
pnpm run compile |
Type-check with tsc --noEmit (no output) |
test |
pnpm run test |
Run the vitest suite once |
test:watch |
pnpm run test:watch |
Run vitest in watch mode |
Press F5 in VS Code to launch the Extension Development Host.
Disclaimer
AISense provides estimates of token and credit usage based on approximate tokenization and built-in pricing data. It is not a replacement for monitoring your actual usage and costs in your GitHub billing dashboard. Credit calculations may be inaccurate due to tokenizer differences, outdated pricing data, or API changes. You are solely responsible for tracking your own Copilot credit consumption. The author assumes no liability for miscalculations, unexpected charges, or any financial impact resulting from the use of this extension.
No guarantee of savings. The savings figures shown in the dashboard and chat headers are estimates based on counterfactual comparisons. Actual savings depend on your usage patterns, model availability, and Copilot plan. The author of AISense does not guarantee any specific reduction in credit consumption.
Third-party services. AISense routes your prompts to GitHub Copilot models (provided by GitHub/Microsoft) and, if enabled, to a local model endpoint you configure. The author of AISense has no control over these services, their availability, pricing, data handling, or terms of use. Your use of these services is governed by their respective terms and privacy policies.
License
MIT — Copyright (c) 2026 Thomas van Veen