AISense

Smart Prompt Compression & Model Routing for GitHub Copilot Chat

AISense is a VS Code extension that sits between you and GitHub Copilot Chat. It compresses your prompts, classifies intent, and routes each request to the cheapest model that fits — saving tokens and money without sacrificing answer quality.

AISense overview

Features

Prompt Compression

An 8-rule pipeline strips noise from your prompts before they reach the model:

Rule	What it does
Code Fence Slim	Keeps head + tail of long code blocks, elides the middle
Declarative Voice	Converts "Could you please…" → direct commands
Instruction Dedup	Removes exact-duplicate sentences
Whitespace	Collapses blank lines, strips trailing whitespace
Markdown Noise	Strips decorative emoji from headings, removes empty bullets
Vague Language	Flags vague phrases ("some examples", "fairly short") as advisory notes
Conciseness Hint	Appends a brevity hint to save output tokens
Hallucination Guard	Appends a grounding hint to reduce hallucination

Every rule can be toggled individually via settings.

Intent-Based Model Routing

A keyword-based classifier detects what you're asking for (code, refactor, reasoning, translate, explain, Q&A) and a first-match-wins policy selects the cheapest model that fits:

qa + short prompt     → copilot-fast
translate             → gpt-4o-mini
large code task       → gpt-5.2-codex
deep reasoning        → claude-opus-4.6

The policy is fully customisable via .aisense/modelPolicy.yaml in your workspace.

Local Model Routing

Route simple intents (Q&A, explain, translate) to a local model for 0 credits. Works with any OpenAI-compatible endpoint:

Ollama (default)
Docker Model Runner
LM Studio
llama.cpp / vLLM

Companion panel local model

Agent Hand-off

When your prompt is actionable (create/edit/refactor), AISense shows a "Run with Copilot" button that re-issues the compressed prompt to the default Copilot Chat agent — which has file and terminal tools that chat participants don't. A scope-guard suffix prevents the agent from running unnecessary builds or tests.

Dashboard

A built-in dashboard tracks your savings in real time:

Total tokens saved (input + output)
Estimated credit savings vs your baseline model
Per-request breakdown with intent, model, and compression ratio
Scope-guard savings from agent hand-offs

Dashboard

Companion Panel

A sidebar panel in the Activity Bar shows your most recent requests with model, intent, and savings at a glance. Includes the local model status and a quick toggle.

Companion panel

Instruction File Linter

Automatically lints *.instructions.md, copilot-instructions.md, and *.prompt.md files for:

Token bloat (configurable threshold, default 500 tokens)
Duplicate sentences
"Act as a…" role prompts (which increase hallucination risk)

Offers a Compress code action to fix issues inline.

Status Bar Token Counter

A live token counter in the status bar shows the token count of your current selection (or whole file). Click it to compress.

Status bar

Custom Agent Template

Install the "AISense Saver" agent template into your workspace via the command palette. It acts as a cost-optimised agent that compresses every prompt and prefers local models for simple tasks.

Getting Started

Prerequisites

VS Code 1.95+
GitHub Copilot Chat extension

Installation

Install the .vsix from a release or build from source:
```
pnpm install
pnpm run build
```

Install in VS Code:

code --install-extension aisense-*.vsix

Open any workspace and type @aisense in Copilot Chat.

Local Model Setup (Optional)

Install Ollama and pull a model:
```
ollama pull llama3.2
```
Enable local routing:
```
{ "aisense.local.enabled": true }
```
The Companion panel shows a green dot when the local model is reachable.

Commands

Command	Description
`AISense: Run API Smoke Test`	Verify Copilot API access
`AISense: Compress Selection`	Compress the selected text and show a diff
`AISense: Open Dashboard`	Open the savings dashboard
`AISense: Run Demo (Goldset)`	Run the built-in demo prompts
`AISense: Install "Saver" Custom Agent`	Install the agent template into your workspace
`AISense: Refresh Model Pricing`	Auto-discover model pricing from the Copilot API

Configuration

All settings live under aisense.*. Key settings:

Setting	Default	Description
`aisense.enabled`	`true`	Master switch
`aisense.routing.enabled`	`true`	Enable compress + route pipeline
`aisense.routing.showHeader`	`true`	Show intent/model/savings header in replies
`aisense.routing.showCompressedPrompt`	`true`	Show the compressed prompt in replies
`aisense.routing.forwardHistory`	`true`	Forward conversation history for multi-turn context
`aisense.local.enabled`	`false`	Route eligible intents to a local model
`aisense.local.endpoint`	`http://localhost:11434/v1`	OpenAI-compatible endpoint URL
`aisense.local.model`	`llama3.2`	Model ID for local inference
`aisense.linter.enabled`	`true`	Lint instruction files for token bloat
`aisense.statusBar.enabled`	`true`	Show live token counter

See the full settings reference in the VS Code Settings UI under AISense.

Model Policy

The routing policy is defined in .aisense/modelPolicy.yaml in your workspace root. This is where you control which model handles which intent.

version: 1
defaultModel: gpt-4o-mini

rules:
  - intent: qa
    model: gpt-4o-mini
    reason: "all Q&A goes to the cheapest model"
  - intent: code
    maxTokens: 8000
    model: copilot-fast
    reason: "small code task — fast & cheap is enough"
  - intent: code
    model: claude-sonnet-4.6
    reason: "larger code task — needs more context"
  - intent: reasoning
    model: claude-opus-4.6
    reason: "deep reasoning — premium model"

Rules are evaluated top-to-bottom, first match wins. Use maxTokens to split by prompt size (e.g. small code tasks → cheap model, large ones → premium). If no rule matches, defaultModel is used.

Available intents: code, refactor, reasoning, translate, explain, qa.

The file is loaded on every @aisense call and cached by modification time — edit it and the next call picks up the changes immediately.

Development

pnpm install
pnpm run watch    # incremental rebuild
pnpm run test     # run vitest suite

VS Code Tasks

The workspace ships with pre-configured tasks (defined in .vscode/tasks.json and AISense.code-workspace):

Task	Description
`npm: build` (default build)	Production bundle via esbuild
`npm: watch`	Incremental rebuild in watch mode (background task)
`AISense: Package (.vsix)`	Build + package into a `.vsix` (runs `npm: build` first)
`AISense: Install (Stable)`	Package + install into VS Code Stable
`AISense: Install (Insiders)`	Package + install into VS Code Insiders
`AISense: Install (Stable + Insiders)`	Package + install into both Stable and Insiders in parallel
`AISense: Uninstall (Stable)`	Remove the extension from VS Code Stable
`AISense: Uninstall (Insiders)`	Remove the extension from VS Code Insiders
`AISense: Uninstall (Stable + Insiders)`	Remove the extension from both Stable and Insiders

Additional npm scripts available from the command line:

Script	Command	Description
`compile`	`pnpm run compile`	Type-check with `tsc --noEmit` (no output)
`test`	`pnpm run test`	Run the vitest suite once
`test:watch`	`pnpm run test:watch`	Run vitest in watch mode

Press F5 in VS Code to launch the Extension Development Host.

Disclaimer

AISense provides estimates of token and credit usage based on approximate tokenization and built-in pricing data. It is not a replacement for monitoring your actual usage and costs in your GitHub billing dashboard. Credit calculations may be inaccurate due to tokenizer differences, outdated pricing data, or API changes. You are solely responsible for tracking your own Copilot credit consumption. The author assumes no liability for miscalculations, unexpected charges, or any financial impact resulting from the use of this extension.

No guarantee of savings. The savings figures shown in the dashboard and chat headers are estimates based on counterfactual comparisons. Actual savings depend on your usage patterns, model availability, and Copilot plan. The author of AISense does not guarantee any specific reduction in credit consumption.

Third-party services. AISense routes your prompts to GitHub Copilot models (provided by GitHub/Microsoft) and, if enabled, to a local model endpoint you configure. The author of AISense has no control over these services, their availability, pricing, data handling, or terms of use. Your use of these services is governed by their respective terms and privacy policies.

AISense

DeepSharpᴵᴼ

AISense

Features

Prompt Compression

Intent-Based Model Routing

Local Model Routing

Agent Hand-off

Dashboard

Companion Panel

Instruction File Linter

Status Bar Token Counter

Custom Agent Template

Getting Started

Prerequisites

Installation

Local Model Setup (Optional)

Commands

Configuration

Model Policy

Development

VS Code Tasks

Disclaimer

License