SmartRoute — AI Token Optimizer for VS Code

Route smarter. Spend less. Code faster. Target: 40–60% reduction in LLM token spend within 90 days, with zero degradation in output quality.

What is SmartRoute?

SmartRoute is an intelligent AI dispatch layer built directly into VS Code. Instead of always hitting the most expensive frontier model, SmartRoute automatically:

Classifies your prompt (boilerplate vs. architecture vs. complex bug fix)
Routes it to the cheapest model capable of answering it well
Compresses context to strip noise before sending tokens
Tracks every interaction in a real-time cost dashboard
Learns from your feedback to improve routing accuracy over time

The result? You keep Copilot-quality answers while dramatically cutting API costs — whether you're a solo developer or running an engineering org.

Features

⚡ Smart 3-Tier Routing

Every prompt is classified and dispatched to the cheapest capable model automatically.

Tier	Task Types	Models Used
Tier 1 — Simple	boilerplate, explain, docs, simple bug fix	Claude Haiku 4.5, GPT-4o mini, Gemini Flash
Tier 2 — Medium	feature build, refactor, review, test writing	Claude Sonnet 4.6, GPT-4o, Gemini Pro
Tier 3 — Complex	architecture, brainstorm, novel debugging	Claude Opus 4.6, o3, Gemini Ultra

🗜️ Context Compression

Before sending any prompt, SmartRoute:

Strips all code comments (configurable)
Normalizes whitespace and blank lines
Deduplicates unchanged files across conversation turns
Manages a sliding history window (3 / 6 / 10 turns per tier)

This alone can reduce per-request token count by 20–35%.

📊 Real-Time Usage Dashboard

Open SmartRoute: Open Dashboard to see:

Total tokens consumed today / this week / this month
Cost breakdown by model and task type
Estimated savings vs. always using Tier 3
Model distribution pie chart (powered by Chart.js)

🔁 Quality Feedback Loop

Every response includes a 👍 / 👎 button. A thumbs-down automatically:

Re-routes the same prompt to the next tier up
Records the calibration event
Gradually improves the classifier's thresholds

🛡️ Admin Panel

Manage everything from SmartRoute: Admin Settings:

Add / rotate API keys (stored in VS Code's encrypted SecretStorage — never written to disk)
Set daily token budgets per developer
Override routing for specific task types (e.g. always use Opus for architecture)
Configure an optional org telemetry endpoint (aggregated counts only, zero prompt content)

🔬 Research Agent

The built-in Research Sub-Agent fetches live best-practice documentation weekly from Anthropic, OpenAI, and Google developer portals, so your system prompts stay optimal as models evolve.

Getting Started

1. Install the Extension

Search for SmartRoute in the VS Code Extensions panel, or install via command palette:

ext install livemint-engineering.smartroute

2. Add Your API Key(s)

Open the Admin panel:

Ctrl+Shift+P → SmartRoute: Admin Settings

Enter at least one API key (Anthropic, OpenAI, or Google). You only need the providers you want to use — SmartRoute will route within your available providers.

3. Open the Chat

Ctrl+Shift+A    (Mac: Cmd+Shift+A)

Or click the ⚡ SmartRoute icon in the Activity Bar.

Start typing — SmartRoute handles everything else.

Chat Hint Prefixes

Prefix your message to override the auto-classifier:

Prefix	Routes to
`@research`	Tier 3 — brainstorm
`@plan`	Tier 3 — architecture
`@build`	Tier 2 — feature build
`@fix`	Auto-classified bug fix
`@review`	Tier 2 — code review

Example:

@plan Design a multi-tenant auth system with JWT and refresh tokens

Without a prefix, the classifier decides automatically.

Commands

Command	Shortcut	Description
`SmartRoute: Open Chat`	`Ctrl+Shift+A`	Open the AI chat panel
`SmartRoute: Open Dashboard`	—	View token usage & cost savings
`SmartRoute: Admin Settings`	—	Manage API keys, budgets, overrides
`SmartRoute: Refresh Prompt Knowledge Base`	—	Force-fetch latest provider docs

Architecture

SmartRoute is a multi-agent system built with TypeScript inside the VS Code extension host:

Your Prompt
    │
    ▼
┌──────────────────────┐
│   Orchestrator Agent  │  ← entry point, coordinates all agents
└──────────┬───────────┘
           │
    ┌──────▼────────────┐
    │  Classifier Agent  │  → { taskType, tier, confidence }
    └──────┬────────────┘
           │
    ┌──────▼────────────────────┐
    │  Prompt Engineer Agent    │  → optimised system prompt
    │  (+ Research Sub-Agent)   │  → fetches live provider docs
    └──────┬────────────────────┘
           │
    ┌──────▼──────────────┐
    │   Context Manager    │  → compress + sliding history window
    └──────┬──────────────┘
           │
    ┌──────▼──────────────┐     ┌─────────────────────────┐
    │    Model Router      │────▶│  Anthropic / OpenAI /   │
    │   (Tier 1 / 2 / 3)   │     │  Google / OpenRouter     │
    └──────┬──────────────┘     └─────────────────────────┘
           │
    ┌──────▼──────────────┐
    │   Usage Logger       │  → local JSON (.smartroute/usage.json)
    └──────┬──────────────┘
           │
    ┌──────▼──────────────┐
    │   Feedback Agent     │  → 👍/👎 re-routing + calibration
    └─────────────────────┘

Configuration Reference

All settings are under smartroute.* in VS Code settings:

Setting	Type	Default	Description
`defaultProvider`	`string`	`"anthropic"`	Preferred provider when routing confidence is equal
`compressionEnabled`	`boolean`	`true`	Strip comments & normalize whitespace before sending
`dailyTokenBudget`	`number`	`500000`	Per-developer daily token cap (0 = unlimited)
`researchAgentEnabled`	`boolean`	`true`	Allow weekly best-practice doc fetching
`telemetryEndpoint`	`string`	`""`	Optional org-level aggregated telemetry URL
`routingOverrides`	`object`	`{}`	Pin task types to models, e.g. `{"architecture": "claude-opus-4-6"}`

Privacy & Security

No prompt content is ever logged — only metadata (token counts, model, task type, timestamp)
API keys live exclusively in VS Code's encrypted SecretStorage and are never written to disk or sent anywhere
Usage data is stored locally at .smartroute/usage.json in your workspace (add to .gitignore)
Org telemetry (if configured) sends aggregated token counts only — zero prompt content, zero user identifiers
The extension makes no outbound requests except to your configured AI provider APIs and (optionally) provider documentation URLs for the Research Agent

Requirements

VS Code 1.85.0 or later
At least one API key from:
- Anthropic (Claude models)
- OpenAI (GPT models)
- Google AI Studio (Gemini models)
- OpenRouter (multi-provider gateway)

Release Notes

See CHANGELOG.md for the full history.

0.1.0 — Initial Release

3-tier smart routing across Anthropic, OpenAI, Google, and OpenRouter
Context compression & conversation windowing
Real-time usage dashboard
Quality feedback loop with automatic re-routing
Admin panel with secure API key management
Research agent for live prompt best-practices

Contributing

Issues and pull requests are welcome at github.com/anshu-agi/smartroute.

SmartRoute – AI Token Optimizer

anshu-agi