Skip to content
| Marketplace
Sign in
Visual Studio Code>Other>Copilot Token AwarenessNew to Visual Studio Code? Get it now.
Copilot Token Awareness

Copilot Token Awareness

Raj Uppadhyay

|
12 installs
| (0) | Free
Live token count and cost estimate for GitHub Copilot context
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Copilot Token Awareness

Stop being surprised by your AI credit bill.
This extension gives you a live, transparent estimate of how many tokens GitHub Copilot will consume - and what that costs in AI credits - before you send a single message. Use it to understand where your context budget goes, choose the right model, and get more out of every credit.


Screenshot

Copilot Token Breakdown panel showing Ask/Agent mode switcher, context sources table, step-by-step cost calculation, and official pricing link

The Breakdown panel shows Ask/Agent mode, every context source contributing tokens, a step-by-step cost calculation, and a link to official GitHub pricing.


Why This Exists

GitHub Copilot Chat and Agent mode bill in AI credits (1 credit = $0.01 USD) Source. Each request sends a large bundle of context to the model - your active file, snippets from open tabs, a system prompt, custom instructions, and workspace-index retrievals - all before you type a single word of your question.

Most developers have no visibility into this until they see a usage report. This extension surfaces those numbers in real time so you can:

  • Understand what is driving your token count
  • Choose the most cost-effective model for the task
  • Optimise by closing unused tabs or splitting large files
  • Budget AI credit usage across a team or project

Important: Code completions and next-edit suggestions are not billed in AI credits and are excluded from all estimates in this extension. Only Ask mode and Agent mode chat interactions are in scope.
Source: GitHub Copilot billing - Code completions


Features

Feature Details
Live status bar 🔢 [Ask] ~23,129 tokens \| ~$0.0694 - updates as you type, edit files, or switch tabs
Ask / Agent mode Toggle between modes in the status bar or Breakdown panel; each mode uses different estimation assumptions
Breakdown panel Click the status bar to see a per-source token table, step-by-step cost calculation, and all assumptions used
Transparent assumptions Every value used in the estimate is shown with an explanation - nothing is hidden
23 built-in models All current Copilot models with exact pricing from the official docs
Included-model detection GPT-4.1 and GPT-5 mini are flagged as included models that don't consume credits within your plan allowance
Custom instructions detection .github/copilot-instructions.md and *.instructions.md files are auto-detected and counted
Optimisation tips Rule-based hints: too many tabs open, active file too large, approaching context window limit
User-patchable pricing Override any model's multipliers or add new models without waiting for an extension update

How the Estimate Is Built

Copilot does not expose what it sends to the model. This extension reconstructs a realistic estimate using Copilot's published architecture and empirical measurement. Every assumption is shown in the Estimation Assumptions section of the Breakdown panel.

Context sources (in order)

Source How it is estimated Notes
System prompt Fixed budget per mode (Ask: ~5,000 tokens, Agent: ~14,000 tokens) Ask mode contains behaviour rules only. Agent mode adds full tool definitions, skill descriptions, and security policies. Override with askSystemPromptTokenBudget or agentSystemPromptTokenBudget.
Custom instructions Actual file content tokenised Copilot auto-injects .github/copilot-instructions.md, .copilot-instructions.md, and any *.instructions.md file into every request.
Active file Full file content tokenised Copilot always includes the entire active file in Ask and Agent requests.
Selected text Full selection content tokenised Included when you have a selection active in the editor.
Open tabs 20% of each tab's tokens (Ask) / 15% (Agent) Copilot uses a Jaccard-similarity algorithm to extract best-match snippets (~60 lines each), not the full file. The snippet ratio approximates this. Override with tabSnippetRatio.
Workspace index retrieval Fixed overhead: 1,500 tokens (Ask) / 3,000 tokens (Agent) Copilot's semantic index may pull relevant files that are not currently open as tabs. Override with retrievalOverheadTokens.

What is NOT included (by design)

Item Why excluded
Your prompt text Typically 10–200 tokens; add it mentally for a tighter estimate
Conversation history Only Turn 1 is estimatable before the chat starts; each subsequent turn adds ~500–2,000 tokens
Code completions Not billed in AI credits - separate unlimited quota
Cached tokens Repeat turns benefit from Copilot's prompt cache (10× cheaper); the Turn-1 disclaimer covers this

Ask mode vs Agent mode

Ask mode Agent mode
System prompt ~5,000 tokens ~14,000 tokens
Tab snippet ratio 20% of file tokens 15% of file tokens
Retrieval overhead ~1,500 tokens ~3,000 tokens
Estimate variance ±10% ±15%
Use case Single-turn Q&A, explain, review Multi-step tasks, file edits, terminal

Switch modes using the Ask / Agent toggle buttons at the top of the Breakdown panel, or via the copilotTokenAwareness.chatMode setting.


Cost Calculation

Input cost = total_tokens × input_multiplier × token_unit_price_usd

Where:

  • total_tokens = sum of all context sources above
  • input_multiplier = model-specific value from the GitHub Copilot pricing table ($/M tokens ÷ 10)
  • token_unit_price_usd = 0.00001 (i.e. 1,000 token units = $0.01 = 1 AI credit)

The Breakdown panel also shows a worst-case output ceiling (expandable) that assumes Copilot fills the entire remaining context window - this almost never happens but shows the absolute maximum exposure.


Supported Models

All pricing is sourced directly from the official GitHub Copilot models and pricing page.

Anthropic

Model Input $/M Output $/M Context
Claude Haiku 4.5 $1.00 $5.00 160K
Claude Sonnet 4 $3.00 $15.00 160K
Claude Sonnet 4.5 $3.00 $15.00 160K
Claude Sonnet 4.6 $3.00 $15.00 160K
Claude Opus 4.5 $5.00 $25.00 234K
Claude Opus 4.6 $5.00 $25.00 234K
Claude Opus 4.7 $5.00 $25.00 234K
Claude Opus 4.8 $5.00 $25.00 232K

Anthropic models include a cache-write cost in addition to cached-input pricing. The extension uses the non-cached rate (accurate for Turn 1).

Google

Model Input $/M Output $/M Context
Gemini 2.5 Pro $1.25 $10.00 173K
Gemini 3 Flash (Preview) $0.50 $3.00 173K
Gemini 3.1 Pro (Preview) $2.00 $12.00 173K
Gemini 3.5 Flash $1.50 $9.00 192K

OpenAI

Model Input $/M Output $/M Context Notes
GPT-4.1 $2.00 $8.00 128K ⭐ Included model
GPT-5 mini $0.25 $2.00 192K ⭐ Included model
GPT-5.2 $1.75 $14.00 192K
GPT-5.2-Codex $1.75 $14.00 400K
GPT-5.3-Codex $1.75 $14.00 400K
GPT-5.4 $2.50 $15.00 400K
GPT-5.4 mini $0.75 $4.50 400K
GPT-5.4 nano $0.20 $1.25 400K
GPT-5.5 $5.00 $30.00 400K

⭐ Included models (GPT-4.1, GPT-5 mini) do not consume AI credits within your plan's monthly allowance. The extension flags these with a green banner and notes that the cost shown is the overage rate only.

Fine-tuned (GitHub) & Microsoft

Model Input $/M Output $/M Context Notes
Raptor mini (Preview) $0.25 $2.00 264K Uses GPT-5 mini pricing
MAI-Code-1-Flash $0.75 $4.50 128K Microsoft

Settings Reference

Setting Default Description
copilotTokenAwareness.chatMode ask Chat mode to estimate for: ask or agent. Controls system-prompt budget, snippet ratio, and retrieval overhead.
copilotTokenAwareness.model claude-sonnet-4-6 Model used for cost calculation. Selectable from all 23 built-in models.
copilotTokenAwareness.askSystemPromptTokenBudget 5000 Override system-prompt token estimate for Ask mode only. 5000 = use mode default (~5,000 tokens).
copilotTokenAwareness.agentSystemPromptTokenBudget 14000 Override system-prompt token estimate for Agent mode only. 14000 = use mode default (~14,000 tokens).
copilotTokenAwareness.systemPromptTokenBudget 0 (Deprecated) Legacy single-value override for both modes. Ignored when a mode-specific setting is set.
copilotTokenAwareness.tabSnippetRatio 0 Fraction of each open tab's tokens included as snippets (0.0–1.0). 0 = use mode default (Ask: 0.20, Agent: 0.15).
copilotTokenAwareness.retrievalOverheadTokens 0 Fixed token budget for workspace-index retrieval. 0 = use mode default (Ask: 1,500, Agent: 3,000).
copilotTokenAwareness.includeOpenTabs true Include open editor tabs in the token estimate.
copilotTokenAwareness.maxTabsToInclude 5 Maximum number of open tabs to include.
copilotTokenAwareness.tokenUnitPriceUsd 0.00001 Price per token unit in USD. Update if GitHub changes pricing.
copilotTokenAwareness.modelOverrides {} Override multipliers or context window of any built-in model.
copilotTokenAwareness.customModels [] Add models not yet built into the extension.

Keeping pricing current

GitHub may update multipliers or add new models at any time. You don't need to wait for an extension update.

Override a built-in model's multiplier:

// settings.json
"copilotTokenAwareness.modelOverrides": {
  "claude-sonnet-4-6": { "inputMultiplier": 0.35 }
}

Add a brand-new model:

"copilotTokenAwareness.customModels": [
  {
    "id": "my-new-model",
    "displayName": "My New Model",
    "inputMultiplier": 0.2,
    "outputMultiplier": 1.0,
    "contextWindow": 200000
  }
]

Update the base token unit price:

"copilotTokenAwareness.tokenUnitPriceUsd": 0.000012

Commands

Command Description
Copilot Token Awareness: Show Breakdown Open the Breakdown panel
Copilot Token Awareness: Reset Session Totals Clear the session-level cumulative counter

Token Counting

Uses tiktoken (cl100k_base encoding) running entirely in WebAssembly inside VS Code - no data is sent to any external service. Falls back to a character-based heuristic (~4 chars/token) if the WASM module fails to load; the status bar shows a warning in that case.

Tokenizer note: cl100k_base is OpenAI's tokenizer. Anthropic (Claude) and Google (Gemini) use their own tokenizers. For typical English and source code the counts are within ±5%, which is within the stated estimate variance. The variance percentages shown in the UI (±10% Ask, ±15% Agent) account for this approximation.


Accuracy & Disclaimer

This extension is an awareness tool, not a billing meter. Estimates are based on:

  • Copilot's published architecture and empirically observed behaviour
  • Turn 1 only (conversation history is not pre-knowable)
  • Non-cached token rates (cached turns are cheaper; the disclaimer in the panel notes this)
  • The assumption that Copilot uses the entire active file and snippets from tabs (actual selection may vary by feature and version)

Expected accuracy: ±10% for Ask mode, ±15% for Agent mode.

For official billing information, plan allowances, and current per-token rates, always refer to:
📄 GitHub Copilot - Models and Pricing


License

MIT © Raj Uppadhyay

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft