AI Contribution Tracker
Heuristic, privacy-friendly VS Code extension that estimates how much of your code is likely AI‑assisted (Copilot, inline completions, pastes from AI chat, etc.) versus manually authored. It classifies every character flow in real time and exposes granular buckets for deeper analysis.
✨ Core Capabilities
- Real‑time character classification (per keystroke / paste / completion)
- Granular buckets for paste vs inline vs typed vs Copilot Ask
- Commit history aggregation (session snapshots on reset/commit)
- Explorer view: live session stats + recent commits
- Recording & replay demo (import/export deterministic editing sessions)
- Copilot Ask / Agent mode detection with duplicate suppression
- JSON export for offline analysis
- Debug view (recent classified changes + duplicate detection insight)
- File type separation: Tracks code vs documentation files separately
📥 Installation (User)
- Open VS Code → Extensions view (
Ctrl+Shift+X
).
- Search:
AI Contribution Tracker
.
- Click Install, then reload if prompted.
- (Optional) Sign in on first Git commit (AAD/Microsoft) to enable commit summary upload.
- View stats in Explorer under AI Assistance Stats.
Direct browser link (opens in VS Code): Marketplace Page
Privacy note: No sign-in = no uploads. Without signing in, all tracking stays local.
See the detailed Installation Guide for more context.
�📁 File Type Support
The extension now tracks AI assistance separately for different file types:
Code Files (tracked): js, ts, jsx, tsx, py, java, c, cpp, cs, php, rb, go, rs, swift, kt, scala, dart, vue, svelte, html, css, scss, sass, less, xml, json, yaml, yml, toml, sql, sh, bash, ps1, bat, cmd
Documentation Files (tracked): md, markdown, txt, rst, adoc, asciidoc, org, tex, latex, rtf, doc, docx, odt, pdf, readme
Special Detection: Files named README, CHANGELOG, LICENSE, CONTRIBUTING (regardless of extension) are treated as documentation
Display: The Explorer view shows separate AI percentages for code vs documentation files, allowing you to see how AI assistance varies between different types of content.
Storage: All file type statistics are preserved in commit history and uploaded to external storage systems (when configured).
📊 Tracked Session Fields
Each field is a raw character count (no percentages in the tree details):
Field |
Meaning |
Typical Examples / When It Grows |
Exclusions / Edge Notes |
totalCharacters |
All newly inserted characters counted toward session total |
You accept an inline AI suggestion (adds 120 chars) → +120; you paste code (80 chars) → +80 |
Does NOT grow on deletions, ignored undo/redo, skipped file-replace, duplicate large insertion |
likelyAiCharacters |
Inline code the heuristics treat as clearly AI |
Long multi-line suggestion inserted instantly; agent applies code block |
Paste paths & small manual typing go elsewhere |
pastelikelyAiCharacters |
Pasted code judged strongly AI-like |
Paste a well‑formed function block from an AI chat response |
Only for paste events; inline accepts never land here |
pastepossiblyAiCharacters |
Pasted code that might be AI but less certain |
Paste medium (40–80 char) snippet after short pause |
Mid-confidence paste bucket; still included in AI total |
pasteFromCopilotCharacters |
Pasted block that exactly matches Copilot Ask/chat output |
Copy from Copilot chat Insert / Ask panel and paste into file |
Shortcut recognition; bypasses normal paste scoring |
pastelikelyHumanCharacters |
Pasted text that looks manually copied |
Copy from StackOverflow / another local file and paste |
Still increases totalCharacters but NOT AI sum |
typeCharacters |
Individual or tiny bursts of manual typing |
Type letters one-by-one, auto-pair insertion of () |
Prevents normal typing from diluting AI signal |
idecompleteCharacters |
IDE (non-AI) basic completion acceptances |
Press Tab to accept a short property / variable suggestion |
Treated as human; not counted as AI |
deletedCharacters |
Characters removed (churn metric) |
Highlight 30 chars and press Delete → +30 here |
Not subtracted from totals; purely informational |
Derived (not separately stored in the tree):
- AI Character Total =
likelyAiCharacters + pastelikelyAiCharacters + pastepossiblyAiCharacters + pasteFromCopilotCharacters
- AI % = AI Character Total /
totalCharacters
(shown in root summary)
Field Interaction Details
- A single change only increments one of the mutually exclusive buckets among:
likelyAiCharacters
, pastelikelyAiCharacters
, pastepossiblyAiCharacters
, pasteFromCopilotCharacters
, pastelikelyHumanCharacters
, typeCharacters
, idecompleteCharacters
(plus totalCharacters
).
- Paste classification path decides among three outcomes (likely / possibly AI / likely human) producing exactly one increment.
- Copilot Ask paste shortcut (
pasteFromCopilotCharacters
) bypasses the normal paste confidence scoring.
- Deletions never decrement any bucket and only increase
deletedCharacters
.
- Auto-pair insertions (like
()
, {}
) count toward typeCharacters
to avoid false AI inflation.
🚫 Ignored / Filtered Events
To reduce noise and false positives, several categories of changes are skipped entirely (no counters touched):
Category |
Detection Logic |
Rationale |
Undo / Redo |
event.reason === Undo/Redo |
Reversing previous user intent; not new authored content |
Non-code/doc files |
Extension filter (getFileType ) |
Tracks both code and documentation files; skips unknown file types |
Find & Replace batches |
Multiple changes sharing identical replacement text (isFindReplaceOperation ) |
Bulk mechanical edits shouldn't inflate AI usage |
Duplicate large insertions |
New text (len > 20) already present in original doc (isDuplicateChange ) |
Avoid double counting agent-mode 'apply' after generation |
Likely file replace operations |
Large change covering ≥80% of file or massive (>10KB) from start (isLikelyFileReplace ) |
Formatting / regeneration noise |
Empty changes |
No contentChanges |
No-op events |
Copilot Ask accumulation buffer edits |
Documents with scheme vscode-chat-code-block |
Chat buffer capture only; not user file content |
Deletions (re AI totals) |
change.text.length === 0 |
Count only in deletedCharacters to keep AI % denominator stable |
These safeguards ensure percentages reflect authored/accepted content rather than maintenance actions.
🧠 Classification Model (Heuristic)
Factors applied per change:
- Size & speed of insertion
- Multi‑line & structural completeness (functions, classes, imports)
- Code pattern recognition (syntax signatures)
- Timing relative to last keystroke (pauses → suggestions)
- Paste pattern indicators (indent consistency, external markers, duplication)
- Explicit input source correlation (type vs completion vs paste vs copilot)
Paste events are classified into three buckets (likely AI / possibly AI / likely human), each with its own counter so weighting can be applied later.
🔐 Privacy & Philosophy
- No network calls for core tracking
- Only aggregates anonymous character counts & heuristic flags
- Export is user-initiated JSON
- No personal identifiers, no code content transmitted
🗂 Explorer View
AI Assistance Stats
panel shows:
- Root: AI % summary + total & likely AI counts
- Expanded session: all raw character buckets above + session duration + change count
- Commit History: last 10 session snapshots (percentage + counts)
🛠 Commands
Command |
Purpose |
AI Tracker: Reset Current Session Stats |
Starts a new session & pushes prior snapshot to history |
AI Tracker: Export AI Tracking Data |
Opens JSON export (session + commit history + summary) |
AI Tracker: Show Debug Information |
Shows recent changes & duplicate metrics |
AI Tracker: Start Recording Text Changes (AI Tracker) |
Begin capturing a deterministic replayable session |
AI Tracker: Stop & Save Recording (AI Tracker) |
Stop and optionally save recording JSON |
AI Tracker: Import & Replay Recording (AI Tracker) |
Load prior recording & fast replay with live classification |
AI Tracker: Show Stat Demo (AI Tracker) |
Run bundled multi‑phase demo (agent mode, inline, manual typing) |
Recording JSON schema: ai-tracker-recording@1
(deterministic event list with timing deltas).
⚙️ Configuration
No user-configurable settings: heuristics and thresholds are fixed (confidence threshold = 60). The extension auto-uploads commit summary metrics to the configured Kusto endpoint upon commit (AAD sign-in prompt on first use).
🧪 Recording & Replay
Use recording commands to capture an editing session with input classification preserved. Replaying regenerates edits, exercising heuristics deterministically — ideal for tuning & regression testing. The built‑in demo showcases:
- Agent-mode style bulk insertions
- Inline suggestion acceptance
- Pure human typing with occasional IDE completions
🧩 Export Data Structure
Example (fields trimmed & illustrative only):
{
"currentSession": {
"totalCharacters": 1234,
"likelyAiCharacters": 480,
"pastelikelyAiCharacters": 90,
"pastepossiblyAiCharacters": 60,
"pasteFromCopilotCharacters": 150,
"pastelikelyHumanCharacters": 320,
"typeCharacters": 100,
"idecompleteCharacters": 34,
"deletedCharacters": 55,
"sessionStartTime": 1733945233123,
"lastActivity": 1733945299555,
"changes": [ { "timestamp": 1733945..., "classification": "likely-ai", "confidence": 82, "characterCount": 120, "factors": ["multi-line","large-insertion"] } ],
"aiPercentage": 61.2
},
"commitHistory": [
{
"timestamp": 1733941000123,
"totalCharacters": 800,
"likelyAiCharacters": 300,
"pastepossiblyAiCharacters": 40,
"pastelikelyAiCharacters": 70,
"pasteFromCopilotCharacters": 110,
"pastelikelyHumanCharacters": 160,
"typeCharacters": 60,
"idecompleteCharacters": 20,
"deletedCharacters": 25
}
],
"summary": {
"totalCommits": 5,
"averageAIPercentage": 58.4,
"totalCharactersTracked": 6230,
"totallikelyAIAssistedCharacters": 3560
},
"exportedAt": "2025-08-19T12:00:00.000Z"
}
🧬 Heuristic Highlight
Simplified scoring ideas applied (paste path has separate logic):
- Size (thresholded tiers)
- Line count / multi-line penalty or boost
- Structural completeness (function / class / import patterns)
- Code pattern presence (keywords, braces balance)
- Timing gap since last keystroke
- Source channel (type / completion / paste / copilot)
- Paste anomaly signals (mixed indentation, external markers, duplicated blocks)
Edge handling:
- Deletes counted separately (
deletedCharacters
) — not subtracted from totals
- Duplicate insert suppression (agent mode double apply)
- Find/replace & large file-replace heuristics ignored
🚀 Getting Started (Development)
npm install
npm run compile
# Press F5 in VS Code to launch Extension Development Host
Run tests:
npm test
Package VSIX:
npm install -g vsce
vsce package
❗ Known Limitations
Area |
Note |
Heuristics |
Not a ground‑truth measurement; directional only |
Small AI edits |
Very short inline accepts may classify as human |
Large human paste |
May occasionally score as AI (mitigated by paste buckets) |
Cross-session persistence |
In‑workspace state stored; no external sync yet |
Language coverage |
Generic patterns; no deep language semantics per LSP yet |
🛣 Roadmap (Indicative)
- Automatic Git commit hooks / PR correlation
- Remote aggregation adapter (when storage mode != none)
- Weighting model for possibly‑AI vs likely‑AI
- Improved snippet vs AI disambiguation
- Rich visualization (charts / timelines)
- Optional anonymized telemetry for heuristic tuning
🤝 Contributing
- Fork
- Branch (
feat/...
or fix/...
)
- Implement + add/adjust tests
- Run lint & tests
- PR with concise rationale
📄 License
MIT (see LICENSE
).
🙋 FAQ
Q: Why show raw counts instead of percentages everywhere?
Percentages can obscure distribution. Raw buckets let you recompute alternative weightings later.
Q: Can I change the AI confidence threshold?
Not currently; it is fixed (60) to keep comparisons consistent.
Q: Does it send my code anywhere?
Only aggregated commit summary metrics (no raw code) are sent to Kusto after sign-in; JSON export is manual and local.
Q: How are AI characters summed?
likelyAi + pastelikelyAi + pastepossiblyAi + pasteFromCopilot
.
Happy tracking.