Claude Code at the tip of your cursor
CodeSpark is a first-class Claude Code experience inside VS Code. It runs the Claude Code CLI under the hood, giving you the same models, tools, and When you hand everything to an agent, you don't just lose control — you lose context. The agent builds its understanding of the codebase, but you don't build yours. Every file you navigate to, every change you reason about, every decision you make strengthens your own mental model. That's not overhead — it's how you stay effective. CodeSpark keeps you in those low-level, practical interactions where learning happens, while removing the mechanical friction that slows you down.
Prerequisites
Getting started
How it worksInline agent (
|
| Mac | Windows / Linux | What it does |
|---|---|---|
Cmd+I |
Ctrl+I |
Open the inline agent — describe a change and it edits the file at your cursor |
Cmd+Shift+I |
Ctrl+Shift+I |
Open the research agent — attaches the current file and cursor position as context when the panel is already open |
These shortcuts may conflict with other extensions (e.g. GitHub Copilot uses the same bindings). To rebind them, open the command palette and search for "Preferences: Open Keyboard Shortcuts (JSON)", then add your preferred bindings:
Mac — Cmd+Shift+P > "Preferences: Open Keyboard Shortcuts (JSON)"
[
{ "key": "cmd+i", "command": "codeSpark.invoke", "when": "editorTextFocus" },
{ "key": "cmd+shift+i", "command": "codeSpark.openResearch" }
]
Windows / Linux — Ctrl+Shift+P > "Preferences: Open Keyboard Shortcuts (JSON)"
[
{ "key": "ctrl+i", "command": "codeSpark.invoke", "when": "editorTextFocus" },
{ "key": "ctrl+shift+i", "command": "codeSpark.openResearch" }
]
Inline agent performance
The inline agent is optimized for low-latency edits (~1.5–2s typical). Here's how:
Long-lived MCP server. The MCP server that bridges the Claude CLI and VS Code uses Streamable HTTP transport, started once at extension activation. Each CLI invocation connects to the already-running server instead of spawning a new process, eliminating ~300ms of MCP boot overhead per edit.
Session pre-population. When you press Cmd+I, the CLI process is spawned immediately and a session file is pre-built with fake Read tool results containing the current file content. This puts the file in context without requiring an actual Read tool call, and an assistant prefill message primes the model to go straight to edit_file without explanatory text.
Prompt cache warming. The Anthropic API caches prompt prefixes — system prompt, tool definitions, and conversation history — so repeated edits process only the new instruction. When the estimated token count exceeds the caching threshold (4,096 for Haiku), a lightweight pre-warm message is sent to the CLI while the user types their prompt. By the time the real instruction is submitted, the cache is hot and the API skips reprocessing the prefix.
All edits go through VS Code. File modifications use the WorkspaceEdit API via an IPC server, keeping edits in the undo stack and integrated with the editor. The diff between before/after text (via the diff library) determines which lines changed, driving both the focus scroll and the post-edit dimming effect.
