DeepSeek Inline Completion for VS CodeAI-powered inline (ghost-text) code completions using the DeepSeek API. As you type, the extension sends surrounding context to DeepSeek's chat models and renders suggestions directly in your editor — just like GitHub Copilot's inline completions. Table of Contents
Features
Requirements
InstallationFrom Source (Development)
Then press F5 in VS Code to launch the Extension Development Host with the extension loaded. Packaging as
|
| Setting | Type | Default | Range | Description |
|---|---|---|---|---|
deepseek.enabled |
boolean |
true |
— | Master switch to enable/disable all completions |
deepseek.model |
string |
"deepseek-v4-flash" |
"deepseek-v4-flash", "deepseek-v4-pro" |
Which DeepSeek model to query |
deepseek.maxTokens |
number |
256 |
16–4096 |
Maximum tokens in the completion response |
deepseek.temperature |
number |
0.2 |
0–2 |
Sampling temperature; lower = more deterministic |
deepseek.debounceMs |
number |
300 |
100–2000 |
Delay (ms) after last keystroke before requesting |
deepseek.contextLines |
number |
50 |
10–200 |
Lines of context sent above and below cursor |
deepseek.enabledLanguages |
string[] |
[] |
e.g. ["javascript","python"] |
Language IDs to enable; empty = all languages |
Example settings.json
{
"deepseek.enabled": true,
"deepseek.model": "deepseek-v4-pro",
"deepseek.maxTokens": 512,
"deepseek.temperature": 0.1,
"deepseek.debounceMs": 500,
"deepseek.contextLines": 80,
"deepseek.enabledLanguages": [
"javascript",
"typescript",
"javascriptreact",
"typescriptreact",
"python",
"rust",
"go"
]
}
Note:
deepseek.apiKeyis not in the example above — keys are managed via SecretStorage, notsettings.json.
Commands
| Command ID | Title | Description |
|---|---|---|
deepseek.toggleCompletion |
DeepSeek: Toggle Inline Completion | Toggles deepseek.enabled on/off. Displays a confirmation message. |
deepseek.setApiKey |
DeepSeek: Set API Key | Opens a masked input box to enter/update your API key. The value is stored in VS Code's global settings. |
Access both via the Command Palette (Ctrl+Shift+P).
How It Works
Request Lifecycle
User types a character
│
▼
VS Code fires InlineCompletionContext (triggerKind: Automatic)
│
▼
Provider debounces for deepseek.debounceMs (default 300ms)
│
├── User types again → debounce timer resets
│
▼ (no new keystrokes during debounce window)
Extract prefix (code before cursor) and suffix (code after cursor)
│
▼
Build FIM-style prompt with <PREFIX> / <CURSOR> / <SUFFIX>
│
▼
POST to https://api.deepseek.com/v1/chat/completions
│
├── New keystroke → AbortController aborts the in-flight request
│
▼
Parse response, extract completion text, strip markdown fences
│
▼
Return vscode.InlineCompletionItem → rendered as ghost text
Prompt Structure
The extension does not rely on native Fill-in-the-Middle (FIM) API support. Instead, it constructs a structured chat message:
System message:
You are an expert inline code completion assistant.
You complete code in the {language} programming language.
RULES:
1. Return ONLY the code that should appear at the cursor position.
2. Do NOT repeat the prefix or suffix.
3. Do NOT include any explanations, markdown fences, or commentary.
4. Complete the current statement/expression logically.
5. If the cursor is mid-identifier, complete that identifier naturally.
6. If there is nothing meaningful to complete, return an empty response.
7. Maintain the same indentation style as the surrounding code.
8. Multi-line completions are acceptable when appropriate.
User message:
Complete the code at the <CURSOR> position in the following {language} file.
<PREFIX>
{last N lines of code before cursor}
</PREFIX>
<CURSOR>
<SUFFIX>
{next N lines of code after cursor}
</SUFFIX>
Return ONLY the code that replaces <CURSOR>. Do not wrap in backticks.
Context is truncated: prefix keeps the tail (closest to cursor, up to ~2000 chars), suffix keeps the head (closest to cursor, up to ~1000 chars).
Response Extraction
The raw model response goes through a cleanup pipeline in extractCompletion():
- Trim whitespace.
- Strip markdown code fences (
```) if the model wrapped the output. - Strip the prefix if the model accidentally repeated it.
- Return
nullif nothing meaningful remains (prevents showing empty/whitespace ghost text).
Architecture
File Map
deepseek-inline-completion/
│
├── package.json # Extension manifest, contributes configuration & commands
├── tsconfig.json # TypeScript compiler configuration
├── .vscodeignore # Files excluded from .vsix packaging
├── .gitignore
│
├── .vscode/
│ ├── launch.json # Debug launch config ("Run Extension", "Run Extension (no build)")
│ └── tasks.json # Build task definitions (compile, watch)
│
├── src/
│ ├── extension.ts # Activation entry point — registers provider + commands
│ ├── deepseekClient.ts # HTTPS client for DeepSeek API (chat/completions)
│ └── inlineCompletionProvider.ts # VS Code InlineCompletionItemProvider implementation
│
└── out/ # Compiled JavaScript output (gitignored)
├── extension.js
├── deepseekClient.js
└── inlineCompletionProvider.js
Data Flow
┌─────────────────────────────────────────────────────────────────┐
│ extension.ts (activate) │
│ ├── Registers DeepSeekInlineCompletionProvider │
│ ├── Registers commands: toggleCompletion, setApiKey │
│ └── Listens for configuration changes │
└──────────────────────┬──────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ inlineCompletionProvider.ts │
│ ├── provideInlineCompletionItems(document, position, context) │
│ │ ├── Checks: enabled? apiKey? language filter? │
│ │ ├── Automatic trigger → debouncedComplete() │
│ │ └── Explicit trigger → getCompletions() immediately │
│ ├── getCompletions() │
│ │ ├── Aborts previous request │
│ │ ├── Gets prefix (code before cursor) │
│ │ ├── Gets suffix (code after cursor) │
│ │ ├── Calls client.complete(prefix, suffix, lang, signal) │
│ │ └── Returns InlineCompletionItem[] │
│ └── dispose() — cleanup on deactivate │
└──────────────────────┬──────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ deepseekClient.ts │
│ ├── complete(prefix, suffix, language, signal) │
│ │ ├── Reads config (apiKey, model, maxTokens, temperature) │
│ │ ├── Builds FIM prompt via buildFimPrompt() │
│ │ ├── POST via https.request() to api.deepseek.com │
│ │ ├── Parses JSON response │
│ │ └── Calls extractCompletion() to clean output │
│ ├── httpsRequest() — raw HTTPS with AbortSignal support │
│ ├── buildSystemPrompt() — system role instructions │
│ ├── buildFimPrompt() — constructs <PREFIX>/<CURSOR>/<SUFFIX> │
│ ├── extractCompletion() — strips fences, prefix repeats │
│ └── truncateCode() — limits context to max character count │
└─────────────────────────────────────────────────────────────────┘
Class Diagram
┌──────────────────────────────────────────┐
│ InlineCompletionProvider │
├──────────────────────────────────────────┤
│ - client: DeepSeekClient │
│ - debounceTimer: NodeJS.Timeout? │
│ - abortController: AbortController? │
├──────────────────────────────────────────┤
│ + provideInlineCompletionItems(...) │
│ - debouncedComplete(...) │
│ - getCompletions(...) │
│ - getPrefix(...) │
│ - getSuffix(...) │
│ + dispose() │
└──────────────┬───────────────────────────┘
│ uses
▼
┌──────────────────────────────────────────┐
│ DeepSeekClient │
├──────────────────────────────────────────┤
│ - apiHost: string │
│ - apiPath: string │
│ - userAgent: string │
├──────────────────────────────────────────┤
│ + complete(prefix, suffix, lang, signal) │
│ - httpsRequest(body, apiKey, signal) │
│ - buildSystemPrompt(language) │
│ - buildFimPrompt(prefix, suffix, lang) │
│ - extractCompletion(raw, prefix) │
│ - truncateCode(code, maxChars) │
│ - logError(message) │
└──────────────────────────────────────────┘
Configuration Recipes
Minimal latency (fast completions)
{
"deepseek.debounceMs": 150,
"deepseek.maxTokens": 128,
"deepseek.contextLines": 30,
"deepseek.temperature": 0
}
Best for fast typists who want suggestions to appear almost instantly. Shorter context and fewer tokens reduce API latency.
High quality (thoughtful completions, multi-line)
{
"deepseek.model": "deepseek-v4-pro",
"deepseek.debounceMs": 600,
"deepseek.maxTokens": 1024,
"deepseek.contextLines": 120,
"deepseek.temperature": 0.3
}
Best for complex codebases where you want full function bodies, closing braces, and multi-line completions. More context helps the model understand intent.
Python-only, conservative
{
"deepseek.enabledLanguages": ["python"],
"deepseek.model": "deepseek-v4-pro",
"deepseek.temperature": 0.1,
"deepseek.maxTokens": 256,
"deepseek.contextLines": 60
}
Troubleshooting
No suggestions appear at all
- Check
deepseek.enabled— RunDeepSeek: Toggle Inline Completionand verify the notification says "Enabled". - Check your API key — Run
DeepSeek: Set API Keyand ensure a valid key is stored. A migration message in the Extension Host output ([DeepSeek Inline Completion] Migrated API key from settings to SecretStorage.) confirms a legacy key was imported. - Check
deepseek.enabledLanguages— If non-empty, verify the current file's language ID is in the list. Check the language ID in the VS Code status bar (bottom-right). - Check Output panel — Open
View → Output, select the "Extension Host" channel from the dropdown, and look for[DeepSeek Inline Completion]log messages.
Error: "DeepSeek API error 401"
Your API key is invalid or expired. Generate a new key at platform.deepseek.com/api_keys and update via DeepSeek: Set API Key.
Error: "DeepSeek API error 429"
You've hit DeepSeek's rate limit. Increase deepseek.debounceMs to send fewer requests, or upgrade your DeepSeek API tier.
Suggestions are slow to appear
- Increase
deepseek.debounceMs(counter-intuitive but reduces cancelled requests). - Decrease
deepseek.contextLinesto send less data. - Decrease
deepseek.maxTokensfor faster generation. - Check your network latency to
api.deepseek.com.
Suggestions are inaccurate or irrelevant
- Set
deepseek.modelto"deepseek-v4-pro"for higher-quality code completions. - Lower
deepseek.temperatureto0or0.1for more deterministic output. - Increase
deepseek.contextLinesto give the model more surrounding context.
Extension won't activate (no logs in Extension Host output)
- Verify VS Code version ≥ 1.82.0 (
Help → About). - Verify
npm run compilecompleted without errors. - Check that
out/directory contains compiled.jsfiles.
Development
Project Setup
cd deepseek-inline-completion
npm install
npm run compile
Build & Run
| Task | Command | Description |
|---|---|---|
| Compile once | npm run compile |
Runs tsc -p ./ |
| Watch mode | npm run watch |
Recompiles on file changes |
| Launch extension | Press F5 | Opens Extension Development Host |
| Launch (no rebuild) | Select "Run Extension (no build)" in Run & Debug | Skips the pre-launch build task |
Debugging
- Press F5 to launch the Extension Development Host.
- Set breakpoints in
src/*.tsfiles. - Open a file and start typing — breakpoints in
provideInlineCompletionItemsandcompletewill be hit. - View logs in the original VS Code window's Debug Console.
- View extension host logs via
View → Output → "Extension Host".
Adding Features
- Support streaming completions — Modify
deepseekClient.tsto setstream: trueand parse Server-Sent Events (SSE). Requires switching fromhttps.requestto streaming response handling. - Cache recent completions — Add an LRU cache in
inlineCompletionProvider.tskeyed by(document URI, position, prefix hash). - Multi-line ghost text rendering — Already supported; the provider returns multi-line strings and VS Code renders them automatically.
- Context from multiple files — Enhance
getPrefix()/getSuffix()to include imports or related file snippets by resolving imports from the current document. - Status bar indicator — Add a status bar item in
extension.tsshowing when a request is in-flight, similar to Copilot's spinner.
FAQ
How is this different from GitHub Copilot?
Copilot uses OpenAI Codex models and sends extensive context (neighboring files, recently viewed tabs). This extension is a lightweight alternative that uses DeepSeek models with only local context (prefix/suffix from the current file). It's fully open-source and you control the API key.
Does it work offline?
No — every completion requires a network call to api.deepseek.com. There is no local model.
What does it cost?
You pay DeepSeek directly for API usage. See DeepSeek Pricing. deepseek-v4-flash is the most cost-effective option; deepseek-v4-pro costs more but delivers higher-quality completions.
Can I use it alongside Copilot?
Yes. VS Code supports multiple inline completion providers. Copilot's suggestions take priority; this extension's suggestions appear when Copilot has none. You can toggle either independently.
What languages are supported?
Any language VS Code recognizes (all language IDs). Both deepseek-v4-flash and deepseek-v4-pro support dozens of programming languages. Use v4-pro for complex or less common languages.
How do I report a bug or request a feature?
File an issue on the project repository with:
- VS Code version (
Help → About) - Extension settings (
deepseek.*from yoursettings.json, with API key redacted) - Steps to reproduce
- Relevant Extension Host logs
License
MIT