Skip to content
| Marketplace
Sign in
Visual Studio Code>Programming Languages>DeskAI — Local AI Coding AssistantNew to Visual Studio Code? Get it now.
DeskAI — Local AI Coding Assistant

DeskAI — Local AI Coding Assistant

DeskAI

|
4 installs
| (0) | Free
100% offline AI coding assistant. Full project generation, agentic file editing & terminal control powered by local LLMs (Qwen3, DeepSeek-Coder) running entirely on your machine. Works on Windows, macOS and Linux. No API keys.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

DeskAI — Intelligent Local Coding Assistant

100% offline AI coding assistant powered by local LLMs running entirely on your machine. No API keys. No data sent to the cloud. Full project generation, agentic file editing, and terminal control — all on your hardware.


Features

🤖 Agent Mode — Full Project Generation

Let the AI plan and build entire projects autonomously:

  • Creates files and folders directly on disk
  • Runs terminal commands (npm install, build, etc.)
  • Reads, edits, and verifies files
  • Recovers from errors automatically
  • Persistent task memory (scratchpad) survives context resets
  • "Continue task" button resumes interrupted sessions

💬 Ask Mode — Instant Code Help

  • Code explanation, refactoring, debugging
  • Ask about selected code via right-click → DeskAI: Ask About Selection
  • Attach files as context with the 📎 button

📎 File Attachment

  • Attach single files or entire folders as context
  • File content processed server-side — never floods the chat UI
  • Large files handled gracefully with truncation and read_file fallback

⚡ Local LLM Powered

  • Uses llama-server under the hood (bundled)
  • Supports GGUF models — import any compatible model
  • Recommended models auto-downloaded on first launch

Recommended Models

Model RAM Required Best For
Qwen3-27B Q4_K_M ⭐ 20 GB+ Full project generation, best quality
DeepSeek-Coder-V2-Lite 16B 12 GB+ Great for 15GB RAM machines
Qwen2.5-Coder-14B Q4_K_M 12 GB+ Solid coding baseline

Models are downloaded automatically on first use. You can also Import any .gguf file via the model picker.


Getting Started

  1. Install the extension
  2. Open the DeskAI panel in the secondary sidebar (right side)
  3. Wait for the model to load (first launch downloads the model ~8–16 GB)
  4. Switch to Agent mode for full project generation
  5. Type your task and press Enter

Showcase / Demo Mode

For full project generation without confirmation dialogs:

  • Enable deskai.autoApproveWrites: true — files written directly without popups
  • Enable deskai.hiddenTerminal: true — commands run silently in background

Settings

| Setting | Default | Description | |---------|---------|-------------| | deskai.autoApproveWrites | false | Write files directly without confirmation dialogs | | deskai.hiddenTerminal | false | Run shell commands silently in background | | deskai.maxAgentIterations | 40 | Max tool-call iterations before pausing | | deskai.contextSize | 32768 | Model context window in tokens | | deskai.temperature | 0.3 | Generation temperature (0=deterministic) | | deskai.modelPath | | Custom path to a `.gguf` model file | | `deskai.modelsSearchPaths` | `[]` | Extra directories to scan for models | | `deskai.llamaServerPath` | | Custom path to llama-server executable |


Requirements

  • Windows 10/11 (primary), macOS, Linux
  • RAM: 12 GB minimum (16 GB+ recommended for best models)
  • Disk: 10–20 GB free for model storage
  • VS Code 1.85.0+

How It Works

  1. On first open, DeskAI downloads llama-server and a GGUF model
  2. llama-server runs locally on port 11434 with a 32k context window
  3. The chat panel communicates with the local server via HTTP
  4. In Agent mode, the model calls tools (write_file, run_terminal_command, etc.) to implement tasks
  5. All data stays on your machine — nothing is sent to the internet during inference

Commands

Command Description
DeskAI: Open Chat Open the chat panel
DeskAI: Ask About Selection Ask about selected code (right-click menu)
DeskAI: Refresh Project Context Re-scan workspace for context
DeskAI: Import Custom Model (.gguf) Import a local model file
DeskAI: Stop Server Stop the llama-server process
DeskAI: Open Debug Log View the extension debug log

Privacy

  • All inference is local — your code never leaves your machine
  • The only network requests are:
    • Initial download of llama-server binary (from GitHub Releases)
    • Initial download of the selected GGUF model (from Hugging Face)
  • After first setup, the extension works completely offline

License

MIT — see LICENSE

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft