Skip to content
| Marketplace
Sign in
Visual Studio Code>Machine Learning>Voice AgentNew to Visual Studio Code? Get it now.
Voice Agent

Voice Agent

Aditya Kangune

|
3 installs
| (0) | Free
Voice input for AI coding assistants — speak your prompts instead of typing. Offline by default, optional OpenAI Whisper API for faster transcription.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Voice Agent

Voice input for AI coding assistants. Speak your prompts instead of typing.

Works with Claude Code, Cursor, GitHub Copilot, and any other VS Code AI tool. Press Ctrl+Alt+V, speak naturally, and your words appear in the chat input automatically. No clicking, no manual stop. Just speak and pause.

Features

  • OpenAI Whisper API: fastest and most accurate option — just add your API key and go
  • Fully offline fallback: local Whisper model, speech never leaves your machine
  • Accent-aware: handles Indian, Chinese, Japanese, and other non-native English accents well
  • Smart auto-stop: Detects when you stop speaking automatically
  • Works everywhere: Claude Code, Cursor, Copilot Chat, any terminal or editor
  • No subscriptions: pay only for what you use (~$0.006/min with OpenAI, roughly $0.10 for a 10-minute session), or use fully offline for free

Setup

Step 1: Install Python dependencies

pip install faster-whisper sounddevice numpy webrtcvad-wheels openai

Step 2: Add your OpenAI API key (recommended for best results)

Get a free API key at platform.openai.com/api-keys.

Open VS Code Settings (Ctrl+,), search Voice Agent, and set:

  • voice-agent.provider to openai
  • voice-agent.openaiApiKey to your key (sk-...)

OpenAI Whisper gives significantly better accuracy and faster results than local models. At ~$0.006/min, a full day of dictation costs less than $0.10. Strongly recommended over local mode.

Usage

  1. Press Ctrl+Alt+V (Mac: Cmd+Alt+V) or click the 🎤 Voice button in the status bar
  2. Speak your prompt naturally
  3. Pause for ~1 second — transcription starts automatically
  4. Your words are pasted directly into the active input

Press the button again at any time to cancel.

Settings

Setting Default Description
voice-agent.provider local openai for cloud (recommended), local for offline
voice-agent.openaiApiKey (empty) OpenAI API key for cloud transcription
voice-agent.model base Local model size: tiny, base, small, medium
voice-agent.language en Spoken language code (e.g. ja, zh, hi, fr)
voice-agent.pythonPath (empty) Custom Python path if needed

Provider Comparison

Provider Quality Speed Cost Privacy
OpenAI API Excellent ~1s ~$0.006/min Cloud
Local (small) Better ~4s Free 100% offline
Local (base) Good ~2s Free 100% offline
Local (tiny) Basic ~1s Free 100% offline

Status Bar

State Meaning
🎤 Voice (highlighted) Ready, using OpenAI API
🎤 Voice (plain) Ready, using local Whisper
⟳ Loading model... First-time model load (local only)
⏺ Listening... Recording — speak now
⟳ Transcribing... Processing your speech

Troubleshooting

  • Nothing happens: make sure Python dependencies are installed (pip install faster-whisper sounddevice numpy webrtcvad-wheels)
  • OpenAI error: check your API key is correct and has credits
  • Slow first use: local Whisper downloads a model on first run (~150MB for base), cached after that
  • No speech detected: speak closer to the mic; check microphone permissions in Windows Settings → Privacy → Microphone

License

MIT

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft