VoiceDev — Voice-Native Development

Voice-native development for VS Code — speak to navigate, commit, and control your workflow.

⚠️ v0.1.0-preview — Early development. Expect rapid improvements.

🎥 See It In Action

Watch VoiceDev in action — see voice-driven git workflows, navigation, and AI-assisted development.

Why VoiceDev?

Development is full of small, repetitive actions — saving files, committing code, navigating to a line, running a command. These tiny interruptions pull you out of flow.

VoiceDev makes voice a first-class way to drive your development workflows. Not just dictation — real workflows, entirely by voice:

🗣️ "git status"  →  "git commit message fixed the auth bug"  →  "git push"

A full commit cycle without touching a menu, a palette, or a terminal prompt.

🗣️ "open file server.ts"  →  "go to line 42"  →  "format document"  →  "save all"

Navigate, edit, and save — all spoken.

🗣️ "ask copilot explain this file"  →  "copilot commit"

AI-assisted development, triggered by voice.

Core Beliefs

Workflows over keystrokes — voice shines at chaining intent, not replacing a keyboard
Privacy as a feature — cloud speed or local privacy, always your choice
Accessible by nature — every speakable workflow is one less barrier
Progressive disclosure — simple on day one, powerful by day thirty

📖 Read the full philosophy in VISION.md

Features

🗣️ 30+ Voice Commands

Speak naturally — VoiceDev understands you even if your wording isn't exact, thanks to fuzzy matching with confidence scoring.

Editor

"save all" · "format document" · "new terminal" · "close editor"

Git — full workflow by voice

"git status" · "git add all" · "git diff" · "git log"
"git commit message fixed the login bug" — wildcard captures your message
"git pull" · "git push" (with confirmation) · force push blocked for safety

Navigation

"open file server.ts" — fuzzy file search
"go to line 42" · "go to top" · "go to bottom" · "go to symbol"

Copilot CLI

"ask copilot explain this error" · "copilot commit" · "copilot suggest how to list docker containers"

Copilot Chat

"copilot chat explain this file" · "ask copilot in chat how to fix this" · "open copilot chat"

System

"help" · "open command center" · "show shortcuts"

🎯 Smart Matching

Wildcard patterns — say "git commit message fixed the auth bug" and VoiceDev extracts "fixed the auth bug" as the commit message. 9 commands support dynamic argument capture.
Fuzzy matching — "format the document" still triggers format-document. No need to memorize exact phrases.
Confidence scoring — if VoiceDev isn't confident enough in a command match, it falls back to dictation instead of guessing wrong.

📝 Voice Dictation

When your speech doesn't match a command, VoiceDev inserts it as text — into the active editor at your cursor, or into the terminal if one is focused.

🔒 Privacy-First Provider Choice

Your voice, your rules. Choose between fast cloud transcription or fully offline local mode:

Provider	Speed	Cost	Privacy	Setup
Groq (default)	⚡ Very Fast	🆓 Free tier	☁️ Cloud	API key only
Voxtral (Mistral)	⚡ Very Fast	🆓 Free tier	☁️ Cloud	API key only
Local	🐌 Slower	🆓 Zero cost	🔒 Fully offline	One-time setup (~2 min)
OpenAI	🐢 Medium	💰 Paid	☁️ Cloud	API key only

Local mode runs via faster-whisper — your voice never leaves your machine. No API keys, no cloud, no limits.

🔊 Audio & Visual Feedback

Start/stop chimes so you know when VoiceDev is listening
Status bar shows recording timer, transcription spinner, and provider name
Toast notifications for command execution, errors, and results
Command Center webview for browsing all available commands

Roadmap

VoiceDev is in active development. Here's where we are and where we're heading.

🟢 Now (v0.1.0-preview)

✅ 30+ voice commands — editor, git, navigation, Copilot CLI, and Copilot Chat
✅ Multi-provider speech-to-text (Groq, Mistral Voxtral, OpenAI, Local)
✅ Privacy-first offline mode via faster-whisper
✅ Fuzzy matching and wildcard pattern extraction
✅ Audio feedback and status bar integration

🔵 Next

🔗 Action Chaining — compose multi-step voice sequences into reusable workflows. Say "git workflow" to run diff → stage → commit → push in one voice command. Chain any combination of built-in commands with abort-on-failure safety.
🎙️ Custom Macros — record a voice phrase and map it to one or more commands. Say "deploy staging" to trigger your own sequence of git pull → build → deploy. Define macros in settings or through the Command Center.
AI-powered developer workflows (inline completion, coding assists)
Real-time translation from multiple languages into English
More VS Code-native actions by voice (search extensions, update settings, open projects)

🟣 Exploring

Voice-based web search with responses inside the IDE
Multi-IDE expansion (Open VSX, Zed, and beyond)
Conversational mode — multi-turn voice interactions
Developer dictionary — learn your codebase vocabulary
Full cross-platform production release

📖 See VISION.md for the full philosophy and detailed roadmap

Requirements

VS Code 1.85.0 or higher
Microphone for voice input

Cloud Providers (Default)

Groq API key (free at console.groq.com) - for fast cloud speech-to-text
OpenRouter API key (optional) - for AI-powered commit messages

Local Provider (Privacy-First Option)

Python 3.9-3.12 - must be in PATH
ffmpeg - must be in PATH (download here)
Disk space: ~150MB for model download (one-time)
RAM: 4GB minimum, 8GB+ recommended

Why choose local? Your voice never leaves your machine. Zero API costs, unlimited usage, works offline.

Installation

Install from VS Code Marketplace (coming soon)
Or install the .vsix file manually

Quick Start

Option A: Groq (Fast & Free - Recommended for Getting Started)

Get a free API key from console.groq.com
Open Command Palette (Cmd/Ctrl + Shift + P)
Run VoiceDev: Set API Key, select "Groq", and paste your key
Press Ctrl+Shift+V to start recording
Speak a command like "save all" or dictate text

Option B: Mistral (Fast & Free - Voxtral powered)

Get a free API key from console.mistral.ai
Open VS Code Settings (Cmd/Ctrl + ,)
Search for "voicedev.stt.provider" and select "mistral"
Open Command Palette and run VoiceDev: Set API Key, select "Mistral"
Press Ctrl+Shift+V to start recording
Speak a command or dictate text

Option C: Local (Private & Offline - Zero Cost)

Install Python 3.9-3.12 and ffmpeg (if not already installed)
Open VS Code Settings (Cmd/Ctrl + ,)
Search for "voicedev.stt.provider"
Select "local" from the dropdown
Press Ctrl+Shift+V to start recording (first run will setup environment ~2 min)
Your voice stays on your machine - no API keys, no cloud, no limits

Configuration

Customize VoiceDev through your VS Code settings. Add these configurations to your settings.json:

Basic Configuration

{
	"voicedev.stt.provider": "groq",
	"voicedev.llm.provider": "openrouter",
	"voicedev.llm.model": "anthropic/claude-3-haiku-20240307",
	"voicedev.audio.feedbackSounds": true
}

Configuration Options

Setting	Description	Values	Default
`voicedev.stt.provider`	Speech-to-text provider	`"groq"`, `"mistral"`, `"openai"`, `"local"`	`"groq"`
`voicedev.llm.provider`	AI model provider	`"openrouter"`, `"openai"`	`"openrouter"`
`voicedev.llm.model`	AI model for commit messages	Model identifier string	`"anthropic/claude-3-haiku-20240307"`
`voicedev.audio.feedbackSounds`	Enable/disable audio feedback	`true`, `false`	`true`

How to Apply Configuration

Open VS Code Settings:
- Windows/Linux: Ctrl + ,
- macOS: Cmd + ,
Click the "Open Settings (JSON)" icon in the top-right corner
Add or modify the VoiceDev configuration section
Save the file - changes take effect immediately

Provider-Specific Settings

Groq Provider

{
	"voicedev.stt.provider": "groq"
}

Note: Use VoiceDev: Set API Key command to securely store your Groq API key.

Mistral Provider

{
	"voicedev.stt.provider": "mistral"
}

Note: Use VoiceDev: Set API Key command to securely store your Mistral API key. Powered by Mistral's Voxtral-mini model for fast, accurate transcription.

Local Provider

{
	"voicedev.stt.provider": "local",
	"voicedev.stt.local.pythonPath": "path-to-python-executable"
}

OpenAI Provider

{
	"voicedev.stt.provider": "openai"
}

Note: Use VoiceDev: Set API Key command to securely store your OpenAI API key. The OpenAI provider integration is planned for a future release.

For Developers

Interested in contributing to VoiceDev? Check out our QUICKSTART.md guide for detailed setup instructions, development workflow, and information about working with the local/offline speech-to-text provider.

Quick links:

Reporting Issues

We welcome bug reports and feature requests from both technical contributors and non-technical users! When creating a new issue, you'll be guided through our structured templates:

🐛 Bug Report: For reporting bugs and unexpected behavior
🚀 Feature Request: For suggesting new features and improvements
💬 General Issue: For questions, discussions, or other topics

Tips for effective issue reporting:

Search existing issues before creating a new one
Provide clear, step-by-step reproduction instructions for bugs
Include version information (VoiceDev, VS Code, OS)
Add screenshots or recordings when helpful
Be specific about expected vs actual behavior

Our templates include sections for both non-technical users (basic information) and technical contributors (detailed logs, technical insights).

Troubleshooting

Local Provider Issues

"Python 3.9+ is required to use Local STT"

Ensure Python 3.9-3.12 is installed and in your system PATH
Run python --version in terminal to verify

"ffmpeg not found in PATH"

Install ffmpeg from ffmpeg.org
On Windows: choco install ffmpeg (with Chocolatey)
On macOS: brew install ffmpeg
Verify with ffmpeg -version

"Setup failed" or timeout during first run

Check your internet connection (downloads ~150MB model)
Run VoiceDev: Clear API Key and try again
Ensure enough disk space (~200MB total)

Cloud Provider Issues

"Invalid API key" or transcription fails

Verify your API key at console.groq.com
Run VoiceDev: Set API Key to update
Check your internet connection

Known Limitations

Local Provider

Performance: ~3-5x slower than cloud providers (runs on CPU)
First run: One-time setup takes ~2 minutes + ~150MB download
Best for: Short commands and dictation (under 30 seconds)
Accuracy: Slightly lower with CPU int8 quantization vs full GPU models
Platform: Windows fully supported; macOS/Linux support in progress

All Providers

Maximum recording length: 30 seconds (auto-stop)
Requires clear audio input for best accuracy
Background noise may affect transcription quality

Contributing

Contributions welcome! See CONTRIBUTING.md for guidelines.

License

BSD 3-Clause License - See LICENSE file for details.

Voice-native development — read the vision 🎙️

VoiceDev (Preview)

Mohit Sharma

VoiceDev — Voice-Native Development

🎥 See It In Action

Why VoiceDev?

Core Beliefs

Features

🗣️ 30+ Voice Commands

🎯 Smart Matching

📝 Voice Dictation

🔒 Privacy-First Provider Choice

🔊 Audio & Visual Feedback

Roadmap

🟢 Now (v0.1.0-preview)

🔵 Next

🟣 Exploring

Requirements

Cloud Providers (Default)

Local Provider (Privacy-First Option)

Installation

Quick Start

Option A: Groq (Fast & Free - Recommended for Getting Started)

Option B: Mistral (Fast & Free - Voxtral powered)

Option C: Local (Private & Offline - Zero Cost)

Configuration

Basic Configuration

Configuration Options

How to Apply Configuration

Provider-Specific Settings

Groq Provider

Mistral Provider

Local Provider

OpenAI Provider

For Developers

Reporting Issues

Troubleshooting

Local Provider Issues

Cloud Provider Issues

Known Limitations

Local Provider

All Providers

Contributing

License