VoiceAssistant enables voice-controlled interactions inside VS Code using a webview-based speech recognizer (Web Speech API) with optional Google Cloud Speech fallback.
Features
Real-time speech transcription using Web Speech API inside a secure webview
Git integration (uses built-in Git API or falls back to terminal git)
Simple UI to start/stop listening and see transcripts
Getting started
npm install
npm run compile
Press F5 to run the extension in a new Extension Development Host
Packaging
Install vsce (already in devDependencies): npm i -g vsce
Run: npm run package — this creates a .vsix file for distribution
Google Cloud Speech (optional, recommended for multi-language)
To enable robust multi-language recognition, set voiceAssistant.useGoogleSpeech to true and set voiceAssistant.googleCredentialFile to the path of your Google Cloud service-account JSON (with Speech-to-Text permissions).
Configure voiceAssistant.recognitionLanguages with an array of IETF language tags you want the assistant to consider (default: ["en-US","hi-IN"]). The extension will send recorded audio to Google Speech-to-Text, which will use the provided languages to automatically detect and transcribe speech.
Note: Google STT requires credentials and network access. The extension ships a local integration using @google-cloud/speech to simplify usage.
OpenAI integration (code generation)
To enable OpenAI as a generation provider, set voiceAssistant.useOpenAI to true. Supply your OpenAI API key either via the voiceAssistant.openaiApiKey setting or the OPENAI_API_KEY environment variable.
Once enabled, voice prompts like "Create a button that does X" or "Generate a component for login" will use OpenAI (if available) to produce code suggestions. Generated files are shown in a preview and you are prompted to Apply and optionally Commit & Push the changes.
If you want fully automated operation, the following settings control auto behavior:
voiceAssistant.allowExternalCodeUpload (default: false) — must be enabled to allow sending prompts and code to external AI providers.
voiceAssistant.autoApplyGeneratedCode (default: false) — if enabled, generated code is auto-applied when the number of generated files is <= voiceAssistant.autoApplyMaxFiles.
voiceAssistant.autoApplyMaxFiles (default: 5) — maximum files allowed for auto-apply.
voiceAssistant.autoCommitAndPush (default: false) — when enabled and auto-apply runs, commits and pushes will be performed automatically.
Safety note: Enabling fully automatic modes can modify many files and push changes. Keep allowExternalCodeUpload disabled if you do not want code or prompts sent externally. Use autoApplyGeneratedCode only when you trust the provider and the prompt context.
Be careful: sending code and prompts to external services may expose content. Use include/exclude rules and limit what is uploaded as needed.
Notes & Limitations
The Web Speech API requires the webview environment to support it (should be available in VS Code stable's webviews). If not available, use the Google Cloud fallback (not implemented client-side by default).
Intent parsing is rule-based and meant for an MVP. For advanced natural language understanding, integrate an NLU provider.
Contributing
Open issues, add features, and send pull requests.