Speech To Text STT (Multi-Provider)
This extension provides a simple interface within VS Code to record audio from your microphone and transcribe it to text using various Speech-to-Text providers.
Features
- Record Audio: Capture audio directly from your selected microphone.
- Multi-Provider Support: Choose between:
- ElevenLabs
- OpenAI (Whisper)
- Groq (Whisper models)
- Transcription: Convert recorded audio to text using your chosen provider.
- History: View a history of your transcriptions within the VS Code sidebar.
- Actions:
- Copy transcription text to clipboard.
- Clear transcription history.
- (Optional) Automatically copy transcription to clipboard.
- (Optional) Automatically insert transcription into the active editor.
- Status Bar Integration: Start/stop recording and see the current status directly from the status bar.
- Output Channel: View detailed logs and messages in the "Speech To Text STT" output channel.
Configuration
Configure the extension via VS Code settings (File > Preferences > Settings, or Cmd+,
/ Ctrl+,
). Search for "Speech To Text STT".
Required:
Select Provider:
speech-to-text-stt.transcriptionProvider
: Choose your desired transcription service (elevenlabs
, openai
, or groq
). Default: elevenlabs
.
Configure Selected Provider: You only need to configure the provider you selected above.
- ElevenLabs:
speech-to-text-stt.elevenlabs.apiKey
: Your ElevenLabs API key.
- (Optional)
speech-to-text-stt.elevenlabs.modelId
: Model ID (default: eleven_multilingual_v2
).
- (Optional)
speech-to-text-stt.elevenlabs.languageCode
: BCP-47 language code (e.g., en
, es
). Leave empty for auto-detect.
- (Optional)
speech-to-text-stt.elevenlabs.numSpeakers
: Number of speakers hint (1-32).
- OpenAI:
speech-to-text-stt.openai.apiKey
: Your OpenAI API key.
- (Optional)
speech-to-text-stt.openai.modelId
: Model ID (default: whisper-1
).
- (Optional)
speech-to-text-stt.openai.language
: ISO-639-1 language code (e.g., en
, es
).
- (Optional)
speech-to-text-stt.openai.prompt
: Text to guide the model.
- (Optional)
speech-to-text-stt.openai.temperature
: Sampling temperature (0-1).
- Groq:
speech-to-text-stt.groq.apiKey
: Your Groq API key.
- (Optional)
speech-to-text-stt.groq.modelId
: Model ID (default: whisper-large-v3-turbo
).
- (Optional)
speech-to-text-stt.groq.language
: ISO-639-1 language code.
- (Optional)
speech-to-text-stt.groq.prompt
: Text to guide the model.
- (Optional)
speech-to-text-stt.groq.temperature
: Sampling temperature (0-1, default 0).
General Options:
speech-to-text-stt.copyToClipboardAfterTranscription
: Automatically copy transcription to clipboard (default: true
).
speech-to-text-stt.insertIntoEditorAfterTranscription
: Automatically insert transcription into the active editor (default: false
).
Usage
- Configure: Set your desired provider and its API key in the VS Code settings.
- Open the View: Click the "Speech To Text STT" icon in the activity bar (default icon may vary).
- Select Microphone (Optional): Click "⚙️ Select Microphone..." in the view to choose a specific input device if needed.
- Start Recording: Click "▶️ Start Recording" in the view or the
$(record) STT: Idle
item in the status bar.
- Stop & Transcribe: Click "⏹️ Stop Recording & Transcribe" in the view or the
$(debug-pause) STT: Recording...
item in the status bar.
- View History: Transcriptions appear in the view's history section.
- Copy/Clear: Use the icons or context menus in the history view to copy text or clear the history.
Requirements
- FFmpeg is required for audio recording from your microphone. This extension uses FFmpeg to capture audio input.
Installing FFmpeg
The extension will automatically search for FFmpeg in common installation locations. If not found, you'll need to install it manually:
After installing, ensure FFmpeg is available in your system PATH. You can verify by running ffmpeg -version
in a terminal.
Note: If you install FFmpeg after launching VS Code, you may need to restart VS Code for the extension to detect it.
For detailed information about how this extension uses FFmpeg, detection methods, and advanced installation guidance, see the FFmpeg Documentation.
Contributing
Contributions, issues, and feature requests are welcome. Please check the GitHub repository.
License
MIT