ECodeWhisper
Eager Code Whisper - Voice-to-text extension for VS Code and Cursor using faster-whisper with TEN VAD for automatic silence detection.
Features
- Real-time transcription: Stream audio to faster-whisper server via HTTP API
- Smart silence detection: TEN VAD automatically stops recording after configurable silence duration
- Low latency: Optimized Python backend with async I/O
- Local processing: All transcription happens on your machine - no cloud services
- Partial results: See transcription as you speak (configurable)
- Auto-insert: Automatically insert transcription at cursor position
Project Structure
e-codewhisper/
├── src/
│ ├── client/
│ │ └── extension.ts # TypeScript VS Code extension
│ └── server/
│ └── python/
│ └── codewhisper.py # Python backend with TEN VAD
├── resources/
│ ├── compose.yml # Docker compose for faster-whisper
│ └── install-stt-service.sh # Systemd service installer
├── images/
│ └── codewhisper.svg # Extension icon
├── out/ # Compiled extension (generated)
├── package.json # VS Code extension manifest
├── pyproject.toml # Python dependencies (uv)
├── tsconfig.json # TypeScript config
└── README.md # Documentation
Requirements
System Requirements
- OS: Linux (Ubuntu 22.04+ recommended)
- Python: 3.12 or higher
- Audio: ALSA (
arecord command available)
- GPU: NVIDIA GPU with CUDA support (for faster-whisper server)
Dependencies
Installation
1. Install Python Dependencies
Using uv (recommended):
cd ~/.vscode/extensions/codewhisper-* # or ~/.cursor/extensions/codewhisper-*
uv sync
Or using pip:
pip install numpy httpx
pip install git+https://github.com/TEN-framework/ten-vad.git
2. Install faster-whisper Server (Optional - Automated)
The extension includes an automated installer for the faster-whisper Docker service:
# Make script executable
chmod +x resources/install-stt-service.sh
# Install with defaults (small model, English)
./resources/install-stt-service.sh
# Or customize installation
./resources/install-stt-service.sh \
--model large-v3 \
--language es \
--port 4445 \
--device cuda
This will:
- Create
~/.automation/stt/compose.yml
- Create systemd service
automation-stt
- Start the service automatically on boot
Manual Installation
If you prefer manual installation:
# Create directory
mkdir -p ~/.automation/stt/models
cd ~/.automation/stt
# Copy compose file
cp /path/to/codewhisper/resources/compose.yml .
# Start service
docker compose up -d
# Check logs
docker logs -f codewhisper-stt
Systemd Service (Manual)
Create /etc/systemd/system/automation-stt.service:
[Unit]
Description=Faster Whisper for IDEs
Requires=docker.service
After=docker.service
[Service]
Type=oneshot
RemainAfterExit=yes
User=YOUR_USERNAME
WorkingDirectory=/home/YOUR_USERNAME/.automation/stt
ExecStart=/usr/bin/docker compose up -d
ExecStop=/usr/bin/docker compose down
[Install]
WantedBy=multi-user.target
Then enable:
sudo systemctl daemon-reload
sudo systemctl enable automation-stt
sudo systemctl start automation-stt
3. Verify Installation
# Check service health
curl http://localhost:4445/health
# Test transcription endpoint
curl http://localhost:4445/v1/models
Usage
Keyboard Shortcuts
| Action |
Windows/Linux |
macOS |
| Toggle Recording |
Ctrl+Shift+Space |
Cmd+Shift+Space |
| Cancel Recording |
Escape |
Escape |
Recording Flow
- Press
Ctrl+Shift+Space to start recording
- Speak your text
- Recording stops automatically when:
- VAD detects silence (configurable duration)
- You press
Ctrl+Shift+Space again
- You press
Escape to cancel
- Transcription is inserted at cursor position (or copied to clipboard)
Status Bar
The status bar shows the current state:
- 🎤 CodeWhisper - Ready to record
- ⏺️ Recording... - Actively recording
- 🔄 Transcribing... - Processing audio
- 🔄 Connecting... - Connecting to server
Configuration
Open VS Code/Cursor settings and search for "CodeWhisper":
| Setting |
Default |
Description |
codewhisper.pythonPath |
python3 |
Path to Python 3.12+ interpreter |
codewhisper.whisperEndpoint |
http://localhost:4445/v1/audio/transcriptions |
HTTP API endpoint |
codewhisper.model |
small |
Whisper model (tiny, base, small, medium, large-v2, large-v3) |
codewhisper.language |
en |
Language code (en, es, fr, etc.) |
codewhisper.vadSilenceThreshold |
1.5 |
Seconds of silence before auto-stop (0.3-5.0) |
codewhisper.vadThreshold |
0.5 |
VAD probability threshold (0.1-0.9) |
codewhisper.autoInsert |
true |
Auto-insert at cursor position |
codewhisper.showPartialResults |
true |
Show partial transcriptions |
Example settings.json
{
"codewhisper.model": "large-v3",
"codewhisper.language": "es",
"codewhisper.vadSilenceThreshold": 2.0,
"codewhisper.autoInsert": true
}
Architecture
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────────┐
│ VS Code/ │ │ Python Backend │ │ faster-whisper │
│ Cursor │────▶│ (codewhisper) │────▶│ Server (Docker) │
│ Extension │ │ + TEN VAD │ │ HTTP API │
└─────────────────┘ └──────────────────┘ └─────────────────────┘
│ │
│ spawn process │ audio capture
│ JSON messages │ VAD processing
└────────────────────────┘
Components
TypeScript Extension (src/client/extension.ts)
- VS Code integration
- Status bar management
- Configuration handling
- Spawns Python process
Python Backend (src/server/python/codewhisper.py)
- Audio capture via
arecord
- TEN VAD for voice activity detection
- HTTP API client for faster-whisper server (OpenAI-compatible)
- JSON message protocol with extension
faster-whisper Server (Docker)
- GPU-accelerated transcription
- OpenAI-compatible API
Troubleshooting
"Cannot connect to faster-whisper server"
# Check if container is running
docker ps | grep codewhisper-stt
# Check logs
docker logs codewhisper-stt
# Restart service
sudo systemctl restart automation-stt
"ten-vad not installed"
pip install git+https://github.com/TEN-framework/ten-vad.git
"arecord: command not found"
sudo apt install alsa-utils
"No speech detected"
- Check microphone permissions
- Increase
vadThreshold if getting false negatives
- Decrease
vadThreshold if not detecting quiet speech
- Test microphone:
arecord -d 3 test.wav && aplay test.wav
High latency
- Use smaller model (
tiny or base)
- Ensure GPU is being used (check
WHISPER_DEVICE=cuda)
- Check GPU memory with
nvidia-smi
Development
Use make for common tasks:
make help # Show all available commands
make dev # Install with dev dependencies
make build # Compile TypeScript extension
make test # Run Python tests
make lint # Run ruff linter
make format # Format code with ruff
make typecheck # Run ty type checker
make package # Build VSIX package
make clean # Remove build artifacts
Manual Commands
# Install Node.js dependencies
npm install
# Compile TypeScript
npm run compile
# Package extension
npm run vsce:package
Python Development
# Install with dev dependencies
uv sync --group dev
# Run tests
uv run pytest src/server/python/tests/ -v
# Type checking
uv run ty check src/server/python/
# Linting
uv run ruff check src/server/python/
License
MIT License - see LICENSE for details.
Credits