Cloud TTS for VSCode
Right-click selected text (in an editor or in the integrated terminal) → "Read Aloud" → audio plays on macOS, Windows, and Linux.
Starts free with your System Voice (no API key, offline). For human-sounding speech, plug in Gemini, OpenAI, or ElevenLabs — pick whichever you like the sound of best.
Voices
- System Voice (default) — free, no key, offline. Uses your OS built-in synthesizer: macOS
say, Windows SAPI, Linux espeak-ng/espeak/spd-say. Robotic, but costs nothing and works everywhere.
The cloud providers below sound human (the OS voices are robotic):
- Gemini — natural prosody, supports natural-language style prompts ("Say sarcastically:") and audio tags (
[whispers], [shouting])
gemini-2.5-flash-preview-tts (default) — faster, cheaper. ~$10/1M chars
gemini-2.5-pro-preview-tts — better prosody. ~$80/1M chars (≈8× flash)
gemini-3.1-flash-tts-preview — newest (Apr 2026), best controllability and expressivity, 100+ languages. ~$12/1M chars
- OpenAI — fast, cheap, decent quality
gpt-4o-mini-tts (default) — newer, supports instructions. ~$0.015/min audio
tts-1 — cheap and fast, no instructions. $15/1M chars
tts-1-hd — higher-quality variant of tts-1. $30/1M chars
- ElevenLabs — top-tier voice cloning quality (most expensive)
eleven_multilingual_v2 (default) — highest quality multilingual. ~$120/1M chars (API)
eleven_turbo_v2_5 — lower latency, slightly less expressive. ~$60/1M chars (API)
eleven_flash_v2_5 — lowest latency, cheapest. ~$60/1M chars (API)
Prices are approximate ballparks for rough comparison — see each provider's pricing page for exact rates and any free-tier quotas.
Install
cd ~/Documents/vsc-extensions/cloud-tts
npx --yes @vscode/vsce package --allow-missing-repository
code --install-extension cloud-tts-*.vsix # stable VSCode
code-insiders --install-extension cloud-tts-*.vsix # Insiders (if you use it)
Re-run after edits to update. Reload windows with Cmd+Shift+P → Developer: Reload Window.
- macOS / Windows — System Voice and audio playback work out of the box (
say / built-in SAPI; afplay / MediaPlayer).
- Linux — System Voice needs
espeak-ng (or espeak/spd-say); cloud-audio playback needs ffplay (ffmpeg) or mpg123. Install via your package manager, e.g. sudo apt install espeak-ng ffmpeg.
First-time setup
Nothing to configure — System Voice is the default and needs no key. To upgrade to a human-sounding cloud voice:
- Set an API key:
Cmd+Shift+P → Cloud TTS: Set API Key… → pick provider → paste key. The input field is masked, and the key is stored in VSCode's encrypted SecretStorage — never written to settings.json, never synced. (System Voice is keyless, so it isn't listed here.)
- Pick a provider:
Cmd+Shift+P → Cloud TTS: Switch Provider, or set cloudTts.provider in Settings.
- (Optional) tune voice/model:
Cmd+Shift+P → Cloud TTS: Open Settings.
Usage
No default keybindings — every action runs via the right-click menu (for selections) or the Command Palette (Cmd+Shift+P). Bind your own shortcuts in Code → Settings → Keyboard Shortcuts if you want them.
| Action |
How |
| Read selection (editor) |
Right-click → Read Aloud |
| Read selection (terminal) |
Right-click → Read Aloud |
| Stop playback |
Cloud TTS: Stop Playback (Command Palette) |
| Quick switch provider |
Cloud TTS: Switch Provider |
| Set / change API key |
Cloud TTS: Set API Key… (masked input) |
| Delete all stored keys |
Cloud TTS: Clear All API Keys |
| Open settings |
Cloud TTS: Open Settings |
| Cancel during synthesis |
Cancel button on the progress notification |
Terminal selection is read via a brief clipboard round-trip (the only way without an official API). The clipboard is restored to its previous content immediately after.
Settings reference
Top-level
| Setting |
Default |
Notes |
cloudTts.provider |
gemini |
Active provider |
API keys are not in Settings — use Cloud TTS: Set API Key… instead (encrypted).
Gemini
| Setting |
Default |
Notes |
cloudTts.gemini.model |
gemini-2.5-flash-preview-tts |
Or gemini-2.5-pro-preview-tts for better prosody, gemini-3.1-flash-tts-preview for the newest (Apr 2026) |
cloudTts.gemini.voice |
Kore |
30 voices: Zephyr, Puck, Charon, Kore, Fenrir, Leda, Orus, Aoede, … |
cloudTts.gemini.stylePrompt |
(empty) |
Prepended to text. e.g. "Say in a calm, neutral tone:" |
OpenAI
| Setting |
Default |
Notes |
cloudTts.openai.model |
gpt-4o-mini-tts |
Or tts-1, tts-1-hd |
cloudTts.openai.voice |
nova |
alloy, ash, ballad, cedar, coral, echo, fable, marin, onyx, nova, sage, shimmer, verse |
cloudTts.openai.instructions |
(empty) |
gpt-4o-mini-tts only. e.g. "Speak cheerfully." |
ash / ballad / cedar / coral / marin / sage / verse only work with gpt-4o-mini-tts.
ElevenLabs
| Setting |
Default |
Notes |
cloudTts.elevenlabs.model |
eleven_multilingual_v2 |
Or eleven_turbo_v2_5, eleven_flash_v2_5 |
cloudTts.elevenlabs.voiceId |
21m00Tcm4TlvDq8ikWAM (Rachel) |
Browse https://elevenlabs.io/app/voice-library for IDs |
Limitations
- Linux needs a TTS engine / audio player installed (see Platform notes); macOS and Windows work out of the box.
- No streaming — waits for full clip before playback. Long selections take a few seconds before audio starts. (System Voice starts speaking immediately.)
- Terminal selection requires a clipboard round-trip (clipboard is restored after).
Releasing (maintainer)
.github/workflows/publish.yml auto-publishes to both the VS Code Marketplace and the Open VSX Registry whenever package.json#version changes on main. To cut a release:
- Bump
version in package.json (update README if needed).
- Commit + push to
main (or merge a PR).
- Watch the run at
Actions → Publish to Marketplace & Open VSX.
The two registries publish independently — if one fails, the other still goes out.
Pushes that don't touch version no-op silently — safe to edit README, code, etc. on main without triggering a publish.
One-time setup: VSCE_PAT secret
The workflow needs an Azure DevOps PAT stored as a GitHub repo secret.
- Create the PAT at https://dev.azure.com → user icon → Personal access tokens → New Token.
- Organization: All accessible organizations
- Scopes: Custom defined → check Marketplace: Manage
- Add it to GitHub: Repo Settings → Secrets and variables → Actions → New repository secret.
- Name:
VSCE_PAT
- Value: the token from step 1
- (Optional) verify locally:
VSCE_PAT=<token> npx @vscode/vsce verify-pat geryit.
One-time setup: OVSX_PAT secret
The workflow also needs an Open VSX access token.
- Sign the Open VSX Publisher Agreement (one-time), then create a token at https://open-vsx.org/user-settings/tokens.
- Add it to GitHub: Repo Settings → Secrets and variables → Actions → New repository secret.
- Name:
OVSX_PAT
- Value: the token from step 1
- (Optional) publish locally:
OVSX_PAT=<token> npx ovsx publish.
Verified-publisher warning: Open VSX shows a "not a verified publisher of the namespace" notice until the geryit namespace is owned. Request ownership via an issue at EclipseFdn/open-vsx.org (one-time, after signing the agreement).