Skip to content
| Marketplace
Sign in
Visual Studio Code>Other>Cloud TTS (Gemini / OpenAI / ElevenLabs)New to Visual Studio Code? Get it now.
Cloud TTS (Gemini / OpenAI / ElevenLabs)

Cloud TTS (Gemini / OpenAI / ElevenLabs)

geryit

|
70 installs
| (2) | Free
Right-click selected text (editor or terminal) to read it aloud using a cloud TTS provider of your choice.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Cloud TTS for VSCode

Right-click selected text (in an editor or in the integrated terminal) → "Read Aloud" → audio plays on macOS, Windows, and Linux.

Starts free with your System Voice (no API key, offline). For human-sounding speech, plug in Gemini, OpenAI, or ElevenLabs — pick whichever you like the sound of best.

Voices

  • System Voice (default) — free, no key, offline. Uses your OS built-in synthesizer: macOS say, Windows SAPI, Linux espeak-ng/espeak/spd-say. Robotic, but costs nothing and works everywhere.

The cloud providers below sound human (the OS voices are robotic):

  • Gemini — natural prosody, supports natural-language style prompts ("Say sarcastically:") and audio tags ([whispers], [shouting])
    • gemini-2.5-flash-preview-tts (default) — faster, cheaper. ~$10/1M chars
    • gemini-2.5-pro-preview-tts — better prosody. ~$80/1M chars (≈8× flash)
    • gemini-3.1-flash-tts-preview — newest (Apr 2026), best controllability and expressivity, 100+ languages. ~$12/1M chars
  • OpenAI — fast, cheap, decent quality
    • gpt-4o-mini-tts (default) — newer, supports instructions. ~$0.015/min audio
    • tts-1 — cheap and fast, no instructions. $15/1M chars
    • tts-1-hd — higher-quality variant of tts-1. $30/1M chars
  • ElevenLabs — top-tier voice cloning quality (most expensive)
    • eleven_multilingual_v2 (default) — highest quality multilingual. ~$120/1M chars (API)
    • eleven_turbo_v2_5 — lower latency, slightly less expressive. ~$60/1M chars (API)
    • eleven_flash_v2_5 — lowest latency, cheapest. ~$60/1M chars (API)

Prices are approximate ballparks for rough comparison — see each provider's pricing page for exact rates and any free-tier quotas.

Install

cd ~/Documents/vsc-extensions/cloud-tts
npx --yes @vscode/vsce package --allow-missing-repository
code           --install-extension cloud-tts-*.vsix   # stable VSCode
code-insiders  --install-extension cloud-tts-*.vsix   # Insiders (if you use it)

Re-run after edits to update. Reload windows with Cmd+Shift+P → Developer: Reload Window.

Platform notes

  • macOS / Windows — System Voice and audio playback work out of the box (say / built-in SAPI; afplay / MediaPlayer).
  • Linux — System Voice needs espeak-ng (or espeak/spd-say); cloud-audio playback needs ffplay (ffmpeg) or mpg123. Install via your package manager, e.g. sudo apt install espeak-ng ffmpeg.

First-time setup

Nothing to configure — System Voice is the default and needs no key. To upgrade to a human-sounding cloud voice:

  1. Set an API key: Cmd+Shift+P → Cloud TTS: Set API Key… → pick provider → paste key. The input field is masked, and the key is stored in VSCode's encrypted SecretStorage — never written to settings.json, never synced. (System Voice is keyless, so it isn't listed here.)
  2. Pick a provider: Cmd+Shift+P → Cloud TTS: Switch Provider, or set cloudTts.provider in Settings.
  3. (Optional) tune voice/model: Cmd+Shift+P → Cloud TTS: Open Settings.
Provider Get a key at
Gemini https://aistudio.google.com/apikey
OpenAI https://platform.openai.com/api-keys
ElevenLabs https://elevenlabs.io/app/settings/api-keys

Usage

No default keybindings — every action runs via the right-click menu (for selections) or the Command Palette (Cmd+Shift+P). Bind your own shortcuts in Code → Settings → Keyboard Shortcuts if you want them.

Action How
Read selection (editor) Right-click → Read Aloud
Read selection (terminal) Right-click → Read Aloud
Stop playback Cloud TTS: Stop Playback (Command Palette)
Quick switch provider Cloud TTS: Switch Provider
Set / change API key Cloud TTS: Set API Key… (masked input)
Delete all stored keys Cloud TTS: Clear All API Keys
Open settings Cloud TTS: Open Settings
Cancel during synthesis Cancel button on the progress notification

Terminal selection is read via a brief clipboard round-trip (the only way without an official API). The clipboard is restored to its previous content immediately after.

Settings reference

Top-level

Setting Default Notes
cloudTts.provider gemini Active provider

API keys are not in Settings — use Cloud TTS: Set API Key… instead (encrypted).

Gemini

Setting Default Notes
cloudTts.gemini.model gemini-2.5-flash-preview-tts Or gemini-2.5-pro-preview-tts for better prosody, gemini-3.1-flash-tts-preview for the newest (Apr 2026)
cloudTts.gemini.voice Kore 30 voices: Zephyr, Puck, Charon, Kore, Fenrir, Leda, Orus, Aoede, …
cloudTts.gemini.stylePrompt (empty) Prepended to text. e.g. "Say in a calm, neutral tone:"

OpenAI

Setting Default Notes
cloudTts.openai.model gpt-4o-mini-tts Or tts-1, tts-1-hd
cloudTts.openai.voice nova alloy, ash, ballad, cedar, coral, echo, fable, marin, onyx, nova, sage, shimmer, verse
cloudTts.openai.instructions (empty) gpt-4o-mini-tts only. e.g. "Speak cheerfully."

ash / ballad / cedar / coral / marin / sage / verse only work with gpt-4o-mini-tts.

ElevenLabs

Setting Default Notes
cloudTts.elevenlabs.model eleven_multilingual_v2 Or eleven_turbo_v2_5, eleven_flash_v2_5
cloudTts.elevenlabs.voiceId 21m00Tcm4TlvDq8ikWAM (Rachel) Browse https://elevenlabs.io/app/voice-library for IDs

Limitations

  • Linux needs a TTS engine / audio player installed (see Platform notes); macOS and Windows work out of the box.
  • No streaming — waits for full clip before playback. Long selections take a few seconds before audio starts. (System Voice starts speaking immediately.)
  • Terminal selection requires a clipboard round-trip (clipboard is restored after).

Releasing (maintainer)

.github/workflows/publish.yml auto-publishes to both the VS Code Marketplace and the Open VSX Registry whenever package.json#version changes on main. To cut a release:

  1. Bump version in package.json (update README if needed).
  2. Commit + push to main (or merge a PR).
  3. Watch the run at Actions → Publish to Marketplace & Open VSX.

The two registries publish independently — if one fails, the other still goes out.

Pushes that don't touch version no-op silently — safe to edit README, code, etc. on main without triggering a publish.

One-time setup: VSCE_PAT secret

The workflow needs an Azure DevOps PAT stored as a GitHub repo secret.

  1. Create the PAT at https://dev.azure.com → user icon → Personal access tokens → New Token.
    • Organization: All accessible organizations
    • Scopes: Custom defined → check Marketplace: Manage
  2. Add it to GitHub: Repo Settings → Secrets and variables → Actions → New repository secret.
    • Name: VSCE_PAT
    • Value: the token from step 1
  3. (Optional) verify locally: VSCE_PAT=<token> npx @vscode/vsce verify-pat geryit.

One-time setup: OVSX_PAT secret

The workflow also needs an Open VSX access token.

  1. Sign the Open VSX Publisher Agreement (one-time), then create a token at https://open-vsx.org/user-settings/tokens.
  2. Add it to GitHub: Repo Settings → Secrets and variables → Actions → New repository secret.
    • Name: OVSX_PAT
    • Value: the token from step 1
  3. (Optional) publish locally: OVSX_PAT=<token> npx ovsx publish.

Verified-publisher warning: Open VSX shows a "not a verified publisher of the namespace" notice until the geryit namespace is owned. Request ownership via an issue at EclipseFdn/open-vsx.org (one-time, after signing the agreement).

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft