Skip to content
| Marketplace
Sign in
Visual Studio Code>Machine Learning>MantraNew to Visual Studio Code? Get it now.
Mantra

Mantra

mishra7yash

|
56 installs
| (4) | Free
Control VS Code with your voice. Extremely accurate, absurdly fast.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Mantra

Code with your thoughts, not your keyboard. Extremely accurate, absurdly fast.

Mantra listens to your voice and instantly edits code, runs IDE commands, executes terminal commands, and interacts with AI agents — all hands-free. Works in VS Code and Cursor.

Free to use — no API keys or setup required.

Discord: https://discord.gg/fmWCScWuUn

Please read the Privacy and Data Handling section before use. Use a good quality and well-positioned desktop microphone for best results.


How It Works

  1. Speech-to-text: Audio is streamed for real-time transcription with built-in turn detection. The most frequent identifiers from your open file are sent as keyterms to bias recognition toward your code's vocabulary. Live partial transcripts appear in the status bar as you speak.
  2. Context-aware routing: When the IDE is the active window, the transcript goes through the full pipeline — commands, text operations, and LLM routing.
  3. LLM routing (IDE focused): The transcript + editor context + terminal history is sent to a fast LLM. The LLM classifies the instruction and returns one of four types:
    • command — runs an IDE command (130+ supported)
    • modification — applies a small, targeted edit to selected text (changes are highlighted in green/red). Only available when you have text manually selected in the editor. Without a selection, code-edit requests are routed to the agent.
    • terminal — translates natural language to a shell command and executes it
    • agent — forwards the transcript to the selected AI agent (Claude Code or Cursor). This is the default for any non-trivial task. When no agent is selected, requests are answered by a built-in Q&A agent shown in the Output panel.
  4. Pre-LLM shortcuts: Common phrases like "undo", "save", "scroll down", "enter", "delete", "focus editor", "ask Claude ...", keyboard shortcuts, and symbol navigation ("go to the handleCommand function") are handled instantly without waiting for the LLM.

Setup

1) Install Mantra

Install from the VS Code Marketplace or the Open VSX Registry. Also works in Cursor — install via Extensions: Install from VSIX or search "Mantra" in the Cursor extension marketplace.

2) Start!

Run "Mantra: Start Recording" from the Command Palette, press Ctrl+Shift+1, or click Hands-Free Mode in the Mantra sidebar panel.

No API keys needed — everything works out of the box.

Cursor users: The agent backend is automatically set to Cursor so voice commands route to the Cursor composer panel by default.


Recording Modes

Hands-Free Mode

Press Ctrl+Shift+1 or click Hands-Free Mode in the sidebar. Mantra listens continuously, detects speech turns automatically, and processes each one. Live partial transcripts appear in the bottom-left status bar as you speak. Auto-stops after 5 minutes of silence.

Push to Talk

Hold the Push to Talk button in the sidebar. Mantra records while you hold, then transcribes and processes when you release. Also shows live partial transcripts in the status bar.


What You Can Say

Code editing (requires text selection)

Code edits only happen when you have text selected in the editor — either selected manually or via a voice "select" command. Select the lines you want to change, then speak:

  • "change this to a while loop"
  • "rename this variable to count"
  • "add a docstring to this function"
  • "remove the comments"
  • "for i in range len nums print nums i" (raw code dictation)

Without a selection, code-edit requests are routed to the agent (if active) or answered as a question.

Voice selection

Say "select" to select code by description, then follow up with an edit:

  • "select this function" → selects the enclosing function
  • "select the inner for loop" → selects just the nested loop, not the outer one
  • "select the if statement" → selects the full if/elif/else chain
  • "highlight the try block" → selects the try/except block
  • "select lines 10 to 20" → selects exact line range

Once selected, your next voice command can edit it: "select this function" → "add a docstring" performs the edit on the selected code.

Agent tasks

When an agent is active, complex or unselected code tasks go to the agent automatically:

  • "create a terminal-based tic tac toe game"
  • "add a helper function to validate user input"
  • "add authentication"
  • "refactor this to use async/await"

IDE commands

  • "undo", "redo", "save", "format document"
  • "close this file", "open utils dot java"
  • "select lines 4 to 19", "go to line 20"
  • "delete", "delete line", "delete this line"
  • "scroll down", "scroll up 5 lines", "page up"
  • "toggle sidebar", "zen mode", "zoom in"
  • "focus editor", "focus terminal", "focus explorer"
  • "next tab", "previous tab", "first tab", "tab three"

Symbol navigation

Navigate directly to any function, class, method, or symbol in the current file:

  • By name: "go to the handleCommand function", "go to function processActivityFile", "jump to class GameBoard"
  • By description (semantic): "go to the function that resets the board", "go to the error handler", "jump to the input validation function"

Name-based matching uses fast token fuzzy matching (no LLM call). If the name doesn't match closely enough, or if you describe what a symbol does instead of saying its name, Mantra falls back to an LLM-powered semantic match.

Relative navigation also works via the LLM: "go to the next else if", "jump to the catch block" — the LLM finds the target line in the visible code and navigates there.

Opening files

  • "open script dot py" → opens script.py
  • "open main" → fuzzy-matches main.py, main.ts, etc.
  • "open auth dot controller dot ts" → opens auth.controller.ts

File names are fuzzy-matched against the workspace. You can say the extension ("dot py") or omit it. When the IDE is focused, "open X" tries to match a workspace file first; only if no file matches does it try to open a macOS app.

Terminal commands

  • "run this file" → python3 main.py
  • "create a virtual environment" → python3 -m venv venv
  • "install the requests library" → pip3 install requests
  • "check git status" → git status

Terminal commands are executed by default. If you want to just type without executing, say something like "create a virtual environment but don't run it".

Keyboard shortcuts (macOS, any app)

Any modifier+key combo spoken naturally is executed via the system:

  • "command B" → Cmd+B
  • "control shift P" → Ctrl+Shift+P
  • "command shift F" → Cmd+Shift+F

Stop / Resume

Say "pause", "stop", or "stop listening" to stop. Say "resume" or use Ctrl+Shift+1 to start again. Hands-free mode also auto-stops after 5 minutes of no speech.


Agent Integration

Mantra supports multiple AI agent backends:

  • Cursor — sends prompts to the Cursor composer panel. Auto-selected when running in Cursor IDE. Only available in Cursor (grayed out in VS Code).
  • Claude Code (Terminal) — sends prompts to the Claude Code CLI running in a terminal. Requires the claude CLI in your PATH.
  • Claude Code (Extension) — sends prompts to the Claude Code VS Code extension sidebar panel. No CLI needed — just install the extension.
  • Q&A — built-in Q&A agent shown in the Output panel. No external agent needed.

Select your preferred mode from the Agent dropdown in the Mantra sidebar. When running in Cursor, the agent defaults to Cursor automatically.

When an agent is active, it becomes the default destination for any non-trivial request and for all code-edit requests when no text is selected. You don't need to say "ask Claude" — just speak naturally and tasks are automatically routed to the agent.

Sending prompts to the agent

Say "ask Claude to refactor this function", "ask agent how to fix that", or just "add an AI opponent", "improve the performance", "add authentication" — it routes to the agent automatically.

When "Send Context to Agent" is enabled, the current editor state (filename, cursor position, selected text), activity log, terminal history, and workspace file listing are written to a context file. The first message in each session instructs the agent to read this file before responding; follow-up messages send just the raw transcript.

Common phrases like "ask Claude ...", "ask agent ...", "ask LLM ...", "ask AI ..." are intercepted before the LLM for instant routing. These also work when the IDE is not focused.

How agent modes work

  • Terminal mode types prompts into the Claude CLI terminal without pressing Enter. You review and say "enter" to send.
  • Extension mode pastes prompts into the Claude Code extension sidebar without submitting. You review and say "enter" to send.
  • Cursor mode pastes prompts into the Cursor composer panel without submitting. You review and say "enter" to send.

All modes prepend context on the first message and support follow-up messages in the same conversation (no new tabs or panels created on follow-ups).

While the agent is running

  • Commands still work normally — "save file", "undo", "focus terminal" all execute as usual
  • Questions and conversation go to the agent — "how do I fix this error?" types into the agent
  • "enter" — submits the current input
  • "up" / "down" — arrow keys for navigating selection menus (Terminal mode)
  • "yes" / "ok" / "go ahead" — confirms the current selection
  • "focus editor" / "go back" — switches back to the editor
  • "focus agent" / "open claude" — switches to (or opens) the agent panel

Agent voice commands

  • "new conversation" / "clear conversation"
  • "resume conversation" (Terminal mode)
  • "accept changes" / "reject changes" (for proposed diffs)
  • "stop" / "interrupt" / "cancel" (sends Ctrl+C, Terminal mode)

System Commands (Any App)

These commands work regardless of which app is in the foreground. They are processed before any IDE or LLM logic.

Mouse

Say Action
"click" Left click at current mouse position
"double click" Double click at current mouse position
"right click" Right click at current mouse position
"move mouse up/down/left/right [N]" Move mouse N pixels (default 50)

Open & switch apps

Say Action
"open Safari", "open Chrome", "open Slack" Open or focus the named app
"open VS Code", "open Cursor", "open IDE" Open IDE
"switch to Safari", "switch to Terminal" Bring the named app to front

Polite phrasing works too — "could you please open Safari" is handled correctly.

Browser navigation

Say Action
"back" / "go back" Cmd+[ (browser back)
"forward" / "go forward" Cmd+] (browser forward)
"refresh" / "reload" Cmd+R
"hard refresh" Cmd+Shift+R
"new tab" Cmd+T
"close tab" Cmd+W
"reopen tab" Cmd+Shift+T
"address bar" / "url bar" Cmd+L
"bookmark" Cmd+D

Key presses

Say Action
"press enter", "press escape", "press tab" Sends that key
"press up", "press down", "press left", "press right" Arrow keys
"press page up", "press page down", "press home", "press end" Navigation keys

Type text

Say Action
"type hello world" Types the text via clipboard paste

Window management

Say Action
"minimize" Cmd+M
"close window" Cmd+W
"full screen" Cmd+Ctrl+F
"next window" / "previous window" Cmd+` / Cmd+Shift+`
"hide" / "hide app" Cmd+H
"mission control" Ctrl+Up

System

Say Action
"spotlight" / "search computer" Cmd+Space
"screenshot" Cmd+Shift+3 (full screen)
"screenshot selection" Cmd+Shift+4 (area)
"lock screen" Cmd+Ctrl+Q

Unfocused Commands (When Another App Is Active)

When the IDE is not the frontmost window, these commands route keystrokes to whatever app you're using (Safari, Terminal.app, Finder, etc.).

Arrow keys & repetition

Say Action
"up", "down", "left", "right" Arrow key
"up 5 times", "down three times" Repeat arrow key N times

Scrolling

Say Action
"scroll up" / "scroll down" Smooth scroll (15 arrow presses)
"scroll up a lot" / "scroll down a lot" Big scroll (2 page jumps)
"page up" / "page down" Single page jump
"scroll to top" / "scroll to bottom" Cmd+Home / Cmd+End

Basic keys

Say Action
"enter" / "return" / "submit" Enter key
"escape" / "cancel" / "dismiss" Escape key
"tab" Tab key
"space" Space key
"delete" / "backspace" Backspace key

Tab switching

Say Action
"next tab" / "previous tab" Ctrl+Tab / Ctrl+Shift+Tab
"first tab", "second tab", ..., "ninth tab" Cmd+1 through Cmd+9
"last tab" Cmd+9

Text editing

Say Action
"undo" / "redo" Cmd+Z / Cmd+Shift+Z
"copy" / "paste" / "cut" Cmd+C / Cmd+V / Cmd+X
"select all" Cmd+A
"find" / "search" Cmd+F
"save" Cmd+S
"close" / "quit" Cmd+W / Cmd+Q
"zoom in" / "zoom out" / "reset zoom" Cmd+= / Cmd+- / Cmd+0

Developer tools (browser)

Say Action
"dev tools" / "inspect element" Cmd+Option+I
"console" / "open console" Cmd+Option+J
"view source" Cmd+Option+U

Terminal.app / iTerm shortcuts

Say Action
"clear" / "clear terminal" Cmd+K
"interrupt" / "control c" / "kill process" Ctrl+C
"exit terminal" / "control d" Ctrl+D
"suspend" / "control z" Ctrl+Z
"reverse search" / "control r" Ctrl+R
"beginning of line" / "control a" Ctrl+A
"end of line" / "control e" Ctrl+E
"clear line" / "control u" Ctrl+U
"delete word" / "control w" Ctrl+W

Finder shortcuts

Say Action
"show hidden files" Cmd+Shift+.
"go to folder" Cmd+Shift+G
"new folder" Cmd+Shift+N
"get info" / "file info" Cmd+I

Sidebar Panel

Mantra adds a panel to the activity bar. From the sidebar you can:

  • Hands-Free Mode / Stop — toggle button to start or stop continuous listening
  • Push to Talk — hold the button to record, release to transcribe and process
  • Stop / Stop & Transcribe (Ctrl+Shift+2 / Ctrl+Shift+3) — while recording, stop (discard audio) or stop and transcribe what's been said so far
  • Test Microphone — verify your mic is working with a live volume meter
  • Activity Log — scrollable history of every transcript, command, code edit, terminal action, and agent interaction. Code modifications include:
    • Show diff — toggle to view exactly what changed (green/red highlighting)
    • Open in tab — open the diff in a full editor tab
    • Undo / Redo — revert or re-apply a specific change
  • Focus — quick buttons to switch between Editor, Terminal, Agent, Explorer, Search, and Source Control
  • Settings
    • Microphone — pick your input device
    • Agent — choose Cursor, Claude Code (Terminal), Claude Code (Extension), or Q&A. Cursor is auto-selected and only available in Cursor IDE.
    • Commands-Only Mode — bypass the LLM entirely (ON/OFF toggle)
    • Send Context to Agent — include editor state, activity log, terminal history, and workspace files in agent prompts (ON by default)
    • All Settings / Keyboard Shortcuts
  • Router Prompt / Selection Model Prompt — view and edit the LLM system prompts directly in the sidebar

Commands-Only Mode

Toggle via the sidebar or Command Palette. When enabled:

  • No LLM calls — speech is transcribed but only matched against pre-mapped commands and text operations
  • What works: all 130+ IDE commands, text operations, keyboard shortcuts, system commands, focus/navigation, and pause/resume
  • What doesn't work: code edits, questions, terminal command generation, and agent forwarding

Useful for low-latency command execution with no LLM calls beyond speech-to-text.


Agent Context

When "Send Context to Agent" is enabled (the default), Mantra writes a context file before each agent prompt containing:

  • Editor state — current filename, language, cursor position, selected text (if any)
  • Activity log — timestamped history of commands, edits, and transcripts
  • Terminal history — recent shell commands and their output
  • Workspace files — listing of files and folders in the project

The first message in each session includes a single-line instruction telling the agent to read the context file. Follow-up messages send just the raw transcript. The context file is updated before every message, after every log entry, and after every terminal command.

Selection model: When the transcript contains a selection keyword ("select", "highlight", "lines X to Y"), Mantra runs a separate lightweight LLM call to determine the exact lines to select — "select the inner for loop", "select this function", "highlight the try block". This lets you select code by natural language description with precision, including nested constructs.


Commands & Keyboard Shortcuts

Action Shortcut
Start Recording Ctrl+Shift+1
Stop Listening Ctrl+Shift+2
Stop & Transcribe Ctrl+Shift+3
Open Settings Ctrl+Shift+4
Select Microphone Ctrl+Shift+5
Push to Talk Sidebar button (hold)

All shortcuts can be customized in File > Preferences > Keyboard Shortcuts (search "mantra"), or via the Keyboard Shortcuts button in the sidebar.


Settings

Open Settings > Extensions > Mantra to adjust:

  • Agent Backend — Cursor, Claude Code (Terminal), Claude Code (Extension), or Q&A. Defaults to Cursor in Cursor IDE, Q&A elsewhere.
  • Reasoning Effort — Low (default), medium, or high
  • Router Prompt — Customize the LLM system prompt (also editable in the sidebar)
  • Selection Model Prompt — Customize the prompt used for voice-based code selection (also editable in the sidebar)
  • Memory Manager Prompt — Customize the prompt used for the memory manager
  • Commands Only — Bypass the LLM entirely
  • Send Context to Agent — Include editor state, activity log, terminal history, and workspace files in agent prompts (default: on)
  • Microphone Input — Set via Command Palette > "Mantra: Select Microphone". Advanced users can paste raw FFmpeg input args.

Privacy and Data Handling

  • Secret scrubbing: Before any data leaves your machine, Mantra scrubs secrets from terminal history, file content, and context files. This includes API keys, tokens, passwords, PEM blocks, JWTs, connection strings with credentials, CLI flag arguments (--pat, --token, --password), export assignments with secret-like names, output of dangerous commands (env, printenv, cat .env), and high-entropy strings that look like tokens. Scrubbing happens at every external boundary — the context file, the LLM prompt, and agent dispatches.
  • Sensitive file detection: If the active editor contains a sensitive file (.env, .pem, credentials, etc.) or inline secrets are detected, Mantra warns you before sending anything to the LLM.
  • What goes to the STT service: The most frequent identifiers from your open file are sent as keyterms to bias recognition toward your code's vocabulary. Audio is streamed live (not stored as a file). Mantra's account with the STT service has opted out of data sharing.
  • What goes to the LLM service: The current file's contents (scrubbed), file name, cursor context, and terminal history (scrubbed). The LLM service does not retain inputs or outputs from the model usage.
  • For more: Other than the API usage described above, Mantra runs entirely locally and does not collect, save, or share any of your data.

Troubleshooting

  • No mic on macOS — Allow the IDE (VS Code or Cursor) under System Settings > Privacy & Security > Microphone.
  • Mouse click not working — Allow the IDE under System Settings > Privacy & Security > Accessibility.
  • "Command not found: claude" — Only needed for Claude Code Terminal mode. Add the CLI to your PATH: echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.zshrc && source ~/.zshrc. Alternatively, use Extension mode or Cursor mode.
  • Cursor agent grayed out — The Cursor agent option is only available when running inside Cursor IDE. It appears as "Cursor (Not available)" in VS Code.
  • Ghost transcriptions ("two", "four") — These are filtered automatically. If ambient noise is high, adjust your microphone position.
  • File not found — Include punctuation words when speaking filenames: "open auth dot controller dot ts". You can also say the name without an extension and Mantra will fuzzy-match it.
  • Logs — Check View > Output > Mantra for detailed logs. The sidebar Activity Log also shows a history of all transcripts and actions.

Supported IDE Commands

Over 130 pre-mapped commands:

save, save all, new file, close file, close other files, close all files, reopen closed editor, undo, redo, cut, copy, paste, select all, toggle line comment, toggle block comment, format document, format selection, rename symbol, quick fix, organize imports, expand selection, shrink selection, select next occurrence, duplicate line down, duplicate line up, move line up, move line down, add cursor above, add cursor below, fold all, unfold all, toggle word wrap, find, replace, find in files, replace in files, next tab, previous tab, tab one through tab nine, page up, page down, go to definition, peek definition, go to references, go to implementation, jump to bracket, focus editor, focus sidebar, focus panel, toggle output, toggle sidebar, toggle panel, toggle zen mode, split editor, toggle minimap, zoom in, zoom out, reset zoom, toggle terminal, focus terminal, new terminal, next terminal, previous terminal, focus agent, new conversation, accept changes, reject changes, focus explorer, focus search, focus source control, focus debug, focus extensions, show command palette, quick open, toggle breakpoint, start debugging, stop debugging, continue debugging, step over, step into, step out, stage file, stage all, unstage file, commit, push, pull, checkout branch, show diff, stash, pop stash, toggle fullscreen, show problems, show notifications, clear notifications, reveal in finder, copy file path, copy relative path, markdown preview, run task, run build task, run test task, clear terminal, terminal scroll up, terminal scroll down.

Additional text operations (no LLM needed): go to line N, go to symbol by name, select/copy/cut/delete line N, select/copy/cut/delete lines A to B, scroll up/down, page up/down, new line above/below, indent, outdent, delete, paste, kill process, run last command.


WSL2 (Windows Subsystem for Linux) — Quick Setup

Mantra works in a Remote - WSL window. Use WSLg (audio bridge) and a Pulse-enabled FFmpeg.

Recommended (WSLg enabled)

  1. Windows PowerShell (Admin)
    wsl --update
    wsl --shutdown
    
  2. In WSL Ubuntu
    sudo apt update && sudo apt install -y ffmpeg pulseaudio-utils
    export MANTRA_FFMPEG_PATH=/usr/bin/ffmpeg
    code .
    
  3. In VS Code: Command Palette > "Mantra: Select Microphone" > choose a device.
  4. Start recording.

Alternative (no WSLg)

  • Open the folder directly in Windows VS Code, or
  • Use the Windows mic from WSL:
    export MANTRA_FFMPEG_PATH="/mnt/c/ffmpeg/bin/ffmpeg.exe"
    export MANTRA_AUDIO_INPUT='-f dshow -i audio=Microphone (Your Device Name)'
    code .
    

Troubleshooting (WSL)

  • "PulseAudio: Connection refused" — WSLg not active: wsl --update then wsl --shutdown.
  • "Unknown input format 'pulse'" — Wrong FFmpeg: install Ubuntu ffmpeg and set MANTRA_FFMPEG_PATH=/usr/bin/ffmpeg.
  • No mics in picker — Enable WSLg and relaunch VS Code from the WSL shell (code .).

Persist FFmpeg path

echo 'export MANTRA_FFMPEG_PATH=/usr/bin/ffmpeg' >> ~/.bashrc

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft