MR Multi Model Ai Coder

The Claude Code experience — on your own keys or Accounts, your own CLIs, and a pile of completely free models.

📚 Providers · Features · Commands · User documentation · Architecture · Analytics

MR Multi Model Ai Coder — empty-state welcome, provider cards, and model picker

Ever been locked out of Claude in the middle of your work?

You're deep in a problem. Claude is helping you ship. Then — nothing. You're locked out.

Your account got flagged for "unusual activity."
You hit a daily usage limit right when you needed it most.
The service is down for an hour.
You signed in from a second machine and suddenly everything breaks.
You share one account to save money, and one flag means everyone loses access.

Whatever the reason, your whole day stops while you figure out what to do next. You lose your train of thought. You lose the conversation you were halfway through. And the problem you were about to solve is still sitting there.

This extension makes that problem go away.

Use your own Claude account when it's working. The moment it isn't — click once and switch to a completely free model. Your chat keeps going. The files you had open, the context, the history, everything comes along. You never lose your flow.

You get eight live provider chips in one extension:

Your own Claude account for the hard stuff
Kilo Code — one CLI install, 9 free models routed automatically (no credit card)
OpenRouter — 14 free models behind one key (GPT-OSS, Qwen Coder, Llama, Gemma, …). Free-tier caps apply; $10 one-time load lifts them.
Ollama Local — models run entirely on your machine. No account, no rate limits, works offline.
Qwen — DashScope API or the Qwen CLI for the Alibaba ecosystem
OpenAI Codex — GPT-4.1, o3, o4-mini via API key or the Codex CLI with OAuth
Google Gemini — Gemini 2.5 Pro/Flash with 1M context via API key or Gemini CLI
HuggingFace — 22 open-weight agentic coders (Qwen3, DeepSeek V3/R1, Llama 4, Mistral) via HF Inference API

Mix and match however you want. Pay nothing you don't need to.

Full setup, model tables, and config walkthroughs for every provider in PROVIDERS.md.

Free models, out of the box

Three separate free paths. No credit card on any of them.

Kilo Code (npm install -g @kilocode/cli → kilo login)

Stepfun Step 3.5 Flash · Grok Code Fast · Nvidia Nemotron 3 Super 120B
ByteDance Seed 2.0 Pro · Google Lyria 3 · Kilo Auto (best-available router)

OpenRouter (sign up free at openrouter.ai/keys — no credit card)

Featured (less throttled, fully agentic): Nemotron 3 Super 120B · Nemotron Nano 30B / 12B VL / 9B · GPT-OSS 120B / 20B · MiniMax M2.5 · Trinity Large
Plus: Qwen3 Coder 480B · Qwen3 Next 80B · Gemma 4 31B · Llama 3.3 70B · GLM 4.5 Air · Gemma 3 27B · Hermes 3 405B · more
Free tier: 50 req/day (1000/day after a one-time $10 load, never expires)

Ollama Local (install, ollama pull qwen3.6, done — 100% offline)

Qwen 3.6 (27B / 35B / MoE, 256K context) · Qwen 3 Coder Next 80B · Qwen 2.5 Coder (3B → 32B) · Llama 3.2/3.3 · Gemma 3 · DeepSeek R1 · Phi 4
Zero rate limits. Zero cost. Zero data leaves your machine.

Qwen — DashScope free trial (API) or the Qwen CLI (ModelScope OAuth)

Mix and match. Use Claude for the hard reasoning, flip to a free model for routine edits, pay for nothing you don't need.

iFlow support is currently paused while the iFlow CLI's sign-in flow stabilises — it'll come back once their auth migration is complete.

What you get

Everything Claude Code users expect, plus a few things they don't:

Multi-tab chats — every conversation is a real editor tab. Drag, split, pin.
Full agentic loop — Read / Write / Edit / Bash / Grep / Glob, each gated by per-turn approval
Inline file diff in the approval card — see exactly what will change before you say yes (+added, −removed, colour-coded, collapsible)
Automatic checkpoints + rewind — any turn can be undone and the files come back
Hot-swap provider or model mid-conversation — no reset, context replays cleanly to the new backend
Smart @file mentions — fuzzy workspace search, case-insensitive, recently-open files ranked first
Attach from any folder — native OS file picker, like uploading to a web app. Downloads, Desktop, other drives.
Paste screenshots with Ctrl+V — auto-saved under .mrcoder/pasted-images/ and attached
Proper Markdown rendering with syntax-aware code blocks and one-click Copy code
Copy / Edit-and-resend / Regenerate on every message
↑/↓ history — cycle through your previous prompts terminal-style
TodoWrite as a live checklist — watch the agent's plan update in real time
Live usage readout — input / output tokens, context-window %, estimated cost per turn and session
Developer Analytics Dashboard — productivity score, streak tracking, peak hours, efficiency trends, session quality, cost intelligence, model ranking, and 12 KPI cards with period-over-period deltas. All local, fully isolated. Full spec
Cost cap per session, four permission modes (plan / default / acceptEdits / bypassPermissions)
MCP servers on the Claude backend, same config shape as Claude Code
/init wizard — one command generates a CODER.md and wires it into every provider's memory file (CLAUDE.md, KILO.md, IFLOW.md, QWEN.md, CODEX.md, GEMINI.md, AGENTS.md) so your whole team's conventions ride along automatically
Scheduled messages — /schedule walks you through picking a date and time (24-hour) step-by-step. Missed while VS Code was closed? You get a notification with Send Now / Reschedule / Delete — never a silent auto-send

Haptic Feedback & Progress Music

Every chat event has a sound cue — submit click, response ding, error thud, approval tone, task-complete chime. All work out of the box with built-in synthesized tones. You can replace any of them with your own audio files (.wav, .mp3, .ogg, .m4a) via Settings or Ctrl+Shift+P → Configure Haptic Sound.

Progress Music

Play background music while the AI is generating:

Put your audio files in a folder
Ctrl+Shift+P → Configure Progress Music Folder → select the folder
Enable it: set mrMultiCoder.hapticFeedback.onProgress.enabled to true

Files play sequentially and loop until the response finishes. A single file loops continuously. Volume is independent from event sounds (onProgress.volume, default 0.3).

Multiple tabs? Only one tab plays at a time. When that tab finishes, music automatically resumes in the next still-generating tab.

Reset everything: Ctrl+Shift+P → Reset Haptic Feedback — returns all sounds and progress settings to defaults.

Full settings reference and examples in COMMANDS.md § Haptic Feedback.

Install

Easiest — VS Code:

Open Extensions (Ctrl+Shift+X / Cmd+Shift+X)
Search MR Multi coder
Click Install

Command line:

code --install-extension mr-innovations.mr-multi-coder

Marketplace page: https://marketplace.visualstudio.com/items?itemName=mr-innovations.mr-multi-coder

Quick start

Click the MR icon in the Activity Bar (left edge) — that's your Conversations sidebar.
Start a new chat from any of three places:
- New Conversation button in the sidebar
- MR icon in the top-right of any open file's editor toolbar
- ✱ MR Multi coder entry in the status bar
Pick a provider from the chip at the bottom of the chat.
Configure credentials the first time — either an API key (MR Multi coder: Set Claude API Key) or just install the relevant CLI on your PATH.
Type a task. Enter sends, Shift+Enter makes a newline, ↑ cycles through previous prompts.

Every conversation is a standalone editor tab. Open five in parallel if you want.

Providers

Seven live provider chips, one hidden. Full setup steps, model tables, free-tier caveats, Ollama install / configure walkthrough, and troubleshooting live in PROVIDERS.md.

Provider	Auth	Cost	How it runs
Claude	Anthropic API key	Paid	`@anthropic-ai/claude-agent-sdk` — full agent, MCP, checkpoints
Kilo Code	CLI OAuth (`kilo login`)	Free	Background `kilo run` subprocess. Routes to 9 free models.
Qwen	DashScope key or `qwen` CLI	Free trial → paid	API: streaming SSE. CLI: subprocess.
OpenRouter	OpenRouter key (`sk-or-v1-…`)	Free tier	Direct HTTP to `openrouter.ai/api/v1`. 14 free models.
Ollama (Local)	None — localhost	Free forever	HTTP to your own Ollama server. Offline-capable.
OpenAI Codex	OpenAI API key or `codex` CLI	Paid (free trial)	API: streaming SSE. CLI: subprocess. o4-mini, o3, GPT-4.1.
Google Gemini	Google AI Studio key or `gemini` CLI	Free tier → paid	API: streaming SSE. CLI: subprocess. 1M context.
~~iFlow~~	—	—	Paused — auth migration in progress

Note on CLI providers — we spawn them with stdin/stdout pipes (not a pseudo-terminal). If a CLI insists on a real TTY or needs interactive first-time login, run it once in VS Code's integrated terminal to complete that step, then try again.

New to Ollama? It's fully local — no account, no key, unlimited free iterations. Install in one command, pull a ~5 GB model, you're off. Full walkthrough in PROVIDERS.md § Ollama.

Key commands

Open the Command Palette (Ctrl+Shift+P) and type MR Multi coder:

New Conversation (Ctrl+Shift+Alt+A)
Set Claude API Key / Set Qwen API Key / Set OpenRouter API Key / Set OpenAI Codex API Key / Set Gemini API Key
Select Provider… / Select Model… / Select Permission Mode…
Add Selection to Chat (Ctrl+Shift+L)
Manage MCP Servers / Toggle MCP Server…
Compact Conversation · Rewind to Checkpoint… · Fork Session with Summary…
Configure Haptic Sound · Configure Progress Music Folder · Reset Haptic Feedback

Slash commands inside the chat: /init, /help, /compact, /clear, /fork <summary>, /cost, /cancel, /schedule.

Full command reference with examples, haptic feedback settings, and slash commands: COMMANDS.md.

Scheduled messages

Type /schedule and the extension guides you step-by-step:

Message — what to send (or type /schedule my message here to skip this)
Date — pick Today, Tomorrow, or enter a specific date
Time — enter in 24-hour format (e.g. 14:30)

/schedule list          — view pending schedules
/schedule cancel <id>   — cancel a schedule

If VS Code is closed when the scheduled time passes, the message is not auto-sent. Instead, a notification shows you what was missed and offers three actions: Send Now, Reschedule, or Delete.

Key settings (under `mrMultiCoder.*`)

Setting	Default	What it does
`provider`	`claude`	Active backend
`permissionMode`	`default`	Approval behaviour
`claude.model`	`claude-sonnet-4-6`	Model for the Claude provider
`claude.backend`	`anthropic`	Anthropic / Bedrock / Vertex / Foundry
`qwen.authMode`	`api`	`api` (DashScope) or `cli`
`openrouter.model`	`openai/gpt-oss-20b:free`	Default OpenRouter model
`ollama.endpoint`	`http://localhost:11434/v1`	Your Ollama server URL
`ollama.model`	`qwen2.5-coder:7b`	Default Ollama tag
`codex.authMode`	`api`	`api` (OpenAI key) or `cli` (Codex CLI)
`codex.model`	`o4-mini`	Default OpenAI model
`gemini.authMode`	`api`	`api` (AI Studio key) or `cli` (Gemini CLI)
`gemini.model`	`gemini-2.5-flash`	Default Gemini model
`maxToolRounds`	`0`	Max tool-call rounds per turn; `0` = provider default
`mcpServers`	`{}`	Same JSON shape as Claude Code's `settings.json`
`loadProjectMemory`	`true`	Auto-load `CODER.md`, `CLAUDE.md`, `AGENTS.md`, `.mrcoder/MEMORY.md`
`autoCompactThreshold`	`0.8`	Trigger `/compact` at this fraction of context
`toolResultTruncateBytes`	`50000`	Cap large tool results
`costCapUsdPerSession`	`0`	Block sends past this cost; `0` disables
`telemetry.enabled`	`false`	Opt-in only, no content / paths / keys

Team conventions (`/init`)

Run /init in any chat. It asks you a handful of questions (max file size, testing policy, commit style, off-limits paths, etc.), writes a single CODER.md at the repo root, and wires a reference to it into CLAUDE.md, KILO.md, IFLOW.md, QWEN.md, CODEX.md, GEMINI.md, and AGENTS.md so every provider reads the same conventions.

Change the rules later? Edit CODER.md — everything else picks it up automatically.

Web Remote Access

Access your workspace from any browser — phone, tablet, or another computer — with full authentication and per-user access control.

Open MR Multi coder Settings → Web Remote Access section
Add users (email + password), assign project access (all or specific folders)
Click Start Server — a login-protected VS Code Web instance launches

What you get:

Two-port architecture — auth proxy on public port (18462), VS Code Server on internal-only port (18463). No unauthenticated traffic reaches the server.
JWT session auth with HttpOnly cookies, bcrypt password hashing, brute-force rate limiting
Per-user project access — restrict users to specific workspace folders
Extension allowlist — only MR Multi coder is installable in the web instance (uses VS Code 1.96+ extensions.allowed)
Auto-installs MR Multi coder into the web VS Code instance so remote users get the full AI coding experience
Works on LAN, VPN, or with a reverse proxy for public access

Full setup, security model, and troubleshooting in REMOTE-ACCESS.md.

Security

API keys live in VS Code's SecretStorage — never in settings.json, never synced, never logged.
CLI providers are auto-detected on your PATH. Nothing is bundled.
Untrusted / virtual workspaces correctly gate file-edit and shell tools.
Workspace trust respected throughout — Bash and edit tools refuse to run in untrusted folders.
Web Remote Access uses JWT + bcrypt + rate limiting — see REMOTE-ACCESS.md for the full security model.

Org policy (optional)

Commit a .mrcoder/policy.json to your repo to lock behaviour for contributors:

{
  "forcedPermissionMode": "plan",
  "forcedDisallowedTools": ["Bash"],
  "costCapUsdPerSession": 5,
  "disableCliProviders": false,
  "lockProvider": "claude"
}

Documentation

PROVIDERS.md — every provider in one place: model tables, free-tier caveats, Ollama install + configure walkthrough, switching mid-chat, FAQ.
FEATURES.md — complete catalog of every feature, every slash command, every setting.
COMMANDS.md — every command, keybinding, slash command, and haptic feedback setting with examples.
DOCUMENTATION.md — day-to-day user guide: install, configure providers, attachments, rewind, troubleshooting.
ARCHITECTURE.md — how the extension is built: bundle layout, protocol, per-tab lifecycle, security posture.
ANALYTICS_FEATURES.md — developer analytics dashboard: productivity score, streak tracking, cost intelligence, model ranking, export.
REMOTE-ACCESS.md — web remote access: setup, security model, per-user project access, troubleshooting.
CHANGELOG.md — what changed in each release.

Feedback

Bug, missing feature, weird edge case? File it on the Marketplace page, join the Telegram community, or reach out to the publisher.

Telegram community: https://t.me/+Bj5YpMhXYHwzNzc1 — fastest route for questions, issue reports, and suggestions.
Marketplace: https://marketplace.visualstudio.com/items?itemName=mr-innovations.mr-multi-coder
Publisher: MR Innovations

Legal

MIT licensed — see LICENSE.

Not affiliated with Anthropic. "Claude Code" is Anthropic's trademark. This extension is independent and uses Anthropic's official Agent SDK for the Claude provider. "Kilo Code", "iFlow", and "Qwen" are independent CLIs from their respective authors.

MR Multi Model Ai coder

MR INNOVATIONS

MR Multi Model Ai Coder

Ever been locked out of Claude in the middle of your work?

Free models, out of the box

What you get

Haptic Feedback & Progress Music

Progress Music

Install

Quick start

Providers

Key commands

Scheduled messages

Key settings (under `mrMultiCoder.*`)

Team conventions (`/init`)

Web Remote Access

Security

Org policy (optional)

Documentation

Feedback

Legal

MR Multi Model Ai coder

MR INNOVATIONS

MR Multi Model Ai Coder

Ever been locked out of Claude in the middle of your work?

Free models, out of the box

What you get

Haptic Feedback & Progress Music

Progress Music

Install

Quick start

Providers

Key commands

Scheduled messages

Key settings (under mrMultiCoder.*)

Team conventions (/init)

Web Remote Access

Security

Org policy (optional)

Documentation

Feedback

Legal

Key settings (under `mrMultiCoder.*`)

Team conventions (`/init`)