Xiaomi MiMo for Copilot Chat

Pick MiMo-V2.5-Pro-UltraSpeed, MiMo V2.5 Pro & V2.5 from the Copilot Chat model picker — with thinking mode, vision, and agent tools.

Love Xiaomi MiMo's reasoning capabilities but don't want to leave Copilot Chat? This extension drops MiMo-V2.5-Pro-UltraSpeed, MiMo V2.5 Pro & V2.5 straight into the model selector — with thinking mode, vision (V2.5), tool calling, and your own API key.

Why this extension?

Don't replace Copilot — power it up. No new sidebar, no new chat UI to learn. Just new models in the picker you already use.
Agent mode, tool calling, instructions, MCP, skills — all of it still works. Copilot's entire stack, now running on MiMo.
Prompt caching that actually works. The extension sends prompt_cache_hit_tokens back to the API so MiMo can validate and continue its server-side cache — no warm-up waste, no surprise re-computation.
Reasoning token tracking. Every response logs reasoning_effort feedback and exact reasoning token counts so you can see how much thinking the model invested.
MiMo V2.5 supports vision. Drop screenshots, UI mockups, or diagrams into chat — V2.5 can see and understand them.
MiMo V2.5 Pro excels at deep reasoning. Complex refactors, multi-step debugging, algorithm design — tasks that need serious thinking.
BYOK, pay MiMo directly. Your API key, your bill, your rate limits. Stored in the OS keychain, never on disk.

Features

MiMo-V2.5-Pro-UltraSpeed, MiMo V2.5 Pro & V2.5 in the model picker

All three models show up alongside GPT-4o, Claude, and friends in Copilot Chat's model selector. 917K token context on all three. Switch models mid-chat without losing history.

Prompt Caching with Full Feedback Loop

Most "compatible" extensions blindly forward API responses. This one closes the loop: it reads prompt_tokens_details.cached_tokens from each response and feeds it back to the API on the next request, ensuring MiMo's server-side prompt cache stays warm across multi-turn conversations. The result — dramatically lower costs and latency on long agent sessions.

Real-world cache performance (from actual usage logs):

[02:10:37.592] tokens: prompt=14973 completion=130 | cache: hit=12288 rate=82% | reasoning=83 | chars/tok=3.17
[02:10:43.595] tokens: prompt=15179 completion=230 | cache: hit=14912 rate=98% | reasoning=111 | chars/tok=2.58
[02:10:48.067] tokens: prompt=17110 completion=245 | cache: hit=15168 rate=89% | reasoning=42 | chars/tok=2.23
[02:10:53.388] tokens: prompt=17635 completion=174 | cache: hit=17088 rate=97% | reasoning=77 | chars/tok=1.99
[02:11:06.325] tokens: prompt=18068 completion=521 | cache: hit=17600 rate=97% | reasoning=108 | chars/tok=1.83
[02:13:50.253] tokens: prompt=19077 completion=57  | cache: hit=18048 rate=95% | reasoning=24 | chars/tok=1.75
[02:13:53.348] tokens: prompt=19201 completion=142 | cache: hit=19072 rate=99% | reasoning=23 | chars/tok=1.69

Cache hit rates climb to 97–99% on subsequent turns — meaning the model skips re-reading your entire conversation history and jumps straight to the new content.

Thinking Mode with Reasoning Token Visibility

Full support for MiMo's reasoning_content. Watch the model's thought process in real-time as it tackles complex problems. Every response also reports the exact number of reasoning tokens consumed — so you know how much "thinking" each answer cost you.

Vision Support (MiMo V2.5)

Drop a screenshot or UI mockup into chat and V2.5 will analyze it directly. Perfect for understanding code screenshots, design specs, and visual debugging.

Inherits Every Copilot Capability

Because this plugs into Copilot's native provider API, you get the full stack for free:

Agent mode — autonomous multi-step tasks
Tool calling — file edits, terminal, workspace search, Git, tests
Instructions & skills — all your .instructions.md, AGENTS.md, and skills just work
Prompt caching stats — MiMo's cache hit rate logged in the output channel so you can see the savings

Multi-Region API Support

Choose the API endpoint that matches your subscription plan:

Default API (no plan)
Token Plan — China
Token Plan — Singapore
Token Plan — Europe (Amsterdam)

Secure by Default

API key lives in VS Code's SecretStorage (OS keychain on macOS / Windows / Linux). Never in settings.json, never in your Git history.

Zero Runtime Dependencies

Pure VS Code API + Node.js built-ins. No Python, no Docker, no local proxy server to babysit.

Getting Started

Prerequisites

VS Code 1.116 or later. This extension relies on non-public Copilot Chat APIs that may break on newer VS Code versions — report an issue if you hit one.
GitHub Copilot subscription (Free / Pro / Enterprise — the free tier works)
MiMo API key from platform.xiaomimimo.com

Usage

Install from the VS Code Marketplace
Run MiMo: Set API Key from the Command Palette (Ctrl+Shift+P)
Paste your API key
Open Copilot Chat, click the model picker, pick MiMo-V2.5-Pro-UltraSpeed, MiMo V2.5 Pro, or MiMo V2.5
That's it — chat away

Models

Model	Best For	Vision
MiMo-V2.5-Pro-UltraSpeed	Fast Pro reasoning for latency-sensitive agent tasks	❌
MiMo V2.5 Pro	Complex refactors, agent tasks, deep reasoning, algorithm design	❌
MiMo V2.5	Fast everyday coding, quick edits, image analysis, UI understanding	✅

Both support thinking mode, tool calling, and 917K token context.

Settings

Setting	Default	Description
`mimo-copilot.baseUrl`	`https://api.xiaomimimo.com/v1`	API endpoint — select a preset or enter a custom URL
`mimo-copilot.maxTokens`	`0`	Max output tokens (`0` = API default, capped at 131072). Useful for cost control
`mimo-copilot.modelIdOverrides`	`{...}`	Override API model IDs — only needed for compatible third-party APIs

Commands

Command	Description
`MiMo: Set API Key`	Configure your MiMo API key
`MiMo: Get API Key`	Open the MiMo platform to get an API key
`MiMo: Clear API Key`	Remove the stored API key
`MiMo: Open Settings`	Open MiMo extension settings
`MiMo: Show Logs`	View extension logs for debugging

License

MIT

Xiaomi MiMo for Copilot Chat

Sdcb AI

Xiaomi MiMo for Copilot Chat

Why this extension?

Features

MiMo-V2.5-Pro-UltraSpeed, MiMo V2.5 Pro & V2.5 in the model picker

Prompt Caching with Full Feedback Loop

Thinking Mode with Reasoning Token Visibility

Vision Support (MiMo V2.5)

Inherits Every Copilot Capability

Multi-Region API Support

Secure by Default

Zero Runtime Dependencies

Getting Started

Prerequisites

Usage

Models

Settings

Commands

License