Skip to content
| Marketplace
Sign in
Visual Studio Code>Machine Learning>OIOXO — Private On-Device Coding AI & Token SaverNew to Visual Studio Code? Get it now.
OIOXO — Private On-Device Coding AI & Token Saver

OIOXO — Private On-Device Coding AI & Token Saver

OIOXO

|
1 install
| (0) | Free
Coding intelligence that runs on your machine, not in someone else's cloud — and a context engine that cuts up to 90% of the tokens your AI tools burn. Your code never leaves the device.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

OIOXO — Private On-Device Coding AI & Token Saver

Your code is your most valuable asset. OIOXO is built on one principle: it stays on your machine.

OIOXO brings coding intelligence to VS Code without the cloud in the middle — a model that downloads once, encrypted, and runs on this device, grounded in your actual codebase. And for the AI tools you already pay for, OIOXO becomes a token saver: it cuts what they read and what they write, so your subscriptions and API keys go several times further.


Private, local coding — the default

  • The model runs here. OIOXO's own models (0.4–4.7 GB) are matched to your machine's real capabilities — memory, free RAM, GPU — download once encrypted, unlock per session, and run locally. Your code, your questions, and the answers never leave the device.
  • Grounded in your project, verifiably. Every answer is built on the OIOXO context engine — the minimal relevant slice of your codebase, found by lexical search and the import graph. Replies show exactly which files they were grounded in.
  • Made for working code. Streamed answers with proper code blocks — Copy, Insert at cursor, New file — and right-click Explain / Improve / Add selection from any editor. Ctrl+Alt+O opens the panel.
  • Your standards travel with the repo. .oioxo/rules.md and .oioxo/memory.md are honored in every conversation.

The token saver — for everything else you use

Most teams already pay for Copilot, Claude Code, Cursor, or raw API keys. OIOXO makes all of them cheaper, in three independent ways:

  1. Minimal context for your agents. Copilot, Claude Code, Cursor, Windsurf, Gemini CLI and Codex are wired automatically the moment you open a folder. Instead of reading whole files, they receive the relevant slice — measured on a real production codebase: 90–92% fewer context tokens per question.
  2. Leaner requests and replies on your own keys. When you route OIOXO through a key you own, prompts are compressed before they leave (duplicate and stale context removed) and replies are kept lean — output tokens, the expensive ones, drop sharply. Three professional levels: Disabled · Balanced · Aggressive.
  3. Local-first answers. Questions can be answered by the on-device model first, escalating to your paid model only when genuinely needed — every non-escalated question is an external call that never happened.

None of this ever applies to OIOXO's own models. They always run at full quality. The status bar keeps honest score: OIOXO · 41k saved.

When you want a frontier model

Bring your own key — OpenAI, Anthropic, Google, Groq, Mistral, xAI, DeepSeek, OpenRouter, Together, Fireworks, Cerebras, a local Ollama, or any OpenAI-compatible endpoint. Keys are held in VS Code secret storage and travel only to the provider you chose; OIOXO never sees them and never proxies your traffic.

Private by architecture, not by promise

  • The index, the grounding, and the on-device models run 100% locally.
  • Models ship encrypted and unlock per session — at rest, they are inert data.
  • The only thing OIOXO's servers ever receive is a number: how many tokens you saved.

Pricing

  • Free — the on-device models, the assistant, your own keys, and a generous monthly saved-tokens allowance.
  • OIOXO Pro — $3.99/mo — unlimited token-saving across every tool, every project, every machine. → oioxo.com

Requirements

  • VS Code 1.85+ · Node.js 18+ (the agent-facing engine runs via npx oioxo-mcp)
  • On-device model: ~1 GB free RAM for the smallest tier; larger tiers are offered when your machine genuinely fits them

From the makers of the OIOXO IDE — the editor that runs its models in your browser.

© OIOXO · oioxo.com · Docs · Terms

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft