`MOCHI`

local-first · multi-agent · memory-aware

A local-first AI coding agent for VS Code with workspace tools, session memory, and approval-aware edits.

Mochi is an experimental VS Code extension that brings an OpenAI Agents SDK runtime into a local editor chat panel. It can inspect your workspace, reason over the active editor context, edit files, run approved commands, and preserve lightweight memory across sessions.

The project is intentionally local-first. Runtime state, memory, and traces are stored on your machine, while workspace tools operate only against the selected local folder.

Demo

https://github.com/user-attachments/assets/01e88781-2600-4f24-8cb4-a177271787ab

If the video does not render in your Markdown viewer, open media/video-demo.mp4 directly.

Features

Editor-native chat panel with streamed assistant replies.
Multiple chat sessions with independent history and input drafts.
Workspace tools for listing files, reading files, writing files, creating directories, and applying focused edits.
Command execution with explicit in-chat approval before running local commands.
Approval cards for destructive actions such as deleting files, deleting directories, or clearing existing file content.
Current-window memory for continuing work across turns without treating every message as a new durable fact.
Rolling session summaries that compact older history while keeping recent turns available for context.
Workspace memory that records detected project facts and suggested verification commands.
Project instruction loading from MOCHI.md, AGENTS.md, and CLAUDE.md style files when present.
Memory snapshots, run traces, MemoryCommit records, and memory events for debugging what Mochi remembered, skipped, archived, and changed.
Memory controls for viewing local memory, private current-window mode, isolating a window from other sessions, disabling persistent memory reads, and clearing current or all local memory.
Private mode is exposed directly in the chat panel as a current-window toggle; the slash menu stays small and only exposes high-frequency shortcuts.
Long-Term Memory records are stored locally, including non-private window_archive records created when a window is archived and deleted.
Root-agent orchestration with delegated subagents for repository guidance, coding, plan review, and code review.
Role-specific tool permissions so exploratory and review agents stay read-only while the coding agent can edit.
Lightweight local skills that inject task-specific workflow guidance only when relevant.
Markdown rendering in assistant replies, including headings, lists, code blocks, inline code, links, and quotes.

Requirements

VS Code 1.90.0 or newer.
An OpenAI API key or a Google AI Studio Gemini API key.
Node.js and npm for local development from source.

Marketplace users can configure model credentials from VS Code:

Mochi: Configure Model Credentials

Mochi stores the API key in VS Code Secret Storage and stores non-sensitive model settings in VS Code Settings.

For local development or advanced setup, Mochi can also read model provider configuration from your shell environment or from ~/.openai-env. The setup script supports OpenAI and Gemini through an OpenAI-compatible endpoint:

export MOCHI_MODEL_PROVIDER="openai"
export OPENAI_API_KEY="sk-..."
export MOCHI_OPENAI_API_KEY="sk-..."
export OPENAI_BASE_URL="https://api.openai.com/v1"
export OPENAI_MODEL="gpt-4.1-mini"
export OPENAI_API_FORMAT="chat_completions"

The runtime also accepts plain .env-style lines such as OPENAI_API_KEY="sk-...", which makes the same file work on Windows, macOS, and Linux.

Quick Start

Install Mochi from the VS Code Marketplace:

https://marketplace.visualstudio.com/items?itemName=zee.mochi-local-agent

Configure model credentials:

Mochi: Configure Model Credentials

If you open Mochi before configuring credentials, it will prompt you to configure them. If you try to send a message without an API key, Mochi will show the same configuration prompt again.

Then open the Command Palette in VS Code and run:

Local Agent: Open Chat

For local development, install JavaScript dependencies:

npm install

Then configure credentials and start the Extension Development Host:

npm run setup:model

Alternative setup helpers:

Windows, macOS, Linux: node ./scripts/setup_model.js
macOS, Linux shells only: ./scripts/setup_model.sh

If you do not need a local proxy, choose n when the setup script asks about proxy configuration. Mochi reads ~/.openai-env directly at runtime. On Windows you usually only need to restart the Extension Development Host after setup; no source step is required.

Start the extension from source:

Open this repository in VS Code.
Press F5.
Select Run Local Agent Extension if VS Code asks for a launch target.
In the Extension Development Host window, run Local Agent: Open Chat.
Send ping or ask Mochi to inspect the workspace.
If you want Mochi to work on a different folder, run Local Agent: Select Workspace Folder from the Extension Development Host.

Usage

Open the Mochi panel from the Command Palette:

Local Agent: Open Chat

Mochi can answer questions, inspect files, make workspace edits, and use the active editor selection as context. When an action is risky, the chat panel shows an approval card before the runtime proceeds.

For complex work, Mochi may delegate bounded subtasks to specialized subagents. Delegation remains visible in run traces, and subagents receive selected memory and skills rather than unrestricted long-term memory.

Memory is user-visible and controllable. The product model has exactly three layers: Current Window Memory, Long-Term Memory, and Runtime Trace. Current Window Memory keeps the active chat coherent; Long-Term Memory stores durable local records such as window_archive; Runtime Trace records evidence about tools, approvals, commands, subagents, and failures.

The chat panel has a Private toggle for browser-private-window style work. Private windows can keep their own current-window context while they are open, but they do not read saved memory from other windows and they do not archive into Long-Term Memory. Non-private window artifact deletion now archives safe current-window context into a local kind: "window_archive" Long-Term Memory record before deleting the current window artifacts. Private deletion records a blocked memory event instead.

Run Mochi: Open Memory Controls to inspect the current memory state. Use Mochi: Delete Current Window Artifacts to archive and delete a non-private window, or discard a Private window without archive. The older granular controls remain available from Memory Controls and the Command Palette during the transition.

The current implementation still has internal session, task-like working state, workspace, user, long-term memory, and memory event stores. The memory model is captured in doc/memory-model.md and doc/memory-model.zh.md; it defines the three layers, current storage locations, lifecycle, archive/delete rules, Private mode, and Memory Controller boundaries. User preference, project fact/convention, decision, and window archive are Long-Term Memory record kinds, not separate layers. doc/memory-v2.md tracks the JSON-backed implementation plan and current status.

The current workflow is optimized for local development:

Open a workspace folder.
Start Mochi from the Extension Development Host.
Ask for a code change, explanation, or project review.
Review any approval cards for command execution or destructive file operations.
Use memory snapshots when you want to inspect what Mochi stored.

Commands

Command	Purpose
`Local Agent: Open Chat`	Open or focus the Mochi chat panel.
`Local Agent: Quick Ask`	Send a quick prompt to Mochi.
`Local Agent: Ask About Selection`	Prefill chat with the current editor selection.
`Local Agent: Replace Selection With Last Reply`	Insert the latest assistant reply into the active editor selection.
`Local Agent: Select Workspace Folder`	Choose which folder Mochi should treat as the active workspace.
`Local Agent: Open Memory Snapshot`	Open a compact memory and trace snapshot.
`Local Agent: Open Raw Memory Snapshot`	Open the raw stored memory snapshot.
`Mochi: Open Memory Controls`	Inspect current memory state and available memory commands.
`Mochi: Toggle Current Window Private Mode`	Prevent reading saved memory and memory from other sessions in this window.
`Mochi: Toggle Current Window Memory Isolation`	Prevent or allow reading memory from other sessions.
`Mochi: Toggle Current Window Persistent Memory Reads`	Prevent or allow reading persistent memory in the current window.
`Mochi: Delete Current Window Artifacts`	Archive and delete a non-private window's chat, working state, trace, and routing artifacts; Private windows are discarded without archive.
`Mochi: Clear Current Window Memory`	Clear summaries, working state, traces, and routing memory for the current window while keeping chat messages.
`Mochi: Clear Current Session Summary Memory`	Clear the current session summary and compaction memory.
`Mochi: Clear Current Window Working State`	Transitional internal command for clearing current working-state records.
`Mochi: Clear Current Workspace Memory`	Clear detected workspace facts and verification hints.
`Mochi: Clear User Memory`	Clear saved user preferences.
`Mochi: Clear Current Trace Memory`	Clear latest run trace and routing state.
`Mochi: Clear All Local Memory`	Clear all local memory categories while keeping chat sessions and messages.

For the full command reference, see doc/commands-and-capabilities.md.

Slash Shortcuts

The chat input supports a small / shortcut menu. It is intentionally not a duplicate of the full command palette.

Current shortcuts:

/help
/new
/memory
/clear-private-window
/model

Safety Model

Mochi treats the workspace as shared state:

File mutations are serialized per target path.
Writes refuse stale edits when a file changed after Mochi read it.
Destructive file actions require approval.
Local command execution requires approval.
Tool results are recorded so Mochi can distinguish success, failure, denial, and skipped work.
Run traces capture tool calls, approvals, changed paths, command evidence, and verification status.
Current-window isolation prevents cross-session memory recall.
Persistent memory reads can be disabled per current window.
Private mode blocks persistent memory reads, cross-session recall, and Long-Term Memory archive writes for the current window.
Memory events record completed, skipped, and blocked memory decisions such as non-private archive creation or Private archive blocking.
Users can clear the current window's memory or all local Mochi memory.

This makes the extension useful for real local work while keeping potentially surprising actions visible.

Project Structure

src/extension/   VS Code activation, commands, webview UI, and chat controller
src/runtime/     OpenAI Agents SDK runtime, tools, prompts, memory, and tracing
scripts/         Model provider setup helper
doc/             Architecture notes, feature notes, roadmap, and command reference
media/           Extension and README assets

Development

Use the VS Code launch configuration for the main extension path:

.vscode/launch.json -> Run Local Agent Extension

The JavaScript runtime is the only product runtime path. Use the launch configuration above for local development and testing.

Documentation

doc/current-features-and-usage.md
doc/current-architecture.md
doc/memory-model.md
doc/memory-model.zh.md
doc/memory-v2.md
doc/commands-and-capabilities.md
doc/roadmap.md
doc/development-log.md
doc/ultimate-goal.md

License

Mochi is released under the MIT License. See LICENSE for details.

Security

Do not commit real API keys.
Store local credentials in ~/.openai-env or an ignored .env file.
Rotate any key that was exposed.
Review approval cards before allowing file deletion, directory deletion, file clearing, or command execution.