On-Device Model

A chat sidebar that talks directly to Apple's on-device language model - free, private, offline, and entirely contained in your editor.
macOS 26+, Apple Silicon only. Requires Apple Intelligence to be enabled in System Settings.
Install
Open the Extensions view in VS Code (Ctrl/Cmd+Shift+X), search for On-Device Model, hit Install - or grab it from the Marketplace.
A On-Device Model icon then turns up in the Activity Bar. Click it for the chat sidebar.
Using it
Type a prompt into the box at the bottom, press Enter to send (Shift+Enter for a newline), and the model's reply streams into the transcript above. Replies render as Markdown so code, lists, and tables come out formatted. Click the + in the view's title bar to start a fresh conversation, or the red Stop button while a reply is streaming to interrupt the model.
Three toggles sit just above the input. They persist across reloads and workspaces.
- Selection chip - shows the character count of your current selection in the active editor. When on, the selected text (not the whole file) is attached to your next prompt, so the model can answer questions about it or propose a replacement. The selection range is captured at submit time, so the apply still lands on the right spot even if you move your cursor while the model is responding.
- Auto-apply - when on, proposed edits get written to disk as soon as the model finishes. Off by default; with it off you get an Apply / Reject card on each edit, and an Undo link after applying.
- Web search - lets the model decide for itself to look something up on DuckDuckGo when it needs current information. Each search shows up in the bubble with the query the model chose; expanding it reveals the summary the search returned to the model.
The sidebar tells you when the model isn't available - if Apple Intelligence is off, the model is still downloading, or the device isn't eligible.
Honest expectations
Apple's on-device model is around 3 billion parameters and trained as a general-purpose assistant - not for code. It's great for short tasks (summarise this paragraph, suggest a variable name, draft a commit message, explain a regex, refactor variables in one file) and useless for anything that needs deep code reasoning or a long context window. For real coding help, you still want a frontier model.
What it gives you:
- Free. No API key, no quota.
- Private. Prompts never leave your Mac, except when you opt in to web search.
- Offline. Works on the train (with web search off).
- Snappy. First-token latency is excellent on Apple Silicon.
How it works
The extension spawns a small Swift CLI bundled inside the .vsix. The CLI imports FoundationModels, owns one LanguageModelSession, and streams respond output back over stdout as line-delimited JSON. The TypeScript side parses each line into a webview message; the React sidebar folds streaming chunks into the latest assistant message in real time.
The web-search tool is registered on the session via Apple's Tool protocol with a @Generable argument schema, so the model decides when to call it and emits a grammar-constrained query. Inside the tool, DuckDuckGo's Instant Answer JSON is tried first; if it's empty, the HTML interface is scraped for the top result snippets. The raw results are then handed to a stateless side LanguageModelSession whose only instructions are to summarise, so the noisy raw text never enters the conversational transcript - the main session sees a clean two- or three-sentence summary.
File edits proposed by the model are extracted from a fenced code block in the response. When a selection was attached, the model is told to return only the replacement text for the selection; the webview surfaces an inline Apply / Reject card and applies the change via a WorkspaceEdit to the original selection range (captured at submit time, so it survives later cursor moves). The Undo link runs the built-in undo command on the editor.
Support
If this is useful to you and you'd like to support its development, you can buy me a coffee on Ko-fi - always optional, always appreciated.

License
MIT - see LICENSE.