Skip to content
| Marketplace
Sign in
Visual Studio Code>AI>NESweep autocompletionNew to Visual Studio Code? Get it now.
NESweep autocompletion

NESweep autocompletion

SR_team

|
2 installs
| (0) | Free
Next Edit Autocomplete using Sweep Model
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

NESweep — Next Edit autocompletion for VSCode

image

NESweep is a fork of Sweep Next Edit that retargets the extension at a local OpenAI-compatible /v1/completions server (e.g. llama.cpp's llama-server) running an edit-prediction model. The upstream uvx sweep-autocomplete Python child process — which falls back to CPU and is unusable for next-edit latency — is removed.

Features

  • Local OpenAI-compatible backend. Posts to /v1/completions on any server you bring up (llama.cpp, vLLM, sglang, Ollama with the OpenAI shim).
  • SweepAI + Zed Zeta-2 / Zeta-2.1 models. Format auto-detected from sweep.modelName. Zeta-2.1 returns up to three edits per request (cursor area + up to two windows around nearby diagnostics).
  • LSP-diagnostics aware. Cursor-radius filter, cascading-error suppression below a root-cause line, and user-configurable regex rewrites on the messages (clang / clang-tidy presets included).
  • Per-language workspace rules. .vscode/nes-<languageId>.md is editable from the NESweep status-bar menu with a configurable soft-cap warning when the file grows large enough to bloat latency.
  • Cache-friendly + persistent. Stable content emitted first / volatile last for maximum prefix-cache hits; recent files, edits, and cursor positions survive window reload via workspaceState, so the model has context immediately after restart.
  • Status-bar menu + trace logging. Toggle, snooze, ping server, edit instructions. Set the NESweep output channel to Trace (Developer: Set Log Level… → NESweep) for full request/response visibility.

Settings

Key Default Purpose
sweep.serverUrl http://localhost:8080 /v1/completions base URL
sweep.modelName sweepai/sweep-next-edit model field in the request body; substring-matched to pick the prompt format
sweep.completionTimeoutMs 10000 Per-request timeout (ms)
sweep.diagRadius 12 ±N lines around cursor; 0 disables
sweep.broadBefore 125 Lines of broad context before cursor
sweep.broadAfter 75 Lines of broad context after cursor
sweep.rulesMaxChars 3000 Soft cap on per-language workspace-rules file size; overflow surfaces as a diagnostic + red background in the editor
sweep.injectInlineDiagnostics false Inline BUG: comments next to diagnosed lines in the prompt — recommended for 0.5B / 1.5B sweep checkpoints
sweep.inlineDiagnosticsMarker BUG: LSP error here Marker phrase used by the inline injection + response-side strip anchor
sweep.diagnosticsMessageTransforms clang preset {regex: replacement} rewrites applied to every diagnostic message after the built-in normalisations

Setup

Run any supported edit-prediction GGUF behind an OpenAI-compatible /v1/completions server. Examples with llama.cpp:

# Sweep next-edit (default; 7B works without the inline-diagnostics hack)
llama-server -hf sweepai/sweep-next-edit-7b-gguf --ctx-size 32768

# Sweep 1.5B (smaller, faster — turn on sweep.injectInlineDiagnostics)
llama-server -hf sweepai/sweep-next-edit-1.5b-gguf --ctx-size 32768

# Zeta-2 (Zed's SeedCoder-8B, single-region)
llama-server -hf bartowski/zed-industries_zeta-2-GGUF --ctx-size 16384

# Zeta-2.1 (Zed's SeedCoder-8B, multi-region)
llama-server -hf bartowski/zed-industries_zeta-2.1-GGUF --ctx-size 16384

Then point sweep.modelName at the right name. Detection rules:

  • zeta-2.1 / zeta2.1 / zeta-2-1 / zeta_2_1 → Zeta-2.1 multi-region
  • zeta2 / zeta-2 / seedcoder → Zeta-2 single-region
  • everything else → Sweep layout (default)

Sweep's GGUF advertises 32k natively; the full prompt routinely runs 15–20k tokens for non-trivial files, so a smaller --ctx-size truncates real prompts. Zeta-2 / 2.1's editable regions are much tighter (±15 lines around cursor + tiny ±2-line halos for diagnostic regions on 2.1), so those prompts are smaller.

Build & install the extension:

bun install
bun run build
bunx @vscode/vsce package --no-dependencies --skip-license
code --install-extension nesweep-*.vsix --force

Credits

  • Original Sweep Next Edit by SweepAI.
  • Sweep prompt format ported from cursortab.nvim.
  • Zeta-2 / Zeta-2.1 model card: zed-industries on Hugging Face.

License

GNU Affero General Public License v3.0 or later — see LICENSE.

The upstream repository sweepai/vscode-nes does not ship a LICENSE file, but its initial commit (fcdfb50 — init: Base vscode foundation based on zed impl) is a line-for-line TypeScript translation of zed-industries/zed/crates/zeta/src/sweep_ai.rs — the wire-protocol structs, the ActionType enum with its SCREAMING_SNAKE_CASE serde rename, the brotli (quality=11, lgwin=22) params, the hardcoded https://autocomplete.sweep.dev/... endpoint, even the // TODO-fenced privacy_mode_enabled: false were carried over verbatim. The Rust file was removed from Zed in commit 42583c1 on 2025-12-04, but at the time of the initial commit it was AGPL-3.0 as part of the Zed editor. Translating an AGPL work into another language produces a derivative work covered by the same license, so AGPL-3.0 attaches to the entire combined codebase regardless of whether the upstream author shipped a LICENSE file. This fork makes that licensing explicit.

Copyright attribution:

  • Zed Industries, Inc. — original sweep_ai.rs (AGPL-3.0), ported in src/api/schemas.ts, src/core/constants.ts, and parts of src/api/client.ts.
  • SweepAI and the upstream sweepai/vscode-nes contributors — VS Code-side glue (extension activation, inline-edit provider, document tracker, telemetry plumbing), itself a combined work covered by the same AGPL terms.
  • This fork's authors — all subsequent commits.
  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft