Skip to content
| Marketplace
Sign in
Visual Studio Code>Other>Gneiss — AI Writing AssistantNew to Visual Studio Code? Get it now.
Gneiss — AI Writing Assistant

Gneiss — AI Writing Assistant

Yiding Song

|
1 install
| (0) | Free
Fill in the gaps in your writing with AI agents.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Gneiss — AI Writing Assistant

A VSCode extension where you control the argument's skeleton and delegate low-level tasks for AI agents that 'fill in the blanks'. Targetted towards academic, formal, and research writing — Gneiss works with LaTeX, Markdown, and any other file type. Our vision is to recentre humans in the writing process, integrating LLMs as 'blanks' in your text so you can write at the speed of thought without sacrificing control.

For example, as you being drafting the motivation for a paper, you might write:

The seminal result by ⟨citation⟩ introduced the transformer architecture, which ⟨one-sentence summary of key contribution⟩.

Then press Option+S (to fill in blanks sequentially in a token-efficient manner). Gneiss automatically delegates agents to complete these blanks with smart context construction: agents can browse the web, query files in your repo, and reference bibtex citations to fill in your blanks (for Gneiss' first-class file access and bibtex support, see here).

When we ran Gneiss on the above example, it produced:

The seminal result by Vaswani et al. (2017) introduced the transformer architecture, which is a network architecture based solely on attention mechanisms, dispensing with recurrence and convolutions entirely.

Getting Started

  1. Install the Gneiss extension.
  2. Configure your API key (see Configuration below).
  3. Open a document, press Cmd+[ to create a blank, type an instruction, and click Fill.

Usage

Creating Blanks

Press Cmd+[ to insert a blank at your cursor. A ⟨⟩ pair appears and you can type a natural-language instruction inside it.

For example, you might write:

The seminal result by ⟨citation⟩ introduced the transformer architecture, which ⟨one-sentence summary of key contribution⟩.

You can also select text first, then press Cmd+[ — the selection becomes the instruction.

Blanks are directly editable: click into the ⟨instruction⟩ text to modify it at any time. CodeLens actions (Fill and Delete) appear above lines with pending blanks.

Filling Blanks

Click the Fill CodeLens above any blank, or use one of the batch fill modes:

Mode Shortcut Best for
Single Click "Fill" CodeLens One blank at a time
Complete Option+A All blanks in one API call — most token-efficient
Sequential Option+S Blanks that depend on earlier ones (fills top-to-bottom in a single conversation)
Parallel Option+D Independent blanks — fastest wall-clock time

Each fill produces an inline diff. Press Cmd+Y to accept or Cmd+N to reject.

References

Type @ inside a blank to reference files from your workspace. The agent receives their content as context.

Syntax What it does
@path/to/file.tex Include the file's content in the agent's context
@papers/ Browse files in a directory
@*.bib Glob-match files
@references.bib:nanda2023 Include a specific BibTeX entry from a .bib file
@cite:nanda2023 Find a cite key across all .bib files in the workspace
@"path with spaces/refs.bib":key Quoted path syntax for paths containing spaces

Autocomplete suggestions appear as you type, with icons distinguishing file types. For .bib files, typing : after the filename triggers cite key completions showing key, title, author, and year.

@model: Override

Override the LLM provider and model for individual blanks:

⟨@model:claude:claude-sonnet-4-5 synthesise the argument from sections 2 and 3⟩
⟨@model:ollama:llama3:8b year of publication⟩

This lets you mix models within a document — e.g. a fast/cheap model for simple lookups and a more capable model for complex synthesis. Blanks without @model: use your global default. The directive is stripped before reaching the model.

Autocomplete helps here too: typing @model: suggests providers (claude, ollama), and typing further (e.g. @model:claude:) suggests available models.

Agent Tools

The filling agent has access to tools it can use autonomously:

  • Web search — the agent can search the web for facts, citations, and context (Claude uses native web search; Ollama requires an API key from ollama.com/settings/keys)
  • **read_file** — read any file in your workspace (text files, .bib, and PDFs)
  • **search_files** — grep-style search across workspace files
  • **list_directory** — browse directories

You don't need to @-reference everything — the agent can explore your workspace on its own. But explicit @ references are more reliable and avoid extra tool-call latency.

File Access Control

Create a .gneissignore file in your workspace root to exclude files from agent access. Uses regex patterns, one per line:

# Ignore build artifacts
build/.*
dist/.*

# Ignore drafts
drafts/.*\.tex

Files matching .gitignore patterns are also excluded by default (controlled by the gneiss.repo.respectGitignore setting).

Keybindings

All keybindings are remappable via VSCode's standard keybindings.json.

Action Binding
Open blank at cursor / wrap selection Cmd+[
Cancel blank Escape
Fill all blanks (complete) Option+A
Fill all blanks (sequential) Option+S
Fill all blanks (parallel) Option+D
Accept inline diff Cmd+Y
Reject inline diff Cmd+N
Open chat panel Cmd+L

Note: Cmd+[ overrides VSCode's "Outdent Line" — use Shift+Tab for outdenting instead.

Configuration

Claude

Your API key is detected automatically from the ANTHROPIC_API_KEY environment variable, or set it in settings: gneiss.llm.claude.apiKey.

Setting Default Description
gneiss.llm.claude.model claude-haiku-4-5 Model to use
gneiss.llm.claude.maxTokens 64000 Max output tokens per fill. When extended thinking is enabled, the 5k thinking budget is carved out of this value. Model limits: haiku/sonnet/opus-4-5 = 64k, opus-4-6 = 128k

Ollama

Setting Default Description
gneiss.llm.ollama.model qwen3:4b Model to use (must be pulled locally)
gneiss.llm.ollama.baseUrl http://localhost:11434 Ollama server URL
gneiss.llm.ollama.apiKey API key for web search (from ollama.com/settings/keys) — not needed for local inference
gneiss.llm.ollama.keepAlive 60m How long to keep model loaded. Use -1 for infinite
gneiss.llm.ollama.numCtx 32768 Context window size. Must be consistent across requests for KV cache reuse
gneiss.llm.ollama.maxTokens -1 Max output tokens (-1 = unlimited)

General

Setting Default Description
gneiss.llm.defaultProvider claude claude or ollama
gneiss.agents.maxConcurrent 3 Max parallel fill agents
gneiss.agents.groupingDistance 30 Max lines apart to group nearby blanks into one API call
gneiss.agents.fillOnClose false Auto-fill blanks immediately when instruction is entered
gneiss.agents.skipThinking false Disable extended thinking to reduce latency
gneiss.repo.respectGitignore true Exclude .gitignore-matched files from agent access
  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft