Gneiss — AI Writing Assistant

A VSCode extension where you control the argument's skeleton and delegate low-level tasks for AI agents that 'fill in the blanks'. Targetted towards academic, formal, and research writing — Gneiss works with LaTeX, Markdown, and any other file type. Our vision is to recentre humans in the writing process, integrating LLMs as 'blanks' in your text so you can write at the speed of thought without sacrificing control.

For example, as you being drafting the motivation for a paper, you might write:

The seminal result by ⟨citation⟩ introduced the transformer architecture, which ⟨one-sentence summary of key contribution⟩.

Then press Option+S (to fill in blanks sequentially in a token-efficient manner). Gneiss automatically delegates agents to complete these blanks with smart context construction: agents can browse the web, query files in your repo, and reference bibtex citations to fill in your blanks (for Gneiss' first-class file access and bibtex support, see here).

When we ran Gneiss on the above example, it produced:

The seminal result by Vaswani et al. (2017) introduced the transformer architecture, which is a network architecture based solely on attention mechanisms, dispensing with recurrence and convolutions entirely.

Getting Started

Install the Gneiss extension.
Configure your API key (see Configuration below).
Open a document, press Cmd+[ to create a blank, type an instruction, and click Fill.

Usage

Creating Blanks

Press Cmd+[ to insert a blank at your cursor. A ⟨⟩ pair appears and you can type a natural-language instruction inside it.

For example, you might write:

The seminal result by ⟨citation⟩ introduced the transformer architecture, which ⟨one-sentence summary of key contribution⟩.

You can also select text first, then press Cmd+[ — the selection becomes the instruction.

Blanks are directly editable: click into the ⟨instruction⟩ text to modify it at any time. CodeLens actions (Fill and Delete) appear above lines with pending blanks.

Filling Blanks

Click the Fill CodeLens above any blank, or use one of the batch fill modes:

Mode	Shortcut	Best for
Single	Click "Fill" CodeLens	One blank at a time
Complete	`Option+A`	All blanks in one API call — most token-efficient
Sequential	`Option+S`	Blanks that depend on earlier ones (fills top-to-bottom in a single conversation)
Parallel	`Option+D`	Independent blanks — fastest wall-clock time

Each fill produces an inline diff. Press Cmd+Y to accept or Cmd+N to reject.

References

Type @ inside a blank to reference files from your workspace. The agent receives their content as context.

Syntax	What it does
`@path/to/file.tex`	Include the file's content in the agent's context
`@papers/`	Browse files in a directory
`@*.bib`	Glob-match files
`@references.bib:nanda2023`	Include a specific BibTeX entry from a `.bib` file
`@cite:nanda2023`	Find a cite key across all `.bib` files in the workspace
`@"path with spaces/refs.bib":key`	Quoted path syntax for paths containing spaces

Autocomplete suggestions appear as you type, with icons distinguishing file types. For .bib files, typing : after the filename triggers cite key completions showing key, title, author, and year.

`@model:` Override

Override the LLM provider and model for individual blanks:

⟨@model:claude:claude-sonnet-4-5 synthesise the argument from sections 2 and 3⟩
⟨@model:ollama:llama3:8b year of publication⟩

This lets you mix models within a document — e.g. a fast/cheap model for simple lookups and a more capable model for complex synthesis. Blanks without @model: use your global default. The directive is stripped before reaching the model.

Autocomplete helps here too: typing @model: suggests providers (claude, ollama), and typing further (e.g. @model:claude:) suggests available models.

Agent Tools

The filling agent has access to tools it can use autonomously:

Web search — the agent can search the web for facts, citations, and context (Claude uses native web search; Ollama requires an API key from ollama.com/settings/keys)
**read_file** — read any file in your workspace (text files, .bib, and PDFs)
**search_files** — grep-style search across workspace files
**list_directory** — browse directories

You don't need to @-reference everything — the agent can explore your workspace on its own. But explicit @ references are more reliable and avoid extra tool-call latency.

File Access Control

Create a .gneissignore file in your workspace root to exclude files from agent access. Uses regex patterns, one per line:

# Ignore build artifacts
build/.*
dist/.*

# Ignore drafts
drafts/.*\.tex

Files matching .gitignore patterns are also excluded by default (controlled by the gneiss.repo.respectGitignore setting).

Keybindings

All keybindings are remappable via VSCode's standard keybindings.json.

Action	Binding
Open blank at cursor / wrap selection	`Cmd+[`
Cancel blank	`Escape`
Fill all blanks (complete)	`Option+A`
Fill all blanks (sequential)	`Option+S`
Fill all blanks (parallel)	`Option+D`
Accept inline diff	`Cmd+Y`
Reject inline diff	`Cmd+N`
Open chat panel	`Cmd+L`

Note: Cmd+[ overrides VSCode's "Outdent Line" — use Shift+Tab for outdenting instead.

Configuration

Claude

Your API key is detected automatically from the ANTHROPIC_API_KEY environment variable, or set it in settings: gneiss.llm.claude.apiKey.

Setting	Default	Description
`gneiss.llm.claude.model`	`claude-haiku-4-5`	Model to use
`gneiss.llm.claude.maxTokens`	`64000`	Max output tokens per fill. When extended thinking is enabled, the 5k thinking budget is carved out of this value. Model limits: haiku/sonnet/opus-4-5 = 64k, opus-4-6 = 128k

Ollama

Setting	Default	Description
`gneiss.llm.ollama.model`	`qwen3:4b`	Model to use (must be pulled locally)
`gneiss.llm.ollama.baseUrl`	`http://localhost:11434`	Ollama server URL
`gneiss.llm.ollama.apiKey`		API key for web search (from `ollama.com/settings/keys`) — not needed for local inference
`gneiss.llm.ollama.keepAlive`	`60m`	How long to keep model loaded. Use `-1` for infinite
`gneiss.llm.ollama.numCtx`	`32768`	Context window size. Must be consistent across requests for KV cache reuse
`gneiss.llm.ollama.maxTokens`	`-1`	Max output tokens (`-1` = unlimited)

General

Setting	Default	Description
`gneiss.llm.defaultProvider`	`claude`	`claude` or `ollama`
`gneiss.agents.maxConcurrent`	`3`	Max parallel fill agents
`gneiss.agents.groupingDistance`	`30`	Max lines apart to group nearby blanks into one API call
`gneiss.agents.fillOnClose`	`false`	Auto-fill blanks immediately when instruction is entered
`gneiss.agents.skipThinking`	`false`	Disable extended thinking to reduce latency
`gneiss.repo.respectGitignore`	`true`	Exclude `.gitignore`-matched files from agent access

Gneiss — AI Writing Assistant

Yiding Song

Gneiss — AI Writing Assistant

Getting Started

Usage

Creating Blanks

Filling Blanks

References

@model: Override

Agent Tools

File Access Control

Keybindings

Configuration

Claude

Ollama

General

`@model:` Override