Skip to content
| Marketplace
Sign in
Visual Studio Code>Programming Languages>TinyFalconNew to Visual Studio Code? Get it now.
TinyFalcon

TinyFalcon

Preview

Falcon Rising

|
5 installs
| (0) | Free
Your data, your compute. Local-first agentic coding assistant. Requires Ollama (free) running locally — works fully offline.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

TinyFalcon

Your data, your compute. A local-first agentic coding assistant for VS Code.

TinyFalcon plans tasks, edits files, runs terminal commands, reads the results, and iterates — all driven by a local LLM served by Ollama. Designed from the ground up for small local models (7B–32B), with FSM-enforced safety gates that keep the agent honest on hardware that doesn't run frontier models.

For developers in regulated industries, air-gapped environments, or anyone who wants AI assistance without streaming their codebase to a third party.


⚠️ Requires Ollama (free)

Before TinyFalcon can do anything, you need:

  1. Ollama installed and running (ollama serve).

  2. At least one model pulled. Start with qwen2.5-coder:7b:

    ollama pull qwen2.5-coder:7b
    

If Ollama isn't running when you send your first prompt, TinyFalcon shows a one-click error pane with the install link and copy-buttons for the commands above — but it's faster if you set them up first.


Install

  1. Install Ollama and run ollama serve.
  2. Pull a model: ollama pull qwen2.5-coder:7b.
  3. Install TinyFalcon from the VS Code Marketplace (this listing).
  4. Click the TinyFalcon icon in the activity bar.
  5. Send your first prompt — TinyFalcon will plan, write code, run it, and iterate until it's done.

Screenshots

TinyFalcon writing and testing code via the agent loop

A typical TinyFalcon run: the agent wrote a binary-tree inversion program (write_file), ran the test suite (run_terminal), and confirmed everything passed — all from a single prompt.


What makes TinyFalcon different

Other agentic VS Code tools (Copilot, Cursor, Cline's cloud mode) need to stream your code to a third party. Local-first alternatives (Cline + Ollama, Continue, Aider) were built around cloud frontier models and retrofitted for local — pointed at a 7B–32B model they tend to malformed-tool-call, infinite-loop, or give up early.

TinyFalcon goes the other direction. The agent loop, tool schemas, and recovery logic are all designed for small local models on developer-grade hardware:

  • Deterministic safety gates. The FSM blocks success-claiming replies after failures and identical tool retries — fixes the two most common small-model failure modes without depending on the model to "behave".
  • Status-led tool results. Tool results lead with a clear SUCCEEDED / FAILED verdict so the model can't mistake "ran the tool" for "the command worked".
  • Workspace ledger. A live snapshot of files written and recent commands is injected each turn so the model has ground truth, not hallucinated paths.
  • Cross-turn memory. Multi-turn conversations preserve history and the workspace state across user prompts.
  • Per-model prompt packs. Built-in packs tuned for Qwen, DeepSeek, and Gemma quirks; auto-selected from the model tag.
  • Friendly first-run errors. No cryptic "OllamaUnreachable" — actionable buttons (Install Ollama, copy ollama serve, copy ollama pull <model>, retry).

How it works

TinyFalcon runs a recursive agentic loop:

  1. Ingest — Workspace context + your prompt.
  2. Plan — The local model proposes a tool call (read file, write file, run command).
  3. Act — The extension executes via the VS Code API.
  4. Observe — The agent reads the result: file contents, exit codes, stderr.
  5. Refine — Loop, or finish with a reply summarizing what it did.

What's different from "ReAct-style" loops written for cloud models: every step is gated by a small FSM that enforces invariants the model can't be trusted to enforce itself. The model picks the actions; the FSM enforces correctness.


Supported Models

Any model Ollama can serve will work. Recommended starting points:

Model tag Pack Notes
qwen2.5-coder:7b qwen Best balance of size/quality for most hardware. Default starting point.
qwen2.5-coder:14b / :32b qwen Higher quality on stronger GPUs.
deepseek-coder:6.7b deepseek Strong code-focused alternative.
gemma2:9b gemma Good general-purpose backup.

Switch models in Settings → TinyFalcon → Ollama: Model. The matching prompt pack is selected automatically.

Not yet supported (post-v1): Reasoning models like deepseek-r1 — their <think> block prefixes break the JSON parser. Tracked separately.


Settings

Setting Default What it does
tinyfalcon.ollama.host http://localhost:11434 Where to reach Ollama.
tinyfalcon.ollama.model qwen2.5-coder:7b Which Ollama model to use.
tinyfalcon.agent.promptPack auto Which prompt pack: auto, default, qwen, deepseek, gemma.
tinyfalcon.agent.maxIterations 25 Max plan-act-observe cycles per prompt before giving up.
tinyfalcon.tools.enabled true Enable agentic tool use. Disable to fall back to plain chat.
tinyfalcon.tools.autoApprove.readFile true Auto-approve read_file (non-destructive).
tinyfalcon.tools.autoApprove.writeFile false Auto-approve write_file. Off by default for safety.
tinyfalcon.tools.autoApprove.runTerminal false Auto-approve run_terminal. Off by default for safety.

Privacy

TinyFalcon's default mode is fully local — your code, prompts, and the model's outputs stay on your machine. No telemetry. No accounts. No usage metering.

This isn't an absolute claim — cloud-fallback support may be added later as an explicit opt-in. The current release (v0.1.0) has no such mode and makes no network calls beyond localhost:11434.


If something doesn't work, please use the Q&A tab on the marketplace listing.


License

Proprietary. Use granted under the terms in the LICENSE file (linked from the marketplace Resources sidebar); redistribution, reverse engineering, and derivative works are not permitted.


Your data, your compute. TinyFalcon is small by design, not by ambition.

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft