Navni AI — AI Code Assistant
Free, private, local-first AI assistant for VS Code. Use Ollama for 100% private coding, or bring your own API key for cloud providers. No telemetry, no subscriptions.

Version 1.0.0 | Changelog | GitHub
Why Navni AI?
- 🔒 100% Private — Run locally with Ollama. Your code never leaves your machine.
- 💰 Free Forever — No subscription required. Bring Your Own Key for cloud APIs.
- ⚡ Inline Completions — Tab-autocomplete ghost text, like Copilot, but local and free.
- 🧠 Context Engine — Background indexing with semantic search across your entire project.
- 🤖 6 Specialized Agents — Agent, Chat, Generator, Reviewer, Tester, Debug Helper.
Features
⚡ Inline Code Completions (Tab-Autocomplete)
- Ghost text suggestions as you type — press Tab to accept
- Uses fast FIM (Fill-in-Middle) models for instant completions
- Separate model setting — use a small 1.3B model for speed while chat uses a larger model
- Debounced, cached, and cancellation-aware
- Streaming responses with smooth "Thinking..." animation
- Direct Ollama streaming — bypasses CLI for ultra-low latency (~200ms first token)
- Markdown rendering with syntax-highlighted code blocks
- Mermaid diagram rendering — AI can generate visual diagrams inline
- Stream metadata — see model name, token count, and tok/s after each response
- Stop generation button to cancel anytime
🔧 Smart Apply & Ripple Detection
- Smart Apply — Apply code from chat to files using VS Code's native diff editor
- Next Edit detection — When you apply a change, Navni finds all callers/usages and suggests bulk updates
- File Operations — Create, modify, and delete files directly from chat
🌿 Conversation Branching
- Edit previous messages to fork into new threads
- Navigate branches with left/right arrows
- Full history preserved — never lose context
🧠 Context Engine
- Background indexing with file watchers for real-time updates
- Semantic search across your entire project
- Auto-attach active editor context on every message
- Status bar indicator showing index health
🤖 Multi-Agent System
| Agent |
Purpose |
| ⚡ Agent |
General-purpose with file operations and workspace awareness |
| 💬 Chat |
Lightweight conversations without file context |
| ⚡ Generator |
Create new code, scaffolds, and boilerplate |
| 🔬 Reviewer |
Code analysis, security review, best practices |
| 🧪 Tester |
Generate unit tests and test plans |
| 🐛 Debug Helper |
Stack trace analysis, error fixing in 7 languages |
📊 Additional Features
- Agentic Task Planning — Complex requests are auto-decomposed into steps
- Project Intelligence — Auto-detect frameworks, tech stack, architecture
- Multi-Session Chat — Multiple independent sessions with persistence
- Diff Engine — Preview all changes before applying
- MCP Integration — Connect to Model Context Protocol servers
Supported Providers
| Provider |
Type |
Cost |
| Ollama |
Local |
Free |
| Groq |
Cloud |
Free tier |
| Google Gemini |
Cloud |
Free tier |
| OpenAI |
Cloud |
Paid |
| OpenRouter |
Cloud |
Pay-per-use |
| Cerebras |
Cloud |
Free tier |
Quick Start
1. Install the Navni CLI
pip install cognify-code
2. Set up a model
# Local (free, private)
ollama pull deepseek-coder:6.7b
# For inline completions (optional, recommended)
ollama pull deepseek-coder:1.3b
3. Install this extension and start coding!
The sidebar opens automatically. Type a message or just start coding to see inline completions.
Keyboard Shortcuts
| Command |
Windows/Linux |
Mac |
| Review File |
Ctrl+Shift+R |
Cmd+Shift+R |
| Generate Code |
Ctrl+Shift+G |
Cmd+Shift+G |
| Explain Selection |
Ctrl+Shift+E |
Cmd+Shift+E |
| Open Chat |
Ctrl+Shift+C |
Cmd+Shift+C |
Right-click selected code for Review Selection, Explain Selection, and Edit Code with AI.
Settings
| Setting |
Description |
Default |
cognify.provider |
LLM provider |
ollama |
cognify.model |
Chat model |
deepseek-coder:6.7b |
cognify.completion.enabled |
Enable inline completions |
true |
cognify.completion.model |
Completion model (fast, small) |
(uses chat model) |
cognify.completion.debounceMs |
Typing pause before completing |
300 |
cognify.autoContext |
Auto-include relevant files |
true |
cognify.maxContextTokens |
Max context tokens |
8000 |
cognify.cliPath |
Custom CLI path (Android/Termux) |
(auto-detect) |
Troubleshooting
"cognify not found"
pip install cognify-code
cognify --version
Slow completions
- Use a small model for completions: set
cognify.completion.model to deepseek-coder:1.3b
- Lower
cognify.completion.debounceMs to 200
Connection errors
cognify status
Contributing
Contributions welcome! Visit our GitHub repository.\n
License
MIT License — see LICENSE for details.