Local AI

AI chat for your code, powered by a local Ollama LLM — fully offline, no cloud APIs, no data leaving your machine.

Features

💬 Ask

Chat with AI about your code
Streaming responses — tokens appear as the model generates them
Markdown rendering — formatted output with syntax-highlighted code blocks
Copy / Apply / Save buttons on every code block

📎 Context Attach

Auto-context — active file is automatically attached when no text is selected
Selection context — highlight code in the editor → auto-appears as context chip
📎 button — attach current file, selection, a folder (recursive), or browse individual files
Drag & drop — drag files from Explorer or Finder directly into the chat
Right-click — right-click any file/folder in Explorer → Add to AI Chat Context

🗂️ Conversation History

Conversations are saved per workspace
Use the 🕐 button to browse, switch, or delete past chats
Auto-titled from the first message

🔧 Model Management

Dropdown shows installed models with parameter size and quantization badges (e.g. 3.2B · Q4_K_M)
Library models show size info and description before you pull
One-click ↓ pull for any model with live progress bar
Hover over installed model → ✕ to uninstall (two-step confirmation)
Connection indicator: green = Ollama running, red = offline

Keyboard Shortcuts

Action	macOS	Windows / Linux
Open Chat	`Cmd+Shift+L`	`Ctrl+Shift+L`
Send message	`Enter`	`Enter`
New line	`Shift+Enter`	`Shift+Enter`

Prerequisites

1. Install Ollama

# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh

# Windows / macOS — download from https://ollama.com/download

⚠️ Do not use brew install ollama — the Homebrew package is missing the llama-server binary.

2. Pull a model

# Lightweight & fast (good for low-resource machines)
ollama pull gemma:2b
ollama pull llama3.2:3b
ollama pull phi3:mini

# Coding-focused
ollama pull qwen2.5-coder:7b
ollama pull codellama:7b
ollama pull deepseek-coder:6.7b

# Balanced quality / speed
ollama pull mistral:7b
ollama pull llama3.1:8b

3. Start Ollama

ollama serve
# "address already in use" → Ollama is already running, that's fine

Configuration

Open Settings (Cmd+,) and search for localAIPrompt:

Setting	Default	Description
`localAIPrompt.ollamaUrl`	`http://localhost:11434`	Ollama API base URL
`localAIPrompt.model`	`gemma:2b`	Model to use
`localAIPrompt.systemPrompt`	(coding assistant)	System prompt sent to the model

Run Locally (Development)

cd local-ai-prompt
npm install
npm run compile
# Press F5 in VS Code → opens Extension Development Host

Package as VSIX

npm install -g @vscode/vsce
vsce package

Install: Extensions → ... → Install from VSIX…

Troubleshooting

Error	Fix
Red connection dot	Run `ollama serve`
`model 'xxx' not found`	Run `ollama pull <model-name>` or use the dropdown
`llama-server binary not found`	Reinstall Ollama via official installer, not Homebrew
`address already in use`	Ollama already running — no action needed

License

MIT

Local AI Promopt

Linh Hoang