Ollama Cloud VSCode Extension
An AI-powered coding assistant for Visual Studio Code, powered by Ollama Cloud and local Ollama. Similar to Cline, this extension provides intelligent code assistance, file editing, and command execution capabilities.
Features
- 🤖 AI Chat Interface: Interactive chat with Ollama Cloud and local Ollama models
- 📝 Smart Code Editing: AI can read, write, and modify files with your approval
- 🔧 Command Execution: Execute terminal commands suggested by the AI
- 🔄 Diff View: Review changes before applying them
- 🎯 Multiple Models: Support for various Ollama models (Llama, CodeLlama, Mistral, etc.)
- ⚡ Real-time Streaming: See AI responses as they're generated
- 🎨 VSCode Integration: Native VSCode UI with dark/light theme support
- 🧠 Enhanced Context Awareness: Tracks tasks, files, and commands for better continuity
- 💾 Session Persistence: Auto-saves and restores conversations across VSCode restarts
- 📊 Token Usage Tracking: Monitor your Ollama Cloud API usage with visual indicators
- 🗂️ Workspace Indexing: Automatically understands your project structure
- 🔁 Automatic Retry: Smart retry logic with exponential backoff for rate limits
- ⏱️ Configurable Timeouts: Adjust request timeouts for slower connections
What's New in v0.1.11
🔍 Web Research Capabilities
- Search the Web: AI can now search the internet using DuckDuckGo (no API key required)
- Deep Research: Automatically fetches and analyzes content from top search results
- URL Fetching: Extract and analyze text from any webpage
- File Downloads: Download files from URLs for analysis
- Commands:
searchWeb, researchTopic, fetchUrl
👋 Interactive Walkthrough & Onboarding
- Welcome Tour: Step-by-step introduction to all features on first use
- Tips & Tricks: Comprehensive tips panel with categorized advice
- Keyboard Shortcuts: Quick reference guide for all shortcuts
- Feature Discovery: Interactive tour of chat, code actions, autocomplete, terminal, research, and notebooks
🔧 Development Mode
- Hot Reload: Automatic file watching with reload prompts
- Dev Panel: Quick actions for reload, clear cache, and logs
- Output Channel: Detailed development logs
- Statistics: Track active watchers and extension state
💾 State Manager with Debounced Persistence
- Fast In-Memory Reads: Instant access to state data
- Debounced Writes: 500ms batched writes for performance
- Batch Operations: Efficient bulk state updates
- Secrets Support: Secure credential storage
- Statistics Tracking: Monitor cache sizes and pending changes
📓 Jupyter Notebook Support
- Cell Operations: Explain, fix, optimize, and generate notebook cells
- Output Integration: Analyzes cell outputs for better context
- Full Context: Extracts entire notebook structure
- Commands:
explainNotebookCell, fixNotebookCell, optimizeNotebookCell, generateNotebookCell
🔗 URI Handler for Deep Linking
- Deep Links: Open extension features via URLs
- Supported URIs:
vscode://ollama-cloud/chat?message=Hello&autoSend=true
vscode://ollama-cloud/explain?file=/path/to/file.ts&line=10
vscode://ollama-cloud/fix?file=/path/to/file.ts&line=10
vscode://ollama-cloud/generate?type=test&file=/path/to/file.ts
vscode://ollama-cloud/model?select=chat
📋 Enhanced Terminal Integration
- Clipboard-Based Capture: Preserves your clipboard while capturing terminal output
- Error Analysis: Explain terminal errors with full context
- Command Suggestions: AI suggests commands based on your description
- OS-Aware: Adapts to Windows, macOS, and Linux
🎯 Code Action Provider
- Right-Click Menu: Access AI features directly from code
- Actions: Add to Chat, Explain Code, Improve Code, Fix with Ollama
- Smart Context: Auto-expands 3 lines above/below for better understanding
- Diagnostic Integration: Fixes errors based on VS Code diagnostics
⚡ Enhanced Autocomplete
- LRU Cache: Smart caching with max 100 entries
- Performance Tracking: Monitor cache hit rates and request counts
- Automatic Cleanup: Periodic maintenance every 60 seconds
- Optimized Debounce: 250ms for faster responses
- Statistics:
getStats() method for debugging
What's New in v0.1.10
Enhanced Token Usage Display
- Top-Position Token Usage: Token usage now appears prominently at the top of the chat interface
- Model-Specific Styling: Local models show green styling, cloud models show blue styling with gradient backgrounds
- Visual Progress Bars: Real-time token usage tracking with color-coded progress bars (green/yellow/red)
- Monthly Usage Tracking: Monitor your Ollama Cloud API usage with detailed breakdowns
Advanced File Editing Features
- Model-Specific Content Fixes: Automatically fixes common issues with different AI models:
- Removes escape characters for Gemini, Llama, Mistral models
- Strips markdown codeblock markers for DeepSeek, Llama, Mistral models
- Converts HTML entities for DeepSeek models
- Cleans up whitespace for models like "minsteral"/"minstral"
- Handles JSON and YAML file formatting
- Enhanced File Path Validation: Improved security with better pattern matching and validation
- Better Error Handling: More robust error messages and validation
Improved User Interface
- Action Approval Cards: New styled cards for file edits and command execution with gradient backgrounds
- File Read Notifications: Visual indicators when files are being read
- Enhanced Environment Info: Better workspace information display with colored chips
- Improved Markdown Rendering: Better table support and formatting
Advanced Context Management
- File Context Tracking: Tracks files that are read, edited, or mentioned
- Task Context Awareness: Maintains context about files created/modified during tasks
- Session Restoration: Improved session management and restoration
Code Quality Improvements
- TypeScript Best Practices: Better type definitions and error handling
- Code Organization: Cleaner separation of concerns
- Performance Optimizations: More efficient file operations and UI updates
What's New in v0.1.9
Bug Fixes
- Session Restore Now Works: Restoring a previous session now properly displays all chat messages in the chat window
- Default Model Settings: The
ollamaCloud.chatModel setting now properly sets the selected model in the dropdown
- Premium Models Indicator: Premium models (70B+, Mixtral, Claude, GPT-4, etc.) are clearly marked with a 💎 icon
What's New in v0.1.8
Action Approval UI (Cline-Style)
- "Ollama wants to edit {filename}": Beautiful purple/indigo gradient cards appear when AI suggests file edits
- "Ollama wants to run command": Amber/yellow gradient cards for command execution requests
- "Ollama is reading file": Blue notification cards when AI reads files
- Apply/Skip Buttons: User-friendly buttons to approve or skip each action
- Task Completion Banner: Green gradient banner shows summary when all actions complete
Model Dropdown Enhancements
- Source Indicators: 💻 for local models, ☁️ for cloud models
- Premium Model Indicator: 💎 badge for large/expensive models (70B+, Mixtral, Claude, GPT-4)
- Smart Sorting: Local models appear first in the dropdown
Custom System Prompt
- New Setting:
ollamaCloud.customSystemPrompt with dynamic placeholders
- Placeholders:
{{OS}}, {{SHELL}}, {{WORKSPACE}}, {{WORKSPACE_PATH}}, {{FILE_TREE}}, {{NPM_SCRIPTS}}, {{MODE}}
- Full Control: Replace the entire system prompt or leave empty for default
Session Restore Improvements
- No Auto-Restore: Sessions no longer automatically restore on startup
- User Choice: Prompt with "Restore Session" and "Start Fresh" buttons
Other Improvements
- Increased Timeout: Default timeout increased from 30s to 120s for large models
- No Placeholder Code: AI now provides complete, copy-paste ready code (no more "// ... rest of code")
What's New in v0.1.4
- Full Cross-Platform Support: Commands now work seamlessly on Windows, macOS, and Linux
- Platform-Aware Shell Selection: Automatically uses the appropriate shell (PowerShell on Windows, zsh on macOS, bash on Linux)
- Smart Command Chaining: Uses correct command separators for each platform (
; for PowerShell, && for Unix shells)
What's New in v0.1.3
Improved AI Response Reliability
- Official Ollama SDK: Now uses the official
ollama npm package for better compatibility and reliability
- Streaming Responses: See AI output in real-time as it's generated (configurable)
- Context Window Management: Proper
num_ctx parameter support (default 32768) for longer conversations
- Retry Logic: Automatic retry with exponential backoff for rate-limited requests
- Request Cancellation: Working Cancel button to abort long-running requests
- Actual Token Counts: Real token usage from API instead of estimates
New Configuration Options
enableStreaming - Toggle streaming responses on/off
contextWindow - Set the context window size (2048-131072)
requestTimeout - Set request timeout in milliseconds (5000-600000)
Installation
From VSIX (Local Installation)
- Download the
.vsix file
- Open VSCode
- Go to Extensions (Ctrl+Shift+X)
- Click the "..." menu at the top
- Select "Install from VSIX..."
- Choose the downloaded file
From Source
- Clone this repository
- Run
npm install
- Run
npm run compile
- Press F5 to open a new VSCode window with the extension loaded
Setup
Option 1: Ollama Cloud (Recommended for beginners)
Get Ollama Cloud API Key
- Sign up at ollama.com
- Go to your account settings and generate an API key
Configure the Extension
- Open VSCode Settings (Ctrl+,)
- Search for "Ollama Cloud"
- Enter your API key in
ollamaCloud.cloudApiKey
Option 2: Local Ollama
Install Ollama
- Download from ollama.com
- Run
ollama serve to start the local server
Pull a Model
ollama pull llama3.1
Configure the Extension
- The extension will automatically detect local models
- No API key needed for local models
Option 3: Both (Hybrid)
Use both local and cloud models! The extension automatically routes requests to the appropriate endpoint based on where each model is available.
Usage
Opening the Chat
- Click the Ollama Cloud icon in the Activity Bar (left sidebar)
- Or use the keyboard shortcut:
Ctrl+Shift+O (Windows/Linux) or Cmd+Shift+O (Mac)
- Or open Command Palette (Ctrl+Shift+P) and run "Ollama Cloud: Open Chat"
Chatting with the AI
Simply type your question or request in the chat input and press Enter. The AI can help with:
- Writing new code
- Explaining existing code
- Debugging errors
- Refactoring code
- Creating new files
- Running commands
- And much more!
Operating Modes
ACT Mode (Default)
- AI can suggest file edits and commands
- You approve each action before it's executed
- Best for getting things done
PLAN Mode
- AI explains what should be done without executing
- Great for understanding complex tasks
- Use for learning and planning
File Editing
When the AI suggests file changes, it will format them like this:
// File: src/example.js
function hello() {
console.log("Hello, World!");
}
You'll be prompted to approve the changes before they're applied. If "Show Diff" is enabled in settings, you'll see a side-by-side comparison.
Command Execution
When the AI suggests commands, they'll be formatted like:
npm install axios
You'll be asked to confirm before the command is executed.
Configuration
| Setting |
Description |
Default |
ollamaCloud.cloudApiKey |
Your Ollama Cloud API key |
(empty) |
ollamaCloud.localEndpoint |
Local Ollama server URL |
http://localhost:11434 |
ollamaCloud.chatModel |
AI model for chat |
llama3.1 |
ollamaCloud.autocompleteModel |
AI model for autocomplete |
ministral-3:3b |
ollamaCloud.defaultMode |
Default operating mode |
act |
ollamaCloud.customSystemPrompt |
Custom system prompt with placeholders |
(empty) |
ollamaCloud.temperature |
Response creativity (0-2) |
0.7 |
ollamaCloud.maxTokens |
Maximum response length |
4096 |
ollamaCloud.contextWindow |
Context window size |
32768 |
ollamaCloud.requestTimeout |
Request timeout (ms) |
120000 |
ollamaCloud.enableStreaming |
Enable streaming responses |
true |
ollamaCloud.autoApprove |
Auto-approve AI actions |
false |
ollamaCloud.showDiff |
Show diff before applying changes |
true |
ollamaCloud.enableAutocomplete |
Enable AI code completion |
true |
ollamaCloud.enableCodeLens |
Show AI action buttons in code |
true |
Custom System Prompt Placeholders
When using customSystemPrompt, you can use these placeholders that will be replaced with actual values:
| Placeholder |
Description |
{{OS}} |
Operating system (Windows, macOS, Linux) |
{{SHELL}} |
Shell type (PowerShell, Bash, etc.) |
{{WORKSPACE}} |
Current workspace name |
{{WORKSPACE_PATH}} |
Full path to workspace |
{{FILE_TREE}} |
Project file structure |
{{NPM_SCRIPTS}} |
Available npm scripts |
{{MODE}} |
Current mode (ACT or PLAN) |
Available Models
Cloud Models (Ollama Cloud)
- llama3.1 - Latest Llama model (recommended)
- llama3.2 - Llama 3.2
- ministral-3:3b - Fast, efficient for autocomplete
- gemma3:4b - Google's Gemma 3
Local Models (requires local Ollama)
- codellama - Specialized for coding
- mistral - Fast and efficient
- mixtral - Mixture of experts model
- qwen2.5-coder - Specialized coding model
- phi3 - Microsoft's Phi-3
Keyboard Shortcuts
| Shortcut |
Action |
Ctrl+Shift+O |
Open Chat |
Ctrl+Shift+N |
New Task |
Ctrl+K |
Inline Chat (with selection) |
Ctrl+Shift+E |
Explain Code |
Ctrl+Shift+T |
Generate Tests |
Ctrl+Shift+F |
Fix Code |
Ctrl+Shift+R |
Review Code |
Ctrl+Shift+M |
Modernize Code |
Enter |
Send message |
Shift+Enter |
New line in message |
Commands
Chat & Core
- Ollama Cloud: Open Chat - Open the chat interface
- Ollama Cloud: New Task - Start a new conversation
- Ollama Cloud: Clear History - Clear chat history
- Ollama Cloud: Select Model - Choose a different AI model
Code Actions
- Ollama Cloud: Explain Code - Explain selected code
- Ollama Cloud: Fix Code - Fix issues in selected code
- Ollama Cloud: Improve Code - Suggest code improvements
- Ollama Cloud: Add to Chat - Add selected code to chat
- Ollama Cloud: Generate Tests - Generate tests for selected code
- Ollama Cloud: Review Code - Get a code review
- Ollama Cloud: Modernize Code - Update code to modern standards
Terminal Integration
- Ollama Cloud: Add Terminal Output - Add selected terminal output to chat
- Ollama Cloud: Explain Terminal Error - Explain terminal errors
- Ollama Cloud: Suggest Terminal Command - Get command suggestions
Web Research
- Ollama Cloud: Search Web - Search the internet and add results to chat
- Ollama Cloud: Research Topic - Deep research with content fetching
- Ollama Cloud: Fetch URL - Fetch and analyze URL content
Jupyter Notebooks
- Ollama Cloud: Explain Notebook Cell - Explain current notebook cell
- Ollama Cloud: Fix Notebook Cell - Fix errors in notebook cell
- Ollama Cloud: Optimize Notebook Cell - Optimize cell for performance
- Ollama Cloud: Generate Notebook Cell - Generate cell from description
Walkthrough & Help
- Ollama Cloud: Show Welcome - Show welcome tour
- Ollama Cloud: Show Tips - Show tips and tricks
- Ollama Cloud: Show Shortcuts - Show keyboard shortcuts
Development
- Ollama Cloud: Toggle Dev Mode - Enable/disable development mode
- Ollama Cloud: Show Dev Stats - Show development statistics
Tips
- Be Specific: The more specific your request, the better the AI can help
- Provide Context: Mention file names, error messages, or relevant code
- Review Changes: Always review AI-suggested changes before applying
- Use New Task: Start a new task for unrelated questions to maintain context
- Experiment with Models: Different models excel at different tasks
- Adjust Timeout: Increase
requestTimeout for larger models or slower connections
- Use Streaming: Keep streaming enabled to see responses as they generate
Troubleshooting
"Invalid API key" Error
- Verify your API key in settings
- Make sure you have an active Ollama Cloud account
"Request timed out" Error
- Increase
requestTimeout in settings (default 30000ms)
- Try a smaller/faster model
- Check your internet connection
Extension Not Loading
- Check the Output panel (View → Output → Ollama Cloud)
- Try reloading VSCode (Ctrl+Shift+P → "Reload Window")
Slow Responses
- Try a smaller model (e.g., ministral-3:3b instead of llama3.1)
- Enable streaming to see partial responses
- Check your internet connection
- Reduce
maxTokens in settings
Local Ollama Not Detected
- Make sure Ollama is running (
ollama serve)
- Check the
localEndpoint setting matches your Ollama server
Privacy & Security
- Your code is sent to Ollama Cloud or your local Ollama for processing
- API keys are stored locally in VSCode settings
- No data is stored by this extension beyond session persistence
- Review Ollama's privacy policy for cloud usage details
Development
Building from Source
# Install dependencies
npm install
# Compile TypeScript
npm run compile
# Watch for changes
npm run watch
# Package extension
npm run package
Running Tests
npm test
Contributing
Contributions are welcome! Please feel free to submit issues or pull requests.
License
MIT License - See LICENSE file for details
Credits
Inspired by the Cline VSCode extension. Built with ❤️ for the developer community.
Support
Note: This extension works with both Ollama Cloud (requires API key) and local Ollama (free, requires local installation). Visit ollama.com to get started.
| |