Ollama Cloud VSCode Extension
An AI-powered coding assistant for Visual Studio Code with multi-provider support. Works with Ollama Cloud, local Ollama, Anthropic Claude, OpenAI ChatGPT, and more - similar to Cline. This extension provides intelligent code assistance, file editing, and command execution capabilities across multiple AI providers.
Features
- 🤖 Multi-Provider AI Chat: Interactive chat with Ollama (local/cloud), Anthropic Claude, OpenAI ChatGPT, and more
- 📝 Smart Code Editing: AI can read, write, and modify files with your approval
- 🔧 Command Execution: Execute terminal commands suggested by the AI
- 🔄 Diff View: Review changes before applying them
- 🎯 Multiple Models: Support for various models across all providers (Llama, Claude 3.5, GPT-4, etc.)
- ⚡ Real-time Streaming: See AI responses as they're generated
- 🎨 VSCode Integration: Native VSCode UI with dark/light theme support
- 🧠 Enhanced Context Awareness: Tracks tasks, files, and commands for better continuity
- 💾 Session Persistence: Auto-saves and restores conversations across VSCode restarts
- 📊 Usage Tracking: Monitor API usage with visual indicators for each provider
- 🗂️ Workspace Indexing: Automatically understands your project structure
- 🔁 Automatic Retry: Smart retry logic with exponential backoff for rate limits
- ⏱️ Configurable Timeouts: Adjust request timeouts for slower connections
- 🔄 Provider Switching: Easily switch between AI providers based on your needs
What's New in v0.1.15
🤖 Multi-Provider AI Support (Major Feature)
- Anthropic Claude: Full integration with Claude 3.5 Sonnet, Opus, and Haiku models
- OpenAI ChatGPT: Support for GPT-4 Turbo, GPT-4, and GPT-3.5 Turbo models
- Ollama Compatibility: Continued support for local and cloud Ollama models
- Unified Interface: Seamless switching between providers without changing workflows
⚙️ Flexible Configuration
- Provider Selection: Choose your preferred AI provider in settings (
ollamaCloud.apiProvider)
- API Key Management: Individual API key configuration for each provider
- Model Awareness: Provider-specific model selection with appropriate recommendations
- Backward Compatible: Existing Ollama workflows continue to work unchanged
🎯 Advanced Features
- Cline-Inspired Architecture: Unified API handler system similar to popular AI assistants
- Enterprise Ready: Designed to work with multiple AI providers for redundancy and flexibility
- Cost Optimization: Choose providers based on pricing, performance, or availability
- Future Proof: Easy to add new providers as they become available
What's New in v0.1.18
🧪 Enhanced Testing Framework
📝 Comprehensive Test Coverage
- File Editing Tests: Added 50+ new tests covering file creation, editing, error handling, and reliability scenarios
- Command Execution Tests: Enhanced command executor testing with security, concurrency, and error recovery tests
- Chat Integration Tests: Added tests for various file creation formats and messaging workflows
- Cross-Platform Compatibility: Comprehensive testing across Windows, macOS, and Linux environments
🔍 Reliability Verification
- Error Handling Tests: Comprehensive tests for detailed error message generation and recovery
- Path Handling Tests: Verification of cross-platform path normalization and special character handling
- Concurrency Tests: Tests for simultaneous file operations and command executions
- Edge Case Coverage: Tests for empty content, long paths, invalid inputs, and boundary conditions
🛡️ Security & Safety Tests
- Command Injection Prevention: Tests verifying safe handling of potentially malicious command inputs
- Path Sanitization: Tests ensuring proper validation of file paths and directory traversal prevention
- Resource Management: Tests for proper cleanup and resource handling under various conditions
🛠️ File Editing Reliability Improvements
🔧 Enhanced File Operations
- Robust File Parsing: Improved regex patterns to handle various AI model output formats for file creation
- Simplified Approval Flow: Streamlined file editing process for immediate execution without complex approvals
- Better Error Handling: Comprehensive error logging with detailed messages and stack traces for debugging
- Cross-Platform Path Support: Enhanced path normalization and workspace context handling
⚡ Improved Command Execution
- Enhanced Error Recovery: Better command execution with detailed error reporting and workspace context
- Reliable File Operations: Automatic directory creation and proper path resolution for all file operations
- Immediate Feedback: Clear success/failure messages for all file editing operations
🎯 System Prompt Optimization
- Action-Oriented Instructions: Updated AI system prompt with clearer, more actionable file editing guidance
- Multiple Format Support: AI can now recognize and process various file creation formats
- Complete Code Generation: Emphasis on providing full, copy-paste ready code without placeholders
🎯 Granular Controls
- Autocomplete Streaming: Added
enableStreamingForAutocomplete setting for granular control over streaming
- Configurable Delays: New
autocompleteDebounceDelay and autocompletePreviewDelay settings
- Adaptive Performance: Added
enableAdaptivePerformance setting for automatic performance tuning
- Enhanced Configuration: Users can now fine-tune autocomplete behavior to their preferences
📊 Enhanced User Feedback
- Preview Completions: Implemented immediate feedback with short preview completions
- Progress Indicators: Added visual progress notifications for longer autocomplete operations
- Response Time Tracking: Enhanced logging with response time measurements
- Silent Error Handling: Improved error handling that doesn't interrupt user workflow
- Memory Pressure Detection: Automatic cache size adjustment based on system memory usage
- Adaptive Debounce Timing: Dynamic debounce time based on API response performance
- Resource Management: Enhanced cleanup and timeout handling for optimal resource usage
- Performance Monitoring: Advanced tracking and optimization of autocomplete performance
What's New in v0.1.16
🐛 Bug Fix - Version Number Correction
🔧 Fixed Version Bump Issue
- Corrected Version Progression: Fixed incorrect version bump during automated build process
- Proper Version Chain: Ensured correct version progression from 0.1.15 to 0.1.16
- Build Script Alignment: Verified build script properly increments patch version
- Package Consistency: Confirmed package.json and package-lock.json version synchronization
🧪 Testing Verification
- All Tests Pass: Verified all 215 unit tests continue to pass
- Integration Testing: Confirmed session management, state persistence, and web research features work correctly
- Version-Specific Tests: Validated tutorial experience shows correctly per version update
What's New in v0.1.12
🛠️ Enhancement - Tutorial Experience Improvement
👋 Improved Welcome Tutorial Behavior
- Version-Based Tutorial Display: Tutorial now only shows once per extension version update, not every time VSCode opens
- Automatic Version Tracking: Extension automatically tracks which version was last shown the tutorial
- Cleaner User Experience: Removed "Don't Show Again" option since tutorial is now version-aware
- Updated Messaging: Final tour step clarifies that tutorial only shows per update
🎉 Major Feature Release - Complete Implementation
This release represents a complete overhaul of the extension with enterprise-grade features, comprehensive testing, and production-ready code quality.
🔍 Web Research Capabilities
- Search the Web: AI can now search the internet using DuckDuckGo (no API key required)
- Deep Research: Automatically fetches and analyzes content from top search results
- URL Fetching: Extract and analyze text from any webpage
- File Downloads: Download files from URLs for analysis
- Commands:
ollama-cloud.searchWeb - Search and add results to chat
ollama-cloud.researchTopic - Deep research with content fetching
ollama-cloud.fetchUrl - Fetch and analyze URL content
👋 Interactive Walkthrough & Onboarding
- Welcome Tour: Step-by-step introduction to all features on first use
- Tips & Tricks: Comprehensive tips panel with categorized advice
- Keyboard Shortcuts: Quick reference guide for all shortcuts
🔧 Development Mode
- Hot Reload: Automatic file watching with reload prompts
- Dev Panel: Quick actions for reload, clear cache, and logs
- Output Channel: Detailed development logs
- Statistics: Track active watchers and extension state
- Commands:
ollama-cloud.toggleDevMode - Enable/disable development mode
ollama-cloud.showDevStats - Show development statistics
💾 State Manager with Debounced Persistence
- Fast In-Memory Reads: Instant access to state data
- Debounced Writes: 500ms batched writes for performance
- Batch Operations: Efficient bulk state updates
- Secrets Support: Secure credential storage
- Statistics Tracking: Monitor cache sizes and pending changes
- Singleton Pattern: Thread-safe initialization with guards
📓 Jupyter Notebook Support
- Cell Operations: Explain, fix, optimize, and generate notebook cells
- Output Integration: Analyzes cell outputs for better context
- Full Context: Extracts entire notebook structure
- Commands:
ollama-cloud.explainNotebookCell - Explain current notebook cell
ollama-cloud.fixNotebookCell - Fix errors in notebook cell
ollama-cloud.optimizeNotebookCell - Optimize cell for performance
ollama-cloud.generateNotebookCell - Generate cell from description
🔗 URI Handler for Deep Linking
- Deep Links: Open extension features via URLs
- Supported URIs:
vscode://ollama-cloud/chat?message=Hello&autoSend=true
vscode://ollama-cloud/explain?file=/path/to/file.ts&line=10
vscode://ollama-cloud/fix?file=/path/to/file.ts&line=10
vscode://ollama-cloud/generate?type=test&file=/path/to/file.ts
vscode://ollama-cloud/model?select=chat
- URI Generation: Utilities for creating and sharing deep links
📋 Enhanced Terminal Integration
- Clipboard-Based Capture: Preserves your clipboard while capturing terminal output
- Error Analysis: Explain terminal errors with full context
- Command Suggestions: AI suggests commands based on your description
- OS-Aware: Adapts to Windows, macOS, and Linux
- Commands:
ollama-cloud.addTerminalOutput - Add selected terminal output to chat
ollama-cloud.explainTerminalError - Explain terminal errors
ollama-cloud.suggestTerminalCommand - Get command suggestions
🎯 Code Action Provider
- Right-Click Menu: Access AI features directly from code
- Actions: Add to Chat, Explain Code, Improve Code, Fix with Ollama
- Smart Context: Auto-expands 3 lines above/below for better understanding
- Diagnostic Integration: Fixes errors based on VS Code diagnostics
⚡ Enhanced Autocomplete
- LRU Cache: Smart caching with max 100 entries
- Performance Tracking: Monitor cache hit rates and request counts
- Automatic Cleanup: Periodic maintenance every 60 seconds
- Optimized Debounce: 250ms for faster responses
- Statistics:
getStats() method for debugging
🧪 Testing & Quality
100% Test Coverage
- 215 Tests Passing: All unit tests passing with 0 failures
- Test Suites:
- OllamaCloudClient (token usage, sessions, context management)
- SessionManager (persistence, retrieval, validation)
- CommandExecutor (cross-platform, error handling, special cases)
- ChatViewProvider (message handling, lifecycle)
- FileEditor (path validation, content fixes)
- AutocompleteProvider (caching, performance, languages)
- StateManager (debounced persistence, batch operations)
- Integration tests
📦 Technical Improvements
Architecture
- Singleton Patterns: Thread-safe initialization for all managers
- Debounced Operations: Performance optimization for state writes
- LRU Caching: Memory-efficient caching with automatic eviction
- Proper Disposal: Resource cleanup on deactivation
- Type Safety: Full TypeScript strict mode compliance
Code Quality
- JSDoc Comments: Comprehensive documentation
- Error Handling: Graceful failure recovery
- Performance Monitoring: Built-in metrics and statistics
- Resource Management: Proper cleanup and disposal patterns
Bundle Size
- 404 KiB: Optimized webpack bundle
- 9 New Modules: Web research, walkthrough, dev mode, notebook, URI handler, state manager, terminal integration, code actions, enhanced autocomplete
What's New in v0.1.10
Enhanced Token Usage Display
- Top-Position Token Usage: Token usage now appears prominently at the top of the chat interface
- Model-Specific Styling: Local models show green styling, cloud models show blue styling with gradient backgrounds
- Visual Progress Bars: Real-time token usage tracking with color-coded progress bars (green/yellow/red)
- Monthly Usage Tracking: Monitor your Ollama Cloud API usage with detailed breakdowns
Advanced File Editing Features
- Model-Specific Content Fixes: Automatically fixes common issues with different AI models:
- Removes escape characters for Gemini, Llama, Mistral models
- Strips markdown codeblock markers for DeepSeek, Llama, Mistral models
- Converts HTML entities for DeepSeek models
- Cleans up whitespace for models like "minsteral"/"minstral"
- Handles JSON and YAML file formatting
- Enhanced File Path Validation: Improved security with better pattern matching and validation
- Better Error Handling: More robust error messages and validation
Improved User Interface
- Action Approval Cards: New styled cards for file edits and command execution with gradient backgrounds
- File Read Notifications: Visual indicators when files are being read
- Enhanced Environment Info: Better workspace information display with colored chips
- Improved Markdown Rendering: Better table support and formatting
Advanced Context Management
- File Context Tracking: Tracks files that are read, edited, or mentioned
- Task Context Awareness: Maintains context about files created/modified during tasks
- Session Restoration: Improved session management and restoration
Code Quality Improvements
- TypeScript Best Practices: Better type definitions and error handling
- Code Organization: Cleaner separation of concerns
- Performance Optimizations: More efficient file operations and UI updates
What's New in v0.1.9
Bug Fixes
- Session Restore Now Works: Restoring a previous session now properly displays all chat messages in the chat window
- Default Model Settings: The
ollamaCloud.chatModel setting now properly sets the selected model in the dropdown
- Premium Models Indicator: Premium models (70B+, Mixtral, Claude, GPT-4, etc.) are clearly marked with a 💎 icon
What's New in v0.1.8
Action Approval UI (Cline-Style)
- "Ollama wants to edit {filename}": Beautiful purple/indigo gradient cards appear when AI suggests file edits
- "Ollama wants to run command": Amber/yellow gradient cards for command execution requests
- "Ollama is reading file": Blue notification cards when AI reads files
- Apply/Skip Buttons: User-friendly buttons to approve or skip each action
- Task Completion Banner: Green gradient banner shows summary when all actions complete
Model Dropdown Enhancements
- Source Indicators: 💻 for local models, ☁️ for cloud models
- Premium Model Indicator: 💎 badge for large/expensive models (70B+, Mixtral, Claude, GPT-4)
- Smart Sorting: Local models appear first in the dropdown
Custom System Prompt
- New Setting:
ollamaCloud.customSystemPrompt with dynamic placeholders
- Placeholders:
{{OS}}, {{SHELL}}, {{WORKSPACE}}, {{WORKSPACE_PATH}}, {{FILE_TREE}}, {{NPM_SCRIPTS}}, {{MODE}}
- Full Control: Replace the entire system prompt or leave empty for default
Session Restore Improvements
- No Auto-Restore: Sessions no longer automatically restore on startup
- User Choice: Prompt with "Restore Session" and "Start Fresh" buttons
Other Improvements
- Increased Timeout: Default timeout increased from 30s to 120s for large models
- No Placeholder Code: AI now provides complete, copy-paste ready code (no more "// ... rest of code")
What's New in v0.1.4
- Full Cross-Platform Support: Commands now work seamlessly on Windows, macOS, and Linux
- Platform-Aware Shell Selection: Automatically uses the appropriate shell (PowerShell on Windows, zsh on macOS, bash on Linux)
- Smart Command Chaining: Uses correct command separators for each platform (
; for PowerShell, && for Unix shells)
What's New in v0.1.3
Improved AI Response Reliability
- Official Ollama SDK: Now uses the official
ollama npm package for better compatibility and reliability
- Streaming Responses: See AI output in real-time as it's generated (configurable)
- Context Window Management: Proper
num_ctx parameter support (default 32768) for longer conversations
- Retry Logic: Automatic retry with exponential backoff for rate-limited requests
- Request Cancellation: Working Cancel button to abort long-running requests
- Actual Token Counts: Real token usage from API instead of estimates
New Configuration Options
enableStreaming - Toggle streaming responses on/off
contextWindow - Set the context window size (2048-131072)
requestTimeout - Set request timeout in milliseconds (5000-600000)
Installation
From VSIX (Local Installation)
- Download the
.vsix file
- Open VSCode
- Go to Extensions (Ctrl+Shift+X)
- Click the "..." menu at the top
- Select "Install from VSIX..."
- Choose the downloaded file
From Source
- Clone this repository
- Run
npm install
- Run
npm run compile
- Press F5 to open a new VSCode window with the extension loaded
Setup
Option 1: Ollama Cloud (Recommended for beginners)
Get Ollama Cloud API Key
- Sign up at ollama.com
- Go to your account settings and generate an API key
Configure the Extension
- Open VSCode Settings (Ctrl+,)
- Search for "Ollama Cloud"
- Enter your API key in
ollamaCloud.cloudApiKey
Option 2: Local Ollama
Install Ollama
- Download from ollama.com
- Run
ollama serve to start the local server
Pull a Model
ollama pull llama3.1
Configure the Extension
- The extension will automatically detect local models
- No API key needed for local models
Option 3: Both (Hybrid)
Use both local and cloud models! The extension automatically routes requests to the appropriate endpoint based on where each model is available.
Usage
Opening the Chat
- Click the Ollama Cloud icon in the Activity Bar (left sidebar)
- Or use the keyboard shortcut:
Ctrl+Shift+O (Windows/Linux) or Cmd+Shift+O (Mac)
- Or open Command Palette (Ctrl+Shift+P) and run "Ollama Cloud: Open Chat"
Chatting with the AI
Simply type your question or request in the chat input and press Enter. The AI can help with:
- Writing new code
- Explaining existing code
- Debugging errors
- Refactoring code
- Creating new files
- Running commands
- And much more!
Operating Modes
ACT Mode (Default)
- AI can suggest file edits and commands
- You approve each action before it's executed
- Best for getting things done
PLAN Mode
- AI explains what should be done without executing
- Great for understanding complex tasks
- Use for learning and planning
File Editing
When the AI suggests file changes, it will format them like this:
// File: src/example.js
function hello() {
console.log("Hello, World!");
}
Approval Workflow
By default, you'll be prompted to approve the changes before they're applied. These will appear as clickable cards in the chat interface with options to Apply, Skip, or review the Diff (when enabled).
If you'd prefer to skip manual approval, you can enable Auto-Approve in settings (ollamaCloud.autoApprove). ⚠️ Warning: With auto-approve enabled, the AI can directly modify your files and execute commands without confirmation.
Command Execution
When the AI suggests commands, they'll be formatted like:
npm install axios
Approval Workflow
These commands will appear as clickable cards in the chat interface. By default, you'll need to explicitly approve each command before it runs. Click Run to execute or Skip to ignore.
If you'd prefer to skip manual approval, you can enable Auto-Approve in settings (ollamaCloud.autoApprove). ⚠️ Warning: With auto-approve enabled, the AI can directly execute terminal commands without confirmation.
Configuration
| Setting |
Description |
Default |
ollamaCloud.cloudApiKey |
Your Ollama Cloud API key |
(empty) |
ollamaCloud.localEndpoint |
Local Ollama server URL |
http://localhost:11434 |
ollamaCloud.chatModel |
AI model for chat |
llama3.1 |
ollamaCloud.autocompleteModel |
AI model for autocomplete |
ministral-3:3b |
ollamaCloud.defaultMode |
Default operating mode |
act |
ollamaCloud.customSystemPrompt |
Custom system prompt with placeholders |
(empty) |
ollamaCloud.temperature |
Response creativity (0-2) |
0.7 |
ollamaCloud.maxTokens |
Maximum response length |
4096 |
ollamaCloud.contextWindow |
Context window size |
32768 |
ollamaCloud.requestTimeout |
Request timeout (ms) |
120000 |
ollamaCloud.enableStreaming |
Enable streaming responses |
true |
ollamaCloud.autoApprove |
Auto-approve AI actions |
false |
ollamaCloud.showDiff |
Show diff before applying changes |
true |
ollamaCloud.enableAutocomplete |
Enable AI code completion |
true |
ollamaCloud.includeFileContext |
Automatically include context from open files in chat messages |
true |
ollamaCloud.enableCodeLens |
Show AI action buttons in code |
true |
ollamaCloud.enableStreamingForAutocomplete |
Enable streaming for autocomplete requests |
false |
ollamaCloud.autocompleteDebounceDelay |
Delay before sending autocomplete requests (ms) |
250 |
ollamaCloud.autocompletePreviewDelay |
Delay before showing autocomplete preview (ms) |
1000 |
ollamaCloud.enableAdaptivePerformance |
Enable adaptive performance tuning |
true |
Custom System Prompt Placeholders
When using customSystemPrompt, you can use these placeholders that will be replaced with actual values:
| Placeholder |
Description |
{{OS}} |
Operating system (Windows, macOS, Linux) |
{{SHELL}} |
Shell type (PowerShell, Bash, etc.) |
{{WORKSPACE}} |
Current workspace name |
{{WORKSPACE_PATH}} |
Full path to workspace |
{{FILE_TREE}} |
Project file structure |
{{NPM_SCRIPTS}} |
Available npm scripts |
{{MODE}} |
Current mode (ACT or PLAN) |
Project Context Files
The AI can automatically read project-specific context from special markdown files in your workspace. Create one of these files to provide the AI with project-specific information. The extension supports multiple popular AI context formats.
Supported File Names (Ordered by Priority)
.ollamacloud.md (Primary - Highest priority)
.ollamacloud-context.md
.github/copilot-instructions.md (GitHub standard location)
.github/copilot_instructions.md
.copilot-instructions.md
.copilot_instructions.md
copilot-instructions.md
.claude/context.md
.claude/instructions.md
claude-context.md
.project-context.md
.context.md
PROJECT.md
README.context.md
PROJECT_CONTEXT.md
This extension automatically reads context files used by other AI assistants:
- GitHub Copilot: Supports
.github/copilot-instructions.md format
- Anthropic Claude: Supports
.claude/context.md format
- Generic AI tools: Supports common context file names
You can use existing context files from other AI tools, or create a new one tailored for Ollama Cloud.
Example Project Context Files
Create .ollamacloud.md in your project root:
# Project Information
- **Project Name**: MyApp
- **Framework**: React with TypeScript
- **Build Tool**: Vite
- **CSS Framework**: Tailwind CSS
- **Backend API**: REST API at /api/v1
# Important Guidelines
- Always use functional components with React hooks
- Follow the existing folder structure pattern
- Use Tailwind classes for styling instead of CSS files
- Maintain consistent error handling with try/catch blocks
- Keep components small and focused (max 150 lines)
# Architecture Notes
- Authentication is handled via JWT tokens
- State management uses React Context + useReducer
- All API calls go through the `/src/api/client.ts` wrapper
- Environment variables are prefixed with `VITE_`
# Common Patterns to Avoid
- ❌ Don't use Redux (we migrated away from it)
- ❌ Don't create new CSS files (use Tailwind only)
- ❌ Don't use class components (legacy only)
# Copilot Instructions for This Project
## Coding Standards
- Follow the existing code style
- Use TypeScript with strict typing
- Include JSDoc comments for exported functions
- Handle errors gracefully with try/catch
## Project Structure
- `/src` - Source code
- `/test` - Unit tests
- `/dist` - Built output
- Configuration files in root
## Key Dependencies
- vscode - VS Code Extension API
- ollama - Ollama client library
- axios - HTTP client for API calls
Claude Format (.claude/context.md)
# Claude Context
This project is a VS Code extension for AI-assisted coding.
## Core Principles
- User privacy is paramount
- Transparency in all AI interactions
- Security through explicit approval workflows
- Performance through caching and debouncing
## Technical Constraints
- Must work with both local and cloud AI models
- All file operations require user confirmation
- Network calls must handle timeouts gracefully
- Extension must work offline when possible
Best Practices
- Single Context File: Use only one context file to avoid conflicts
- Keep it Concise: Focus on the most important information
- Regular Updates: Keep context files current with project changes
- Team Consensus: Ensure team agrees on context guidelines
- Security Review: Review context files for sensitive information
The AI will automatically include this context information in all conversations to provide more accurate and project-specific responses.
Available Models
Cloud Models (Ollama Cloud)
- llama3.1 - Latest Llama model (recommended)
- llama3.2 - Llama 3.2
- ministral-3:3b - Fast, efficient for autocomplete
- gemma3:4b - Google's Gemma 3
Local Models (requires local Ollama)
- codellama - Specialized for coding
- mistral - Fast and efficient
- mixtral - Mixture of experts model
- qwen2.5-coder - Specialized coding model
- phi3 - Microsoft's Phi-3
Keyboard Shortcuts
| Shortcut |
Action |
Ctrl+Shift+O |
Open Chat |
Ctrl+Shift+N |
New Task |
Ctrl+K |
Inline Chat (with selection) |
Ctrl+Shift+E |
Explain Code |
Ctrl+Shift+T |
Generate Tests |
Ctrl+Shift+F |
Fix Code |
Ctrl+Shift+R |
Review Code |
Ctrl+Shift+M |
Modernize Code |
Enter |
Send message |
Shift+Enter |
New line in message |
Commands
Chat & Core
- Ollama Cloud: Open Chat - Open the chat interface
- Ollama Cloud: New Task - Start a new conversation
- Ollama Cloud: Clear History - Clear chat history
- Ollama Cloud: Select Model - Choose a different AI model
Code Actions
- Ollama Cloud: Explain Code - Explain selected code
- Ollama Cloud: Fix Code - Fix issues in selected code
- Ollama Cloud: Improve Code - Suggest code improvements
- Ollama Cloud: Add to Chat - Add selected code to chat
- Ollama Cloud: Generate Tests - Generate tests for selected code
- Ollama Cloud: Review Code - Get a code review
- Ollama Cloud: Modernize Code - Update code to modern standards
Terminal Integration
- Ollama Cloud: Add Terminal Output - Add selected terminal output to chat
- Ollama Cloud: Explain Terminal Error - Explain terminal errors
- Ollama Cloud: Suggest Terminal Command - Get command suggestions
Web Research
- Ollama Cloud: Search Web - Search the internet and add results to chat
- Ollama Cloud: Research Topic - Deep research with content fetching
- Ollama Cloud: Fetch URL - Fetch and analyze URL content
Jupyter Notebooks
- Ollama Cloud: Explain Notebook Cell - Explain current notebook cell
- Ollama Cloud: Fix Notebook Cell - Fix errors in notebook cell
- Ollama Cloud: Optimize Notebook Cell - Optimize cell for performance
- Ollama Cloud: Generate Notebook Cell - Generate cell from description
Walkthrough & Help
- Ollama Cloud: Show Welcome - Show welcome tour
- Ollama Cloud: Show Tips - Show tips and tricks
- Ollama Cloud: Show Shortcuts - Show keyboard shortcuts
Development
- Ollama Cloud: Toggle Dev Mode - Enable/disable development mode
- Ollama Cloud: Show Dev Stats - Show development statistics
Tips
- Be Specific: The more specific your request, the better the AI can help
- Provide Context: Mention file names, error messages, or relevant code
- Review Changes: Always review AI-suggested changes before applying
- Use New Task: Start a new task for unrelated questions to maintain context
- Experiment with Models: Different models excel at different tasks
- Adjust Timeout: Increase
requestTimeout for larger models or slower connections
- Use Streaming: Keep streaming enabled to see responses as they generate
Troubleshooting
"Invalid API key" Error
- Verify your API key in settings
- Make sure you have an active Ollama Cloud account
"Request timed out" Error
- Increase
requestTimeout in settings (default 30000ms)
- Try a smaller/faster model
- Check your internet connection
Extension Not Loading
- Check the Output panel (View → Output → Ollama Cloud)
- Try reloading VSCode (Ctrl+Shift+P → "Reload Window")
Slow Responses
- Try a smaller model (e.g., ministral-3:3b instead of llama3.1)
- Enable streaming to see partial responses
- Check your internet connection
- Reduce
maxTokens in settings
Local Ollama Not Detected
- Make sure Ollama is running (
ollama serve)
- Check the
localEndpoint setting matches your Ollama server
Privacy & Security
- Your code is sent to Ollama Cloud or your local Ollama for processing
- API keys are stored locally in VSCode settings
- No data is stored by this extension beyond session persistence
- Review Ollama's privacy policy for cloud usage details
Development
Building from Source
# Install dependencies
npm install
# Compile TypeScript
npm run compile
# Watch for changes
npm run watch
# Package extension
npm run package
Running Tests
npm test
Contributing
Contributions are welcome! Please feel free to submit issues or pull requests.
License
MIT License - See LICENSE file for details
Credits
Inspired by the Cline VSCode extension. Built with ❤️ for the developer community.
Support
Note: This extension works with both Ollama Cloud (requires API key) and local Ollama (free, requires local installation). Visit ollama.com to get started.
| |