Skip to content
| Marketplace
Sign in
Visual Studio Code>Programming Languages>Ollama Cloud BETANew to Visual Studio Code? Get it now.
Ollama Cloud BETA

Ollama Cloud BETA

JKagiDesigns LLC

|
66 installs
| (0) | Free
AI-powered coding assistant using Ollama Cloud - similar to Cline, Copilot, etc. BETA
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Ollama Cloud VSCode Extension

An AI-powered coding assistant for Visual Studio Code with multi-provider support. Works with Ollama Cloud, local Ollama, Anthropic Claude, OpenAI ChatGPT, and more - similar to Cline. This extension provides intelligent code assistance, file editing, and command execution capabilities across multiple AI providers.

Features

  • 🤖 Multi-Provider AI Chat: Interactive chat with Ollama (local/cloud), Anthropic Claude, OpenAI ChatGPT, and more
  • 📝 Smart Code Editing: AI can read, write, and modify files with your approval
  • 🔧 Command Execution: Execute terminal commands suggested by the AI
  • 🔄 Diff View: Review changes before applying them
  • 🎯 Multiple Models: Support for various models across all providers (Llama, Claude 3.5, GPT-4, etc.)
  • ⚡ Real-time Streaming: See AI responses as they're generated
  • 🎨 VSCode Integration: Native VSCode UI with dark/light theme support
  • 🧠 Enhanced Context Awareness: Tracks tasks, files, and commands for better continuity
  • 💾 Session Persistence: Auto-saves and restores conversations across VSCode restarts
  • 📊 Usage Tracking: Monitor API usage with visual indicators for each provider
  • 🗂️ Workspace Indexing: Automatically understands your project structure
  • 🔁 Automatic Retry: Smart retry logic with exponential backoff for rate limits
  • ⏱️ Configurable Timeouts: Adjust request timeouts for slower connections
  • 🔄 Provider Switching: Easily switch between AI providers based on your needs

What's New in v0.1.15

🤖 Multi-Provider AI Support (Major Feature)

🌐 Cross-Platform AI Integration

  • Anthropic Claude: Full integration with Claude 3.5 Sonnet, Opus, and Haiku models
  • OpenAI ChatGPT: Support for GPT-4 Turbo, GPT-4, and GPT-3.5 Turbo models
  • Ollama Compatibility: Continued support for local and cloud Ollama models
  • Unified Interface: Seamless switching between providers without changing workflows

⚙️ Flexible Configuration

  • Provider Selection: Choose your preferred AI provider in settings (ollamaCloud.apiProvider)
  • API Key Management: Individual API key configuration for each provider
  • Model Awareness: Provider-specific model selection with appropriate recommendations
  • Backward Compatible: Existing Ollama workflows continue to work unchanged

🎯 Advanced Features

  • Cline-Inspired Architecture: Unified API handler system similar to popular AI assistants
  • Enterprise Ready: Designed to work with multiple AI providers for redundancy and flexibility
  • Cost Optimization: Choose providers based on pricing, performance, or availability
  • Future Proof: Easy to add new providers as they become available

What's New in v0.1.18

🧪 Enhanced Testing Framework

📝 Comprehensive Test Coverage

  • File Editing Tests: Added 50+ new tests covering file creation, editing, error handling, and reliability scenarios
  • Command Execution Tests: Enhanced command executor testing with security, concurrency, and error recovery tests
  • Chat Integration Tests: Added tests for various file creation formats and messaging workflows
  • Cross-Platform Compatibility: Comprehensive testing across Windows, macOS, and Linux environments

🔍 Reliability Verification

  • Error Handling Tests: Comprehensive tests for detailed error message generation and recovery
  • Path Handling Tests: Verification of cross-platform path normalization and special character handling
  • Concurrency Tests: Tests for simultaneous file operations and command executions
  • Edge Case Coverage: Tests for empty content, long paths, invalid inputs, and boundary conditions

🛡️ Security & Safety Tests

  • Command Injection Prevention: Tests verifying safe handling of potentially malicious command inputs
  • Path Sanitization: Tests ensuring proper validation of file paths and directory traversal prevention
  • Resource Management: Tests for proper cleanup and resource handling under various conditions

🛠️ File Editing Reliability Improvements

🔧 Enhanced File Operations

  • Robust File Parsing: Improved regex patterns to handle various AI model output formats for file creation
  • Simplified Approval Flow: Streamlined file editing process for immediate execution without complex approvals
  • Better Error Handling: Comprehensive error logging with detailed messages and stack traces for debugging
  • Cross-Platform Path Support: Enhanced path normalization and workspace context handling

⚡ Improved Command Execution

  • Enhanced Error Recovery: Better command execution with detailed error reporting and workspace context
  • Reliable File Operations: Automatic directory creation and proper path resolution for all file operations
  • Immediate Feedback: Clear success/failure messages for all file editing operations

🎯 System Prompt Optimization

  • Action-Oriented Instructions: Updated AI system prompt with clearer, more actionable file editing guidance
  • Multiple Format Support: AI can now recognize and process various file creation formats
  • Complete Code Generation: Emphasis on providing full, copy-paste ready code without placeholders

⚙️ Performance and User Experience Improvements

🎯 Granular Controls

  • Autocomplete Streaming: Added enableStreamingForAutocomplete setting for granular control over streaming
  • Configurable Delays: New autocompleteDebounceDelay and autocompletePreviewDelay settings
  • Adaptive Performance: Added enableAdaptivePerformance setting for automatic performance tuning
  • Enhanced Configuration: Users can now fine-tune autocomplete behavior to their preferences

📊 Enhanced User Feedback

  • Preview Completions: Implemented immediate feedback with short preview completions
  • Progress Indicators: Added visual progress notifications for longer autocomplete operations
  • Response Time Tracking: Enhanced logging with response time measurements
  • Silent Error Handling: Improved error handling that doesn't interrupt user workflow

⚡ Performance Tuning

  • Memory Pressure Detection: Automatic cache size adjustment based on system memory usage
  • Adaptive Debounce Timing: Dynamic debounce time based on API response performance
  • Resource Management: Enhanced cleanup and timeout handling for optimal resource usage
  • Performance Monitoring: Advanced tracking and optimization of autocomplete performance

What's New in v0.1.16

🐛 Bug Fix - Version Number Correction

🔧 Fixed Version Bump Issue

  • Corrected Version Progression: Fixed incorrect version bump during automated build process
  • Proper Version Chain: Ensured correct version progression from 0.1.15 to 0.1.16
  • Build Script Alignment: Verified build script properly increments patch version
  • Package Consistency: Confirmed package.json and package-lock.json version synchronization

🧪 Testing Verification

  • All Tests Pass: Verified all 215 unit tests continue to pass
  • Integration Testing: Confirmed session management, state persistence, and web research features work correctly
  • Version-Specific Tests: Validated tutorial experience shows correctly per version update

What's New in v0.1.12

🛠️ Enhancement - Tutorial Experience Improvement

👋 Improved Welcome Tutorial Behavior

  • Version-Based Tutorial Display: Tutorial now only shows once per extension version update, not every time VSCode opens
  • Automatic Version Tracking: Extension automatically tracks which version was last shown the tutorial
  • Cleaner User Experience: Removed "Don't Show Again" option since tutorial is now version-aware
  • Updated Messaging: Final tour step clarifies that tutorial only shows per update

🎉 Major Feature Release - Complete Implementation

This release represents a complete overhaul of the extension with enterprise-grade features, comprehensive testing, and production-ready code quality.

🔍 Web Research Capabilities

  • Search the Web: AI can now search the internet using DuckDuckGo (no API key required)
  • Deep Research: Automatically fetches and analyzes content from top search results
  • URL Fetching: Extract and analyze text from any webpage
  • File Downloads: Download files from URLs for analysis
  • Commands:
    • ollama-cloud.searchWeb - Search and add results to chat
    • ollama-cloud.researchTopic - Deep research with content fetching
    • ollama-cloud.fetchUrl - Fetch and analyze URL content

👋 Interactive Walkthrough & Onboarding

  • Welcome Tour: Step-by-step introduction to all features on first use
  • Tips & Tricks: Comprehensive tips panel with categorized advice
  • Keyboard Shortcuts: Quick reference guide for all shortcuts

🔧 Development Mode

  • Hot Reload: Automatic file watching with reload prompts
  • Dev Panel: Quick actions for reload, clear cache, and logs
  • Output Channel: Detailed development logs
  • Statistics: Track active watchers and extension state
  • Commands:
    • ollama-cloud.toggleDevMode - Enable/disable development mode
    • ollama-cloud.showDevStats - Show development statistics

💾 State Manager with Debounced Persistence

  • Fast In-Memory Reads: Instant access to state data
  • Debounced Writes: 500ms batched writes for performance
  • Batch Operations: Efficient bulk state updates
  • Secrets Support: Secure credential storage
  • Statistics Tracking: Monitor cache sizes and pending changes
  • Singleton Pattern: Thread-safe initialization with guards

📓 Jupyter Notebook Support

  • Cell Operations: Explain, fix, optimize, and generate notebook cells
  • Output Integration: Analyzes cell outputs for better context
  • Full Context: Extracts entire notebook structure
  • Commands:
    • ollama-cloud.explainNotebookCell - Explain current notebook cell
    • ollama-cloud.fixNotebookCell - Fix errors in notebook cell
    • ollama-cloud.optimizeNotebookCell - Optimize cell for performance
    • ollama-cloud.generateNotebookCell - Generate cell from description

🔗 URI Handler for Deep Linking

  • Deep Links: Open extension features via URLs
  • Supported URIs:
    • vscode://ollama-cloud/chat?message=Hello&autoSend=true
    • vscode://ollama-cloud/explain?file=/path/to/file.ts&line=10
    • vscode://ollama-cloud/fix?file=/path/to/file.ts&line=10
    • vscode://ollama-cloud/generate?type=test&file=/path/to/file.ts
    • vscode://ollama-cloud/model?select=chat
  • URI Generation: Utilities for creating and sharing deep links

📋 Enhanced Terminal Integration

  • Clipboard-Based Capture: Preserves your clipboard while capturing terminal output
  • Error Analysis: Explain terminal errors with full context
  • Command Suggestions: AI suggests commands based on your description
  • OS-Aware: Adapts to Windows, macOS, and Linux
  • Commands:
    • ollama-cloud.addTerminalOutput - Add selected terminal output to chat
    • ollama-cloud.explainTerminalError - Explain terminal errors
    • ollama-cloud.suggestTerminalCommand - Get command suggestions

🎯 Code Action Provider

  • Right-Click Menu: Access AI features directly from code
  • Actions: Add to Chat, Explain Code, Improve Code, Fix with Ollama
  • Smart Context: Auto-expands 3 lines above/below for better understanding
  • Diagnostic Integration: Fixes errors based on VS Code diagnostics

⚡ Enhanced Autocomplete

  • LRU Cache: Smart caching with max 100 entries
  • Performance Tracking: Monitor cache hit rates and request counts
  • Automatic Cleanup: Periodic maintenance every 60 seconds
  • Optimized Debounce: 250ms for faster responses
  • Statistics: getStats() method for debugging

🧪 Testing & Quality

100% Test Coverage

  • 215 Tests Passing: All unit tests passing with 0 failures
  • Test Suites:
    • OllamaCloudClient (token usage, sessions, context management)
    • SessionManager (persistence, retrieval, validation)
    • CommandExecutor (cross-platform, error handling, special cases)
    • ChatViewProvider (message handling, lifecycle)
    • FileEditor (path validation, content fixes)
    • AutocompleteProvider (caching, performance, languages)
    • StateManager (debounced persistence, batch operations)
    • Integration tests

📦 Technical Improvements

Architecture

  • Singleton Patterns: Thread-safe initialization for all managers
  • Debounced Operations: Performance optimization for state writes
  • LRU Caching: Memory-efficient caching with automatic eviction
  • Proper Disposal: Resource cleanup on deactivation
  • Type Safety: Full TypeScript strict mode compliance

Code Quality

  • JSDoc Comments: Comprehensive documentation
  • Error Handling: Graceful failure recovery
  • Performance Monitoring: Built-in metrics and statistics
  • Resource Management: Proper cleanup and disposal patterns

Bundle Size

  • 404 KiB: Optimized webpack bundle
  • 9 New Modules: Web research, walkthrough, dev mode, notebook, URI handler, state manager, terminal integration, code actions, enhanced autocomplete

What's New in v0.1.10

Enhanced Token Usage Display

  • Top-Position Token Usage: Token usage now appears prominently at the top of the chat interface
  • Model-Specific Styling: Local models show green styling, cloud models show blue styling with gradient backgrounds
  • Visual Progress Bars: Real-time token usage tracking with color-coded progress bars (green/yellow/red)
  • Monthly Usage Tracking: Monitor your Ollama Cloud API usage with detailed breakdowns

Advanced File Editing Features

  • Model-Specific Content Fixes: Automatically fixes common issues with different AI models:
    • Removes escape characters for Gemini, Llama, Mistral models
    • Strips markdown codeblock markers for DeepSeek, Llama, Mistral models
    • Converts HTML entities for DeepSeek models
    • Cleans up whitespace for models like "minsteral"/"minstral"
    • Handles JSON and YAML file formatting
  • Enhanced File Path Validation: Improved security with better pattern matching and validation
  • Better Error Handling: More robust error messages and validation

Improved User Interface

  • Action Approval Cards: New styled cards for file edits and command execution with gradient backgrounds
  • File Read Notifications: Visual indicators when files are being read
  • Enhanced Environment Info: Better workspace information display with colored chips
  • Improved Markdown Rendering: Better table support and formatting

Advanced Context Management

  • File Context Tracking: Tracks files that are read, edited, or mentioned
  • Task Context Awareness: Maintains context about files created/modified during tasks
  • Session Restoration: Improved session management and restoration

Code Quality Improvements

  • TypeScript Best Practices: Better type definitions and error handling
  • Code Organization: Cleaner separation of concerns
  • Performance Optimizations: More efficient file operations and UI updates

What's New in v0.1.9

Bug Fixes

  • Session Restore Now Works: Restoring a previous session now properly displays all chat messages in the chat window
  • Default Model Settings: The ollamaCloud.chatModel setting now properly sets the selected model in the dropdown
  • Premium Models Indicator: Premium models (70B+, Mixtral, Claude, GPT-4, etc.) are clearly marked with a 💎 icon

What's New in v0.1.8

Action Approval UI (Cline-Style)

  • "Ollama wants to edit {filename}": Beautiful purple/indigo gradient cards appear when AI suggests file edits
  • "Ollama wants to run command": Amber/yellow gradient cards for command execution requests
  • "Ollama is reading file": Blue notification cards when AI reads files
  • Apply/Skip Buttons: User-friendly buttons to approve or skip each action
  • Task Completion Banner: Green gradient banner shows summary when all actions complete

Model Dropdown Enhancements

  • Source Indicators: 💻 for local models, ☁️ for cloud models
  • Premium Model Indicator: 💎 badge for large/expensive models (70B+, Mixtral, Claude, GPT-4)
  • Smart Sorting: Local models appear first in the dropdown

Custom System Prompt

  • New Setting: ollamaCloud.customSystemPrompt with dynamic placeholders
  • Placeholders: {{OS}}, {{SHELL}}, {{WORKSPACE}}, {{WORKSPACE_PATH}}, {{FILE_TREE}}, {{NPM_SCRIPTS}}, {{MODE}}
  • Full Control: Replace the entire system prompt or leave empty for default

Session Restore Improvements

  • No Auto-Restore: Sessions no longer automatically restore on startup
  • User Choice: Prompt with "Restore Session" and "Start Fresh" buttons

Other Improvements

  • Increased Timeout: Default timeout increased from 30s to 120s for large models
  • No Placeholder Code: AI now provides complete, copy-paste ready code (no more "// ... rest of code")

What's New in v0.1.4

Cross-Platform Command Execution

  • Full Cross-Platform Support: Commands now work seamlessly on Windows, macOS, and Linux
  • Platform-Aware Shell Selection: Automatically uses the appropriate shell (PowerShell on Windows, zsh on macOS, bash on Linux)
  • Smart Command Chaining: Uses correct command separators for each platform (; for PowerShell, && for Unix shells)

What's New in v0.1.3

Improved AI Response Reliability

  • Official Ollama SDK: Now uses the official ollama npm package for better compatibility and reliability
  • Streaming Responses: See AI output in real-time as it's generated (configurable)
  • Context Window Management: Proper num_ctx parameter support (default 32768) for longer conversations
  • Retry Logic: Automatic retry with exponential backoff for rate-limited requests
  • Request Cancellation: Working Cancel button to abort long-running requests
  • Actual Token Counts: Real token usage from API instead of estimates

New Configuration Options

  • enableStreaming - Toggle streaming responses on/off
  • contextWindow - Set the context window size (2048-131072)
  • requestTimeout - Set request timeout in milliseconds (5000-600000)

Installation

From VSIX (Local Installation)

  1. Download the .vsix file
  2. Open VSCode
  3. Go to Extensions (Ctrl+Shift+X)
  4. Click the "..." menu at the top
  5. Select "Install from VSIX..."
  6. Choose the downloaded file

From Source

  1. Clone this repository
  2. Run npm install
  3. Run npm run compile
  4. Press F5 to open a new VSCode window with the extension loaded

Setup

Option 1: Ollama Cloud (Recommended for beginners)

  1. Get Ollama Cloud API Key

    • Sign up at ollama.com
    • Go to your account settings and generate an API key
  2. Configure the Extension

    • Open VSCode Settings (Ctrl+,)
    • Search for "Ollama Cloud"
    • Enter your API key in ollamaCloud.cloudApiKey

Option 2: Local Ollama

  1. Install Ollama

    • Download from ollama.com
    • Run ollama serve to start the local server
  2. Pull a Model

    ollama pull llama3.1
    
  3. Configure the Extension

    • The extension will automatically detect local models
    • No API key needed for local models

Option 3: Both (Hybrid)

Use both local and cloud models! The extension automatically routes requests to the appropriate endpoint based on where each model is available.

Usage

Opening the Chat

  • Click the Ollama Cloud icon in the Activity Bar (left sidebar)
  • Or use the keyboard shortcut: Ctrl+Shift+O (Windows/Linux) or Cmd+Shift+O (Mac)
  • Or open Command Palette (Ctrl+Shift+P) and run "Ollama Cloud: Open Chat"

Chatting with the AI

Simply type your question or request in the chat input and press Enter. The AI can help with:

  • Writing new code
  • Explaining existing code
  • Debugging errors
  • Refactoring code
  • Creating new files
  • Running commands
  • And much more!

Operating Modes

ACT Mode (Default)

  • AI can suggest file edits and commands
  • You approve each action before it's executed
  • Best for getting things done

PLAN Mode

  • AI explains what should be done without executing
  • Great for understanding complex tasks
  • Use for learning and planning

File Editing

When the AI suggests file changes, it will format them like this:

// File: src/example.js
function hello() {
  console.log("Hello, World!");
}

Approval Workflow

By default, you'll be prompted to approve the changes before they're applied. These will appear as clickable cards in the chat interface with options to Apply, Skip, or review the Diff (when enabled).

If you'd prefer to skip manual approval, you can enable Auto-Approve in settings (ollamaCloud.autoApprove). ⚠️ Warning: With auto-approve enabled, the AI can directly modify your files and execute commands without confirmation.

Command Execution

When the AI suggests commands, they'll be formatted like:

npm install axios

Approval Workflow

These commands will appear as clickable cards in the chat interface. By default, you'll need to explicitly approve each command before it runs. Click Run to execute or Skip to ignore.

If you'd prefer to skip manual approval, you can enable Auto-Approve in settings (ollamaCloud.autoApprove). ⚠️ Warning: With auto-approve enabled, the AI can directly execute terminal commands without confirmation.

Configuration

Setting Description Default
ollamaCloud.cloudApiKey Your Ollama Cloud API key (empty)
ollamaCloud.localEndpoint Local Ollama server URL http://localhost:11434
ollamaCloud.chatModel AI model for chat llama3.1
ollamaCloud.autocompleteModel AI model for autocomplete ministral-3:3b
ollamaCloud.defaultMode Default operating mode act
ollamaCloud.customSystemPrompt Custom system prompt with placeholders (empty)
ollamaCloud.temperature Response creativity (0-2) 0.7
ollamaCloud.maxTokens Maximum response length 4096
ollamaCloud.contextWindow Context window size 32768
ollamaCloud.requestTimeout Request timeout (ms) 120000
ollamaCloud.enableStreaming Enable streaming responses true
ollamaCloud.autoApprove Auto-approve AI actions false
ollamaCloud.showDiff Show diff before applying changes true
ollamaCloud.enableAutocomplete Enable AI code completion true
ollamaCloud.includeFileContext Automatically include context from open files in chat messages true
ollamaCloud.enableCodeLens Show AI action buttons in code true
ollamaCloud.enableStreamingForAutocomplete Enable streaming for autocomplete requests false
ollamaCloud.autocompleteDebounceDelay Delay before sending autocomplete requests (ms) 250
ollamaCloud.autocompletePreviewDelay Delay before showing autocomplete preview (ms) 1000
ollamaCloud.enableAdaptivePerformance Enable adaptive performance tuning true

Custom System Prompt Placeholders

When using customSystemPrompt, you can use these placeholders that will be replaced with actual values:

Placeholder Description
{{OS}} Operating system (Windows, macOS, Linux)
{{SHELL}} Shell type (PowerShell, Bash, etc.)
{{WORKSPACE}} Current workspace name
{{WORKSPACE_PATH}} Full path to workspace
{{FILE_TREE}} Project file structure
{{NPM_SCRIPTS}} Available npm scripts
{{MODE}} Current mode (ACT or PLAN)

Project Context Files

The AI can automatically read project-specific context from special markdown files in your workspace. Create one of these files to provide the AI with project-specific information. The extension supports multiple popular AI context formats.

Supported File Names (Ordered by Priority)

Ollama Cloud Native Formats

  1. .ollamacloud.md (Primary - Highest priority)
  2. .ollamacloud-context.md

GitHub Copilot Compatible Formats

  1. .github/copilot-instructions.md (GitHub standard location)
  2. .github/copilot_instructions.md
  3. .copilot-instructions.md
  4. .copilot_instructions.md
  5. copilot-instructions.md

Anthropic Claude Compatible Formats

  1. .claude/context.md
  2. .claude/instructions.md
  3. claude-context.md

Generic Formats

  1. .project-context.md
  2. .context.md
  3. PROJECT.md
  4. README.context.md
  5. PROJECT_CONTEXT.md

Format Compatibility

This extension automatically reads context files used by other AI assistants:

  • GitHub Copilot: Supports .github/copilot-instructions.md format
  • Anthropic Claude: Supports .claude/context.md format
  • Generic AI tools: Supports common context file names

You can use existing context files from other AI tools, or create a new one tailored for Ollama Cloud.

Example Project Context Files

Ollama Cloud Format (.ollamacloud.md)

Create .ollamacloud.md in your project root:

# Project Information
- **Project Name**: MyApp
- **Framework**: React with TypeScript
- **Build Tool**: Vite
- **CSS Framework**: Tailwind CSS
- **Backend API**: REST API at /api/v1

# Important Guidelines
- Always use functional components with React hooks
- Follow the existing folder structure pattern
- Use Tailwind classes for styling instead of CSS files
- Maintain consistent error handling with try/catch blocks
- Keep components small and focused (max 150 lines)

# Architecture Notes
- Authentication is handled via JWT tokens
- State management uses React Context + useReducer
- All API calls go through the `/src/api/client.ts` wrapper
- Environment variables are prefixed with `VITE_`

# Common Patterns to Avoid
- ❌ Don't use Redux (we migrated away from it)
- ❌ Don't create new CSS files (use Tailwind only)
- ❌ Don't use class components (legacy only)

GitHub Copilot Format (.github/copilot-instructions.md)

# Copilot Instructions for This Project

## Coding Standards
- Follow the existing code style
- Use TypeScript with strict typing
- Include JSDoc comments for exported functions
- Handle errors gracefully with try/catch

## Project Structure
- `/src` - Source code
- `/test` - Unit tests
- `/dist` - Built output
- Configuration files in root

## Key Dependencies
- vscode - VS Code Extension API
- ollama - Ollama client library
- axios - HTTP client for API calls

Claude Format (.claude/context.md)

# Claude Context

This project is a VS Code extension for AI-assisted coding.

## Core Principles
- User privacy is paramount
- Transparency in all AI interactions
- Security through explicit approval workflows
- Performance through caching and debouncing

## Technical Constraints
- Must work with both local and cloud AI models
- All file operations require user confirmation
- Network calls must handle timeouts gracefully
- Extension must work offline when possible

Best Practices

  1. Single Context File: Use only one context file to avoid conflicts
  2. Keep it Concise: Focus on the most important information
  3. Regular Updates: Keep context files current with project changes
  4. Team Consensus: Ensure team agrees on context guidelines
  5. Security Review: Review context files for sensitive information

The AI will automatically include this context information in all conversations to provide more accurate and project-specific responses.

Available Models

Cloud Models (Ollama Cloud)

  • llama3.1 - Latest Llama model (recommended)
  • llama3.2 - Llama 3.2
  • ministral-3:3b - Fast, efficient for autocomplete
  • gemma3:4b - Google's Gemma 3

Local Models (requires local Ollama)

  • codellama - Specialized for coding
  • mistral - Fast and efficient
  • mixtral - Mixture of experts model
  • qwen2.5-coder - Specialized coding model
  • phi3 - Microsoft's Phi-3

Keyboard Shortcuts

Shortcut Action
Ctrl+Shift+O Open Chat
Ctrl+Shift+N New Task
Ctrl+K Inline Chat (with selection)
Ctrl+Shift+E Explain Code
Ctrl+Shift+T Generate Tests
Ctrl+Shift+F Fix Code
Ctrl+Shift+R Review Code
Ctrl+Shift+M Modernize Code
Enter Send message
Shift+Enter New line in message

Commands

Chat & Core

  • Ollama Cloud: Open Chat - Open the chat interface
  • Ollama Cloud: New Task - Start a new conversation
  • Ollama Cloud: Clear History - Clear chat history
  • Ollama Cloud: Select Model - Choose a different AI model

Code Actions

  • Ollama Cloud: Explain Code - Explain selected code
  • Ollama Cloud: Fix Code - Fix issues in selected code
  • Ollama Cloud: Improve Code - Suggest code improvements
  • Ollama Cloud: Add to Chat - Add selected code to chat
  • Ollama Cloud: Generate Tests - Generate tests for selected code
  • Ollama Cloud: Review Code - Get a code review
  • Ollama Cloud: Modernize Code - Update code to modern standards

Terminal Integration

  • Ollama Cloud: Add Terminal Output - Add selected terminal output to chat
  • Ollama Cloud: Explain Terminal Error - Explain terminal errors
  • Ollama Cloud: Suggest Terminal Command - Get command suggestions

Web Research

  • Ollama Cloud: Search Web - Search the internet and add results to chat
  • Ollama Cloud: Research Topic - Deep research with content fetching
  • Ollama Cloud: Fetch URL - Fetch and analyze URL content

Jupyter Notebooks

  • Ollama Cloud: Explain Notebook Cell - Explain current notebook cell
  • Ollama Cloud: Fix Notebook Cell - Fix errors in notebook cell
  • Ollama Cloud: Optimize Notebook Cell - Optimize cell for performance
  • Ollama Cloud: Generate Notebook Cell - Generate cell from description

Walkthrough & Help

  • Ollama Cloud: Show Welcome - Show welcome tour
  • Ollama Cloud: Show Tips - Show tips and tricks
  • Ollama Cloud: Show Shortcuts - Show keyboard shortcuts

Development

  • Ollama Cloud: Toggle Dev Mode - Enable/disable development mode
  • Ollama Cloud: Show Dev Stats - Show development statistics

Tips

  1. Be Specific: The more specific your request, the better the AI can help
  2. Provide Context: Mention file names, error messages, or relevant code
  3. Review Changes: Always review AI-suggested changes before applying
  4. Use New Task: Start a new task for unrelated questions to maintain context
  5. Experiment with Models: Different models excel at different tasks
  6. Adjust Timeout: Increase requestTimeout for larger models or slower connections
  7. Use Streaming: Keep streaming enabled to see responses as they generate

Troubleshooting

"Invalid API key" Error

  • Verify your API key in settings
  • Make sure you have an active Ollama Cloud account

"Request timed out" Error

  • Increase requestTimeout in settings (default 30000ms)
  • Try a smaller/faster model
  • Check your internet connection

Extension Not Loading

  • Check the Output panel (View → Output → Ollama Cloud)
  • Try reloading VSCode (Ctrl+Shift+P → "Reload Window")

Slow Responses

  • Try a smaller model (e.g., ministral-3:3b instead of llama3.1)
  • Enable streaming to see partial responses
  • Check your internet connection
  • Reduce maxTokens in settings

Local Ollama Not Detected

  • Make sure Ollama is running (ollama serve)
  • Check the localEndpoint setting matches your Ollama server

Privacy & Security

  • Your code is sent to Ollama Cloud or your local Ollama for processing
  • API keys are stored locally in VSCode settings
  • No data is stored by this extension beyond session persistence
  • Review Ollama's privacy policy for cloud usage details

Development

Building from Source

# Install dependencies
npm install

# Compile TypeScript
npm run compile

# Watch for changes
npm run watch

# Package extension
npm run package

Running Tests

npm test

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

License

MIT License - See LICENSE file for details

Credits

Inspired by the Cline VSCode extension. Built with ❤️ for the developer community.

Support

  • Report issues on GitLab
  • Visit jkagidesigns.com for more projects

Note: This extension works with both Ollama Cloud (requires API key) and local Ollama (free, requires local installation). Visit ollama.com to get started.

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft