HalCode AI Assistant
Your personal AI coding partner with deep context awareness and agentic workflows - built for VS Code.
⚡ Quick Start (v3.0.17+)
Using Ollama? Just 3 steps:
- Install Ollama - Download from ollama.ai
- Start Ollama - Run
ollama serve (or ollama signin for cloud models)
- Select Model - Open HalCode settings and pick a model from the dropdown
That's it! No API keys needed. Start chatting! 🚀
🌟 Features
🚀 Ollama Support - Three Great Options! (v3.0.16+)
HalCode now supports three Ollama options to fit every budget and use case:
1️⃣ Local Ollama (Completely Free!)
Perfect for: Casual coding, learning, privacy-focused users
- ✅ Zero Cost - Free after model download
- ✅ Completely Private - Everything runs on your machine
- ✅ Works with AI Assistant - Great for conversation and simple tasks
- ✅ Recommended Models:
llama3:8b, phi3, gemma:7b
- ❌ Limitation: File creation requires Ollama Cloud or Claude/Gemini
Setup:
ollama pull llama3:8b
ollama serve
2️⃣ Ollama Cloud (Cheapest for Heavy Use!)
Perfect for: File creation, fast responses, budget-conscious users
- ✅ Free Tier Available - $0/month with usage limits
- ✅ Lightning Fast - Datacenter-grade GPUs (1-5s responses)
- ✅ Works with All Features - File creation, editing, everything!
- ✅ Flexible Pricing - Free, $20/mo Pro, or $100/mo Max
- ✅ Privacy First - Ollama doesn't log your data
- ✅ Zero API Key Management - Just run
ollama signin
Setup:
ollama signin
# Select a model in HalCode settings
Available Cloud Models:
| Model |
Best For |
Speed |
deepseek-v3.1:671b-cloud |
⚡ Ultra-fast, instant responses |
1-5s |
Available Local Models:
| Model |
Best For |
Speed |
Size |
llama3:8b |
⭐ RECOMMENDED - Best balance |
5-10s |
4.7GB |
phi3 |
⭐ RECOMMENDED - Efficient & fast |
2-5s |
2.3GB |
gemma:7b |
⭐ RECOMMENDED - Strong reasoning |
5-10s |
5GB |
deepseek-r1:14b |
Advanced reasoning |
10-20s |
9GB |
deepseek-r1:32b |
High-quality reasoning |
20-40s |
20GB |
deepseek-r1:70b |
State-of-the-art reasoning |
40-80s |
45GB |
deepseek-r1:671b |
Maximum reasoning capability |
2-5min |
400GB+ |
Want to try other models? Just run ollama pull <model-name> and select it in HalCode!
3️⃣ Claude & Gemini (Best Quality)
Perfect for: Complex tasks, Agentic/Multi-Agent modes, maximum reliability
- ✅ Highest Quality - Most capable models
- ✅ All Features Work - File creation, Agentic, Multi-Agent
- ✅ Gemini Free Tier - $0/month with usage limits
- ✅ Claude Paid - Best for serious development
💰 Complete Pricing Comparison
| Feature |
Local Ollama |
Ollama Cloud Free |
Ollama Cloud Pro |
Claude |
Gemini |
| Cost |
Free |
Free |
$20/mo |
Pay-per-use |
Free + paid |
| Speed |
Slow (60-120s) |
Fast (1-5s) |
Fast (1-5s) |
Fast (2-5s) |
Fast (1-3s) |
| AI Assistant |
✅ Works |
✅ Works |
✅ Works |
✅ Works |
✅ Works |
| File Creation |
❌ No |
✅ Yes |
✅ Yes |
✅ Yes |
✅ Yes |
| Agentic Mode |
❌ No |
🔄 Coming |
🔄 Coming |
✅ Works |
✅ Works |
| Multi-Agent |
❌ No |
🔄 Coming |
🔄 Coming |
✅ Works |
✅ Works |
| Privacy |
✅ Local |
✅ No logging |
✅ No logging |
❌ Cloud |
❌ Cloud |
| Best For |
Casual use |
Budget users |
Power users |
Complex tasks |
Free tier users |
🎯 Which Option Should You Choose?
I'm on a tight budget:
→ Start with Ollama Cloud Free ($0) - Fast, works with file creation
I want completely free:
→ Use Local Ollama ($0) - But only for AI Assistant mode
I need file creation and speed:
→ Use Ollama Cloud Pro ($20/mo) - Best value for heavy use
I need the best quality:
→ Use Claude or Gemini - For complex tasks and Agentic modes
I want to try everything:
→ Start with Ollama Cloud Free, upgrade as needed
✅ Follow-Up File Modification Fix (v3.0.1 - AI Assistant Now Executes Follow-Up Changes! ✅)
When AI suggests modifying a file, it now actually does it!
- ✅ Follow-Up Modification Detection - AI Assistant detects when it suggests file changes
- ✅ Automatic Approval Setup - Sets up pending request for follow-up modifications
- ✅ Proper Execution - When you say "yes", the file actually gets modified
- ✅ Works with Indexed Folders - Integrates with your configured indexed folders
- ✅ No More Empty Promises - AI Assistant now follows through on suggestions
What This Fixes:
- AI Assistant was suggesting file modifications but not executing them
- Follow-up requests like "integrate this component" now actually work
- Approval flow properly triggers for suggested changes
- Files are written to disk when user approves
Never feel stuck again - see exactly what's happening at every step!
- ✅ Real-Time Task Tracking - Tasks extracted from AI plans and displayed with checkboxes
- ✅ Progress Visibility - Watch tasks update from NOT_STARTED → IN_PROGRESS → COMPLETE
- ✅ AI Assistant Fix - Fixed file detection regex to work with all AI response formats
- ✅ Approval Flows - Both AI Assistant and Agentic Workflow now properly ask for approval
- ✅ Instant Feedback - Users see exactly what's being created and when it's done
- ✅ No More Confusion - Clear visual indicators prevent the "is it frozen?" feeling
What This Fixes:
- AI Assistant now properly detects files from AI responses (was too strict with regex)
- Both workflows show clear approval prompts before executing
- Users get real-time progress updates with visual checkmarks
- Task completion is tracked and displayed in chat
- Fallback pattern matching for various AI response formats
Example Flow:
User: "Create a React component with styling"
AI: [Shows plan with task list]
[ ] Create component file
[ ] Add styling
[ ] Update imports
User: "yes"
AI: [Executes and updates tasks]
[x] Create component file
[x] Add styling
[x] Update imports
✅ All tasks complete!
🎯 Token Budget Enforcement (v2.6.10 - Rate Limit Fix! 🎯)
Semantic search now respects token budgets - no more 429 rate limit errors!
- ✅ Automatic token budget enforcement (190K limit per message)
- ✅ Stops adding files when token limit is reached
- ✅ Eliminates 429 "rate limit exceeded" errors
- ✅ Smarter file selection based on relevance
- ✅ Works with both Gemini and Claude
- ✅ Improved response times with optimized context
What This Fixes:
- No more rate limit errors blocking your work
- Consistent performance even with large codebases
- Better token efficiency across all workflows
- Faster responses due to optimized context size
🔄 Claude Embedding Fallback (v2.6.8 - Flexible & Resilient! 🔄)
Semantic search now works with Claude embeddings as a fallback!
- ✅ Automatic fallback from Gemini to Claude embeddings
- ✅ Works even if Gemini API is unavailable
- ✅ Configurable embedding provider (auto/gemini/claude)
- ✅ Claude-only users no longer need Gemini API key
- ✅ Seamless provider switching with zero downtime
- ✅ Better resilience and flexibility
Provider Options:
- Auto (Default) - Try Gemini first, fall back to Claude
- Gemini Only - Use only Gemini embeddings
- Claude Only - Use only Claude embeddings
⚡ Optimized Chat Stream Speed (v2.6.7 - Smoother Responses! ⚡)
AI responses now stream at a more readable pace!
- ✅ Improved streaming speed for better readability
- ✅ Responses appear at 20ms per character (optimized from 10ms)
- ✅ More natural reading experience
- ✅ Smoother visual feedback while AI generates responses
✅ Organic AI Conversation & File Creation (v2.6.4 - Natural Flow! 🎯)
Claude and Gemini now operate more organically between chat and file creation/editing!
- ✅ Seamless conversation flow between chat and file operations
- ✅ Natural planning before file creation/editing
- ✅ Organic context switching between chat and code generation
- ✅ Both Claude and Gemini work smoothly together
- ✅ Full support for both providers in all workflows
Provider Support:
| Feature | Claude | Gemini |
|---------|--------|--------|
| AI Assistant (File Creation) | ✅ ORGANIC! | ✅ ORGANIC! |
| Agentic Mode | ✅ Works | ✅ Works |
| Regular Chat | ✅ Works | ✅ Works |
🧠 Improved Gemini Memory (v2.6.1 - Better Context! 💭)
Now Testing & Improving One Feature at a Time
Gemini now remembers your entire conversation history, just like Claude! This means:
- ✅ Gemini recalls earlier messages in the chat
- ✅ Better context awareness across multi-turn conversations
- ✅ Consistent behavior between Claude and Gemini providers
- ✅ Improved agentic workflow reliability
Note: We're actively testing and improving AI Assistant features one at a time to ensure the best user experience. Each improvement is published incrementally.
🔄 Checkpoint System (v2.6.0 - Undo Changes! 🎯)
Available in: AI Assistant Workflow Only
- Save Snapshots - Automatic checkpoints after each file operation
- Revert Anytime - Click [Revert] to undo changes and go back to any checkpoint
- No Manual Cleanup - No need to manually delete files you don't like
- Safe Experimentation - Try different approaches without fear
- Clear Tracking - See exactly what changed at each checkpoint
Example Workflow:
User: "Create a shopping cart, payment system, and user profile"
AI: "⚙️ Creating Cart.jsx... ✅ Done!"
[CHECKPOINT 1: Created Cart.jsx] [↩️ Revert]
AI: "⚙️ Creating Payment.jsx... ✅ Done!"
[CHECKPOINT 2: Created Payment.jsx] [↩️ Revert]
AI: "⚙️ Creating Profile.jsx... ✅ Done!"
[CHECKPOINT 3: Created Profile.jsx] [↩️ Revert]
User: "I don't like Profile.jsx, redo it differently"
User: Clicks [↩️ Revert] on CHECKPOINT 2
→ Profile.jsx is deleted
→ Conversation continues from CHECKPOINT 2
→ User can ask AI to try a different approach
Note: Checkpoints are only available in the AI Assistant workflow. Multi-Agent and Agentic workflows do not create checkpoints.
💬 AI Assistant File Writing (v2.5.0 - Game Changer! 🎯)
- 🚫 NEVER Write Code in Chat - All code written directly to disk, no pasting needed
- ✅ ALWAYS Ask for Approval - "Should I go ahead?" before creating files
- ⚙️ Natural Progress Messages - "Creating Cart.jsx... ✅ Done!" (conversational, not robotic)
- 🧠 Remember Context Automatically - No need to click "reply on last message"
- AI remembers what it created in this conversation
- Naturally references previous work
- Internal task list tracks all changes
- 📝 Task Tracking - Preserves context across multiple messages
- 🎯 Perfect Workflow - Works seamlessly with Multi-Agent and Agentic Agent
Example:
User: "Add a shopping cart component"
AI: "Great! Let me add that for you..."
User: "yes"
AI: "⚙️ Creating Cart.jsx... ✅ Done!"
AI: "⚙️ Creating Cart.css... ✅ Done!"
AI: "All set! What's next?"
🤖 Enhanced Agentic Workflow (v2.4.0 - Production Ready! 🚀)
- 📊 Detailed Progress Tracking: See exactly what the AI is doing at each stage
- Step-by-step progress indicators (Analyzing → Generating → Validating → Applying)
- File-by-file updates showing each file being written
- Clear visual feedback with animated spinners and checkmarks
- 🔄 Stay in Agentic Mode: Continue working seamlessly after task completion
- Post-completion menu with helpful next actions
- Install dependencies, run tests, start dev server - all from the AI
- Context preservation - AI remembers what it just built
- 💡 Better User Experience: More transparency and control
- See which files are being created/modified in real-time
- Success confirmations for each operation
- AI suggests logical next steps based on what was built
- ⚡ Improved Workflow: No more black box - know exactly what's happening
- Clear status messages at every stage
- Progress indicators show active work
- Seamless transitions between tasks
⚠️ Proactive Code Analysis (v2.3.0! 🚀)
- 🤖 AI Watches Your Code: Automatically analyzes code as you type
- ⚡ Real-Time Warnings: Yellow squiggly lines show potential issues
- 💡 Quick Fixes: Click the lightbulb to apply AI-suggested fixes
- 🔍 Smart Detection: Missing imports, error handling, deep nesting, long functions
- 📊 Professional UI: Looks like native VS Code diagnostics
- 🎯 Hover for Details: See explanations and suggestions on hover
- ⚙️ Non-Intrusive: Only shows when there are actual issues
- 🆓 Free Tier: 10 analyses/day | 💎 Pro: Unlimited
💡 Inline Code Completion (v2.2.4! ⚡)
- ✨ GitHub Copilot-Style Completions: Get code suggestions as you type
- 🆓 FREE with Gemini Flash: Uses Google's free Gemini API (no cost!)
- ⚡ Smart & Fast: 300ms debounce, 5-minute cache, <500ms latency
- 🧠 Context-Aware: Analyzes 20 lines before + 5 lines after cursor
- 📝 Multi-Line Suggestions: Up to 3 lines of intelligent code completion
- 🎯 Works Everywhere: All file types supported
- ⚙️ Fully Configurable: Enable/disable, adjust timing, customize behavior
🔧 Enhanced Terminal Integration (NEW in v2.2.4! 🖥️)
- ▶️ Run Commands from AI: AI can execute terminal commands with your approval
- 📊 Real-Time Output: See command output streaming live
- ✅ Interactive Approval: User approves commands before execution
- 🛡️ Safety Checks: Prevents dangerous commands (rm -rf, format, etc.)
- 🔍 Error Parsing: Automatically detects and highlights errors
- 🚀 Smart Detection: Auto-detects npm, yarn, pnpm, pip, cargo
🔍 Workspace-Wide Search (NEW in v2.2.4! 🔎)
- ⚡ Search ALL Files Instantly: Not limited to indexed folders
- 🎯 Fuzzy Search: Find code even with typos
- 🔧 Smart Search Types:
- Find API endpoints (Express, Spring, Flask, etc.)
- Find function definitions
- Find class definitions
- Find imports/requires
- Find TODO/FIXME comments
- 📊 AI-Friendly Output: Results formatted for AI consumption
- 📁 Grouped by File: Easy to navigate results with context
🗣️ Conversational AI Agents (v2.2.3 💬)
- 👥 Human-Like Communication: AI agents now talk like real colleagues, not robots
- 🎯 Professional & Warm: Maintains expertise while being approachable and friendly
- 💭 Natural Conversations: Uses "I" and "you" naturally, explains reasoning conversationally
- 🤝 Contextual Responses: Agents acknowledge your requests before diving into work
- ✨ Personality-Driven: Each agent (Architect, Developer, QA) has distinct communication style
- 🔄 Smooth Handoffs: Natural transitions between agents ("Let me hand this over to QA...")
- 📢 No More Robot Speak: Removed formal "SYSTEM" messages and robotic status updates
- 💡 Better Understanding: Natural language makes intent and progress clearer
- ⚡ Smart File Chunking: Large files automatically split into manageable chunks (70% memory reduction)
- 🧠 Persistent Embedding Cache: Embeddings saved to disk for instant semantic search across sessions
- 📊 Progressive Indexing: Priority-based file processing (high priority files first)
- ⚠️ Token Budget Protection: Automatic limits prevent memory issues on huge projects
- 10x Faster Searches: Pre-computed embeddings make semantic search nearly instant
- Scalable: Now handles projects with 10,000+ files efficiently
- Smart Summarization: Very large files (>500KB) summarized with signatures extracted
- Detailed Statistics: See exactly what was indexed and what was skipped
🎨 Code Generation Intelligence (v2.2.0)
- 🧠 Context-Aware Code Generation: AI automatically includes relevant context and patterns from your codebase
- 📦 Smart Import Suggestions: Detects missing imports and suggests correct paths with confidence levels
- 🔧 Auto-Fix Imports: Automatically adds missing imports to your file with one click
- 🎨 Pattern Recognition: Learns your coding style (indentation, semicolons, naming conventions)
- 🔄 Refactoring Intelligence: Suggests extract function/component, detects duplication, analyzes impact
- 🔍 Duplicate Code Detection: Finds similar code patterns across your codebase (80%+ similarity)
- 🎯 Quick Actions: Five new buttons for intelligent code generation:
- 📦 Suggest Imports - Find and suggest missing imports
- 🔧 Fix Imports - Automatically add missing imports
- 🎨 Analyze Patterns - Show code style and pattern analysis
- 🔄 Suggest Refactoring - Get refactoring suggestions for selected code
- 🔍 Find Duplicates - Find similar code patterns
- Framework-Aware: Detects React, Vue, Angular, Svelte and generates appropriate code
- Style Matching: Generates code that matches your project's existing style
🧠 Real-time Dependency Analysis (v2.0.0)
- ⚡ Real-time File Watching: Automatic detection of file changes with incremental re-indexing
- 📊 Dependency Graph: Visual Mermaid diagrams showing import/export relationships
- 💥 Impact Analysis: Risk assessment (Low/Medium/High/Critical) with affected file counts
- 🔍 Reference Finder: Find all files that import a specific file instantly
- 🎯 Quick Actions: Four buttons for instant dependency analysis:
- 📊 Dependencies - Show what this file imports
- 🔍 References - Show what files import this file
- 💥 Impact - Analyze risk and affected files
- 🕸️ Graph - Visual dependency diagram
- Smart Recommendations: Get actionable advice based on impact analysis
- Circular Dependency Detection: Identifies problematic dependency chains
- Cross-platform: Works on Windows, Mac, and Linux with proper path handling
💬 Persistent Chat History & Conversations
- Multiple Conversations: Create unlimited separate conversations, each with its own context
- Auto-Save: All conversations automatically saved and restored across VS Code restarts
- Conversation List: Browse all your conversations in the sidebar with timestamps
- Click to Load: Instantly load any previous conversation to continue where you left off
- Auto-Generated Titles: Conversations automatically titled from your first message
🔍 Search & Export
- Search Conversations: Powerful search across all conversation titles and message content
- Export to Multiple Formats:
- 📄 Markdown (.md) - Perfect for documentation and sharing
- 📦 JSON (.json) - For backup and programmatic access
- 🌐 HTML (.html) - Styled webpage for browser viewing
- Clean Filenames: Automatically sanitized filenames for easy organization
🧠 Deep Context Awareness
- 200K+ Token Context: Understands your entire codebase, not just snippets
- Style Guide Integration: Automatically reads and follows your
styleguide.md, CLAUDE.md, or .github/copilot-instructions.md
- Project Structure Analysis: Knows your file organization and dependencies
- Related Files Detection: Understands imports and file relationships
- Real-time Updates: Index automatically updates when files change
🔀 Git Integration
- AI-Generated Commit Messages: Analyzes your changes and generates meaningful commit messages
- Git Status at a Glance: Branch indicator in header with real-time change detection
- View Changes in Chat: See diffs directly in the chat interface with syntax highlighting
- Commit History: View recent commits without leaving the chat
- Quick Git Actions: Status, Commit, Changes, and History buttons for instant access
- Smart Commit Workflow: AI analyzes diffs, suggests commit message, waits for approval
- Conventional Commits: Follows standard format (feat, fix, docs, refactor, etc.)
- Optional Feature: Only shows when Git repository is detected - non-intrusive!
🤖👥 Multi-Agent Collaboration (v1.0.22)
- Three AI Agents Working Together: Complex tasks handled by specialized agents
- 🏗️ Agent 1 (Architect): Plans architecture, designs file structure, identifies dependencies
- 💻 Agent 2 (Developer): Reviews plan and implements code for each file
- 🧪 Agent 3 (QA Engineer): Reviews implementation, writes tests, identifies issues
- User Approval Required: Review the plan before implementation begins
- Automatic File Creation: Generated files automatically saved to your indexed folders
- Real-Time Progress: Watch agents collaborate with live status updates
- Visual Agent Cards: See each agent's status and current task in real-time
🤖 Agentic Workflows
- Plan Before Execute: AI creates a detailed plan and waits for your approval before making changes
- Multi-Step Refactoring: Handles complex, multi-file refactors with confidence
- Test-First Approach: Suggests tests before implementation
- Minimal Changes: Only modifies what's necessary
- Terminal Command Execution: Automatically proposes commands like
npm install after file changes (requires approval)
💻 Interactive Chat Interface
- Sidebar Chat: Always-accessible AI assistant in your VS Code sidebar
- Context-Aware Responses: Include current file context with one click
- Multiple AI Providers: Switch between Claude and Gemini with separate API keys
- Code Analysis: Right-click any file to get instant AI analysis
💬 Conversational AI Experience
Experience AI agents that communicate like real colleagues:
🏗️ Architect Agent:
"Alright, I've analyzed your request and here's what I'm thinking we should do...
I'll need to create about 15 files for this. The structure will be component-based
so it's easy to maintain.
What do you think? Should I hand this off to the developer to build it out?"
💻 Developer Agent:
"Alright, I've reviewed the plan and I'm ready to implement this. Let me build
this out for you...
Done! I've implemented everything from the plan. Let me hand this over to QA
to make sure everything looks good."
🧪 QA Agent:
"Nice work! I've reviewed everything and it's looking solid. All the files are
in place, dependencies check out, and the code is clean.
This is ready to go! 🎉"
⚙️ Highly Customizable
- Custom System Prompts: Define your AI's personality and coding style
- Multiple AI Providers: Support for Claude and Gemini with independent API keys
- Configurable Models: Choose the best model for your needs
- Your Coding Style: Teach the AI your preferences once, use forever
🚀 Getting Started
1. Installation
- Open VS Code
- Press
Ctrl+Shift+X (or Cmd+Shift+X on Mac) to open Extensions
- Search for "Personalized AI Coding Assistant"
- Click Install
2. Set Your API Keys
Add Claude API Key (Optional)
- Press
Ctrl+Shift+P (or Cmd+Shift+P on Mac)
- Type "AI Assistant: Set API Key"
- Select Claude (Anthropic)
- Enter your Claude API key (starts with
sk-ant-)
Add Gemini API Key (Optional - FREE TIER!)
- Press
Ctrl+Shift+P
- Type "AI Assistant: Set API Key"
- Select Gemini (Google)
- Enter your Gemini API key
Note: You can add both keys and switch between them anytime! Each provider has its own separate key storage.
Setup Ollama Cloud (Optional - FREE & FAST!)
- Sign up at ollama.com (free account)
- Open terminal and run:
ollama signin
- Follow the prompts to authenticate with your ollama.com account
- Done! Your device is now linked to your Ollama Cloud account
- In HalCode settings, select any cloud model from the dropdown
That's it! No API keys to manage - Ollama handles authentication automatically.
Setup Local Ollama (Optional - COMPLETELY FREE!)
Recommended Models for HalCode:
llama3:8b - ⭐ Best overall, great instruction following
phi3 - ⭐ Excellent quality, very efficient
gemma:7b - ⭐ Strong reasoning, good code generation
Setup Steps:
Pull a model:
ollama pull llama3:8b
Start Ollama:
ollama serve
In HalCode settings, select your model from the dropdown
Important Notes:
- ✅ Works great with AI Assistant mode (conversation, simple tasks)
- ❌ Does NOT work with file creation (use Ollama Cloud instead)
- ✅ Completely free after download
- ✅ Runs entirely on your machine (100% private)
- ⚠️ Slower than cloud options (60-120 seconds per response)
3. Configuration
- Press
Ctrl+Shift+P
- Type "AI Assistant: Configure System Prompt"
- Enter your personalized instructions
Set Your API Key (Legacy)
- Press
Ctrl+Shift+P (or Cmd+Shift+P on Mac)
- Type "AI Assistant: Set API Key"
- Select your provider and enter the API key
- Press
Ctrl+Shift+P
- Type "AI Assistant: Configure System Prompt"
- Enter your personalized instructions
Example System Prompt:
You are my Senior Engineering Partner, specialized in TypeScript and React.
MODE: Undoubted Expert. Respond with certainty and precision.
AGENTIC BEHAVIOR:
1. Plan First: Outline steps before coding
2. Wait for Approval: Don't code until I say "GO"
3. Minimal Change: Only modify what's necessary
4. Tests First: Suggest tests before implementation
CODING STYLE:
- Language: TypeScript with strict mode
- Use functional components and hooks
- Prefer composition over inheritance
- Include JSDoc comments for public APIs
- Use meaningful variable names
Inline completion is enabled by default with these settings:
{
"halcode.inlineCompletion.enabled": true,
"halcode.inlineCompletion.debounceMs": 300,
"halcode.inlineCompletion.maxLines": 3
}
To customize:
- Press
Ctrl+, to open Settings
- Search for "halcode inline"
- Adjust settings as needed
5. Index Your Project Folder
Important: To use dependency analysis and context features, you need to index your project folder first!
- Press
Ctrl+Shift+P
- Type "AI Assistant: Index Folder for AI Operations"
- Select your project folder
- Wait for indexing to complete (shows progress notification)
- NEW! Click the X button to cancel indexing anytime
What gets indexed:
- All code files (JS, TS, Python, Java, etc.)
- Project structure and file relationships
- Import/export statements
- File dependencies
Real-time Updates:
- Index automatically updates when you edit, create, or delete files
- No need to re-index manually!
5. Usage
Open the Chat
- Click the AI icon in the Activity Bar (left sidebar)
- Or press
Ctrl+Shift+P and run "AI Assistant: Open Chat"
Use Inline Completion (NEW! ⚡)
- Start typing in any code file
- Wait 300ms after you stop typing
- See gray suggestion appear inline
- Press Tab to accept the suggestion
- Keep typing to ignore it
Tips:
- Works best with function definitions, variable declarations, and common patterns
- Uses FREE Gemini Flash API (no cost!)
- Caches common patterns for 5 minutes (faster responses)
- Disable anytime in settings:
halcode.inlineCompletion.enabled
Run Terminal Commands (NEW! 🖥️)
- Press
Ctrl+Shift+P
- Type "AI Assistant: Run Terminal Command"
- Enter your command (e.g.,
npm test)
- Click "Yes" to approve execution
- See output in a new document
Or let the AI run commands:
- Ask the AI: "Run npm install"
- AI will request approval before executing
- You see the full output in the chat
Search Your Workspace (NEW! 🔎)
- Press
Ctrl+Shift+P
- Choose from:
- "AI Assistant: Search Workspace" - Search for any pattern
- "AI Assistant: Find API Endpoints" - Find all API routes
- "AI Assistant: Find TODOs" - Find all TODO/FIXME comments
- See results grouped by file with context
Or ask the AI:
- "Search for all API endpoints"
- "Find all TODO comments"
- "Search for function definitions"
Analyze Dependencies (v2.0.0 🎉)
- Open any file in your indexed folder
- Click the dependency buttons in the chat interface:
- 📊 Dependencies - See what this file imports
- 🔍 References - See what files import this file
- 💥 Impact - Get risk analysis and affected file count
- 🕸️ Graph - View visual dependency diagram
Example Impact Analysis:
## 💥 Impact Analysis: App.js
Risk Level: 🟢 LOW
Impact Score: 2 files affected
### 📊 Dependencies
- Direct: 6 files
- Indirect: 0 files
- Total: 6 files this file depends on
### 🔗 Dependents
- Direct: 2 files
- Transitive: 0 files
- Total: 2 files depend on this file
### 💡 Recommendation
✅ Low risk. Safe to modify with standard testing.
Create a New Conversation
- Click the "✨ New Chat" button in the chat header
- Start a fresh conversation with a clean slate
- Your previous conversation is automatically saved
Manage Your Conversations
- Look at the "Conversations" panel in the sidebar
- See all your conversations with timestamps
- Click any conversation to load it instantly
- Each conversation maintains its own context
Search Your Conversations
- Click the 🔍 Search icon in the Conversations panel header
- Enter a search term (searches titles and message content)
- Results filter in real-time
- Click a result to load that conversation
Export a Conversation
- Right-click any conversation in the Conversations panel
- Click "Export Conversation"
- Choose format:
- Markdown (.md) - Great for documentation
- JSON (.json) - For backup and sharing
- HTML (.html) - View in your browser
- File saves to your chosen location
Switch AI Providers
- Look at the provider dropdown in the chat header
- Select Claude or Gemini
- Instantly switches to the selected provider
- Both providers work independently with their own API keys
Analyze Current File
- Open any file
- Press
Ctrl+Shift+P
- Run "AI Assistant: Analyze Current File"
Refactor with Plan
- Open the file you want to refactor
- Press
Ctrl+Shift+P
- Run "AI Assistant: Refactor with Agentic Plan"
- Describe your refactoring goal
- Review the plan
- Click "GO" to execute
Use Git Integration (v1.0.23+)
Git features automatically appear when you're in a Git repository!
See Git Status: Look for the branch indicator in the chat header (top right)
- Shows current branch name
- Red dot (●) indicates uncommitted changes
- Click to see full Git status
Quick Git Actions: Use the Git buttons below the input area:
- 🔀 Status - View current Git status with changed files
- 📝 Commit - AI generates commit message from your changes
- 📊 Changes - View diffs with syntax highlighting
- 📜 History - See recent commits
AI Commit Workflow:
- Make code changes
- Click 📝 Commit button
- AI analyzes changes and suggests commit message
- Review the suggested message and file list
- Reply "yes" or "commit" to proceed
- Done! Changes committed with meaningful message
No Git Repository?: Git features automatically hide - no setup needed!
🚀 Three-Workflow Approach (v2.5.0+)
HalCode now supports three complementary AI workflows for maximum productivity:
1️⃣ Multi-Agent Workflow - Large Projects
- Architect plans the structure
- Developer implements the code
- QA reviews and tests
- Best for: Complex projects, team collaboration, thorough planning
- Supported Providers: Claude, Gemini
- Ollama Support: Coming soon (experimental phase)
2️⃣ Agentic Agent Workflow - Automation & Verification
- Automated step-by-step execution
- Detailed progress tracking
- Terminal command support (npm install, tests, etc.)
- Error detection and fixing
- Best for: Running npm install, testing, verification, automation
- Supported Providers: Claude, Gemini
- Ollama Support: Coming soon (experimental phase)
3️⃣ AI Assistant Workflow - Details & Polish
- Natural, conversational tone
- Files written directly to disk
- Context preserved across messages
- No code in chat
- Best for: Filling in details, small features, refinements, follow-ups
- Supported Providers: Claude, Gemini, Ollama (local & cloud) ✅
🎯 Provider Support Matrix
| Feature |
Claude |
Gemini |
Ollama Local |
Ollama Cloud |
| AI Assistant |
✅ Full |
✅ Full |
✅ Full |
✅ Full |
| Agentic Mode |
✅ Full |
✅ Full |
❌ Not yet |
🔄 Planned |
| Multi-Agent |
✅ Full |
✅ Full |
❌ Not yet |
🔄 Planned |
| Cost |
Pay-per-use |
Free tier + paid |
Free |
Free + paid |
| Speed |
Fast |
Fast |
Slow |
Fast |
| Privacy |
Cloud |
Cloud |
Local |
Cloud |
Legend:
- ✅ Full = Fully supported and tested
- 🔄 Planned = Coming in future release
- ❌ Not yet = Not supported (use Claude/Gemini instead)
🎯 Recommended Workflow
For Building New Projects:
- Use Multi-Agent (Claude/Gemini) to create the large project structure
- Use Agentic Agent (Claude/Gemini) to run npm install and verify the build
- Use AI Assistant (any provider) to fill in details and add features
For Quick Changes:
- Use AI Assistant with your preferred provider (Claude, Gemini, or Ollama)
For Complex Refactoring:
- Use Multi-Agent (Claude/Gemini) to plan the refactoring
- Use Agentic Agent (Claude/Gemini) to execute changes
- Use AI Assistant (any provider) to polish and finalize
For Budget-Conscious Users:
- Use Ollama Cloud Free with AI Assistant mode
- No API keys needed, just run
ollama signin
- Perfect for casual coding tasks
💡 Best Practices for AI Assistant
🎯 Recommended Workflow for Best Results
For Quick Changes (AI Assistant Direct):
- Ask AI Assistant to make a change
- If AI instantly says "I'll create/edit X files", reply "yes"
- AI writes files directly to disk ✅
For Larger Projects (Use Agentic Agent):
- Ask AI Assistant for a detailed plan
- AI provides a comprehensive plan with multiple steps
- Reply to the plan message and type "please implement"
- Click the "Agentic" button in the toolbar
- Agentic Agent reviews the plan and asks for approval
- Reply "yes" to execute
- Agentic Agent handles all the work automatically ✅
Example Workflow:
You: "What can we do to improve this site?"
AI: [Provides detailed improvement plan]
You: [Click Reply on that message]
You: "please implement"
You: [Click Agentic button]
Agentic: "Here's what I'll do... Should we proceed?"
You: "yes"
Agentic: [Executes entire plan automatically] ✅
Natural Conversation Flow
- AI Assistant now feels like talking to a colleague
- No need to click "reply on last message" - context is preserved
- Ask follow-up questions naturally
- AI remembers what it created
- NEW: Gemini now remembers entire conversations (v2.6.1)
Keep the AI on Track
When using Agentic Agent mode, if something interrupts the flow or you encounter an issue:
- Click "Reply on last message" - This keeps the conversation context
- Add context to your message - Explain what happened or what you need
- Send the message - The AI will continue from where it left off
Example:
- AI creates files and asks "What would you like me to do next?"
- You get an error when running
npm install
- Instead of starting a new conversation:
- Click "Reply on last message"
- Type: "I got an error: [paste error]. Can you help me fix this?"
- Send it
- AI continues in the same context and helps you fix it
Why This Matters
- Preserves Context: AI remembers what it just built
- Faster Fixes: No need to re-explain the project
- Better Flow: Keeps the agentic workflow going smoothly
- Fewer Mistakes: AI has full context of what was done
Pro Tips
- ✅ Use "Reply on last message" for follow-up questions
- ✅ Paste full error messages for better debugging
- ✅ Ask for specific fixes, not just "it doesn't work"
- ✅ Let the AI suggest next steps from the post-completion menu
- ❌ Don't start a new conversation mid-task
- ❌ Don't lose context by asking unrelated questions
📋 Commands
| Command |
Description |
AI Assistant: Open Chat |
Open the AI chat interface |
AI Assistant: Analyze Current File |
Get AI analysis of the active file |
AI Assistant: Refactor with Agentic Plan |
Perform planned refactoring |
AI Assistant: Configure System Prompt |
Set your personalized instructions |
AI Assistant: Set API Key |
Configure your AI provider API key |
⚙️ Settings
| Setting |
Description |
Default |
personalizedAI.apiProvider |
AI provider (claude, gemini, openai) |
claude |
personalizedAI.model |
Specific model to use |
claude-3-5-sonnet-20241022 |
personalizedAI.maxContextTokens |
Maximum context tokens |
150000 |
personalizedAI.agenticMode |
Enable plan-before-execute workflow |
true |
personalizedAI.primaryLanguage |
Your primary programming language |
TypeScript |
🎯 Best Practices
1. Create a Style Guide
Create a styleguide.md or CLAUDE.md in your project root:
# Project Style Guide
## Code Style
- Use TypeScript strict mode
- Prefer functional programming
- Maximum line length: 100 characters
## Naming Conventions
- Variables: camelCase
- Classes: PascalCase
- Constants: UPPER_SNAKE_CASE
## Architecture
- Feature-based folder structure
- Separate business logic from UI
- Use dependency injection
2. Use Context Wisely
- Click "Send + Context" to include current file context
- The AI will read your style guides automatically
- Keep related files in the same directory
3. Leverage Agentic Mode
- For complex changes, use "Refactor with Agentic Plan"
- Review the plan carefully before approving
- The AI will show you a diff before applying changes
🔒 Privacy & Security
- API Keys: Stored securely in VS Code's secret storage
- Code Privacy: Your code is only sent to the AI provider you choose
- No Telemetry: This extension doesn't collect or send usage data
🛠️ Development
Want to customize or contribute?
# Clone the repository
git clone <your-repo-url>
# Install dependencies
npm install
# Compile TypeScript
npm run compile
# Run in development mode
# Press F5 in VS Code to launch Extension Development Host
📝 License
MIT License - See LICENSE file for details
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
💡 Tips
- Start Simple: Begin with basic chat interactions to understand the AI's style
- Refine Your Prompt: Iterate on your system prompt to match your preferences
- Use Context: Always include context for better, more relevant responses
- Review Changes: Always review AI-generated code before committing
- Iterate: The AI learns from conversation - don't hesitate to ask for refinements
- Use Git Integration: Let AI generate meaningful commit messages - saves time and improves commit quality!
🆘 Troubleshooting Ollama Cloud
"Model not found" error
Solution: Run ollama signin in your terminal to authenticate with Ollama Cloud
"Rate limit exceeded" error
Solution: You've hit the free tier usage limits. Either:
- Wait for the hourly/weekly limit to reset
- Upgrade to Ollama Cloud Pro ($20/mo) for higher limits
- Use a local model instead
Cloud model is slow
Solution:
- Check your internet connection
- Try a smaller model (e.g.,
deepseek-r1:1.5b)
- Upgrade to Ollama Cloud Pro for faster speeds
"Connection refused" error
Solution: Make sure Ollama is running:
ollama serve
🆘 Support
Having issues? Please open an issue on GitHub with:
- VS Code version
- Extension version
- Steps to reproduce
- Error messages (if any)
- Which AI provider you're using (Claude, Gemini, or Ollama)
📦 Version History
v3.0.11 (Latest) - Ollama Cloud & Reasoning Models 🚀
- NEW: DeepSeek-V3.1 Cloud support - Ultra-fast instant responses via Ollama cloud API
- NEW: Full DeepSeek-R1 reasoning model support (1.5B through 671B variants)
- IMPROVED: Enhanced Ollama integration with better error handling
- IMPROVED: Model loading detection and pre-loading for reasoning models
- IMPROVED: Increased timeout to 10 minutes for complex reasoning tasks
- IMPROVED: Better error messages for Ollama connectivity issues
- FEATURE: Easy model switching between fast (V3.1 Cloud) and reasoning (R1) models
v3.0.10 - Ollama Local Model Support 🦙
- NEW: Ollama integration for running AI models locally without API keys
- NEW: Support for multiple open-source models (Llama 2, Mistral, CodeLlama, Phi, Gemma)
- NEW: Model switching capability between different AI providers
- IMPROVED: Conversation history with local model support
- FEATURE: Free, private, offline-capable AI inference
v3.0.0 - Task List System & AI Assistant Fix ✅
- NEW: Real-time task tracking extracted from AI plans
- NEW: Progress visibility (NOT_STARTED → IN_PROGRESS → COMPLETE)
- FIXED: AI Assistant file detection regex for all response formats
- IMPROVED: Approval flows for both AI Assistant and Agentic workflows
- IMPROVED: Instant feedback with visual checkmarks
v2.6.5 - Bug Fixes & Improvements 🐛
- FIXED: Agentic Agent menu persistence bug (menu no longer reappears after closing)
- FIXED: AI Assistant file approval flow (now consistently asks for approval before writing files)
- IMPROVED: Better keyword detection for file creation requests (works with natural language like "change color")
- IMPROVED: Two-step approval flow for AI Assistant (analyze → ask approval → write files)
- NOTE: Next features in development: AI Assistant task lists, phased task execution, chat history view fixes
v2.6.4 - Organic AI Conversation 🎯
- IMPROVED: Claude and Gemini now operate more organically between chat and file creation
- IMPROVED: Natural conversation flow when switching between chat and code generation
- IMPROVED: Seamless context awareness across chat and file operations
v2.6.1 - Gemini Memory Improvements 💭
- IMPROVED: Gemini now remembers entire conversation history (like Claude)
- IMPROVED: Better context awareness across multi-turn conversations
- IMPROVED: Consistent behavior between Claude and Gemini providers
- IMPROVED: Enhanced agentic workflow reliability with Gemini
- NEW: Fallback handling for isolated messages in agentic workflows
- NOTE: Currently testing and improving features one at a time for best UX
v2.6.0 - Checkpoint System 🎯
- NEW: Checkpoint system for AI Assistant workflow (undo changes anytime!)
- NEW: Interactive [↩️ Revert] buttons on each checkpoint
- NEW: Safe experimentation - try different approaches without fear
- NEW: Automatic checkpoints after file operations
- IMPROVED: Better error handling for checkpoint revert operations
- NOTE: Checkpoints available in AI Assistant workflow only
v2.5.0 - AI Assistant File Writing 🚀
- NEW: AI Assistant can write files directly to disk
- NEW: Natural progress messages ("Creating Cart.jsx... ✅ Done!")
- NEW: Automatic context preservation across messages
- NEW: Task tracking for multi-step operations
- IMPROVED: Seamless workflow with Multi-Agent and Agentic modes
v2.4.0 - Enhanced Agentic Workflow 🤖
- NEW: Detailed progress tracking with step-by-step indicators
- NEW: File-by-file updates showing each file being written
- NEW: Post-completion menu with helpful next actions
- IMPROVED: Better transparency and control over workflow execution
v2.3.0 - Proactive Code Analysis ⚠️
- NEW: Real-time code analysis as you type
- NEW: Yellow squiggly lines for potential issues
- NEW: Quick fixes with lightbulb suggestions
- NEW: Smart detection of missing imports, error handling, deep nesting
v2.2.3 - Conversational AI 💬
- NEW: Human-like conversational AI agents (Architect, Developer, QA)
- NEW: Professional yet warm communication style
- NEW: Natural language responses with "I" and "you"
- NEW: Contextual acknowledgments before responses
- NEW: Personality-driven agent interactions
- IMPROVED: Removed robotic "SYSTEM" messages
- IMPROVED: Natural approval requests and handoffs
- IMPROVED: Better user engagement and understanding
- NEW: Smart file chunking for large files (70% memory reduction)
- NEW: Persistent embedding cache (10x faster searches)
- NEW: Progressive indexing with priority-based processing
- NEW: Token budget protection for huge projects
- NEW: Smart summarization for very large files (>500KB)
- IMPROVED: Handles 10,000+ file projects efficiently
- IMPROVED: Detailed indexing statistics and skip reporting
v2.2.0 - Code Generation Intelligence 🎨
- NEW: Advanced code generation with pattern recognition
- NEW: Intelligent import suggestions
- NEW: Duplicate code detection
- NEW: Refactoring suggestions
- IMPROVED: Better code quality and consistency
v1.0.23 - Git Integration 🔀
- NEW: AI-generated commit messages from code changes
- NEW: Git status indicator in chat header
- NEW: View diffs directly in chat with syntax highlighting
- NEW: Quick Git actions (Status, Commit, Changes, History)
- NEW: Real-time Git status monitoring
- NEW: Conventional Commits format support
- FEATURE: Smart commit workflow with AI analysis and approval
💎 Free vs Pro
| Feature |
Free Tier |
Pro ($9.99/mo) |
| Chat Messages |
50/day |
♾️ Unlimited |
| Inline Completions |
20/day |
♾️ Unlimited |
| Terminal Commands |
10/day |
♾️ Unlimited |
| Workspace Search |
♾️ Unlimited |
♾️ Unlimited |
| Git Integration |
♾️ Unlimited |
♾️ Unlimited |
| Multi-Agent Workflows |
❌ 7-day trial |
✅ Unlimited |
| Agentic Workflows |
❌ 7-day trial |
✅ Unlimited |
| Folder Indexing |
❌ 7-day trial |
✅ Unlimited |
| Custom Coding Rules |
❌ 7-day trial |
✅ Unlimited |
| 200K Context Engine |
✅ Yes |
✅ Yes |
| Conversational AI |
✅ Yes |
✅ Yes |
🎉 7-Day Pro Trial - Try all Pro features free for 7 days!
🆚 HalCode vs Competitors
| Feature |
Enterprise Tools* |
HalCode Pro |
Winner |
| Inline Completion |
✅ |
✅ |
🤝 TIE |
| Terminal Integration |
✅ |
✅ |
🤝 TIE |
| Workspace Search |
✅ |
✅ |
🤝 TIE |
| 200K Context Engine |
✅ |
✅ |
🤝 TIE |
| Multi-Agent System |
✅ |
✅ |
🤝 TIE |
| Conversational AI |
❌ |
✅ |
🏆 HALCODE |
| Pricing |
$50-100/mo |
$9.99/mo |
🏆 HALCODE |
| Customization |
Limited |
Full Control |
🏆 HALCODE |
| Free Tier |
❌ No |
✅ Yes |
🏆 HALCODE |
💰 Save 80-90% compared to enterprise AI coding tools!
*Compared to similar enterprise AI coding assistants
v1.0.22 - Multi-Agent Enhancements 🤖
- NEW: Real-time agent status cards with visual feedback
- NEW: Agent progress tracking with pulsing animations
- NEW: Color-coded agent states (working, complete, error)
- IMPROVED: Enhanced multi-agent collaboration UI/UX
- IMPROVED: Better error handling and retry mechanisms
- IMPROVED: Deferred heavy initialization for faster startup
- IMPROVED: LRU cache eviction for embeddings (100 entry limit)
- IMPROVED: Chat history limited to last 50 messages on restore
- IMPROVED: Expanded file indexing exclusions
- IMPROVED: Better .vscodeignore configuration
v1.0.20 - UI/UX Improvements ✨
- NEW: Smooth animations and transitions
- NEW: Message timestamps
- NEW: Toast notifications for user feedback
- NEW: Progress indicators for long operations
- NEW: Enhanced code blocks with copy buttons
- IMPROVED: Better message styling and layout
v1.0.9 - Multi-Agent Collaboration 🤖👥
- NEW: Three-agent system (Architect, Developer, QA)
- NEW: Collaborative code generation workflow
- NEW: User approval before implementation
- NEW: Automatic file creation
Earlier Versions
- v1.0.8: Chat history, search, and export features
- v1.0.7: Conversation management
- v1.0.6: Deep context awareness
- v1.0.5: Style guide integration
- v1.0.0: Initial release
Built with ❤️ for developers who want AI that truly understands their code