Skip to content
| Marketplace
Sign in
Visual Studio Code>AI>VibeCodingNew to Visual Studio Code? Get it now.
VibeCoding

VibeCoding

ViveCode Labs

|
4 installs
| (1) | Free
Local AI coding assistant for VS Code with Ollama, Hugging Face, and OpenAI-compatible backends.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

VibeCoding – Local AI Coding Assistant for VS Code

Free. No token limits. Your AI, your choice.

VibeCoding brings Claude-like AI code assistance to VS Code without paying for tokens or hitting subscription limits. Choose your AI backend—Ollama, LM Studio, or Hugging Face Inference—and start coding with intelligence.

🎯 Features

  • Zero Cost: Local AI models, no API tokens, no per-request billing
  • Flexible Backends: Switch between Ollama, LM Studio, or Hugging Face Inference
  • Code Generation: AI writes functions, refactors, debugs, and explains code
  • File System Tools: AI can create folders, write files, and run terminal commands
  • Privacy-First: All conversations stay on your machine
  • Multi-Model Support: Use any open-source LLM (Llama 3.2, Mistral, Phi-3, TinyLlama, etc.)

⚡ Quick Start (5 Minutes)

1. Install One Backend

Choose ONE of these (all free, all local):

Ollama (Easiest for Mac/Linux)

  • Download: ollama.com
  • Run: ollama run llama3.2
  • That's it! VibeCoding auto-detects it.

LM Studio (Best for Windows/Mac)

  • Download: lmstudio.ai
  • Download a model, click "Start Server"
  • VibeCoding auto-connects to localhost:1234

Hugging Face Inference (Advanced)

  • See INSTALLATION.md for setup

2. Install VibeCoding

  1. Open VS Code
  2. Go to Extensions (Cmd+Shift+X / Ctrl+Shift+X)
  3. Search for VibeCoding
  4. Click Install

3. Start Chatting

  1. Open the VibeCoding chat panel (search "Open VibeCoding Chat" in command palette)
  2. Pick your backend in settings (if not auto-detected)
  3. Ask it to code!

Example prompts:

  • "Write a function that sorts an array by date"
  • "Explain this code to me"
  • "Write unit tests for this function"
  • "Create a new folder structure for a React app"

🔧 Configuration

All settings are in Settings → VibeCoding:

Setting Default Options
Provider Ollama ollama, lmStudio, hfAgent
Ollama Model llama3.2 Any model you've pulled
LM Studio URL http://localhost:1234/v1 Your LM Studio server
HF Agent URL http://localhost:8000 Your HF server

📚 Documentation

  • QUICKSTART.md — 30-second setup for impatient developers
  • INSTALLATION.md — Detailed backend setup, system requirements, troubleshooting
  • ONBOARDING.md — First-time user guide, best practices, examples
  • ARCHITECTURE.md — How VibeCoding works, extension structure, developer guide
  • PUBLISHING_GUIDE.md — How to publish similar extensions to VS Code Marketplace

💡 Why VibeCoding?

Problem: AI code assistance tools charge per token or limit free usage.

Solution: Run AI models locally on your hardware. No limits, no billing, full control.

Feature VibeCoding GitHub Copilot Cursor
Cost Free $10–20/mo $20/mo
Token Limits Unlimited 50 req/mo (free) Generous free tier
Model Choice Any OSS model GPT-4o Claude
Privacy Local (on your machine) Sent to GitHub Some local caching
Works Offline ✅ Yes ❌ No ❌ No

🚀 Supported Models

VibeCoding works with any open-source LLM. Popular choices:

Model Size Speed Quality Download
Llama 3.2 3B–70B ⚡⚡⚡ ⭐⭐⭐ Ollama: ollama run llama3.2
Mistral 7B–72B ⚡⚡ ⭐⭐⭐⭐ LM Studio / Ollama
Phi-3 3.8B–14B ⚡⚡⚡ ⭐⭐⭐ LM Studio / Ollama
TinyLlama 1.1B ⚡⚡⚡⚡ ⭐⭐ Ollama: ollama run tinyllama

Recommendation for First Time:

  • Mac M1/M2/M3: Start with ollama run llama3.2 (fast, accurate, auto-optimizes for GPU)
  • Windows/Linux GPU: LM Studio + Mistral (best balance of speed and quality)
  • Limited RAM (<8GB): TinyLlama (small, still surprisingly smart)

🐛 Troubleshooting

"Provider not available" error

  • Is your backend running? Check:
    • Ollama: Is it started? (ollama serve in terminal)
    • LM Studio: Did you click "Start Server"?
    • HF Agent: Did you start the Python server? (python scripts/hf_server.py)
  • Is it on the right port?
    • Ollama: http://localhost:11434
    • LM Studio: http://localhost:1234/v1
    • HF Agent: http://localhost:8000
  • Check VS Code settings: VibeCoding → Provider → select correct backend

"No model loaded" error

  • Ollama: Pull a model first: ollama run llama3.2
  • LM Studio: Download a model inside the app, click "Start Server"
  • HF Agent: Specify a model: python scripts/hf_server.py --model TinyLlama/TinyLlama-1.1B-Chat-v1.0

AI responses are slow

  • Try a smaller model (TinyLlama instead of 70B Llama)
  • Check if CPU is maxed out (GPU not being used)
  • Increase max_tokens in chat settings for longer responses

Terminal commands fail

  • The AI has safety restrictions to prevent dangerous commands
  • Blocked: rm -rf /, sudo, format, etc.
  • Ask it to explain the command instead, then run it yourself

📖 Resources

  • VS Code Extension API: https://code.visualstudio.com/api
  • Ollama Docs: https://ollama.ai
  • LM Studio: https://lmstudio.ai
  • Hugging Face Models: https://huggingface.co/models?pipeline_tag=text-generation
  • Official Website: https://vivecode.dev

📝 License

MIT License – Use freely, modify, distribute.

🤝 Contributing

Found a bug? Want to add a feature? Open an issue or PR on GitHub.


Ready to code with local AI?

  1. Install a backend (Ollama recommended for Mac)
  2. Install VibeCoding from the VS Code Marketplace
  3. Open chat and start building

No tokens. No limits. Just AI.

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft