VibeCoding – Local AI Coding Assistant for VS Code
Free. No token limits. Your AI, your choice.
VibeCoding brings Claude-like AI code assistance to VS Code without paying for tokens or hitting subscription limits. Choose your AI backend—Ollama, LM Studio, or Hugging Face Inference—and start coding with intelligence.
🎯 Features
- Zero Cost: Local AI models, no API tokens, no per-request billing
- Flexible Backends: Switch between Ollama, LM Studio, or Hugging Face Inference
- Code Generation: AI writes functions, refactors, debugs, and explains code
- File System Tools: AI can create folders, write files, and run terminal commands
- Privacy-First: All conversations stay on your machine
- Multi-Model Support: Use any open-source LLM (Llama 3.2, Mistral, Phi-3, TinyLlama, etc.)
⚡ Quick Start (5 Minutes)
1. Install One Backend
Choose ONE of these (all free, all local):
Ollama (Easiest for Mac/Linux)
- Download: ollama.com
- Run:
ollama run llama3.2
- That's it! VibeCoding auto-detects it.
LM Studio (Best for Windows/Mac)
- Download: lmstudio.ai
- Download a model, click "Start Server"
- VibeCoding auto-connects to
localhost:1234
Hugging Face Inference (Advanced)
2. Install VibeCoding
- Open VS Code
- Go to Extensions (Cmd+Shift+X / Ctrl+Shift+X)
- Search for VibeCoding
- Click Install
3. Start Chatting
- Open the VibeCoding chat panel (search "Open VibeCoding Chat" in command palette)
- Pick your backend in settings (if not auto-detected)
- Ask it to code!
Example prompts:
- "Write a function that sorts an array by date"
- "Explain this code to me"
- "Write unit tests for this function"
- "Create a new folder structure for a React app"
🔧 Configuration
All settings are in Settings → VibeCoding:
| Setting |
Default |
Options |
| Provider |
Ollama |
ollama, lmStudio, hfAgent |
| Ollama Model |
llama3.2 |
Any model you've pulled |
| LM Studio URL |
http://localhost:1234/v1 |
Your LM Studio server |
| HF Agent URL |
http://localhost:8000 |
Your HF server |
📚 Documentation
💡 Why VibeCoding?
Problem: AI code assistance tools charge per token or limit free usage.
Solution: Run AI models locally on your hardware. No limits, no billing, full control.
| Feature |
VibeCoding |
GitHub Copilot |
Cursor |
| Cost |
Free |
$10–20/mo |
$20/mo |
| Token Limits |
Unlimited |
50 req/mo (free) |
Generous free tier |
| Model Choice |
Any OSS model |
GPT-4o |
Claude |
| Privacy |
Local (on your machine) |
Sent to GitHub |
Some local caching |
| Works Offline |
✅ Yes |
❌ No |
❌ No |
🚀 Supported Models
VibeCoding works with any open-source LLM. Popular choices:
| Model |
Size |
Speed |
Quality |
Download |
| Llama 3.2 |
3B–70B |
⚡⚡⚡ |
⭐⭐⭐ |
Ollama: ollama run llama3.2 |
| Mistral |
7B–72B |
⚡⚡ |
⭐⭐⭐⭐ |
LM Studio / Ollama |
| Phi-3 |
3.8B–14B |
⚡⚡⚡ |
⭐⭐⭐ |
LM Studio / Ollama |
| TinyLlama |
1.1B |
⚡⚡⚡⚡ |
⭐⭐ |
Ollama: ollama run tinyllama |
Recommendation for First Time:
- Mac M1/M2/M3: Start with
ollama run llama3.2 (fast, accurate, auto-optimizes for GPU)
- Windows/Linux GPU: LM Studio + Mistral (best balance of speed and quality)
- Limited RAM (<8GB): TinyLlama (small, still surprisingly smart)
🐛 Troubleshooting
"Provider not available" error
- Is your backend running? Check:
- Ollama: Is it started? (
ollama serve in terminal)
- LM Studio: Did you click "Start Server"?
- HF Agent: Did you start the Python server? (
python scripts/hf_server.py)
- Is it on the right port?
- Ollama:
http://localhost:11434
- LM Studio:
http://localhost:1234/v1
- HF Agent:
http://localhost:8000
- Check VS Code settings: VibeCoding → Provider → select correct backend
"No model loaded" error
- Ollama: Pull a model first:
ollama run llama3.2
- LM Studio: Download a model inside the app, click "Start Server"
- HF Agent: Specify a model:
python scripts/hf_server.py --model TinyLlama/TinyLlama-1.1B-Chat-v1.0
AI responses are slow
- Try a smaller model (TinyLlama instead of 70B Llama)
- Check if CPU is maxed out (GPU not being used)
- Increase
max_tokens in chat settings for longer responses
Terminal commands fail
- The AI has safety restrictions to prevent dangerous commands
- Blocked:
rm -rf /, sudo, format, etc.
- Ask it to explain the command instead, then run it yourself
📖 Resources
📝 License
MIT License – Use freely, modify, distribute.
🤝 Contributing
Found a bug? Want to add a feature? Open an issue or PR on GitHub.
Ready to code with local AI?
- Install a backend (Ollama recommended for Mac)
- Install VibeCoding from the VS Code Marketplace
- Open chat and start building
No tokens. No limits. Just AI.