Groq AI Assistant for VS Code

A high-performance, premium AI coding assistant designed for speed and precision. Leveraging the Groq LPU™ Inference Engine, this extension provides sub-second responses and a workspace-aware developer experience.
Installation & Setup
For Users
- Install the extension from the VS Code Marketplace.
- Open Settings (
Ctrl+,) and search for Groq Chat.
- Enter your Groq API Key (obtainable from the Groq Console).
For Developers
- Clone the repository:
git clone https://github.com/Nishant7Adhikari/Grok-Extension.git
- Run
npm install to install dependencies.
- Press
F5 in VS Code to launch the extension in a development host.
✨ Key Features
🚀 Ultra-Low Latency Streaming
Experience true real-time interaction. Responses are streamed bit-by-bit using Server-Sent Events (SSE), eliminating wait times even for complex architectural queries.
🧠 Intelligent Context Engine
- Active File Prioritization: Automatically injects the contents of your active editor into the context.
- Workspace Scanning: Intelligently identifies and indexes relevant neighboring files to provide project-wide insights.
🌐 Integrated Web Search
Toggle real-time internet access within the chat. Perfect for checking live documentation, current API statuses, or trending tech stacks without leaving your IDE.
️ Safe Diff Preview
Apply AI-generated code with confidence. Every "Apply" action opens a native VS Code diff view, allowing you to review, edit, and approve changes side-by-side.
⚡ Snappy Inline Completions
Low-latency ghost-text completions with an optimized 300ms debounce and multi-line prediction capabilities.
⌨️ Command Palette & Keyboard Shortcuts
| Feature |
Keybinding (Windows/Linux) |
Keybinding (macOS) |
| Open Chat Panel |
Ctrl + Alt + G |
Cmd + Alt + G |
| Accept Completion |
Tab |
Tab |
| Quick Toggle Menu |
Click Status Bar |
Click Status Bar |
| Show Commands |
Ctrl + Shift + P |
Cmd + Shift + P |
🛠️ Troubleshooting & Best Practices
🚿 Managing Rate Limits
Groq provides high-speed inference but has specific Tokens Per Minute (TPM) and Requests Per Minute (RPM) limits.
- Problem: The cooldown timer resets or stays in a loop.
- Solution: This usually happens if you retry precisely as the timer hits zero while your token bucket is still empty. The extension now uses Exponential Backoff (adding 5s per fail). To avoid this, wait 2–3 seconds after the timer clears before sending a new message.
- Tip: Disable Internet Search for local logic questions to significantly reduce token consumption.
🍱 Optimizing Smart Context
- Problem: The AI forgets recent code or gives irrelevant answers.
- Solution: The extension prioritizes the Active File. If you are working across multiple files, briefly click into the relevant file tab to bring it into the "Active" context, then return to the chat.
- Tip: Close unrelated large files (like logs or build artifacts) to prevent them from "filling up" the AI's memory window.
🌐 Search Accuracy
- Problem: Search doesn't find a specific recent event.
- Solution: We use DuckDuckGo's Instant Answer API. It is best for documentation, stock prices, and major headlines. For highly niche or very local news, try being more specific with city/target names in your prompt.
- Problem: The chat panel shows a logo but no input/messages.
- Solution: Ensure your API Key is set in Settings. If it's still blank, use the
Ctrl + Alt + G shortcut to force-initialize the view, or reload VS Code (Developer: Reload Window).
⚙️ Configuration Options
| Setting |
Description |
Default |
groqChat.apiKey |
Your Groq Cloud API Key. |
"" |
groqChat.model |
The AI model to utilize. |
llama-3.3-70b-versatile |
groqChat.enableChat |
Globally enable/disable the chat UI. |
true |
groqChat.enableInlineSuggestions |
Toggle predictive ghost-text. |
true |
� Author
Nishant Adhikari
� License
This project is licensed under the MIT License. See the LICENSE file for the full text.
Disclaimer: Groq is a trademark of Groq, Inc. This extension is an independent community project.