Skip to content
| Marketplace
Sign in
Visual Studio Code>Programming Languages>AI Code AssistanceNew to Visual Studio Code? Get it now.
AI Code Assistance

AI Code Assistance

Alex Mendes

|
1 install
| (0) | Free
AI Code Assistance — sidebar chat with real tool-calling (read/write files, run commands), slash commands for EDA, ML, RAG, fine-tuning and agents, and role-based routing. Bring your own keys: OpenRouter, OpenAI, Groq, Gemini, or local Ollama/LM Studio. Pick which model each task uses.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

AI Code Assistance — General Coding + Data + Data Science + AI Engineering

A Vibe Coding assistant for VS Code that goes far beyond simple chat. AI Code Assistance turns your editor into a specialist across four full areas:

  • General coding — explain, refactor, generate, fix code in any language
  • Data analytics — inventory datasets, frame business questions, drill into root causes, generate decks
  • Data Science — clean, EDA, build/train/tune ML models, explain them with SHAP/LIME
  • AI Engineering — production RAG, prompt engineering, LoRA fine-tuning, LLM evaluation, agents

Powered by a local backend (a small FastAPI router you run on your machine) that talks directly to the model provider of your choice — bring your own keys for OpenRouter, OpenAI, Groq or Gemini, or run fully local with Ollama / LM Studio. No Colab, no tunnel, no third-party server in the middle.

The assistant responds in whatever language you write in — English, Portuguese, Italian, Spanish, French, or any other. Just type naturally.


Architecture

   ┌──────────────────────┐          ┌──────────────────────┐        ┌──────────────────┐
   │ VS Code (you)        │  HTTP/   │ Local backend        │  HTTPS │ Provider          │
   │  ┌────────────────┐  │  SSE     │ (FastAPI router on   │ ─────▶ │ OpenRouter/OpenAI │
   │  │ AI Code chat   │◄─┼──────────┼─ 127.0.0.1:8001)     │        │ Groq/Gemini  …or… │
   │  └────────────────┘  │          │  role-based routing  │ ─────▶ │ Ollama / LM Studio│
   │  - 14 slash skills   │          │  per-task model cfg  │        │ (local, no key)   │
   │  - Tool calling      │          └──────────────────────┘        └──────────────────┘
   │  - Voice + vision    │
   └──────────────────────┘

Each task type (planner / coder / vision / general / agentic) routes to whatever model you assigned it. Tools execute only on your machine — the backend just emits intents.


What you can do — the four areas

🧑‍💻 General coding (works everywhere)

  • Open Chat: dedicated sidebar chat with an AI Code Assistance icon in the Activity Bar (same UX as Copilot Chat / Claude Code). Click and use, no setup ritual.
  • Inline commands:
    • AI Code Assistance: Explain Selection (Ctrl+Alt+E)
    • AI Code Assistance: Refactor Selection (Ctrl+Alt+R)
    • AI Code Assistance: Generate at Cursor
  • Tool calling: the model can read/write files, run bash, grep, glob, fetch the web, etc. Every tool call shows a confirmation modal with Accept / Always / Deny. Dangerous bash patterns (rm -rf /, dd, fork bombs) block the "Always" option as a safety net.
  • MCP support: connect any MCP server via .mcp.json in your workspace root — both stdio and HTTP transports supported.
  • Voice input and image attachments for vision-capable models.
  • Persistent chat history across sessions.
  • Multilingual: responds in the language you write in — no configuration needed.

📊 Data analytics — /explore, /question, /chart, /drill, /report

Slash What it does
/explore Inventory a dataset, profile every column, surface quality blockers (nulls, duplicates, outliers), recommend feasible analyses
/question Turn a vague business problem into 5-10 testable analytical questions ranked by Impact × Feasibility, with hypotheses for the top 3
/chart Generate Storytelling-with-Data styled charts — action titles, max 2 colors + gray, direct labels, no clutter
/drill Root cause investigation with "peel the onion" methodology: confirm → decompose → isolate → repeat — until you find the specific actionable cause
/report Consolidate the full analysis into a business-ready final_report.md + executable_analysis.ipynb

🔬 Data Science (ML) — /engineer, /eda, /model, /explain

Slash What it does
/engineer Data cleaning, missing values, outlier detection (IQR / Z-score), duplicates, feature engineering driven by business value
/eda Exploratory analysis — descriptive stats, distributions, correlations (Pearson/Spearman), pattern discovery, hypothesis formulation
/model Model selection (linear / trees / boosting / NN), train/val/test splits, hyperparameter tuning, evaluation metrics matched to the business goal
/explain Multi-method feature importance: built-in + permutation + SHAP + LIME + Partial Dependence + interactions + stability checks

🤖 AI Engineering — /rag, /prompt, /finetune, /eval, /agent

Slash What it does
/rag Production RAG pipelines: chunking strategies, embedding selection, hybrid retrieval (semantic + BM25), reranking, prompt template, evaluation with Ragas
/prompt Production prompt engineering — Chain-of-Thought, few-shot, structured outputs (JSON/XML), function calling schemas, anti-jailbreak, robustness testing
/finetune LoRA / QLoRA fine-tuning end-to-end: dataset prep, Unsloth/transformers setup, hyperparams, eval against base, deployment as merged model or adapter, GGUF quantization
/eval LLM evaluation systems: reference-based + rubric-based + reference-free, LLM-as-judge done right, Ragas / DeepEval / PromptFoo / LangSmith, CI integration
/agent Agent architecture — tool catalog design, ReAct / plan-and-execute / reflexion, memory tiers, error handling, framework picking (LangGraph / LlamaIndex / AutoGen / CrewAI / Smolagents)

Type /skills or /help in the chat to list all 14 with descriptions.


Choose which model runs each task

AI Code Assistance routes every request to one of five roles — planner, coder, vision, general, agentic — and you decide which model each role uses. Three interchangeable ways, all sharing one config file and hot-reloaded with no restart:

  • ⚙️ Configure Models panel (AI Code Assistance: Configure Models) — a panel with a model dropdown per task, a token field per provider (saved to your local .env, never shared), and base-URL fields for Ollama / LM Studio.
  • Chat commands — /models to see assignments, /model coder openrouter qwen/qwen-2.5-coder-32b-instruct to pin one, /model now <provider> <model> to force a model for the current chat.
  • Edit ai-code-assistance.models.json by hand. It holds only model names (no secrets), so you can commit and share it; teammates add their own keys.

Quick start

  1. Install — search AI Code Assistance in the Extensions panel (Ctrl+Shift+X).

  2. Run the backend locally (from the project repo):

    cp .env.example .env          # set AI_TOKEN + at least one provider key
    uvicorn backend.main:app --host 127.0.0.1 --port 8001
    
  3. Configure VS Code — open Settings (Ctrl+,) and search AI Code Assistance:

    • aiCodeAssistance.backendUrl → http://127.0.0.1:8001
    • aiCodeAssistance.token → the AI_TOKEN from your .env
  4. Click the AI Code Assistance icon in the Activity Bar (left rail) — the chat opens in the sidebar.

  5. Type /skills to see everything you can do, and click the ⚙️ to pick your models.


Settings

Setting Default Purpose
aiCodeAssistance.backendUrl "" Backend URL — e.g. http://127.0.0.1:8001
aiCodeAssistance.token "" Bearer token (AI_TOKEN), sent on every request (machine-scoped, not synced)
aiCodeAssistance.toolsEnabled true Allow the model to call tools (file I/O, bash, web, MCP)
aiCodeAssistance.permissionMode ask Tool-call approval mode: ask / auto-edit / auto / plan
aiCodeAssistance.healthPollSeconds 10 How often to poll /health for the status bar indicator

MCP Servers

Create a .mcp.json file at your workspace root to connect external MCP servers:

{
  "mcpServers": {
    "my-server": {
      "command": "npx",
      "args": ["-y", "@my-org/mcp-server"],
      "env": { "API_KEY": "your-key-here" }
    },
    "http-server": {
      "url": "http://localhost:3000/mcp",
      "headers": { "Authorization": "Bearer TOKEN" }
    }
  }
}

The ⚙️ Configure Models panel shows active MCP servers and has an "Edit .mcp.json" button.

Keybindings

Action Shortcut
Focus AI Code Assistance chat (sidebar) Ctrl+Shift+M (Cmd+Shift+M)
Explain selection Ctrl+Alt+E
Refactor selection Ctrl+Alt+R

Privacy & security

  • Tools require explicit per-session approval. "Always" decisions live in RAM and reset when VS Code closes.
  • Token is machine-scoped — does not sync across devices via Settings Sync.
  • The backend runs on your own machine and calls your own provider account with your own keys. No third-party server in the middle. Provider keys live in your local .env (gitignored) — never in the shareable model config.
  • File system access and shell execution happen only on your machine — the backend just forwards the model's intent.

Why this stack?

  • Your models, your bill. Mix cloud (OpenRouter / OpenAI / Groq / Gemini) and local (Ollama / LM Studio) per task.
  • Right model per task. A cheap fast model for general chat, a strong coder for code, a vision model for screenshots — configured independently.
  • Shareable, secret-free config. The model config file carries model choices; teammates plug in their own keys.
  • Editor stays local. All file I/O and shell execution happen on your machine, behind a permission prompt.
  • Multilingual. The assistant responds in the language you write in, automatically.

Limitations

  • You run the backend yourself (a single uvicorn command) and supply at least one provider key, or a local engine (Ollama / LM Studio). The extension does not include hosted inference.
  • AI Engineering skills generate code; they don't run training/RAG/eval pipelines for you. They tell you what to write and why.

Issues & contributing

Bug reports, feature requests, and PRs are welcome at github.com/mendesalex89/AI_Assistant.

License

MIT — see LICENSE.

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft