VS Code extension for the R4 / RiVault AI Platform. Provides native editor
integration with Gateway-authenticated LLM endpoints, endpoint management,
job monitoring, and model browsing.
Features
- @r4 Chat Participant — Chat with R4 platform LLM models directly in
VS Code Chat. Supports streaming responses and conversation history.
- Endpoint Management — Sidebar tree view to list, provision, and
teardown inference endpoints.
- Job Monitoring — Sidebar tree view to list jobs, view logs, and cancel
running/pending jobs.
- Model Browser — Browse the UMRS model registry with scope-aware
filtering (system/project/user).
- Status Bar — Connection indicator showing Gateway health.
- Agent Window MCP Tools — Exposes R4 Platform and RAG MCP servers to
Copilot agent mode and the VS Code Agents window.
- Native Model Provider — Registers R4 Platform LLM endpoints as native
language models, selectable in the Copilot chat/edit/agent model picker.
Supports streaming and tool calling.
Chat Commands
| Command |
Description |
@r4 |
Chat with the platform LLM |
@r4 /models |
List available models |
@r4 /endpoints |
List active endpoints |
@r4 /jobs |
List recent jobs |
Setup
- Install the extension
- Run R4: Login to Gateway from the Command Palette
- Enter your Gateway URL and API token
- Ensure
r4-mcp is available on your PATH, or set r4.mcp.command
to the executable path
Alternatively, set r4.gatewayUrl in VS Code settings and the
R4_GATEWAY_URL environment variable.
After login, VS Code discovers two MCP servers from this extension:
R4 Platform — endpoint, job, workflow, model, and platform health tools
R4 RAG — search_docs, list_documents, and rag_status
These servers appear in Copilot agent mode and the Agents window
customizations panel. The extension passes Gateway credentials to the spawned
MCP process through environment variables, not command-line arguments.
Using R4 Models in Copilot
There are two ways to drive Copilot chat / edit / agent on R4 Platform models
(e.g. Kimi-K2.6, GLM-5.1, DeepSeek-V4-Pro). Both require a model with a
ready endpoint — the model picker only lists models that currently have a
warm/active endpoint.
Option A — Native provider (this extension)
- Run R4: Login to Gateway.
- Open the Copilot model picker → models appear under the R4 Platform
vendor. Pick one for chat, edit, or agent mode.
Option B — Copilot BYOK via Custom Endpoint (no extension required)
This uses VS Code's built-in Custom Endpoint provider (which replaced the
deprecated OpenAI Compatible provider). You do not create
chatLanguageModels.json by hand — VS Code opens it for you during the flow.
- Issue a long-lived JWT (admin):
... admin user add ... --issue-token --token-days 365 --gateway-url <gateway> --profile ai4s.
- In VS Code, run Chat: Manage Language Models (Command Palette), or open
the Chat model picker → Manage Language Models (gear icon).
- Select Add Models → Custom Endpoint.
- Enter a group name (e.g.
R4 AI4S), a display name, and paste the JWT as the
API key. Select API type Chat Completions.
- VS Code opens
chatLanguageModels.json. Paste the config below and save. The
url must be the full chat-completions URL
(<gateway>/gw/llm/v1/chat/completions), and each id must match the model
id returned by GET <gateway>/gw/llm/v1/models.
[
{
"name": "R4 AI4S",
"vendor": "customendpoint",
"apiKey": "YOUR_R4_JWT",
"apiType": "chat-completions",
"models": [
{
"id": "Kimi-K2.6",
"name": "Kimi-K2.6 (R4)",
"url": "https://data0.ai.r-ccs.riken.jp/ai4s/gw/llm/v1/chat/completions",
"toolCalling": true,
"vision": false,
"maxInputTokens": 253952,
"maxOutputTokens": 8192
},
{
"id": "DeepSeek-V4-Pro",
"name": "DeepSeek-V4-Pro (R4)",
"url": "https://data0.ai.r-ccs.riken.jp/ai4s/gw/llm/v1/chat/completions",
"toolCalling": true,
"vision": false,
"maxInputTokens": 1040000,
"maxOutputTokens": 8192
},
{
"id": "GLM-5.1",
"name": "GLM-5.1 (R4)",
"url": "https://data0.ai.r-ccs.riken.jp/ai4s/gw/llm/v1/chat/completions",
"toolCalling": true,
"vision": false,
"maxInputTokens": 194560,
"maxOutputTokens": 8192
}
]
}
]
- Pick the model from the chat model picker. If it does not appear, restart
VS Code.
Config notes:
toolCalling: true is required — models without it are hidden from the
agent-mode picker.
maxInputTokens should track each model's served context window (query
GET <gateway>/gw/llm/v1/models and read max_model_len). Leave headroom
for output. Current ai4s ceilings: DeepSeek-V4-Pro 1,048,576 (1M),
Kimi-K2.6 262,144, GLM-5.1 202,752. Only DeepSeek serves a 1M window; the
other two are capped at their trained context and cannot be raised to 1M.
- With Option A (native provider) the context window is derived from
max_model_len automatically — no per-model token config needed.
apiKey is sent as Authorization: Bearer <JWT>, which the Gateway
expects.
Note: Both paths drive Copilot chat, edit, and agent (tool calling)
only. Neither drives inline ghost-text completions, which stay on Copilot's
own model.
Settings
| Setting |
Default |
Description |
r4.gatewayUrl |
"" |
Gateway URL. Falls back to R4_GATEWAY_URL env. |
r4.defaultModel |
"" |
Default model for chat. Uses first available if empty. |
r4.autoRefreshInterval |
30 |
Polling interval in seconds (0 to disable). |
r4.mcp.enabled |
true |
Expose R4 MCP servers to Copilot agent mode and Agents window. |
r4.mcp.command |
"r4-mcp" |
Command used to launch R4 MCP servers. |
r4.mcp.projectId |
"" |
Default RAG project ID. Falls back to R4_PROJECT_ID. |
r4.mcp.timeout |
30 |
Gateway request timeout for R4 MCP tools, in seconds. |
Architecture
The extension is a thin client over Gateway HTTP APIs:
VS Code Extension → Gateway HTTP (/gw/*) → UMRS
- No direct UMRS or LiteLLM access
- API token stored in VS Code SecretStorage (never plaintext)
- Graceful offline degradation when Gateway is unreachable
- All MCP tool calls pass through Gateway policy and audit
- Agent Window integration uses VS Code's MCP provider API; it does not bypass
Gateway or call UMRS/LiteLLM directly.
Development
cd vscode-extension
npm install
npm run compile # Type-check + build
npm run watch # Watch mode (esbuild)
Press F5 in VS Code to launch the Extension Development Host.
Packaging & Installing the .vsix
To build a distributable .vsix and install it into VS Code:
cd vscode-extension
npm install # first time only
# Build the production bundle and package the .vsix.
# (uses @vscode/vsce; npx fetches it on demand — no global install needed)
npx --yes @vscode/vsce package --allow-missing-repository --skip-license
# → produces r4-platform-<version>.vsix in this folder
Install the packaged extension:
# From the command line
code --install-extension r4-platform-0.1.1.vsix
# To remove a previously installed copy first
code --uninstall-extension r4-platform.r4-platform
Or install from the VS Code UI: Extensions view → ... menu →
Install from VSIX... → select the .vsix file.
After installing, reload the window (Developer: Reload Window) and run
R4: Login to Gateway to authenticate.
Notes:
- The
.vsix is a build artifact and is git-ignored; rebuild it from source
rather than committing it.
npm run package runs the same production build that vsce package
invokes via vscode:prepublish, so type errors fail the package step.
Project Structure
src/
extension.ts — Entry point, wires all components
gateway-client.ts — HTTP client for all Gateway communication
auth.ts — SecretStorage-based auth management
status-bar.ts — Connection status indicator
chat-participant.ts — @r4 Chat Participant
views/
endpoints.ts — Endpoint tree view
jobs.ts — Job tree view
models.ts — Model tree view
resources/
r4-icon.svg — Activity bar icon