LingHub AI Router — VS Code Extension
Cost-optimized multi-LLM orchestration for developers. Route requests intelligently between OpenAI, Anthropic Claude, Google Gemini, AWS Bedrock, and local models (Ollama / vLLM) — all from within VS Code.
Links: GitHub · Issues · Platform docs
Included
- Sidebar Webview UI
- Fast / Cheap / Smart / Max modes
- Non-stream and stream chat
- Cost and context display
- Multi-file agent prompt flow
- AWS Spot / vLLM OpenAI-compatible provider with routing and cloud fallback
- Diff / Apply
- Unified diff patch preview / apply
- AutoFix loop
- Git branch / commit / push / rollback
- GitHub pull request creation
- Context selection engine
Setup
- Open this folder in VS Code.
- Run:
npm install
npm run compile
- Press F5 to start the extension development host.
Tests (extension logic)
Pure TypeScript helpers (patch parsing, glob, PR policy) are covered by Vitest:
pnpm test # once
pnpm run test:watch
pnpm run test:coverage # Vitest + @vitest/coverage-v8(参照された src のみ集計)
VS Code 結合テスト(Extension Host、@vscode/test-cli + @vscode/test-electron):
pnpm run test:vscode # compile 後に stable VS Code を起動(初回はダウンロードあり)
# プローブ系をスキップ(CI で AWS/Ollama を有効にしたワークスペース向け)
# LINGHUB_VSCODE_TEST_SKIP_NETWORK=1 pnpm run test:vscode
From the monorepo root (linghub-platform/):
pnpm run test:linghub-vscode # Vitest only (from monorepo root)
pnpm run test:coverage:vscode-extension
pnpm run test:vscode-extension # Extension Host(GUI / xvfb が必要な場合あり)
テストファイルは src/test/suite/*.mocha.ts(*.test.ts は Vitest 専用)。結合テストのコマンド選別・カバレッジの詳細は docs/operations-guide.md §11。
Package extension (VSIX for internal install)
npm run vsix
Produces linghub-platform-vscode-<version>.vsix in the repo root (server/, web/, etc. are excluded via .vscodeignore). Install in VS Code: Extensions: Install from VSIX…. See docs/operations-guide.md §2.
For downloadable assets via GitHub Releases, push a tag like vscode-extension-v0.9.2 (must match package.json version).
The workflow .github/workflows/vscode-extension-release.yml builds and uploads the VSIX automatically.
Optional Marketplace publish automation is available in .github/workflows/vscode-extension-marketplace.yml (requires repository secret VSCE_PAT).
Required settings (cloud router)
Set these for the primary LingHub Router (OpenAI-compatible) endpoint:
linghub.apiBase — e.g. https://your-router/v1/chat/completions
linghub.apiKey
Review gate (apply / AutoFix review) always uses this cloud router only — never AWS vLLM.
Optional: AWS vLLM (OpenAI-compatible)
When enabled, eligible requests can go to your AWS-hosted vLLM first, with automatic fallback to the cloud router.
| Setting |
Purpose |
linghub.providers.awsOpenAI.enabled |
Master switch |
linghub.providers.awsOpenAI.baseUrl |
API base (e.g. https://your-alb.region.elb.amazonaws.com) — /v1/chat/completions is appended if missing |
linghub.providers.awsOpenAI.modelName |
Model id sent to vLLM |
linghub.providers.awsOpenAI.apiKey |
Optional Bearer token (leave empty for internal ALB) |
linghub.providers.awsOpenAI.timeoutMs / timeoutAutofixMs |
Request timeouts |
linghub.providers.awsOpenAI.cooldownMs |
Pause AWS attempts after failures |
linghub.providers.awsOpenAI.healthPath |
Optional GET path for startup probe (e.g. /health); empty = skip |
Routing flags
linghub.routing.preferAwsForCheap — fast/cheap modes may use AWS
linghub.routing.preferAwsForSmart — smart mode may use AWS (max never uses AWS)
linghub.routing.allowAwsForAutofix — single-file AutoFix/repair generations only (default off)
linghub.routing.longPromptCharThreshold — larger payloads force cloud
linghub.routing.cloudTimeoutMs / cloudTimeoutAutofixMs — cloud router timeouts
Fallback behavior
- If AWS fails (timeout, HTTP error, invalid JSON), the extension posts a visible status and retries on the cloud router (if configured).
- Audit events record
llmBackend (cloud-router | aws-openai-compatible | local-llm), llmLatencyMs, and optional llmFallbackFrom / llmFallbackReason (no prompts or secrets).
Local LLM setup (OS-specific)
- Command palette: LingHub: Open Local LLM Setup Guide (OS を選択) — QuickPick for macOS / Windows / WSL / Linux, or follow
linghub.localLLM.setupTargetOs (auto uses runtime detection including WSL env vars). Shows a Markdown preview (Ollama install is not automated).
Known limitations
- Local Ollama is used for non-stream routes only; streaming stays AWS → cloud (SSE wire format).
- Streaming fallback is supported; very old vLLM builds may stream differently — if stream fails, use non-stream mode or disable AWS for stream.
- AutoFix with review gate on still requires cloud credentials even if generation used AWS.
- Cost figures for AWS are labeled estimates (
AWS_VLLM_ESTIMATE bucket), not invoice-accurate GPU billing.
Optional
linghub.githubToken
linghub.defaultBaseBranch
Operations & production hardening
See docs/operations-guide.md (Japanese) for:
- Configuration validation, JSON export for tickets, and AWS health probes
- Settings for patch limits, protected paths / globs, PR title/body policy, test command override & timeouts
- Security audit Output channel (
linghub.security.auditChannelEnabled) and audit JSONL
Customer-facing security narrative: docs/security-policy.md.
Important
This remains an MVP-oriented repo; tune workspace settings and tenant policy for your environment. The operations guide lists the knobs that matter for production.