Operium Orchestra

AI coding assistant for VS Code with local models via Orchestra Runtime, Ollama, and LM Studio, plus cloud GigaChat via Operium.

Features

Default agent workflow with Plan and IDE-context toggles
Local GGUF models through Orchestra Runtime
Local models through Ollama
OpenAI-compatible local models through LM Studio
Cloud GigaChat access through Operium
Editor-context attachments from the active file or selection
Embeddings-first RAG with @codebase, @folder, @docs, and the codebase_search agent tool
Streaming responses in a dedicated sidebar UI
One-click Install in Orchestra from Operium Hub: deep link vscode://OperiumControlHub.operium-orchestra/install-model?manifest_url=… validates manifest_url against orchestra.hubAllowedOrigins, downloads the GGUF with checksum verification, and places it under orchestra.modelsPath (default ~/.orchestra/models). Smoke test: install the built VSIX (or run Extension Development Host), then open in a browser: vscode://OperiumControlHub.operium-orchestra/install-model?manifest_url=<url-encoded HTTPS manifest URL>.

How Runtime Works

Operium Orchestra does not install Orchestra Runtime automatically during extension installation.

The Runtime is downloaded on demand from inside the extension when you press the one-click install action in the onboarding flow. If a local Runtime binary is already available, the extension reuses it instead of downloading a new one.

Supported Backends

Orchestra Runtime for local GGUF models via llama.cpp
Ollama for local coding and chat models
LM Studio for OpenAI-compatible local endpoints
Operium API for cloud access to GigaChat

Configuration

Setting	Default	Description
`orchestra.hubUrl`	`https://hub.operium.ru/models`	Operium Hub catalog URL (sidebar “model catalog” button)
`orchestra.hubAllowedOrigins`	`["https://hub.operium.ru"]`	Allowlisted `origin` values for Hub `install-model` manifest URLs
`orchestra.modelsPath`	empty (`~/.orchestra/models`)	Directory where Hub installs store downloaded model files
`orchestra.operiumUrl`	`https://operium.ru`	Operium instance URL for cloud access
`orchestra.runtimeUrl`	`http://localhost:8100`	Orchestra Runtime server URL
`orchestra.ollamaUrl`	`http://localhost:11434`	Ollama server URL
`orchestra.lmStudioUrl`	`http://localhost:1234`	LM Studio server URL
`orchestra.defaultModel`	auto	Default model to use
`orchestra.systemPromptExtra`	empty	Extra instructions appended to all modes
`orchestra.diskConfigPath`	empty	Legacy setting; project config is `.orchestra/config.yaml`
`orchestra.chatTemperature`	`0.7`	Sampling temperature for chat requests
`orchestra.requestTimeoutMs`	`7200000`	HTTP timeout for streaming / compaction, 2 hours by default
`orchestra.embeddingBaseUrl`	`http://localhost:1234`	OpenAI-compatible `/v1/embeddings` base URL
`orchestra.embeddingModel`	empty	Embedding model used for codebase/docs RAG
`orchestra.embeddingProvider`	`auto`	`auto`, `openai`, or native `ollama` embeddings
`orchestra.ragRetrievalMode`	`auto`	`auto`, `hybrid`, `text`, or `embeddings` retrieval
`orchestra.codebaseAutoIndex`	`true`	Automatically refresh `.orchestra/index` on workspace changes
`orchestra.codebasePaths`	`[]`	Optional workspace-relative roots for `@codebase`; empty means the whole workspace

RAG Index

Orchestra stores the local retrieval index in .orchestra/index. It always supports text retrieval over indexed chunks, and adds semantic vector search when orchestra.embeddingModel is configured. The index is safe to delete or gitignore.

Use @codebase <query>, @folder <path> <query>, @docs <query>, or let the agent call codebase_search when it needs semantic project context.

In monorepos, set orchestra.codebasePaths or YAML orchestra.rag.codebasePaths to keep @codebase focused on the relevant package.

QA

Use docs/manual-qa.md for the current Extension Development Host smoke tests, RAG checks, AgentRunState/guardrail checks, provider matrix, and release gate.

Project config (`.orchestra/config.yaml`)

Orchestra uses one project-local config file in the workspace root. It is created automatically the first time a model is used in the project, or manually through orchestra.initWorkspace.

Models are discovered live from Orchestra Runtime, Ollama, LM Studio, and cloud providers. They do not need to be listed in YAML.
rules: inline strings and uses: paths; toggle files in Settings → Правила.
orchestra block: project defaults such as defaultModel, provider URLs, sampling, context ignores, snippets, and tool policies.
“Rule from answer” appends to .orchestra/rules/from-chat.md.

Precedence is intentionally simple:

VS Code defaults / orchestra.* settings
Project .orchestra/config.yaml
Current UI state for the active session

Support

Website: operium.ru
Email: support@operium.ru

Русский

AI-ассистент для VS Code с локальными моделями через Orchestra Runtime, Ollama и LM Studio, а также с облачным GigaChat через Operium.

Возможности

Агентный поток по умолчанию с переключателями План и Контекст IDE
Локальные GGUF-модели через Orchestra Runtime
Локальные модели через Ollama
OpenAI-совместимые локальные модели через LM Studio
Облачный GigaChat через Operium
Прикрепление контекста из активного файла или выделения
Embeddings-first RAG через @codebase, @folder, @docs и агентский tool codebase_search
Стриминг ответов в боковой панели VS Code
Установка модели в Orchestra одним кликом из Operium Hub: deep link vscode://OperiumControlHub.operium-orchestra/install-model?manifest_url=… — проверка origin по orchestra.hubAllowedOrigins, загрузка с проверкой SHA256, сохранение в orchestra.modelsPath (по умолчанию ~/.orchestra/models). Проверка вручную: установите VSIX или запустите Extension Development Host и откройте в браузере vscode://OperiumControlHub.operium-orchestra/install-model?manifest_url=<URL-encoded HTTPS манифеста>.

Как работает Runtime

Operium Orchestra не устанавливает Orchestra Runtime автоматически в момент установки расширения.

Runtime скачивается по требованию из интерфейса расширения, когда пользователь нажимает действие one-click install в onboarding-сценарии. Если локальный бинарник Runtime уже найден, расширение использует его и не скачивает новый.

Поддерживаемые бэкенды

Orchestra Runtime для локальных GGUF-моделей через llama.cpp
Ollama для локальных coding- и chat-моделей
LM Studio для OpenAI-совместимых локальных endpoint'ов
Operium API для облачного доступа к GigaChat

Настройки

Настройка	По умолчанию	Описание
`orchestra.hubUrl`	`https://hub.operium.ru/models`	URL каталога Operium Hub (кнопка «каталог моделей»)
`orchestra.hubAllowedOrigins`	`["https://hub.operium.ru"]`	Разрешённые origin для `manifest_url` в ссылке установки из Hub
`orchestra.modelsPath`	пусто (`~/.orchestra/models`)	Папка моделей Orchestra Runtime для установок из Hub
`orchestra.operiumUrl`	`https://operium.ru`	URL инстанса Operium для облачного доступа
`orchestra.runtimeUrl`	`http://localhost:8100`	URL сервера Orchestra Runtime
`orchestra.ollamaUrl`	`http://localhost:11434`	URL сервера Ollama
`orchestra.lmStudioUrl`	`http://localhost:1234`	URL сервера LM Studio
`orchestra.defaultModel`	авто	Модель по умолчанию
`orchestra.systemPromptExtra`	пусто	Дополнительные инструкции для всех режимов
`orchestra.diskConfigPath`	пусто	Legacy-настройка; проектный config — `.orchestra/config.yaml`
`orchestra.chatTemperature`	`0.7`	Temperature для чата
`orchestra.requestTimeoutMs`	`7200000`	Таймаут HTTP для стрима / сжатия, по умолчанию 2 часа
`orchestra.embeddingBaseUrl`	`http://localhost:1234`	OpenAI-compatible base URL для `/v1/embeddings`
`orchestra.embeddingModel`	пусто	Embedding-модель для RAG по коду и документации
`orchestra.embeddingProvider`	`auto`	`auto`, `openai` или native `ollama` embeddings
`orchestra.ragRetrievalMode`	`auto`	`auto`, `hybrid`, `text` или `embeddings` retrieval
`orchestra.codebaseAutoIndex`	`true`	Автоматически обновлять `.orchestra/index` при изменениях workspace
`orchestra.codebasePaths`	`[]`	Опциональные корни для `@codebase` относительно workspace; пусто = весь workspace

RAG-индекс

Orchestra хранит локальный retrieval-индекс в .orchestra/index. Text retrieval работает всегда по индексированным chunks, а semantic vector search добавляется, если настроена orchestra.embeddingModel. Индекс можно безопасно удалить или добавить в .gitignore.

Используйте @codebase <запрос>, @folder <путь> <запрос>, @docs <запрос> или агентский tool codebase_search, когда нужен семантический контекст проекта.

В монорепо задайте orchestra.codebasePaths или YAML orchestra.rag.codebasePaths, чтобы @codebase искал только в нужном package.

QA

Актуальный чеклист для Extension Development Host, RAG, AgentRunState/guardrails, провайдеров и release gate находится в docs/manual-qa.md.

Проектный config `.orchestra/config.yaml`

Orchestra использует один config в корне проекта. Он создаётся автоматически при первом использовании модели в проекте или вручную через команду orchestra.initWorkspace.

Модели берутся live из Orchestra Runtime, Ollama, LM Studio и облачных провайдеров. Перечислять их в YAML больше не нужно.
rules: inline-правила и uses: пути; файлы можно включать/выключать в Settings → Правила.
orchestra: defaultModel, URL провайдеров, sampling, ignores для контекста, snippets и tool policies.
Кнопка «правило из ответа» дописывает фрагмент в .orchestra/rules/from-chat.md.

Приоритет простой:

Дефолты/настройки VS Code (orchestra.*)
Проектный .orchestra/config.yaml
Текущее состояние UI активной сессии

Поддержка

Сайт: operium.ru
Email: support@operium.ru

Operium Orchestra

Operium Control Hub

Operium Orchestra

Features

How Runtime Works

Supported Backends

Configuration

RAG Index

QA

Project config (.orchestra/config.yaml)

Support

Русский

Возможности

Как работает Runtime

Поддерживаемые бэкенды

Настройки

RAG-индекс

QA

Проектный config .orchestra/config.yaml

Поддержка

Project config (`.orchestra/config.yaml`)

Проектный config `.orchestra/config.yaml`