Skip to content
| Marketplace
Sign in
Visual Studio Code>AI>Llama.cpp Provider for GitHub Copilot ChatNew to Visual Studio Code? Get it now.
Llama.cpp Provider for GitHub Copilot Chat

Llama.cpp Provider for GitHub Copilot Chat

Maruf Bepary

|
2 installs
| (0) | Free
An extension that integrates Llama.cpp into GitHub Copilot Chat
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Llama.cpp Provider for GitHub Copilot Chat

This extension integrates Llama.cpp models into GitHub Copilot Chat in VS Code.

Features

  • Integrates Llama.cpp server into VS Code's language model chat.
  • Supports streaming responses.
  • Handles tool calling for function invocations.
  • Manages multiple models from the Llama.cpp server.

Requirements

  • VS Code version 1.104.0 or higher.
  • A running Llama.cpp server with OpenAI-compatible API.

Stack

  • TypeScript: A typed superset of JavaScript.
  • VS Code API: APIs for building extensions.

Design

The extension uses a base provider class for OpenAI-compatible chat APIs. The Llama.cpp provider extends this base to connect to a local Llama.cpp server. It handles model fetching, message conversion, and streaming responses. Tool calling is supported through OpenAI-compatible formats.

Setting Up Project

  1. Clone the repository.
git clone https://github.com/your-org/llama-vscode-chat.git
  1. Install dependencies.
npm install
  1. Compile the extension.
npm run compile
  1. Open in VS Code and run the extension.

Usage

Install the extension from the marketplace. Configure the Llama.cpp server URL via the command palette. Select the Llama.cpp provider in the chat interface. Start chatting with the integrated models.

References

  • Llama.cpp Documentation
  • VS Code Extension API
  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft