Llama.cpp Provider for GitHub Copilot Chat

This extension integrates Llama.cpp models into GitHub Copilot Chat in VS Code.

NOTE: My computer is not powerful enough to run decent models locally so this extension wasn't tested fully. Feel free to contribute if this is something that is useful.

Features

Integrates Llama.cpp server into VS Code's language model chat.
Supports streaming responses.
Handles tool calling for function invocations.
Manages multiple models from the Llama.cpp server.

Requirements

VS Code version 1.104.0 or higher.
A running Llama.cpp server with OpenAI-compatible API.

Stack

TypeScript: A typed superset of JavaScript.
VS Code API: APIs for building extensions.

Design

The extension uses a base provider class for OpenAI-compatible chat APIs. The Llama.cpp provider extends this base to connect to a local Llama.cpp server. It handles model fetching, message conversion, and streaming responses. Tool calling is supported through OpenAI-compatible formats.

Setting Up Project

Clone the repository.

git clone https://github.com/your-org/llama-vscode-chat.git

Install dependencies.

npm install

Compile the extension.

npm run compile

Open in VS Code and run the extension.

Usage

Install the extension from the marketplace. Configure the Llama.cpp server URL via the command palette. Select the Llama.cpp provider in the chat interface. Start chatting with the integrated models.

Llama.cpp Provider for GitHub Copilot Chat

Maruf Bepary

Llama.cpp Provider for GitHub Copilot Chat

Features

Requirements

Stack

Design

Setting Up Project

Usage

References