Ollama Copilot: Your AI-Powered Coding Companion

Ollama Copilot is an advanced AI-powered Coding Assistant for Visual Studio Code (VSCode), designed to boost productivity by offering intelligent code suggestions and configurations tailored to your current project's context. It harnesses the latest advancements in LLMs to understand the coding needs, providing precise snippets, configurations, and insights to streamline the development workflow.

Workflow Graph
│
├── Start: Generate Retrieval Query
│   └── Retrieve Documents
│       ├───[Documents Found?]
│       │   ├── No ─────[Under Max Attempts?]
│       │   │           └── Yes ──> Transform Retrieval Query ────┐
│       │   │           └── No ───┐                               │
│       │   └── Yes ───────────────────────────────────┐          │
│       └───[Loop Back After Transformation] <─────────┼──────────┘
│                                                      │
├── Grade Documents <──────────────────────────────────┘
│   └───[Documents Relevant?]
│       ├─── No ──> Web Search ──┐
│       └─── Yes ──────────────────────────────────────┐
│                                                      │
├── Context Management <───────────────────────────────┘
│       ├───[Context Exceeds Limit?]
│       │   ├── Yes ──> Summarize Context ─────────────┐
│       │   └── No ────────────────────────────────────┼──────────┐
│       └───[Loop Back If Summarization Needed] <──────┘          │
│                                                                 │
├── Generate <────────────────────────────────────────────────────┘
│   └───[Need To Extract Code?]
│       ├─── Yes ──> Extract Code ──┐
│       └─── No ───────────────────────────────────────┐
│                                                      │
└── END <──────────────────────────────────────────────┘

Insights:

Generate Retrieval Query: Begins by creating a document retrieval query from user input.
Retrieve Documents: Fetches documents; if unsuccessful, it refines the query or assesses document relevance.
Transform Retrieval Query: Refines the query for improved retrieval, then retries document fetching.
Grade Documents: Evaluates document relevance; non-relevant results lead to a web search, while relevant ones move to context management.
Web Search: Supplements context with web information when needed.
Context Management: Prepares information for output; overly large contexts trigger context summarization.
Summarize Context: Reduces context size, looping back for further refinement if necessary.
Generate: Finalizes output based on context and user query, concluding the workflow.

Features

Intelligent Suggestions: Utilizes LLMs for context-aware code suggestions.
Self-Corrective RAG: Improves query relevance and accuracy through optimized retrieval and generation.
Model Customization: Select AI models to match your coding style or project requirements.
Elastic Search Vector Store: Manages document embeddings for enhanced context understanding.
Project Structure Monitoring: Efficiently tracks project changes to keep suggestions up to date.
File Parsing & Embeddings: Generates embeddings from files, stored in Elastic Search for quick access.
Worker Pools: Ensures smooth performance through optimized file processing.
System Templates: Offers customizable interaction templates to align with your workflow.
Web Search Integration: Expands the scope of information retrieval via Tavily AI [optional].

Prerequisites

Before you install and use Ollama Copilot, ensure you have the following setup in your development environment:

VSCode Version: 1.86.0 or higher.
Ollama Service: The Ollama service must be installed and running on your system. Installation instructions are provided below.
Elastic Search Instance: Required for the Vector Store, facilitating semantic similarity searches.

Setup Guide

Ollama Service Installation: Follow below instructions to install and run the Ollama service on your system.
Base URL Configuration: Set the Ollama service URL through the VSCode command palette.
Extension Activation: Enable Ollama Copilot via the command palette.
Select Model: Select the main model via the command palette.
Run Elastic Search Node: Follow below instructions to run the Elastic Search node on your system.
Elastic Search URL Configuration: Set the Elastic Search URL through the VSCode command palette.
Select Embeddings Model: Select the embeddings model via the command palette.
Generate Embeddings: Generate the document embeddings for the vector store.
Invoke Suggestions: Type copilot to receive AI-driven coding suggestions.
Customize Settings: Options to tailor the Copilot to suit your project.

Installation

macOS and Linux

For macOS and Linux users, Ollama can be installed natively with the following steps:

Download Ollama: Go to the official Ollama website to download the software.
Install Dependencies: Ensure all necessary dependencies are installed on your system.
Run Ollama: Use the terminal to start the Ollama service by executing:
```
./ollama --serve
```
Run Elastic Search Node: Setup Elastic Search Server as described in the official Elastic documentation.

Windows, macOS, and Linux (Docker)

For users preferring Docker, or Windows users:

Prerequisites: Make sure Docker is installed on your system. Visit the Docker documentation for installation instructions.
Install Ollama Service:
- Open a terminal or command prompt.
- Pull and run the Ollama service Docker image as outlined in the Ollama Docker image blog post.
This will start the Ollama service, binding it to port 11434 on localhost by default.
Run Elastic Search Node:
- Open a terminal or command prompt.
- Pull the Elasticsearch Docker image by running the following command:
```
docker pull docker.elastic.co/elasticsearch/elasticsearch:8.12.2
```
- Start a single-node Elasticsearch container with the following command:
```
docker run -p 9200:9200 -e "discovery.type=single-node" -e "xpack.security.enabled=false" -e "xpack.security.http.ssl.enabled=false" docker.elastic.co/elasticsearch/elasticsearch:8.12.2
```
This command starts an Elasticsearch container and exposes port 9200. The security features are intentionally disabled to simplify setup, and this configuration is intended for local development environments only.
- Verify that the Elasticsearch instance is up by running the following command:
```
curl -X GET "http://localhost:9200"
```

Configuration

Configure the "Ollama Copilot" extension in VS Code to connect with your Ollama service:

{
  "copilot.baseUrl": "http://localhost:11434",
  "copilot.model": "llama2",
  "copilot.modelTemperature": "0.2", // string
  "copilot.systemPrompt": "Comprehensive coding assistant...",
  "copilot.cpuCores": 4,
  "copilot.elasticSearchUrl": "http://127.0.0.1:9200",
  "copilot.vectorStoreIndexName": "copilot_vector_store",
  "copilot.tavilySearchApiKey": "<<TAVILY_API_KEY>>",
  "copilot.maxTokenLimit": 4096,
  "copilot.embeddingsModel": "nomic-embed-text",
  "copilot.maxRetrievalAttempts": 1,
  "copilot.documentRetrievalThreshold": 0.001,
  "copilot.printDebuggingLogs": false,
  "copilot.kNearestNeighbors": 3,
  "copilot.similarityMeasure": "cosine",
  "copilot.outputType": "code"
}

Extension Settings

This extension contributes the following settings:

copilot.baseUrl: Sets the base URL for the Ollama service, allowing the extension to communicate with your Ollama service instance.
copilot.model: Choose the AI model for generating code suggestions. The selection can be updated at any time to match your project's requirements.
copilot.modelTemperature: Adjusts the creativity and variability of the AI's code suggestions, enabling a balance between innovative and practical code solutions.
copilot.systemPrompt: Defines the prompt used by the AI, guiding it to generate responses that align with your specific project context and coding standards.
copilot.cpuCores: Configures the number of CPU cores dedicated to processing tasks like hashing and embeddings generation, optimizing the extension's performance.
copilot.elasticSearchUrl: Specifies the URL for the Elastic Search instance, replacing Redis for advanced vector storage and retrieval functionalities.
copilot.vectorStoreIndexName: The name of the index within the Elastic Search Vector Store where document embeddings are stored and retrieved.
copilot.tavilySearchApiKey: Integrates Tavily AI web search capabilities, providing an API key for authenticating search queries.
copilot.maxTokenLimit: Sets the upper limit on the input token count accepted by the LLM, ensuring efficient processing without directly affecting the response size.
copilot.embeddingsModel: Selects the model used for generating document and query embeddings, enhancing the context awareness of the AI suggestions.
copilot.maxRetrievalAttempts: Limits the number of attempts for retrieving relevant documents, optimizing the search process and response time.
copilot.documentRetrievalThreshold: Establishes the minimum score threshold for documents to be considered relevant, ensuring high-quality code suggestions.
copilot.printDebuggingLogs: Enables the output of debugging logs to the console, aiding in troubleshooting and understanding the AI's decision-making process.
copilot.kNearestNeighbors: Determines the k-nearest neighbors (k-NN) engine used in Elasticsearch, affecting the breadth and relevance of search results.
copilot.similarityMeasure: Represents the similarity measure used in Elasticsearch vector store to determine the closeness of the embeddings.
copilot.outputType: Determines the type of output generated by the AI, with "code" specifying that the output should be code snippets or configurations relevant to the development context. The other option is "text".

Note

For efficient document retrieval and enhanced suggestions, ensure the code is well-commented. These comments are leveraged to generate precise embeddings, significantly improving the relevance of suggestions provided.

Known Issues

For any known issues or troubleshooting, please check the GitHub repository issues section.

Release Notes

1.4.0

Elastic Search Integration: Transitioned from Redis to Elastic Search for the Vector Store, enhancing document embedding management and retrieval.
Extended Model Selection: Expanded the range of AI models available for generating code suggestions, including specialized models for embeddings.
Advanced Configuration Options: Introduced additional settings for Elastic Search URL, Tavily Search API integration, and more to offer greater flexibility in customizing the extension.
Performance Optimization: Improved worker pool management and CPU usage configuration, ensuring smoother performance and faster response times.
Enhanced Debugging Support: Added options to enable detailed debugging logs, assisting users in diagnosing and resolving issues more effectively.
Refined User Interface: Updates to the command palette options and settings menu, making it easier for users to navigate and adjust the extension's features.
Flexible Output Customization: Introduced setting allowing users to tailor the output to specific needs, including code snippets, detailed explanations, or both.

Enjoy coding with Ollama Copilot, your AI-powered coding assistant!

Ollama Copilot

Anik Ghosh