Skip to content
| Marketplace
Sign in
Visual Studio Code>Machine Learning>Ollama AssistantNew to Visual Studio Code? Get it now.
Ollama Assistant

Ollama Assistant

Godwin Kimani

|
254 installs
| (0) | Free
VS Code extension for interacting with Ollama AI models
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Ollama Assistant

image

What does this extension do ?

Enhance your coding workflow with effortless access to Ollama models from the comfort of your Visual Studio Code editor.

How to Use

  1. Install Ollama:

    • Ensure you have Ollama installed and running on your system. Follow the installation instructions on the Ollama website for your operating system.
    • Make sure your Ollama server is accessible at the configured API URL (default is http://localhost:11434/api/generate).
  2. Install the Extension:

    • Open Visual Studio Code.
    • Go to the Extensions view (Ctrl+Shift+X or Cmd+Shift+X).
    • Search for Ollama Assistant (or vscode-ollama-assistant).
    • Click Install.
  3. Open the Ollama Assistant:

    • Command Palette: Press Ctrl+Shift+P (or Cmd+Shift+P on macOS) to open the Command Palette.

    • Type Ollama: Open Ollama Assistant and select the command.

    • Status Bar: Click on the $(hubot) Ollama status bar item in the bottom left corner of the VS Code window.

    • Sidebar: Find the Ollama icon in the Explorer sidebar. Click on it to open the Ollama view.

  4. Start Chatting:

    • Once the Ollama view is open (either as a panel or in the sidebar), you will see a chat interface.
    • Type your prompt in the input field at the bottom and press Send or press Enter.
    • Ollama will process your prompt and display the response in the chat window.
  5. Clear History:

    • Click the Clear History button in the chat interface to clear the current chat history. This will also clear the saved history for the current VS Code session.

Extension Settings

You can configure the Ollama Assistant extension settings in your VS Code settings (File > Preferences > Settings, or Code > Settings > Settings on macOS). Search for Ollama to find the extension settings.

Here's a breakdown of the available settings:

Setting Name Type Default Value Description
ollama.apiUrl string http://localhost:11434/api/generate URL for the Ollama API: Specify the URL of your running Ollama API server. Modify this if your Ollama server is running on a different host or port.
ollama.model string llama2 Default model to use for Ollama: Set the default Ollama model to be used for chat interactions. Ensure the model is available in your Ollama installation (you can pull models using ollama pull <model_name> in your terminal).
ollama.temperature number 0.8 Temperature for Ollama model (0 to 1, higher for more creative): Controls the randomness of the model's output. Values closer to 1 make the output more random and creative, while values closer to 0 make it more deterministic and focused.
ollama.numCores integer -1 Number of CPU cores to use for Ollama. Set to -1 to use all available cores.: Adjust the number of CPU cores that Ollama can utilize for processing. Setting it to -1 (default) allows Ollama to use all available cores, which is generally recommended for optimal performance. You can limit this if you want to reserve CPU resources for other tasks.
ollama.topP number 0.9 Top-P sampling parameter (nucleus sampling, 0 to 1): Also known as nucleus sampling. It controls the cumulative probability of the tokens to consider during generation. A value of 0.9 means the model will consider the smallest set of tokens whose cumulative probability exceeds 0.9. Lower values make the output more focused.
ollama.topK integer 40 Top-K sampling parameter (integer, e.g., 40): Limits the model to consider only the top K most likely tokens at each step of generation. Lower values make the output more focused and less diverse.
ollama.numPredict integer 256 Maximum number of tokens to predict (integer): Sets the maximum length of the response generated by the Ollama model. You can increase this value if you expect longer responses.

Example Settings in settings.json:

{
    "ollama.apiUrl": "http://my-ollama-server:11434/api/generate",
    "ollama.model": "mistral",
    "ollama.temperature": 0.7,
    "ollama.numCores": 4,
    "ollama.topP": 0.85,
    "ollama.topK": 30,
    "ollama.numPredict": 512
}

Issues and Feature Requests

If you encounter any issues or have feature requests, please create an issue on GitHub.

Experience Ollama through other ways

  • OllamaRAG A basic RAG that allows you to upload code files, process them, and query them using Ollama models.
  • OllamaCoder A basic, interactive code analysis and assistant interface for Ollama, designed for developers to get instant code reviews, suggestions, and improvements using coding language models.
  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2025 Microsoft