Wingman - AI Coding Assistant
The Wingman-AI extension brings high quality AI assisted coding right to your computer, it's 100% free and private which means data never leaves your machine! 🚀 Getting StartedChoosing an AI ProviderWe recommend starting with Ollama using the Deepseek model(s), see why here or here.
That's it! This extension will validate that the models are configured correctly in it's VSCode settings upon launch. If you wish to customize which models run see the FAQ section. FeaturesCode CompletionThe AI will look for natural pauses in typing to decide when to offer code suggestions (keep in mind the speed is limited by your machine). The code completion feature will also analyze comments you type and generate suggestions based on that context. Code Completion Disable / HotKeyWe understand that sometimes the code completion feature can be too aggressive, which may strain your system's resources during local development. To address this, we have introduced an option to disable automatic code completion. However, we also recognize the usefulness of on-demand completion. Therefore, we've implemented a hotkey that allows you to manually trigger code completion at your convenience. When you need assistance, simply press Interactive ChatTalk to the AI naturally! It will use open files as context to answer your question, or simply select a section of code to use as context. Chat will also analyze comments you type and ge AI ProvidersOllamaOllama is a free and open-source AI model provider, allowing users to run their own local models. Why Ollama?Ollama was chosen for it's simplicity, allowing users to pull a number of models in different configurations and update them at will. Ollama will pull optimized models based on your system architecture, however if you do not have a GPU accelerated machine, models will be slower. Setting up OllamaFollow the directions on the Ollama website. Ollama has a number of open source models available that are capable of writing high quality code. See getting started for how to pull and customize models. Supported ModelsThe extension uses a separate model for chat and code completion. This is due to the fact that different types of models have different strengths, mixing and matching offers the best result. NOTE - You can use any quantization for a supported model, you are not limited. Example: deepseek-coder:6.7b-instruct-q4_0 Supported Models for Code Completion:
Supported Models for Chat:
OpenAIOpenAI is supported! You can use the following models:
NOTE - Unlike using Ollama, your data is not private and will not be sanitized prior to being sent. AnthropicAnthropic is supported! You can use the following models:
NOTE - Unlike using Ollama, your data is not private and will not be sanitized prior to being sent. Hugging FaceHugging Face supports hosting and training models, but also supports running many models (under 10GB) for free! All you have to do is create a free account. Setting up Hugging FaceOnce you have a Hugging Face account and an API key, all you need to do is open the VSCode settings pane for this extension "Wingman" (see FAQ). Once it's open, select "HuggingFace" as the AI Provider and add your API key under the HuggingFace section:
Supported ModelsThe extension uses a separate model for chat and code completion. This is due to the fact that different types of models have different strengths, mixing and matching offers the best result. Supported Models for Code Completion:
Supported Models for Chat:
NOTE - Unlike using Ollama, your data is not private and will not be sanitized prior to being sent. FAQ
TroubleshootingThis extension leverages Ollama due to it's simplicity and ability to deliver the right container optimized for your running environment. However good AI performance relies on your machine specs, so if you do not have the ability to GPU accelerate, responses may be slow. During startup the extension will verify the models you have configured in the VSCode settings pane for this extension, the extension does have some defaults: Code Model - deepseek-coder:6.7b-base-q8_0 Chat Model - deepseek-coder:6.7b-instruct-q8_0 The models above will require enough RAM to run them correctly, you should have at least 12GB of ram on your machine if you are running these models. If you don't have enough ram, then choose a smaller model but be aware that it won't perform as well. Also see information on model Quantization. Release NotesTo see the latest release notes - check out our releases page. If you like the extension, please leave a review! If you don't, open an issue and we'd be happy to assist! Enjoy! |