VSCode N-Gram Code Suggester

⚠️ Experimental Project – Not for Production Use
This repository demonstrates a research prototype that implements a simple n‑gram language model for code completion in VS Code. It is not guaranteed to be stable, fast, or secure enough for production workloads. Use at your own risk.

Overview

VSCode N‑Gram Code Suggester is a proof‑of‑concept that combines:

Component	Description
VS Code extension	Hooks into the editor’s Inline Suggest API and uses the trained n‑gram model to surface context‑aware completions.
Model file	A lightweight model that stores trigram/tag frequencies.

The idea is to show that even a single‑sentence context can yield useful suggestions, without the heavy machinery of large neural models.

Current functional

Autocompletion generation based on a pre-trained model
Project context awareness (open documents are indexed and included in autocompletion generation)
Support for model training and autocompletion generation for: C#, JavaScript, TypeScript, Python

⚠️ Important about configuration
If you use a large model (more than 2 million patterns), you may need to disable "Fuzzy search" and "Use Smoothing" for better performance. If suggestions are still slow, enable "Use Trigger Characters" option. Pre-trained model in builded extension is so large.
For own smaller models, keep these settings enabled for better suggestion quality.

Extension Settings

Below is a quick reference to all user‑configurable options for the extension.
Add any of these to your workspace or user settings.json to tweak the behaviour.

Setting	Type	Default	Constraints	Description
codeSuggester.modelPath	string	`./models/model.json.gz`	–	Path to the trained model file. Supports plain `.json` or gzipped `.json.gz`.
codeSuggester.maxSuggestions	number	`5`	`1 – 10`	Maximum number of suggestions displayed in the IntelliSense list.
codeSuggester.maxFuzzyChecks	number	`2000`	`≥ 1000`	Maximum number of fuzzy‑search checks performed. Higher values give better matches but can be slow on large models.
codeSuggester.minConfidence	number	`0.2`	`0.0 – 1.0`	Minimum confidence threshold for a suggestion to be shown.
codeSuggester.enableFuzzyMatching	boolean	`false`	–	Turns on fuzzy matching for similar code patterns. ⚠️Use only on small models⚠️
codeSuggester.useSmoothing	boolean	`false`	–	Enables smoothing algorithms to better handle rare n‑grams. ⚠️Use only on small models⚠️
codeSuggester.useTriggerCharacters	boolean	`false`	–	When enabled, suggestions are only triggered when the cursor is placed on a trigger character (`. , ( ) [ { : ; =`). Useful if auto‑suggestions feel sluggish.
codeSuggester.useProjectContext	boolean	`true`	-	Use project context from open files for suggestions
codeSuggester.updateOnFileChange	boolean	`false`	-	Update project model when files are modified (may impact performance)

Model Training

For information on training your own models, please refer to the full documentation on GitHub.

Build Extension from sources

For information how to build extension from sources, please refer to the full documentation on GitHub.

License

Distributed under the MIT License. See the LICENSE file for details.

Code Suggester

nb47