Skip to content

ML Guard

A specialized VS Code extension for AI/ML Engineers, Data Scientists, and MLOps Engineers to detect common mistakes in Python ML projects.

Features

Workspace Scanning: Analyze all Python files in your workspace
File Scanning: Scan individual Python files
ML-Specific Checks: Detect data leakage, missing validation, poor practices
Scoring System: Get a score from 0-100 based on code quality
Auto Fixes: Apply safe automatic fixes for common issues
Export Reports: Generate JSON, Markdown, or HTML reports
Dashboard: Interactive webview dashboard

Installation

Clone this repository
Run npm install in the extension directory
Press F5 to launch extension development host
The extension will be installed in the development environment

Usage

Commands

ML Guard: Scan Workspace - Scan all Python files in workspace
ML Guard: Scan Current File - Scan the currently open Python file
ML Guard: Export Report - Export scan results
ML Guard: Auto Fix Safe Issues - Apply automatic fixes
ML Guard: Open Dashboard - Open the interactive dashboard

Right-click on any Python file in the explorer and select "Scan with ML Guard".

Configuration

Configure ML Guard through VS Code settings:

mlguard.pythonPath: Path to Python executable
mlguard.autoScan: Automatically scan files on save
mlguard.severityFilter: Filter issues by severity
mlguard.enableFastAPIChecks: Enable FastAPI checks
mlguard.enableTorchChecks: Enable PyTorch checks

Checks Performed

Data Science

Target column used in features
Data leakage after scaling
Missing null handling
Duplicate rows ignored
No feature encoding

ML Checks

Missing train_test_split
No random_state
No cross validation
No metrics calculation
Accuracy only on imbalanced data

Deep Learning

model.train()/eval() misuse
No torch.no_grad()
GPU not used when available
Missing seed
No checkpoint saving

FastAPI Model Serving

Model loaded on every request
No startup preload
No health endpoint
No exception handling
Blocking inference

MLOps

requirements.txt missing
No Dockerfile
No tests
No logging
No MLflow/model registry

Performance

pandas iterrows loops
Nested loops
Repeated file reads
Unnecessary copies

Publishing

Update package.json with your publisher name
Run vsce package to create .vsix file
Run vsce publish to publish to marketplace

License

MIT