Kabrino - Cloudera AI Manager for VS Code
Kabrino is a Visual Studio Code extension that integrates with Cloudera AI (Before CML) to manage projects, jobs, experiments, and AutoML workflows directly from your editor.
Features
- 🔌 Connect to CML: Configure your CML instance with base URL and API key
- 📁 Browse Projects: View all accessible CML projects in a tree view
- 🎯 Select Active Project: Choose which project to work with
- 📂 File Management: Browse, upload, download, and manage project files
- 🚀 Run Python Files: Execute Python files directly in CML jobs
- 📊 Monitor Jobs: View jobs and their execution history
- 🧪 Experiments: Create, view, and manage ML experiments with runs tracking
- 🤖 AutoML Projects: Create complete AutoML projects with pre-configured workflows
- ⚡ Real-time Status: Track job execution with live status updates
- 📝 DSN Management: Switch between different CML instances seamlessly
Getting Started
Prerequisites
- Access to a Cloudera Machine Learning instance
- A valid CML API key
Installation
- Install the extension from VS Code Marketplace (or build from source)
- Reload VS Code to activate the extension
Configuration
- Click on the Charmeleon icon in the Activity Bar )
- Click the gear icon (⚙️) to configure connection
- Enter your CML credentials:
- Base URL: Your CML instance URL (e.g.,
https://ml-xxx.cloudera.site)
- API Key: Your CML API key
How to Get Your API Key
- Log in to your CML instance
- Click on your user profile (top right corner)
- Navigate to "Settings" or "API Keys"
- Generate a new API key
- Copy and paste it into the Kabrino configuration
Usage
Selecting a Project
- Open the "CML Projects" view in the Kabrino sidebar
- Click on any project to select it as the active project
- All other views (Files, Jobs, Experiments) will update to show content for the selected project
- Click on a project name to view detailed information including:
- Project metadata (name, description, owner, creation date)
- Team collaborators with their roles
- Environment variables (view, add, delete)
Managing Project Environment Variables
Control environment variables for your CML projects:
- View Variables: Click on a project to see all configured environment variables
- Add Variable: Click the "+" button and enter key-value pairs
- Delete Variable: Click the trash icon next to any variable
- Create with Variables: Include environment variables when creating new projects
Managing Files
The "CML Files" view shows all files in your selected project:
- Browse Files: Navigate through project directories
- Upload Files:
- Click the upload button (☁️↑) to select and upload multiple files at once
- Drag & drop one or multiple files from your system directly into folders
- Right-click on a folder → "Upload File to Folder"
- Download Files: Click on any file to open and download it locally
- Multi-Selection: Hold
Ctrl (Windows) or Cmd (Mac) and click to select multiple files/folders
- Delete Files:
- Select one or multiple items and press
Delete or Supr key
- Or right-click → "Delete File" / "Delete Folder"
- Bulk Operations: Select multiple items with
Ctrl+Click then upload or delete them all at once
- Refresh: Click the refresh icon to reload the file tree
Running Python Files
There are two ways to run a Python file in CML:
- Right-click on any
.py file in the editor
- Select "Run Python File in CML"
- Open any
.py file
- Click the play icon (▶️) in the editor toolbar
The extension will:
- Upload the file to your CML project
- Create or find a suitable job
- Execute the job
- Monitor the execution
- Notify you when complete
Managing Jobs
- View all jobs in the "CML Jobs" panel
- Expand a job to see its execution history (up to 20 recent runs, sorted by date)
- See real-time status with emoji indicators:
- ✅ Success
- ❌ Failed
- ⏹️ Stopped
- 🔄 Running
- ⏳ Scheduling
- ⚪ Unknown
- Each run shows execution duration or "⏱️ Running" for active runs
- Right-click on a job to:
- Run Job: Execute the job and monitor in real-time
- Stop Job: Stop a currently running job
- View Details: Open detailed information and full run history
- Delete Job: Remove the job from the project
- Right-click on a specific run to:
- Stop Run: Stop an active execution (ENGINE_SCHEDULING, SCHEDULING, RUNNING, STARTING)
- Jobs automatically expand when executed to show the new run
Working with Experiments
The "CML Experiments" view lets you manage ML experiments:
- View Experiments: See all experiments in the selected project
- View Runs: Expand an experiment to see its runs
- Run Status: Track experiment runs with status indicators
- Create Experiments: Use the API to programmatically create experiments
Creating AutoML Projects
Quickly set up a complete AutoML workflow:
- Right-click on "CML Projects" view
- Select "Create AutoML Project"
- Kabrino will:
- Create a new CML project
- Install required AutoML packages
- Upload helper scripts (feature selection, hyperparameter tuning, model selection)
- Create the Driver AutoML job
- Upload AutoML configuration YAML
- Your AutoML project is ready to use!
Managing Multiple CML Instances
Switch between different CML environments:
- Click the gear icon (⚙️) to configure connection
- Enter a DSN name for the connection
- Configure base URL and API key
- Use the dropdown in the Projects view to switch between saved DSNs
Commands
All commands are available through the Command Palette (Ctrl+Shift+P / Cmd+Shift+P):
Configuration
Kabrino: Configure CML Connection - Set up your CML credentials
Projects
Kabrino: Refresh Projects - Reload the projects list
Kabrino: Select Project - Choose a project to work with
Kabrino: Create AutoML Project - Set up a complete AutoML workflow
Files
Kabrino: Refresh Files - Reload the file tree
Kabrino: Upload File to CML - Upload a file to the selected project
Kabrino: Download File from CML - Download a file from CML
Kabrino: Delete File from CML - Remove a file from the project
Jobs
Kabrino: Refresh Jobs - Reload the jobs list
Kabrino: Run Active Python File in CML - Execute the currently open Python file
Kabrino: Stop Job - Terminate a running job
Experiments
Kabrino: Refresh Experiments - Reload the experiments list
Kabrino: View Experiment Runs - Show all runs for an experiment
Extension Settings
This extension contributes the following settings:
kabrino.baseUrl: CML instance base URL
kabrino.apiKey: CML API key (stored securely)
kabrino.dsn: Data Source Name for the current connection
kabrino.connections: Saved CML connections (multiple instances support)
Known Issues
- Large Python files may take time to upload
- Job execution depends on CML cluster resources
- Experiment runs with large datasets may take time to load (timeout set to 10 minutes)
- File operations on very large files may be slow
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
See LICENSE file for details.