ECIT Lakehouse Studio
A VS Code extension for developing Microsoft Fabric notebooks and exploring Fabric lakehouses. All code execution runs on Fabric via the Livy Sessions API — no local Spark engine required.
Note: This extension was originally built as internal tooling at ECIT and is now available for broader use. While it has been used in production internally, it has not yet been widely tested outside our environments. Issues are quickly resolved in frequent updates — feedback and bug reports are welcome.
Features
- Lakehouse Explorer — Browse schemas, tables, columns, and files. Create schemas and delete tables directly from the tree view
- Livy Spark Execution — Run notebooks and SQL on Fabric directly from VS Code
- SQL Intellisense — Auto-completion for table/column names in
spark.sql() and .sparksql files, with cross-lakehouse support for 3-part naming (lakehouse.schema.table)
- DataFrame Viewer — View query results in a grid panel with Filter, Copy, and Excel export
- Notebook Development — Native Fabric
.py notebook files serialize as real VS Code notebooks with cells, syntax highlighting, and remote execution — no conversion needed
- Fabric Compatibility — Supports
%run, %pip install, notebookutils.notebook.run(), and notebookutils.notebook.runMultiple() via Livy
- Capacity Management — Monitor, resume, and pause Fabric capacity from the explorer
- Schema Compare — Compare schemas between two lakehouse connections (no Spark needed)
- Files Browser — Browse, upload, download, rename, and delete files in OneLake
- Multi-Connection — Run DEV and PRD sessions simultaneously with isolated connections
- Git or Remote — Works with a git-synced Fabric workspace (recommended) or directly against the workspace via the Remote Notebooks view
- Authentication — Azure CLI / Interactive or Service Principal with secrets stored securely in VS Code
- AI-Assisted Development — Work with Claude Code, GitHub Copilot, or any VS Code AI tool directly alongside your Fabric notebooks — something not possible in the Fabric portal
Feature Demos
Remote Spark (Livy session) — Select between environments

Lakehouse Explorer — Browse schemas, tables, columns, and files

Edit and Execute Notebooks — Native Fabric notebooks with remote Spark execution

Spark SQL Files — Standalone .sparksql with intellisense and F5 execution

DataFrame Viewer — Query results with Filter, Copy, and Excel export

SQL Intellisense — Auto-completion for tables, columns, and functions

Schema Compare — Compare schemas between two connections

Capacity Management — Monitor, resume, and pause Fabric capacity

Quick Start
- Install the extension from the VS Code marketplace
- Prerequisites — install Python, Python packages, and Azure CLI (see Requirements below)
- Open a folder containing Fabric
.Notebook files (or any workspace)
- Add a connection — click the "+" in the Lakehouse Studio sidebar
- Start a Spark session — click "Spark" in the status bar (lower left)
- Build intellisense — expand the Tables section in the Lakehouse Explorer to initiate schema sync
- Run queries — press
F5 in a .sparksql file or Ctrl+Enter in a notebook cell
- Recommended settings — see Recommended VS Code Settings for the best experience
Requirements
| Requirement |
Notes |
| Python 3.11+ |
For Pylance intellisense (not for execution) |
| Azure CLI |
Required for interactive authentication (az login) |
Install Azure CLI (if not already installed):
winget install -e --id Microsoft.AzureCLI
Python packages (for intellisense only, not execution):
python -m pip install pyspark==3.5.1 requests pytz pandas openpyxl
Authentication
Two authentication methods are supported. In both cases, the user or service principal must have at least Contributor role on the Fabric workspace(s) used in the connection.
Azure CLI / Interactive (recommended)
Uses DefaultAzureCredential from @azure/identity, which tries Azure CLI, environment variables, and managed identity in order. Best for individual developers.
- Run
az login in your terminal
- When adding a connection, select "Azure CLI / Interactive"
- Only Workspace ID and Lakehouse ID are required
Tip: If the interactive login popup doesn't appear or gets stuck, run az login manually in the VS Code Terminal, then switch back to the DataFrame Results pane.
Service Principal
Uses ClientSecretCredential for automated scenarios or shared team environments.
- Create an App Registration in Azure Entra ID
- Grant it Contributor access to the Fabric workspace
- When adding a connection, select "Service Principal" and enter Tenant ID, Client ID, and Client Secret
The client secret is stored securely in VS Code's secret storage.
Connection Setup
Click Add Connection in the Lakehouse Studio sidebar:
| Field |
Required |
Description |
| Connection Name |
Yes |
Friendly name (e.g., my-lakehouse-dev) |
| Environment |
Yes |
Dev or Prd (auto-detected from name) |
| Workspace ID |
Yes |
GUID from Fabric portal URL |
| Lakehouse ID |
Yes |
GUID from Fabric portal URL |
| Auth Method |
Yes |
Azure CLI / Interactive or Service Principal |
| Tenant ID |
SP only |
Azure tenant GUID |
| Client ID |
SP only |
App registration GUID |
| Client Secret |
SP only |
App registration secret |
| Key Vault Name |
Optional |
Populate fields from Azure Key Vault |
| Capacity fields |
Optional |
Enable resume/pause from the explorer |
| Environment Variables |
Optional |
Key-value pairs injected as os.environ in Livy sessions |
Workspace Structure
The extension creates a local_development/ folder in your workspace root (add it to .gitignore):
your-repo/
├── local_development/ # Auto-created, git-ignored
│ ├── active_connection.json
│ ├── workspace_imports/ # Auto-generated Python modules
│ └── connections/
│ └── {uuid}/
│ └── schema_reference.json # Table/column metadata
├── *.Notebook/ # Fabric notebook folders
│ └── notebook-content.py
└── utility/ # Shared Python modules (optional)
The extension auto-discovers .Notebook folders anywhere in your workspace. Override with the lakehouseStudio.notebookSearchPath setting.
Key Commands
| Command |
Keybinding |
Description |
| New Spark SQL Query |
Ctrl+N |
Open new .sparksql tab |
| Execute Spark SQL |
F5 |
Execute .sparksql file |
| Run Selection |
Shift+Enter |
Execute selected Python code |
| Run Cell |
Ctrl+Enter |
Execute current cell |
| Preview Table |
Ctrl+3 |
Preview table under cursor |
| Toggle Spark Session |
Status bar |
Start, stop, or switch sessions |
Spark SQL Files (.sparksql)
Standalone SQL files with SSMS-like experience:
- Syntax highlighting for Spark SQL
- Intellisense for tables, columns, and functions
- Press
F5 to execute and see results in DataFrame Viewer
- Multiple statements separated by semicolons (
;)
- Multiple SELECT statements run in parallel
SQL Cells in Notebooks (%%sql)
Write SQL directly in notebook cells:
%%sql
SELECT
a.customer_id
, a.name
FROM bronze.customer a
WHERE a.active = true
Type %%sql on the first line → full SQL highlighting and intellisense.
DataFrame Viewer
Two result modes controlled by the "All Rows" toggle:
- Fast mode (default) — JSON via Livy stdout, capped at 500 rows
- Full mode — Parquet via OneLake, up to 100K rows with true row count
Results are cached per .sparksql tab for instant switching.
Settings
| Setting |
Default |
Description |
lakehouseStudio.livyStatementTimeout |
30 |
Livy statement timeout (minutes) |
lakehouseStudio.notebookSearchPath |
"" |
Notebook search path (empty = auto-discover) |
lakehouseStudio.notebookWarnings |
false |
Show destination/schema/key column warnings |
lakehouseStudio.autoSwitchToDevOnStartup |
true |
Auto-switch away from production on startup |
lakehouseStudio.utilityModulesPath |
utility |
Path for shared Python utility modules |
Recommended VS Code Settings — For the best experience with notebooks and SQL
Add or adjust these in your settings.json (Ctrl+Shift+P → "Preferences: Open User Settings (JSON)"):
{
"editor.autoIndent": "keep",
"workbench.editor.enablePreview": false,
"python.analysis.autoIndent": false,
"editor.autoIndentOnPaste": false
}
| Setting |
Why |
editor.autoIndent: keep |
Line breaks maintain indentation level (SSMS-like behavior) |
workbench.editor.enablePreview: false |
Each file opens in its own tab instead of reusing a preview tab |
python.analysis.autoIndent: false |
Prevents Pylance from overriding indentation on new lines |
editor.autoIndentOnPaste: false |
Pasted code keeps its original indentation |
Capacity Management
Configure capacity fields (Subscription ID, Capacity Name, Resource Group) on a connection to enable:
- Status indicator — green (Active), orange (Paused/Resuming)
- Resume/Pause — right-click the connection in the explorer
Required permissions: The user (interactive) or service principal must have Contributor role on the Fabric capacity resource in Azure (not the Fabric workspace — the Azure resource Microsoft.Fabric/capacities).
Changelog
1.7.3
- Subtle executing query text: timer, pipe, and message now match the "Showing X rows" muted style instead of bright secondary text
- Enhanced column refresh on Ctrl+Shift+R: detects
schema.table references in the active editor (.sparksql files and Python spark.sql() blocks) and refreshes their columns from OneLake — no more stale columns after table schema changes
1.7.2
- Brighter "Executing query..." text in DataFrame Viewer: switched from faded hint color to secondary text color for better visibility
- Preview Table (Ctrl+3) now supports backtick-quoted table names (e.g.
schema.\some_table``) — common for Spark tables with special characters
1.7.1
- Fix multi-line
notebookutils.notebook.run(...) hijacking: calls spanning multiple lines (e.g. with keyword args like timeout_seconds, arguments={}) are now normalized and intercepted correctly
- Fix keyword args support in
notebook.run parameter passing: handles both positional (name, timeout, {params}) and keyword (name, arguments={params}) signatures
- Fix Shift+Enter (
executeCodeInCell) now delegates to _executeViaLivy, giving it full preprocessing (inlining, hijacking, %pip transform) instead of raw execution
- Fix notebook auto-discovery for nested folder structures:
findFirstNotebookRoot now finds the common parent when notebooks are spread across sibling subfolders (e.g. nb_silver/, nb_gold/)
- Fix multiple
from workspace.xxx import * in the same cell: all imports are now inlined, not just the first
- Fix progress grid duration: shows each notebook's individual run time instead of cumulative elapsed time
1.7.0
display(df) support: injected at Livy session startup, routes DataFrame results to the DataFrame Viewer panel (works in both Ctrl+Enter and Shift+Enter)
- Auto-load modules on session start: new
autoLoadModules setting loads configured modules (e.g. nb_dataplatform_functions) automatically when a Spark session starts, and pre-fills them in new scratchpad notebooks
- Simplified "Open with Read Code": parquet/CSV/Excel snippets now use simple relative
Files/ paths instead of FILES_BASE_PATH with IS_FABRIC conditionals
- Removed
scratchpadImports setting (replaced by autoLoadModules) and notebookWarnings setting
- Settings reorganized:
utilityModulesPath and autoLoadModules grouped together for discoverability
1.6.3
- Redesigned Add/Edit Connection form: authentication moved to top, workspace and lakehouse pickers via Fabric API ("Select..." buttons) replace manual GUID entry
- Connection name is now optional ("Connection Friendly Name") — defaults to the lakehouse display name from Fabric API
- Unified Add and Edit connection forms into a single codebase, eliminating ~400 lines of duplication
- Renamed right-click menu "Refresh intellisense" to "Rebuild Intellisense"
- Post-save modal dialog guides new users on starting a Spark session and rebuilding intellisense
1.6.2
- Feature demo GIFs added for all major features (Livy sessions, Lakehouse Explorer, notebooks, Spark SQL, DataFrame Viewer, intellisense, Schema Compare, capacity management)
- README overhaul: added internal tooling disclaimer, authentication requirements, recommended VS Code settings, and login troubleshooting tip
- Removed old Setup-guide.md and screenshot attachments — all setup info now lives in README
- Included docs folder in .vsix package so GIFs render on the marketplace
License
MIT
| |