ECIT Fabric Development
VS Code extension for local development of Microsoft Fabric notebooks with lakehouse exploration.
Note: This extension was built for ECIT Data & AI's internal (local) Fabric development workflow. It requires specific workspace structure and Azure Service Principal setup to work and is not intended for public use.
What It Does
- Lakehouse Explorer - Browse schemas, tables, columns, and files from Fabric lakehouses
- Local Spark Development - Run notebooks locally against OneLake data without cloud execution costs
- SQL Intellisense - Auto-completion for table/column names in
spark.sql() and .sparksql files
- DataFrame Viewer - View query results in a panel (like SSMS results grid)
- Notebook Development - Create, run, duplicate, and manage Fabric notebooks locally
- Table Caching - Cache OneLake tables to local RAM for faster development queries
Requirements
Detailed setup guide is in the docs section though in danish. There are additional requirements. Contact author for details.
Software
| Requirement |
Version |
Notes |
| Java JDK |
11 |
Required for Spark |
| Python |
3.11 |
With pip and a list of libraries defined |
Python Packages required
python -m pip install azure-storage-file-datalake delta-spark==3.2.0 azure-identity azure-keyvault-secrets ipykernel pyzmq pyodbc pyarrow==14.0.2 pandas pyspark==3.5.1 websocket-client websocket deltalake openpyxl fsspec "numpy<2"
Note: The extension uses ipykernel directly via ZeroMQ - no Jupyter VS Code extension required.
Workspace Structure
The extension requires a local_development folder in your workspace, which the extension creates - example:
your-repo/
├── local_development/ # Required - extension searches for this
│ ├── schemas/
│ │ └── schema_reference.json # Auto-generated by "Rebuild intellisense"
│ └── temp_notebooks/ # Auto-generated working directory
├── notebooks/ # Your Fabric notebook .py files
│ ├── bronze_customer.py
│ └── silver_sales.py
└── pipelines/ # Fabric pipeline JSON files (optional)
The local_development folder can be at any depth - the extension searches up to 5 levels deep.
Azure Service Principal
Each lakehouse connection requires a Service Principal with access to OneLake:
- Create an App Registration in Azure Entra ID
- Grant permissions to the Fabric workspace (Contributor or higher)
- Create a client secret and note the values:
- Tenant ID
- Client ID (Application ID)
- Client Secret
When adding a connection in the extension, you'll enter these credentials. The client secret is stored securely in VS Code's secret storage.
If the same service principal is used between environments, then connections can be switched without restarting Spark.
Notebooks should be Python files (.py) in Fabric's format with cell markers:
# Fabric notebook source
# METADATA ********************
# META {
# "kernel_info": { "name": "synapse_pyspark" },
# "dependencies": {}
# }
# CELL ********************
df = spark.sql("SELECT * FROM bronze.customer")
display(df)
Workspace Notebook Resolution
Function notebooks (e.g., nb_dataplatform_functions) are resolved from the customer's workspace .Notebook folders. When a notebook uses %run nb_xxx, the extension:
- Finds
nb_xxx.Notebook in the configured search path
- Generates a Python stub in
local_development/workspace_stubs/workspace/
- Resolves dependencies recursively (e.g.,
nb_extract_bc_functions → nb_dataplatform_functions → nb_dataplatform_config)
All %run references are uniformly converted to from workspace.xxx import *.
Key Vault integration:
- If the connection has
keyVaultName set, the extension injects KEY_VAULT_NAME via os.environ
- The
get_key_vault_name() function checks os.environ first, then falls back to module variable
- In Fabric portal, use
%run nb_dataplatform_config to set KEY_VAULT_NAME
Getting Started
- Open the ECIT Fabric sidebar (data lake icon in activity bar)
- Click Add Connection (+) and enter:
- Connection name
- Workspace ID and Lakehouse ID (from Fabric portal URL)
- Tenant ID, Client ID, Client Secret (can be retrieved from Key Vault)
- Key Vault name (for
get_secret() function and if KEY_VAULT_NAME needs replacement)
- Right-click connection → Rebuild intellisense to fetch table schemas
- Click Spark in the status bar (lower left) to start the kernel
Note: The local_development folder is created automatically by the extension.
Key Commands
| Command |
Keybinding |
Description |
| New Spark SQL Query |
Ctrl+N |
Open new .sparksql tab |
| Execute Spark SQL |
F5 |
Execute .sparksql file |
| Run Selection |
Shift+Enter |
Execute selected Python code |
| Run Cell |
Ctrl+Enter |
Execute current cell |
| Preview Table |
Ctrl+3 |
Preview table in cursor |
| Reload Schemas |
Ctrl+Shift+R |
Refresh intellisense from OneLake |
| Convert TEMP tables |
Ctrl+Shift+C |
Convert between SQL/Python formats |
Features
Spark SQL Files (.sparksql)
Standalone SQL files with SSMS-like experience:
- Syntax highlighting for Spark SQL
- Intellisense for tables, columns, and functions
- Press
F5 to execute and see results in DataFrame Viewer
- Multiple statements separated by blank lines
SQL Cells in Notebooks (%%sql)
Write SQL directly in notebook cells using the %%sql magic, just like in Fabric and Databricks:
%%sql
SELECT
a.customer_id
, a.name
FROM bronze.customer a
WHERE a.active = true
How it works:
- Type
%%sql on the first line of any cell → language switches to Spark SQL
- Full SQL syntax highlighting and intellisense
- On execution, SQL is wrapped in
spark.sql() automatically
- On save, cell is serialized with
# %% [sql] marker
Under the hood: %%sql is syntactic sugar - the SQL is wrapped in spark.sql("""...""") before execution, with table references translated to ABFSS paths for local development.
Files Browser
Browse, upload, download, and manage files in OneLake Files section:
- Right-click to upload, download, rename, or delete
- "Open with read code" generates a ready-to-run code snippet
Table Caching
Cache tables to local RAM for fast queries during development:
- Right-click table → Cache table
- Cached tables show a cloud icon in the explorer
- Significant speedup for repeated queries
Changelog
1.3.3
- Fixed CTRL+3 shortcut
- Ensured Fabric Spark (Livy sessions) are reused when executing multiple notebooks
- Changed the bootstrap notebook concept to just run them on the fly when they occur in notebooks (%run ...)
1.3.2
- Simplified resolving og KEY_VAULT_NAME and BASE_PATH to function calls with environment variables
1.3.1
- Fixed bug where BASE_PATH got overridden by our nb_dataplatform_functions during notebook runs
1.3.0
- Typewriter loading animation to reduce perceived query wait time
1.2.9
- Make cell execution timeout configurable (was 10 min hardcoded before)
- Fixed table version history bug
- Added "interrupt kernel" functionality by clicking on the spinning Spark in left lower corner
- Added "Loading data..." text instead of timer when timer stops and data are loading in the view
1.2.8
- Removed pipeline explorer
- Added folder support for notebooks
1.2.7
- Fixed %%sql cells so they deserialize correctly back to Fabric format
1.2.6
- Integrated bootstrap functions notebooks in extension with KEY_VAULT_NAME injection
- Ensure kernel is closed/disposed if we close VS Code
- Minor fixes to intellisense refresh and bug fixes
- Simplified query execution graphic and added query timer (mm:ss)
- Added %%sql cell support
- %run statements get resolved by dynamically looking up the notebook locally
- Removed option to run notebooks remote because we now have Livy sessions (fabric kernel)
- Added multiple sequential notebook runs with status/progress bar and run order
1.2.5
- Added dependencies for ZeroMQ (cmake-ts)
1.2.4
- Fixed ZeroMQ external issue
1.2.3
- Fabric Local Kernel: Self-managed ipykernel replaces Jupyter extension dependency
- No more idle timeout (kernel stays alive until you stop it)
- Start/stop Spark via status bar click (lower left corner)
- No Jupyter VS Code extension required
- Better intellisense refresh on column changes to existing tables (code action and refresh when tree opens)
1.2.2
- Fixed F5 (run selected SQL) does not send query directly to kernel
1.2.1
- Fixed SHIFT+ENTER selection run against kernel
1.2.0
- Replaced cell python with a .fabpy format which uses VS Codes notebook API like Jupyter to provide a real notebook experience
- Added direct kernel access to interactive window (Spark session)
- Added error showing in dataframe results for errors in sparksql files
- Added remote execution to Livy (spark) sessions while following cell executions
1.1.5
- Improved intellisense in multiple SELECT statements on same editor
- Remind user of semicolon SELECT termination
1.1.4
- Increased buffer to minimize flicker when scrolling fast in query results
- Removed parameter passing from remote notebook runnings (wasn't used)
- Fixed warnings for bk_ columns in notebook tree view
1.1.3
- Fixed jar path when user has danish special characters in name
1.1.2
- Fixed abfss translation when tables has backticks beforehand
1.1.1
- Adjusted connection screen/view in light mode
1.1.0
- Automatic pull from remote repos when there is a repos open with connections
- Re-cache tables (UNCACHE + CACHE)
1.0.9
- Copy SQL button didn't work when sql had f-strings (spark.sql(f"""..."""))
- Ensured when copying SQL that the original .py file has it selected for easy copying back after editing
1.0.8
- Added translation to abfss// paths inside python files
- Fixed BASE_PATH so we write to the correct connections lakehouse from local notebooks
1.0.7
- Delete schema, Delete table and Create schema commands added
- Fixed some places where schema.table wasn't properly translated to abfss://
1.0.6
- Adjusted NULL background color in dark theme
- Added more decimals to webview (data results)
1.0.5
- Increased visibility on cell markers in light theme on Python cells
- Added more checks to notebooks to ensure schema, table and business keys are defined correctly
1.0.4
- Fixed runtime translation of spark(f"""...""") strings when parameters are tables we need to resolve before running the code
1.0.3
- Changed so semicolon terminates statements in Spark SQL tabs
1.0.2
- Fixed selection that couldn't be seen on the black background in active cells
- Added examples for spark functions with tooltips
- Added "Delete cell" option in Python cells
1.0.0
- Fabric capacity state and resume/pause
- Allow CRUD operations directly in Spark SQL cells
- Adjusted intellisense so it automatically rebuilds it all if first time refresh
- Added key vault retrieval of values for setting up connections
0.9.9
- Removed local metastore - SQL is now translated to ABFSS paths at runtime
- Simpler setup (no metastore sync step needed)
- Table caching now uses ABFSS paths directly
- Connection switching no longer restarts Spark (same service principal)
- Play button moved to title bar; plug icon switches connections
0.9.8
- Fixed so worker and driver works on same Python
- Deletes eventually misplaced local_development folder
- Added REFRESH TABLE command when hitting CTRL+SHIFT+R
0.9.7
- Autocreates local_development folder (no manual copy around)
- When SQL contains UNION ALL, it now correctly look if columns exists on the "right side of the UNION" if tables on both sides are aliased identically
0.9.6
- Multi connection support for lakehouses
- Bug fixes where the orange color was too dark certain places
- This version requires re-setup of connections initially after updating
0.9.4
- Added better lineage tracking (right click notebook -> Show lineage) based on name conventions
- Adjusted "spinner" graphic to Azure-style (3 dots) when waiting for results
- Added LIMIT X handling to display less or more than 1000 rows
0.9.3
- VS Code theme-aware data grid (light/dark)
.sparksql file format for pure Spark SQL
- Multi-result support for multiple SELECT statements
- Built-in SQL syntax highlighting (removed Inline SQL dependency)
0.9.0
- Files browser with upload/download operations
- Code snippets for reading CSV, Parquet, Excel files
0.8.7
- Remote notebook execution via Fabric REST API
- Pipeline Explorer with remote execution
0.7.0
- Initial release: Lakehouse explorer, intellisense, local notebook development
License
Internal use.