Skip to content
| Marketplace
Sign in
Visual Studio Code>Data Science>Spark Declarative Pipeline (SDP) DesignerNew to Visual Studio Code? Get it now.
Spark Declarative Pipeline (SDP) Designer

Spark Declarative Pipeline (SDP) Designer

Gergely Szécsényi

|
135 installs
| (2) | Free
Low-code VS Code extension to design and visualize Apache Spark Declarative Pipelines — build tables, views and streaming entities, generate PySpark/SQL source, and navigate dependency DAGs (Databricks-compatible).
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Spark Pipeline Designer -- Build Spark Declarative Pipelines in VS Code

Go from idea to running pipeline in minutes. SDP Designer is a low-code, click-based pipeline builder for Apache Spark Declarative Pipelines, right inside VS Code.

No boilerplate. No hassle. Just open VS Code, point at your project, and start building.

SDP Designer


Why SDP Designer?

What started as a simple DAG visualizer has evolved into a full pipeline designer. Whether you're a data engineer, analyst, or just getting started with declarative pipelines, Designer lets you:

  • Build pipelines visually -- create tables, views, and streaming tables with a few clicks
  • Generate boilerplate automatically -- Python and SQL source files are created for you
  • Preview code before finalizing -- see exactly what gets generated before committing
  • Work offline with Databricks -- edit your project locally, sync back via Git when you're online
  • Navigate instantly -- click any node in the DAG to jump to its source code

Getting Started

Starting from scratch

  1. Install the extension from the VS Code Marketplace
  2. Open an empty folder as a workspace in VS Code
  3. Click the SDP Designer icon in the Activity Bar
  4. Click the + New Pipeline... entry in the Apache Spark Pipelines panel
  5. Follow the prompts to name your pipeline and set up the project structure
  6. Start building -- add entities, define dependencies, and preview your pipeline

Working with an existing pipeline

  1. Install the extension from the VS Code Marketplace
  2. Open a workspace containing your Spark pipeline project (.yml, .py, or .sql files)
  3. Click the SDP Designer icon in the Activity Bar
  4. Your pipelines are auto-detected -- select one from the list
  5. Explore the DAG, click nodes to jump to source, or add new entities

Features

Visual Pipeline Designer

Create new pipeline entities directly from the graph. Pick a name, choose the entity type, select dependencies, and Designer generates the source file and places it in the right directory.

Interactive DAG Visualization

Explore your pipeline as an interactive dependency graph with full support for horizontal and vertical layouts, dark and light themes, search, and node navigation.

Full Python and SQL Support

Designer understands both PySpark decorators and SQL DDL statements. Mix and match languages in the same pipeline -- Designer handles the dependency resolution across both.

Databricks Compatible

Designer recognizes Databricks declarations (@dlt.*, @dp.*, @sdp.*). Work on your Databricks project offline in VS Code using Git synchronization, then push your changes when you're back online.

Smart Code Generation

Generated code follows your project structure. Target folders are resolved from your pipeline YAML libraries globs, so new files land exactly where they belong.


Supported Entity Types

Type Python Decorator SQL Statement
Table @dp.table(name="...") CREATE TABLE ...
View @dp.view(name="...") CREATE VIEW ...
Materialized View @dp.materialized_view(name="...") CREATE MATERIALIZED VIEW ...
Temporary View @dp.temporary_view(name="...") --
Streaming Table @dp.streaming_table(name="...") --

Decorators from @dlt.*, @dp.*, and @sdp.* namespaces are all supported.


Example

from pyspark.sql import SparkSession
from pyspark import pipelines as dp

@dp.materialized_view(name="sales_summary")
def create_sales_summary(spark: SparkSession):
    return spark.sql("""
        SELECT region, SUM(amount) AS total
        FROM raw_sales
        GROUP BY region
    """)

@dp.table(name="customers_enriched")
def enrich_customers(spark: SparkSession):
    return spark.sql("""
        SELECT c.*, o.order_count
        FROM raw_customers c
        LEFT JOIN order_counts o ON c.id = o.customer_id
    """)

Validation

Before creating an entity, Designer checks:

  • Name -- must be a valid identifier, not a SQL reserved word, and unique in the pipeline
  • Dependencies -- warns about references to entities that don't exist yet
  • Circular dependencies -- rejects additions that would create a cycle in the DAG
  • Schema -- validates field names and types when provided

Roadmap ideas

  • Unity Catalog integration
  • Run and dry-run support
  • Multi-cloud catalog support
  • Diagram export (PNG, SVG, draw.io)
  • Custom code templates

Because this is a side project, and not sponsored by any company, I prioritize features based on user feedback and my own use cases. If you have a feature you'd like to see, please open an issue or reach out!

Install from VS Code Marketplace

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft