HydraKernel

Run Jupyter notebook branches in parallel from a shared setup.

HydraKernel turns a notebook into a lightweight workflow engine. Instead of manually duplicating notebooks or launching multiple scripts, define common setup cells once and execute independent branches simultaneously using separate Python processes.

Perfect for scenario analysis, parameter sweeps, model runs, sensitivity studies, and any workflow where several independent computations share the same initialization code.

Why HydraKernel?

A common notebook workflow looks like this:

# load data
df = pd.read_csv("large_dataset.csv")

# preprocess
df = preprocess(df)

# build model inputs
inputs = build_inputs(df)

followed by multiple independent experiments:

# baseline

# central

# conservative

Normally, you either:

Run them sequentially
Duplicate notebook cells
Create separate scripts
Build a custom workflow pipeline

HydraKernel automates this process.

Features

Shared Setup Cells

Mark cells that should be included in every branch:

# hydra: setup

import pandas as pd
df = pd.read_csv("data.csv")

Parallel Branch Execution

Create independent branches:

# hydra: branch baseline

run_model("baseline")

# hydra: branch central

run_model("central")

# hydra: branch conservative

run_model("conservative")

HydraKernel automatically generates temporary scripts and executes them simultaneously.

Live Status Tracking

HydraKernel displays branch execution status:

Branch Status
-------------
🟢 baseline          Running
⏳ central           Queued
✅ conservative      Done
❌ failed_case       Failed

Consolidated Output Logging

All branch output is streamed into a dedicated HydraKernel output panel:

[baseline] starting...
[central] starting...

[baseline] complete
[baseline] finished with code 0

[central] complete
[central] finished with code 0

Setup Caching (Experimental)

Mark a setup cell with:

# hydra: setup
# hydra: cache

HydraKernel executes setup once, serializes compatible Python objects, and loads them into every branch.

Useful when setup is expensive:

# hydra: setup
# hydra: cache

df = pd.read_parquet("50GB_dataset.parquet")

instead of reloading the dataset for every branch.

Example

# hydra: setup

x = 5
y = 10

# hydra: branch baseline

print("baseline", x + y)

# hydra: branch central

print("central", x * y)

Output:

[baseline] baseline 15
[central] central 50

Requirements

Visual Studio Code
Jupyter Notebook extension
Python 3.9+
Optional: cloudpickle for setup caching

Install:

pip install cloudpickle

Usage

Open a Jupyter notebook.
Mark shared cells with:

# hydra: setup

Mark branch cells with:

# hydra: branch <name>

Open the Command Palette:

HydraKernel: Run Branches

Watch branches execute in parallel.

Current Limitations

Branches execute as separate Python processes.
Cached objects must be serializable.
Open file handles, sockets, GPU contexts, and some external resources cannot be cached.
Notebook cell outputs are currently displayed in the HydraKernel output panel rather than written back into notebook cells.

Roadmap

Planned

Stop running branches
Branch progress bars
Automatic interpreter detection
Branch groups
Run selected branches only
Notebook output integration
Distributed execution support

Future

Remote cluster execution
SLURM integration
Parameter sweep generation
Dependency graphs
Branch result comparison tools

Release Notes

0.0.1

Initial release.

Shared setup cells
Parallel branch execution
Live output streaming
Status tracking
Experimental setup caching

HydraKernel: If you ever wished to just stop restarting your notebooks.

hydrakernel

Lara Ferreira Bezerra