Deploying to Databricks
This extension has a set of tasks to help with your CI/CD deployments if you are using Notebooks, Python, jars or Scala. These tools are based on the PowerShell module azure.databricks.cicd.tools available through PSGallery. The module has much more functionality if you require it.
Now works with Service Principal Authentication (PREVIEW)
Azure DevOps Tasks
You will find the new Tasks available under the Deploy tab, or search for Databricks:
Deploying Files to DBFS
Use this to deploy a file or pattern of files to DBFS. Typically this is used for jars, py files or data files such as csv. Now supports large files.
Use this to deploy a folder of notebooks from your repo to your Databricks Workspace.
Use this to deploy a folder of scripts from your repo to your Databricks Workspace. If the Secret scope does not exist it will be created for you (note all user access to the scope will be granted).
Use the Databricks UI to get the JSON settings for your cluster (click on the cluster and look in the top right corner for the JSON link). Copy the json into a file and store in your git repo. Remove the cluster_id field (it will be ignored if left) - the cluster name will be used as the unique key.
If a cluster with this name exists it will be updated, if not, it will be created.
Note that if any settings are changed (even tags) the cluster will be restarted when executed.
Your file should look something like:
Add the Task named "Databricks Cluster" - setting the path to your file and the authentication details.
Bulk Export Scripts from your Workspace
Use the option in the Databricks UI to link your notebook to a git repo or you can export existing notebooks using this PowerShell module: https://github.com/DataThirstLtd/azure.databricks.cicd.tools.
Libraries & Jobs
These tools are based on the PowerShell module azure.databricks.cicd.tools available through PSGallery. The module has much more functionality if you require it for Libraries, Jobs and more Cluster management.