VS Code Extension for Spark Labs
Spark lab is a Spark-based development environment that enables you to interactively program, debug, submit, and test Spark applications on a live Spark cluster running on the IBM Analytics Engine (IAE) service.
It is available as a VS Code extension and you can install it in your local system to access Spark IDE using Visual Studio Code. It reduces the time for development and increases usability.
Remote - SSH Extension is required for this extension to work.
IAE Environments
- Software Environment (Cloud Pak for Data): An on-premises or private cloud setup where IBM's analytics engine software runs directly on user-managed infrastructure.
- SaaS (Software as a Service) Environment: A cloud-based solution where analytics engine applications are hosted by IBM Cloud.
Features
What you can do in Visual Studio Code with this extension:
- Connect to spark labs console
- Create & Delete Spark labs from the extension
- Securely connect to your spark labs console's runtimes via SSH over secure WebSockets
- Develop & Debug the files inside your Spark Labs Git project remotely
Installation
Marketplace
You can use VS Code Marketplace to download and install this extension directly to your VS Code.
From extension package
Ask your administrator to enable VS Code support for your cluster
Download the Spark Labs extension and then install the downloaded .vsix
file in VS Code:
Cmd/Ctrl + Shift + P -> Extensions: Install from VSIX
Optional, but recommended: Install the Remote SSH extension in VS Code.
If you're not using the Remote SSH extension, you'll need an SSH client on your machine.
Set up SSH on your machine
Generating an SSH key pair on Mac and Linux
Important: Before generating a new key pair, check the ~/.ssh directory for any existing key pairs,
To generate an SSH key pair on MacOS and Linux:
- Open the Terminal
- Execute this command:
ssh-keygen -t rsa -N ""
When you generate a key pair, specify the location of the private key in the spark-labs.privateKeyPath
and public key in the spark-labs.publicKeyPath
settings of the extension.
Generating an SSH key pair on Windows
Important: Check whether your home directory contains the .ssh
directory or run dir "%USERPROFILE%/.ssh"
to check if any keys exist.
To generate an SSH key pair on Windows:
On Windows 10 or newer with the OpenSSH Client feature enabled:
- Open a new cmd command prompt and use the `ssh-keygen utility.
- Execute this command:
ssh-keygen -t rsa -N ""
If you are using an older Windows version:
- Install a third party tool, such as Putty, to generate a SSH key.
- If your're using Putty, use the
puttygen
utility to generate an SSH key pair.
Note: PuttyGen does not generate OpenSSH keys by default. To export an OpenSSH key, convert the key: select Conversions -> Export OpenSSH key.
Optional: Using a separate SSH key for Spark Labs
To use a seperate SSH key for Spark Labs, add a new entry to your SSH configuration under ~/.ssh/config
:
Host cpdenv
HostName localhost
Port 5681 # Same port as spark-labs.localPort
IdentityFile ~/.ssh/id_cpd_rsa # Path to your private key
Then enter the host cpdenv
under spark-labs.localHost
.
Extension Settings
This extension contributes the following settings:
spark-labs.privateKeyPath
: Path to the private key file for SSH connection. May be left empty as it is optional from 2.0.3 onwards.
spark-labs.publicKeyPath
: Path to the public key file for SSH connection. May be left empty as it is optional from 2.0.3 onwards.
spark-labs.autoOpenRuntime
: Automatically open the runtime once it has been loaded, without the need for extra confirmation.
spark-labs.autoUpdateHostkey
: Automatically update the host key in the known_hosts
file by fetching it via a secure TLS connection. Before modifying the known_hosts file, a backup is created under ~/.ssh/.known_hosts.cpd-backup
. This setting should always be enabled, especially when operating with multiple clusters.
spark-labs.localPort
: The port to listen on locally
spark-labs.localHost
: The SSH host to which Remote SSH will connect to. Can be any host specified in the SSH client config file. The HostName configured in your SSH client config should always point to localhost and the port specified in localPort. This allows you to use your own SSH settings, e.g. a specific private key.
spark-labs.logLevel
: The log level of the extension.
spark-labs.verifySSL
: (Optional) Whether to verify the certificate of the SSL connection or not. Default: true
.
Manage Connection Page
Host Address
: Hostname of the spark labs console. You can find it in the address bar of your browser when you log in to Software or SaaS Environment. For example: cpd.xxx.xxx.xxx.com
.
Environment Type
: Environment where spark labs is running. Software or SaaS. Refer to IAE Environments section.
Cloud Resource Name (CRN)
: Only visible and used for SaaS. More here: https://cloud.ibm.com/docs/account?topic=account-crn
Username
: Username that you use to log in to Software or SaaS Environment.
API Key
: API Key used to authenticate to your Software or SaaS Environment.