DataFlint Copilot
Intelligent Apache Spark optimization and debugging — right in your editor.
DataFlint Copilot connects VS Code to the DataFlint
platform so you can spot performance issues, understand Spark execution
plans, and act on recommendations without leaving your code. It also lets AI
assistants such as Claude, Cursor, and GitHub Copilot analyze your Spark
jobs alongside you.
What you get
- Inline performance highlights — CodeLens and hover tooltips show
bottlenecks, alerts, and suggested fixes anchored to the lines that produced
them.
- Issues view — a dedicated DataFlint activity-bar panel lists every
detected issue grouped by file, with one-click navigation and quick fixes.
- AI assistant integration — your AI agent can query Spark applications,
execution plans, jobs, alerts, and curated Spark expertise.
- Secure, browser-based sign-in — no credentials to copy or configure.
Getting started
- Install DataFlint Copilot from the VS Code Marketplace.
- Open any workspace. The welcome wizard appears automatically on first run;
you can re-open it any time via
DataFlint: Show Welcome.
- Click Sign in with DataFlint. Your browser opens, you authenticate, and
the wizard transitions to a "ready" state.
- Open a Spark file (Python, Scala, Java, or SQL). Highlights appear above
relevant lines and the DataFlint sidebar fills with issues as analysis
runs.
Authentication
DataFlint Copilot uses OAuth 2.0 with PKCE in the system browser. The
extension never sees your password, and tokens are stored in VS Code's
secret storage.
To sign out or switch accounts, run DataFlint: Logout, or use
Switch account inside the welcome wizard.
The DataFlint view in the activity bar contains an Issues panel that
groups every detected performance issue by file. From there you can jump to
the line that produced an issue, reveal a file in the Explorer, or refresh
the tree.
Use it from your AI agent
DataFlint Copilot plugs into VS Code Chat, Cursor, Claude Desktop, and other
MCP-aware agents automatically. Open the agent panel and ask in plain
English:
- "Find my slowest Spark job in the last week and explain why it's slow."
- "Analyze application
etl-prod for performance issues and suggest fixes."
- "Show me the execution plan of job 42 and point out the expensive stages."
- "What's costing the most in my Spark workloads, and how do I reduce it?"
The agent pulls applications, jobs, execution plans, alerts, and curated
Spark expertise from DataFlint on your behalf — no copy-pasting required.
Slash-command prompts
DataFlint also ships pre-built prompts that show up directly in the agent
panel's autocomplete — type / and look for the DataFlint entries. They
cover end-to-end workflows out of the box:
- Performance analysis — pick an application and job, then walk through
alerts and fixes.
- Execution-plan deep dives — explain why a specific node (Exchange,
Join, Aggregate, …) is expensive on the line that produced it.
- Spark expertise topics — guided explanations of AQE, partitioning,
shuffle, streaming, BigQuery / Iceberg connectors, and more.
The list is served live from the platform, so new prompts appear in
autocomplete automatically as we ship them — no extension update needed.
Commands
All commands are listed in the palette under the DataFlint category.
| Command |
Description |
DataFlint: Login |
Open the welcome wizard and start sign-in. |
DataFlint: Show Welcome |
Re-open the welcome wizard. |
DataFlint: Logout |
Clear tokens and reset the session. |
DataFlint: Toggle Highlight |
Show or hide inline highlights. |
DataFlint: Filter Highlights by Query |
Filter highlights by text. |
DataFlint: Fix Performance Issue |
Run the suggested fix on a highlight. |
DataFlint: Show Issues in File |
List DataFlint issues in the current file. |
DataFlint: Reveal All Highlighted Files |
Open every file that has highlights. |
DataFlint: Refresh Issues Tree |
Re-fetch the Issues view. |
DataFlint: Restart MCP Server |
Restart the MCP server. |
DataFlint: Check Server Status |
Print the current connection status. |
DataFlint: Show Logs |
Open the DataFlint output channel. |
DataFlint: Show Debug Information |
Print extension diagnostics for support. |
Settings
All settings live under dataflint-copilot.*.
| Setting |
Type |
Default |
Description |
dataflint-copilot.privacy.sendSourceCode |
boolean |
false |
Share anonymized code snippets to improve highlight accuracy. See Privacy. |
dataflint-copilot.logging.level |
"debug" \| "info" \| "warn" \| "error" |
"info" |
Verbosity of the DataFlint output channel. |
Privacy & telemetry
- Telemetry. The extension reports anonymized error and usage telemetry
to keep the product reliable. No source code or file contents are included
by default.
- Source-code sharing (opt-in). When
dataflint-copilot.privacy.sendSourceCode is enabled, code snippets are
obfuscated before they leave your machine: variable names become
v1, v2, …, function names become fn1, fn2, …, string literals become
"str_1", "str_2", …, and comments are stripped. Only the resulting
structure is shared.
- Authentication. Tokens are stored in VS Code's secret storage, never on
disk in plain text, and never logged.
Requirements
- VS Code 1.99.0 or higher (also works in Cursor and other VS Code-based
editors that support the MCP API).
- DataFlint account — Cloud or a self-hosted DataFlint instance.
- Spark workload — Python, Scala, Java, or SQL Spark applications.
Troubleshooting
The welcome wizard never moves past sign-in.
Re-open the wizard with DataFlint: Show Welcome and try again. If your
firewall blocks loopback connections from VS Code, allow them and retry.
No highlights appear.
- Check you're signed in (
DataFlint: Check Server Status).
- Confirm DataFlint has data for the application open in your editor.
- Toggle highlights off and on (
DataFlint: Toggle Highlight).
- Check the output channel (
DataFlint: Show Logs) for analysis errors.
Support
When filing a bug, please attach the output of DataFlint: Show Debug Information.
License
Proprietary. See the LICENSE file bundled with the extension for details.
Apache, Apache Spark, and Spark are trademarks of the Apache Software
Foundation. DataFlint is not affiliated with the Apache Software Foundation.