Scientific Data Viewer - VSCode Extension
A powerful VSCode extension for viewing and analyzing scientific data files including NetCDF, Zarr, HDF5, and more. This extension provides an intuitive interface for exploring scientific datasets directly within VSCode, eliminating the need for external tools.
🚀 Features
- Multi-format Support: View NetCDF (.nc, .netcdf), Zarr (.zarr), and HDF5 (.h5, .hdf5) files
- Custom Editors: Direct file opening with dedicated NetCDF and HDF5 editors
- Interactive Data Explorer: Browse file structure, dimensions, variables, and attributes
- Enhanced Variable Information: View variable dimension names, data types, shapes, and memory usage
- Data Visualization: Create plots and visualizations directly in VSCode (experimental, disabled by default)
- Advanced Python Integration: Automatic Python environment detection and management
- File Tree Integration: Right-click on supported files in the explorer to open them
- Command Palette Integration: Multiple commands for data viewer operations
- Real-time Configuration: Immediate application of setting changes without restart
- Status Bar Integration: Shows current Python interpreter status
- Comprehensive Logging: Detailed logging system for debugging and monitoring
- Human-readable File Sizes: Display file and variable sizes in appropriate units (B, kB, MB, GB, TB)
- Error Handling: Robust error handling with user-friendly messages
- Experimental Features: Configurable experimental features with clear warnings
📸 Screenshot
📦 Installation
Quick Install (Recommended)
Install from VSCode Marketplace:
- Open VSCode
- Go to Extensions view (
Ctrl+Shift+X
)
- Search for "Scientific Data Viewer"
- Click Install
Install Python dependencies:
pip install xarray netCDF4 zarr h5py numpy matplotlib
Manual Install
Download the extension:
Install the .vsix file:
- Open VSCode
- Go to Extensions view (
Ctrl+Shift+X
)
- Click the "..." menu and select "Install from VSIX..."
- Select the downloaded
.vsix
file
⚙️ Prerequisites
Before using this extension, you need:
- Python 3.13+ installed on your system
- Required Python packages:
- xarray
- netCDF4
- zarr
- h5py
- numpy
- matplotlib
🎯 Usage
Opening Data Files
Direct File Opening:
- Double-click on any supported file (.nc, .netcdf, .zarr, .h5, .hdf5)
- Files open directly in the Scientific Data Viewer
From File Explorer:
- Right-click on any supported file (.nc, .netcdf, .zarr, .h5, .hdf5)
- Select "Open in Scientific Data Viewer"
From Command Palette:
- Press
Ctrl+Shift+P
- Type "Open Scientific Data Viewer"
- Select a file from the file picker
Auto-detection:
- Open any supported file in VSCode
- The extension will detect it and offer to open it in the data viewer
Configuring Python Environment
Automatic Detection:
- The extension will automatically detect Python installations
- It will check for required packages and prompt to install missing ones
Manual Configuration:
- Press
Ctrl+Shift+P
- Type "Python: Select Interpreter"
- Choose your preferred Python environment
- The extension will automatically detect it and use it
Settings:
- Open VSCode Settings (
Ctrl+,
)
- Search for "Scientific Data Viewer"
- Configure Python path and other options
Exploring Data
The data viewer shows:
- File Information: Format, size, and basic metadata
- Dimensions: Dataset dimensions and their sizes
- Variables: All data variables with their types, shapes, dimension names, and memory usage
- ~~Visualization: Interactive plots and charts~~
The data representation is based entirely on the native xarray's Dataset HTML representation.
Creating Visualizations (:warning: EXPERIMENTAL)
- Select a variable from the dropdown or click on it in the variables list
- Choose a plot type (Line Plot, Heatmap, Histogram)
- Click "Create Plot" to generate the visualization
⚙️ Configuration
The extension can be configured through VSCode settings:
scientificDataViewer.autoRefresh
: Automatically refresh data when files change
scientificDataViewer.maxFileSize
: Maximum file size (MB) to load automatically
scientificDataViewer.defaultView
: Default view mode (default)
scientificDataViewer.allowMultipleTabsForSameFile
: Allow opening multiple tabs for the same file (Experimental)
scientificDataViewer.plottingCapabilities
: Enable plotting capabilities (Experimental)
Available Commands
Access these commands via the Command Palette (Ctrl+Shift+P
):
- Open Scientific Data Viewer: Open a file in the data viewer
- Refresh Python Environment: Manually refresh the Python environment
- Show Extension Logs: View detailed extension logs
- Show Settings: Open Scientific Data Viewer settings
Feature Flags
The extension includes configuration options that act as feature flags to control specific behaviors:
scientificDataViewer.allowMultipleTabsForSameFile
(Experimental): Allow opening multiple tabs for the same file
scientificDataViewer.plottingCapabilities
(Experimental): Enable plotting capabilities
- Settings UI: Each setting appears as a checkbox in VSCode Settings
- Real-time Updates: Configuration changes take effect immediately
🔧 Troubleshooting
Common Issues
Python not found:
- Ensure Python is installed and in your PATH
- Use the "Python: Select Interpreter" command to manually set the path
Missing packages:
- Install required packages:
pip install xarray netCDF4 zarr h5py numpy matplotlib
- Or let the extension install them automatically
Large files not loading:
- Increase the
maxFileSize
setting
- Consider using data slicing for very large datasets
Permission errors:
- Ensure the extension has permission to read your data files
- Check file permissions and VSCode workspace settings
Getting Help
- Check the logs: Ctrl+Shift+P (Command Palette) and "Scientific Data Viewer: Show Extension Logs"
- Report issues: Create an issue on the GitHub repository
- Ask questions: Use the GitHub Discussions section
🛠️ Development
Quick Start for Developers
Clone and setup:
git clone https://github.com/etienneschalk/scientific-data-viewer.git
cd scientific-data-viewer
./setup.sh
Open in VSCode:
code .
Run extension:
- Press
F5
to launch Extension Development Host
- Test with sample data files
Development Installation
Clone the repository:
git clone https://github.com/etienneschalk/scientific-data-viewer.git
cd scientific-data-viewer
Install dependencies:
npm install
Compile the extension:
npm run compile
Install Python dependencies (if not already installed):
pip install xarray netCDF4 zarr h5py numpy matplotlib
Open in VSCode:
code .
Run the extension:
- Press
F5
to open a new Extension Development Host window
- Or use
Ctrl+Shift+P
and run "Developer: Reload Window"
Production Installation
Package the extension:
npm run package
Install the .vsix file:
- Open VSCode
- Go to Extensions view (
Ctrl+Shift+X
)
- Click the "..." menu and select "Install from VSIX..."
- Select the generated
.vsix
file
Project Structure
src/
├── extension.ts # Main extension entry point and command registration
├── dataProcessor.ts # Python integration and data processing
├── dataViewerPanel.ts # Webview panel for data visualization
├── pythonManager.ts # Advanced Python environment management
└── logger.ts # Comprehensive logging utilities
Python Scripts
The extension uses several Python scripts for data processing:
get_data_info.py
: Extracts file metadata, dimensions, variables, and their properties
get_data_slice.py
: Retrieves specific data slices from variables
create_plot.py
: Generates visualizations using matplotlib
get_html_representation.py
: Creates HTML representation of xarray datasets
get_text_representation.py
: Creates text representation of datasets
get_show_versions.py
: Shows Python package versions for debugging
create_sample_data.py
: Generates sample data files for testing
test_data_structure.py
: Tests data structure and format detection
Disclaimer: most visualization scripts are experimental and produce unusable plots!
Building
# Compile TypeScript
npm run compile
# Watch for changes
npm run watch
# Run tests
npm test
# Lint code
npm run lint
Testing
Unit Tests:
npm test
Integration Tests:
- Open the extension in development mode
- Test with sample data files
- Verify Python integration works correctly
Debugging
- Set breakpoints in your TypeScript code
- Press F5 to launch the Extension Development Host
- Use the debug console to inspect variables and step through code
📦 Publishing
Preparing for Publication
- Update version in
package.json
- Update CHANGELOG.md with new features and fixes
- Test thoroughly with various file types and sizes
- Update documentation if needed
Publishing to VSCode Marketplace
Install vsce (if not already installed):
npm install -g vsce
Login to Azure DevOps:
vsce login <publisher-name>
Package the extension:
vsce package
Publish:
vsce publish
Manual Publishing
Create a Personal Access Token:
- Go to Azure DevOps
- Create a new Personal Access Token with Marketplace permissions
Login:
vsce login <publisher-name>
# Enter your Personal Access Token when prompted
Publish:
vsce publish
🤝 Contributing
We welcome contributions! Please see our Contributing Guidelines for details.
Development Setup
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
📁 Project Structure
scientific-data-viewer/
├── src/ # TypeScript source code
│ ├── extension.ts # Main extension entry point
│ ├── dataProvider.ts # Tree view provider for file explorer
│ ├── dataProcessor.ts # Python integration and data processing
│ ├── dataViewerPanel.ts # Webview panel for data visualization
│ ├── pythonManager.ts # Python environment management
│ └── logger.ts # Logging utilities
├── python/ # Python scripts for data processing
│ ├── get_data_info.py # Extract file metadata and variable info
│ ├── get_data_slice.py # Extract data slices from variables
│ ├── create_plot.py # Generate visualizations
│ └── get_html_representation.py # Generate HTML representation
├── test/ # Test files
│ ├── runTest.ts # Test runner
│ └── suite/ # Test suites
├── sample-data/ # Sample data files for testing
│ ├── sample_data.nc # NetCDF sample file
│ ├── sample_data.h5 # HDF5 sample file
│ ├── sample_data.zarr/ # Zarr sample dataset
│ └── create_sample_data.py # Script to generate test data
├── out/ # Compiled JavaScript output
├── node_modules/ # Node.js dependencies
├── .vscode/ # VSCode configuration
│ ├── launch.json # Debug configuration
│ ├── tasks.json # Build tasks
│ └── settings.json # Workspace settings
├── package.json # Extension manifest and dependencies
├── package-lock.json # Dependency lock file
├── tsconfig.json # TypeScript configuration
├── tsconfig.test.json # Test TypeScript configuration
├── .eslintrc.json # ESLint configuration
├── language-configuration.json # Language configuration
├── README.md # Main documentation
├── QUICKSTART.md # Quick start guide
├── DEVELOPMENT.md # Development guide
├── CONTRIBUTING.md # Contribution guidelines
├── PUBLISHING.md # Publishing guide
├── CHANGELOG.md # Version history
└── setup.sh # Setup script