DescribeML is a VSCode language plugin to describe machine-learning datasets.
Precisely describe your data's provenance, composition, and social concerns in a structured format.
Make it easy to reproduce your experiments to others when you cannot share your data.
Check out a quick presentation of the tool:
The easiest way to install the plugin is by using the Visual Studio Code Market. Just type "describeML" in the extension tab, and that's it!
Instead, you can install it manually using the packaged release of the plugin in this repository that can be found at the root of the project.
The file is DescribeML-0.0.9.vsix
Open your terminal (or the terminal inside the VSCode) and write this:
Troubles: If you cannot see the syntax highlight in the examples files (p.e. Melanoma.descml) as the image below. Please, reload the VSCode editor and write the code --install command again
Great! That's it.
For more information, check out the quick presentation video!
DescribeML is part of an ongoing research project to improve dataset documentation for machine learning. The core of our proposal is a domain-specific language (preprint here) that allows data creators to describe relevant aspects of their data for the machine learning field and beyond. The Critical Dataset Studios of the Knowing Machines project have compiled an excellent list of current documentation practices.
The complete grammar in Extended Backus-Naur form (EBNF) can be seen in src/language-server/dataset-descriptor.langium
You may need extra steps to contribute or dive into the plugin or the language. (to match with the exact version of the Langium, the base framework we used)
1 - "npm install" to install dependencies.
2 - Then go to /node_modules folder and delete "langium" and "langium-cli" folder
3 - Copy the folder "langium" and "langium-cli" from folder /packages to /node_modules
4 - Get the folder /packages/langium-vscode and paste it inside your VSCode extension folder (typically /.vscode/extensions)
5 - Install the Langium plugin through the UI of VSCode
Testing the extensions under the hood
This repo comes with an already built-in config to debug. Just go to Debug in VSCode, and launch the Extension config. Please check your port 6009 is free.
For more information about how the framework works and how the language can be extended, please refer to https://github.com/langium/langium or the VSCode extension API documentation https://code.visualstudio.com/api