BigCode.Classify - a Visual Studio Code extension for Algorithm Classification using flattened AST.
Before using the extension, you need to get the docker image of
INFO. You can also list these models using the following command:
INFO. You can also assign a key binding to this command by Code > Preferences > Keyboard Shortcuts or by adding the following to your
INFO. The colors of the tokens are assigned according to the attention score of the token for the neural network to predict the classification. The closer it is on the spectrum to Red, the more important (red score = 1); the closer it is on the spectrum to blue, the less important (blue score = 0). For your information, this spectrum is called Hue in the HSB color scheme.
INFO. The title of the WebView tab is named after the source code file, annotated with a number indicating which model was used to generate the attention scores.
INFO. At the end of the web page, you can see the classification results, i.e., the probabilities of the current program belong to each class. According to our Dataset 1 of 10 sorting problems, collected from Github, the meaning of a class is the following:
A PNG image file generated highlights the predicted label in red, amongst the 10 probabilities bars.
The token colors follow the hue, saturation, and brightness (HSB) spectrum, with full saturation=1 and brightness=1. The hue is a color analogous to the hotness, where blue = 0 to red = 1, and any color in-between is determined by the attention score.
Dataset 1: 10 sorting problems, collected from Github, which are insertion-sort, merge-sort, topological-sort, heap-sort, bubble-sort, radix-sort, shell-sort, quick-sort, selection-sort, bucket-sort.
Dataset 2: 104 programming problems, which comprises of 52000 cpp files from the paper Convolutional Neural Networks over Tree Structures for Programming Language Processing, AAAI 2015.
Nghi D. Q. BUI, Yijun YU, Lingxiao JIANG. "Bilateral Dependency Neural Networks for Cross-Language Algorithm Classification", In the 26th edition of the IEEE International Conference on Software Analysis, Evolution and Reengineering, Research Track, Hangzhou, China, February 24-27, 2019.
Nghi D. Q. BUI, Lingxiao JIANG, and Yijun YU. "Cross-Language Learning for Program Classification Using Bilateral Tree-Based Convolutional Neural Networks", In the proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI) Workshop on NLP for Software Engineering, New Orleans, Louisiana, USA, 2018.
Miltiadis Allamanis, Marc Brockschmidt, Mahmoud Khademi. "Learning to Represent Programs with Graphs", In: 6th International Conference on Language Representations (ICLR), 2018.
Y Li, D Tarlow, M Brockschmidt, R Zemel. "Gated graph sequence neural networks", In: 4th International Conference on Language Representations (ICLR), 2016.
Lili Mou, Ge Li, Lu Zhang, Tao Wang, Zhi Jin: "Convolutional Neural Networks over Tree Structures for Programming Language Processing". In: AAAI 2016: 1287-1293