VSCode Extension for Parssist - A parser assistantA Parsergenerator, which takes an extended BNF and builds a Parser (with a certain algorithm). For now it is only possible to generate a parser written in Java. In the future it should be possible to generate parsers in other languages as well. See the open issues for more information. Getting StartedBasically, Parssist takes a grammar and a lexical definition described here (grammar)) and here (lexer)).
Defaultly you can define your grammar in a file called
FeaturesGenerating a ParserYou can generate a parser by pressing Alternatively you can press the button in the status bar. Syntax HighlightingThe extension provides syntax highlighting for the grammar and the lexical definition. Extension SettingsYou can configure the extension by pressing This extension contributes the following settings:
RequirementsFor now, the extension only supports the generation of a parser in Java. So there for you need to have Java installed on your machine. In addition, Parssist uses Gradle to build the parser. Release Notes1.0.0Initial release of Parssist. Date of release: 2024-07-10 1.0.1Bug fixes and improvements. Date of release: 2024-07-10 1.0.2Added logo and updated readme. Date of release: 2024-07-10 Lexer DocumentationIntroductionThe lexerfile defines the information required for lexical analysis and thus tokenization. ExplanationThe simplest form of tokenization is based on the help of regex. The symbol groups and the corresponding patterns are mapped together and define the alphabet and the vocabulary of the grammar.
:= Assignment OperatorThe usual form of a lexer definition rule is as follows:
NONTERMINALNonterminal elements should be mapped with the keyword NONTERMINAL to tell the generator which elements are nonterminal. The mapped regex element can be combined individually or as usual with a regex pipe operator. IgnorablesSymbols (such as the empty symbol) may appear in the grammar which have no influence on the parsing. Well-known examples of this are spaces or line breaks. These are called ignorables in Parssist.
CommentsIn the lexer file there are only line comments that must begin with a hashtag. these are not taken into account during the lexical analysis.
Grammar DocumentationIntroductionThe input grammar format is an easy variation of EBNF. The BNF format used corresponds to the 4-tuple (V, A, P, S). V is a set of non-terminals (vocabulary), A is the alphabet, P the production rules and S the start symbol. Basic principle Explanation of the 4-tuple definition (V, A, P, S)Vocabulary (V)Non-terminals must begin with a capital letter to show that it is a non-terminal. As soon as it is to the left of the production, this implies the initialization of a non-terminal. As the application therefore automatically knows which symbols have been initialized as non-terminals, it is not necessary to explicitly specify a quantity. Possibly, however, later on in type-1 languages where terminals may also be on the left-hand side. At the moment all non-terminals have to be defined in the lexer file! There is no auto recognizer yet. Alphabet (A)The alphabet defines the terminal symbols. These are also defined in the lexer file. Production rules (P)The syntax of the production rules is defined as follows: Production Rule Arrow OperatorProduction rules are therefore generally declared with an arrow ( Empty Symbol OperatorThe empty symbol, which is normally marked as an epsilon, is marked with a dollar sign ( Pipe OperatorIn the example above you can see that production rules are not limited to just one definition rule. This is a modification that we use to avoid defining the syntax unnecessarily long. The pipe symbol ( Startsymbol (S)At the moment, the first non-terminal definition symbol is automaticaly used as the start symbol.
Open IssuesParssist is still in development and there are many open issues. The top priority is to implement the following features:
For more informationThank you for reading. I hope you enjoy using Parssist! |