# LBNF In order to write your own grammar for BNFC, sooner or later you need to know enough about the domain specific language LBNF in which such a grammar is written. Let us look at the example [`cpp.cf`](https://github.com/BNFC/bnfc/blob/master/examples/cpp/cpp.cf). The first line is PDefs. Program ::= [Def] ; Part of what we read here is user defined, while others are part of LBNF. Program ::= [Def] is a production rule, or rule for short. `PDefs` is the name of the rule. [^nameoftherules] The names `PDefs`, `Program` and `Def` are user defined. The symbols ` . ` and ` ::= ` and ` [ ` and ` ] ` and ` ; ` are keywords of LBNF. The second rule is DFun. Def ::= Type Id "(" [Arg] ")" "{" [Stm] "}" ; Here we see for the first time the difference between terminals and non-terminals. (Terminals are tokens (strings) that appear in the program. The non-terminals are the symbols that appear immediately to the left of some `::=`.) The following are user defined non-terminals. Def Type Id Arg Stm Their meaning is defined further down in the file where you find rules that have any of these non-terminals as left hand sides (as in `Arg ::= ...` in line 7 of `cpp.cf`). On the other hand, the symbols ( ) { } are so-called terminal symbols. As opposed to non-terminals, terminals actually appear in the program. Terminals can be recognised by by being placed inside double quotes ` " " `. **Exercise:** Go through [`cpp.cf`](https://github.com/BNFC/bnfc/blob/master/examples/cpp/cpp.cf) and explain the following notation consulting the linked sections of the LBNF documation. - [terminator and separator](https://bnfc.readthedocs.io/en/latest/lbnf.html#terminators-and-separators) - [coercions](https://bnfc.readthedocs.io/en/latest/lbnf.html#coercions) - [rules](https://bnfc.readthedocs.io/en/latest/lbnf.html#token) - [token](https://bnfc.readthedocs.io/en/latest/lbnf.html#the-token-rule) [^token] - [comment](https://bnfc.readthedocs.io/en/latest/lbnf.html#the-comment-rule) - [predefined basic types](https://bnfc.readthedocs.io/en/latest/lbnf.html#predefined-basic-types) [^example]: For example, aiming at building a C++ compiler, we already had a glimpse at a compiler that translates regular expressions into DFAs and at another one that translates certain context-free grammars into deterministic pushdown automata. ## References - [LBNF documentation](https://bnfc.readthedocs.io/en/latest/lbnf.html) ## Epilogue This may be a good opportunity to reflect upon the distinction between object-language and meta-language. We met this distinction last semester in Programming Languages. In our current assignment, C++ is the object-language and LBNF is the meta-language. [^metameta] It is often important to keep different levels of language apart. This is the reason why we use different terminology (in English as a "meta-meta-meta-language" :-)) to talk about LBNF and C++. For example, C++ has "identifiers" and "expressions" and "statements" etc while LBNF has terminals", "non-terminals", "production rules" etc. Sometimes the same terminology is used for both languages with different meanings and then one needs to be careful. So while the types of C++ are "int", "bool" etc, you may find that non-terminals of the grammar such as "Exp" and "Id" are also referred to as "types". Moreover, these non-terminals are often implemented as types of the meta-meta-language (Haskell for us). The use of English words that have different meanings in the three levels below it (Haskell, LBNF, C++) can be confusing, in particular, if these meanings do overlap to some extend. On the other hand, developing a totally precise technical language would make things even more difficult (for different reasons). So newcomers have to be patient and read a lot and take the time to get familiar. And, of course, implementing a compiler will help a lot to understand how all these (and more) different levels of languages interact.[^example] [^nameoftherules]: The names of the rules will be important later as the nodes in the abstract syntax trees which form the hub of a modern compiler. [^token]: The following is relevant: ![](https://i.imgur.com/MixSX9H.png) For example, in `cpp.cf`, the type `Id` defined via `token Id (letter (letter | digit | '_')*)` cannot parse `string`. [^metameta]: There are also various meta-meta-languages (or should I say (meta-)*languages) involved. For example, [LBNF is itself defined in LBNF](https://github.com/BNFC/bnfc/blob/master/source/src/BNFC.cf), while the parser generator, as well as the generated parser, are implemented in Haskell.