# LBNF
In order to write your own grammar for BNFC, sooner or later you need to know enough about the domain specific language LBNF in which such a grammar is written.
Let us look at the example [`cpp.cf`](https://github.com/BNFC/bnfc/blob/master/examples/cpp/cpp.cf).
The first line is
PDefs. Program ::= [Def] ;
Part of what we read here is user defined, while others are part of LBNF.
Program ::= [Def]
is a production rule, or rule for short. `PDefs` is the name of the rule. [^nameoftherules] The names `PDefs`, `Program` and `Def` are user defined. The symbols ` . ` and ` ::= ` and ` [ ` and ` ] ` and ` ; ` are keywords of LBNF.
The second rule is
DFun. Def ::= Type Id "(" [Arg] ")" "{" [Stm] "}" ;
Here we see for the first time the difference between terminals and non-terminals. (Terminals are tokens (strings) that appear in the program. The non-terminals are the symbols that appear immediately to the left of some `::=`.)
The following are user defined non-terminals.
Def
Type
Id
Arg
Stm
Their meaning is defined further down in the file where you find rules that have any of these non-terminals as left hand sides (as in `Arg ::= ...` in line 7 of `cpp.cf`).
On the other hand, the symbols
( ) { }
are so-called terminal symbols. As opposed to non-terminals, terminals actually appear in the program. Terminals can be recognised by by being placed inside double quotes ` " " `.
**Exercise:** Go through [`cpp.cf`](https://github.com/BNFC/bnfc/blob/master/examples/cpp/cpp.cf) and explain the following notation consulting the linked sections of the LBNF documation.
- [terminator and separator](https://bnfc.readthedocs.io/en/latest/lbnf.html#terminators-and-separators)
- [coercions](https://bnfc.readthedocs.io/en/latest/lbnf.html#coercions)
- [rules](https://bnfc.readthedocs.io/en/latest/lbnf.html#token)
- [token](https://bnfc.readthedocs.io/en/latest/lbnf.html#the-token-rule) [^token]
- [comment](https://bnfc.readthedocs.io/en/latest/lbnf.html#the-comment-rule)
- [predefined basic types](https://bnfc.readthedocs.io/en/latest/lbnf.html#predefined-basic-types)
[^example]: For example, aiming at building a C++ compiler, we already had a glimpse at a compiler that translates regular expressions into DFAs and at another one that translates certain context-free grammars into deterministic pushdown automata.
## References
- [LBNF documentation](https://bnfc.readthedocs.io/en/latest/lbnf.html)
## Epilogue
This may be a good opportunity to reflect upon the distinction between object-language and meta-language. We met this distinction last semester in Programming Languages. In our current assignment, C++ is the object-language and LBNF is the meta-language. [^metameta] It is often important to keep different levels of language apart. This is the reason why we use different terminology (in English as a "meta-meta-meta-language" :-)) to talk about LBNF and C++. For example, C++ has "identifiers" and "expressions" and "statements" etc while LBNF has terminals", "non-terminals", "production rules" etc.
Sometimes the same terminology is used for both languages with different meanings and then one needs to be careful. So while the types of C++ are "int", "bool" etc, you may find that non-terminals of the grammar such as "Exp" and "Id" are also referred to as "types". Moreover, these non-terminals are often implemented as types of the meta-meta-language (Haskell for us). The use of English words that have different meanings in the three levels below it (Haskell, LBNF, C++) can be confusing, in particular, if these meanings do overlap to some extend. On the other hand, developing a totally precise technical language would make things even more difficult (for different reasons). So newcomers have to be patient and read a lot and take the time to get familiar. And, of course, implementing a compiler will help a lot to understand how all these (and more) different levels of languages interact.[^example]
[^nameoftherules]: The names of the rules will be important later as the nodes in the abstract syntax trees which form the hub of a modern compiler.
[^token]: The following is relevant:

For example, in `cpp.cf`, the type `Id` defined via `token Id (letter (letter | digit | '_')*)` cannot parse `string`.
[^metameta]: There are also various meta-meta-languages (or should I say (meta-)*languages) involved. For example, [LBNF is itself defined in LBNF](https://github.com/BNFC/bnfc/blob/master/source/src/BNFC.cf), while the parser generator, as well as the generated parser, are implemented in Haskell.