# Lecture 2 ###### tags: `PSD` ### Chapter 3: From Concrete Syntax to abstract syntax **Lexers, Parsers and Generators** A *lexer* or *scanner* is a program that reads characters from a text file and assembles them into a stream of *lexical tokens or lexemes*. It usually ignores the amount of whitespace, carriage returns, tabulation characters and page breaks between non-black symbols. A *parser* is a program that accepts a stream of lexical tokens from a lexer, and builds an abstract syntax tree (AST) representing that stream of tokens. A *lexer generator* is a program that convets a parser specification into a lexer (which recognizes tokens described by the regular expression) // fslex A *parser generator* is a program that converts a parser specification into a parser. // fsyacc ![](https://i.imgur.com/uwhNbcr.png) This is a bottom-uo parser: LR parser. characterized by reading characters from the Left and making derivations from the Right-most nonterminal. **Grammars in parser specifications** A context free grammar has four components: 1. terminal symbol: e.g. x, 12, "foo" 2. nonterminal symbol: A 3. start symbol: S 4. grammar rules or productions of the form A::= tnseq. where tnseq is a sequence of terminal/nonterimal symbols. An informal grammar for our expression language ```F# Expr ::= NAME |INT |- INT |(Expr) |let NAME = Expr in Expr end |Epxr * Expr |Epxr + Expr |Epxr - Expr ``` **Parser specification for expressions** A complete parser specification for the expression language is found in Exprpar.fsy First it declares the tokens/terminal symbolsm CSTINT, NAME, and so on... followed by a declaration of operators associativity and precedence. ![](https://i.imgur.com/UIwH9OI.png) After this then the nonterminal Main is the start symbol. And gives rules to both nonterminals Main and Expr ![](https://i.imgur.com/YD0QgcD.png) The expressions in curly braces on the right describe what result will be produced by a succesful parse. The purpose of parsing is to produce an abstract syntax representation by reading a text file (concrete syntax) To summarize, the curly braces contain F# expressions that produce the result of a successful parse. They are sometimes called semantic actions. ### BCD This is about the math behind. See dmat notes instead.