<div align="center"><h3>Lab Report</h3></div> <div align="center"><h3><i>for</i></h3></div> <hr> <div align="center"><h1 style="border-bottom: none">Compiler Laboratory</h1></div> <hr> <div align="center"><h3><i>Submitted in requirement for the course</i></h3></div> <div align="center"><h2 style="border-bottom: none">CSN - 362</h2></div> <div align="center"><h5>OF BACHELOR OF TECHNOLOGY IN COMPUTER SCIENCE AND ENGINEERING</h5></div> <div align="center"><h3><i>by</i></h3></div> <div align="center"><h3>Unmesh Kumar</h3></div> <div align="center"><h3>(Enrollment Number : 18114079)</h3></div> <div align="center"><h3><i>Submitted to</i></h3></div> <div align="center"><h3>Dr. Pradumn K. Pandey</h3></div> <img src="https://i.imgur.com/wqo6WXg.jpg" width="250" height="250" align="center" /> <div align="center"><h5>Department of Computer Science and Engineering</h5></div> <div align="center"><h5>Indian Institute of Technology (IIT) Roorkee</h5></div> <div style="page-break-after: always; break-after: page;"></div> <div align="center"><h1>Contents</h1></div> [TOC] <div style="page-break-after: always; break-after: page;"></div> # Assignment 1 ### 1.1 Problem Statement Write a program to identify tokens available in C/C++. ### 1.2 Resources ``` lex Flex for Windows ``` ### 1.3 Files run.l - program file of program to read a program file and output identified tokens. ### 1.4 Program Logic ![](https://i.imgur.com/CF1WDkN.jpg) To categorise multiple tokens in a particular C/C++ programme into their respective kinds, i.e. •Keywords •Identifiers •Constants •Strings •Special Symbols() •Operatorsrun .l takes in the file name of the program file as an argument. It contains predefined keywords (Clanguage), operators (C style), delimiters (()[];,.¡space¿) and comment tokens. The program file is readcharacter by character. A buffer of 32 bytes is used to store the intermediate token value. The procedurefor detecting tokens is as follows: 1. Is a single line comment if a token starts with / and the next character is also a /. Continue receivinginput until you reach a newline, at which point a new token will appear. All of the information gatheredduring this procedure is discarded. 2. This is a block comment if the token starts with / and the following character is *. Continue receivinginput until you reach a */, at which point a new token will appear. All of the information gatheredduring this procedure is discarded. 3. Continue collecting input and putting it in the buffer if an alphanumeric character is found. If any ofthe characters encountered are letters, the token is an identifier, and the identifier flag is set to true. 4. If a delimiter or operator is encountered, it marks the end of the token. If identifier flag is set, thenthe token is matched with the list of keywords. If successful, the token is identified as that keyword.If all keywords fail to match, then the token is identified as identifier. If the identifier flag is not set,then the token is identified as number. A new token begins from the next character. ### 1.5 How to Run and Example * Statements in the input File ```cpp int x=1; ``` * Instructions to be typed in Terminal ``` $ lex run.I $ gcc lex.yy.c $ a ``` Output:: ``` keyword identifier operator constant special character ``` <div style="page-break-after: always; break-after: page;"></div> # Assignment 2 ### 2.1 Problem Statement Write a C program to recognize strings under 'a*', 'a*b+', 'abb'. ### 2.2 Resources * C ### 2.3 Files `recognize_string.c` A main c program which will find whether our string matches which of the patterns of the strings pattern a*, abb or a*b+. ### 2.4 Program Logic With the use of transition diagram we we find the input of the state. If any pattern is matched then the string is saif to follow one of three patterns. The patterns can be recognized using following given rule: ``` 'a*' matches empty string and any string which starts with character 'a' 'a*b+' matches every length of the string which has repeated 'b' and every string which starts with repeated 'a's followed by repeated b's 'abb' matches exactly the string 'abb' ``` ### 2.5 How to Run and Example * Input examaples ``` aaaabbbbb aabk ``` * Terminal output ``` $ gcc recognize_string.c -o recognize_string.exe $ recognize_string Enter a String: aaaabbbbb aaaabbbbb is accepted under rule 'a*b+' $ gcc recognize_string.c -o recognize_string.exe $ recognize_string Enter a string: aabk aabk is not recognized ``` <div style="page-break-after: always; break-after: page;"></div> # Assignment 3 ### 3.1 Problem Statement Write a C program to find first terminals of any input grammar. ### 3.2 Resources * C ### 3.3 Files first.c - program file of program that takes grammar as input and find FIRST for all terminals ### 3.4 Program Logic #### Assumptions * Epsilon is represented by ‘#’. * Productions are of the form A=B, where ‘A’ is a single Non-Terminal and ‘B’ can be any combination of Terminals and Non- Terminals. * L.H.S. of the first production rule is the start symbol. * Grammer is not left recursive. * Each production of a non terminal is entered on a different line. * Only Upper Case letters are Non-Terminals and everything else is a terminal. * We Don't use ‘!’ or ‘$’ as they are reserved for special purposes. #### Explanation Our takes number of production rules and then each production rule one by one as input. 1. If X is terminal, FIRST(X) = X. 2. If X→εis a production, then addεto FIRST(X). 3. If X is a non-terminal, and X→Y1 Y2 . . . Yk is a production, andεis in all of FIRST(Y1), . . . ,FIRST(Yk), then addεto FIRST(X). 4. If X is a non-terminal, and X→Y1 Y2 . . . Yk is a production, then add a to FIRST(X) if for some i,a is in FIRST(Yi), andεis in all of FIRST(Y1), . . . , FIRST(Y(i-1)) ### 3.5 How to Run and Example * Input example ``` E -> TR R -> +T R| # T -> F Y Y -> *F Y | # F -> (E) | i ``` * Terminal output ``` $ gcc first.c -o first.exe $ first First(E)= { (, i, } First(R)= { +, #, } First(T)= { (, i, } First(Y)= { *, #, } First(F)= { (, i, } ``` <div style="page-break-after: always; break-after: page;"></div> # Assignment 4 ### 4.1 Problem Statement Write a C++ program for constructing of LL(1) parsing. ### 4.2 Resources * C++ ### 4.3 Files `parser.cpp`- file contains the program to parse the input string. Input is given from the terminal. ### 4.4 Program Logic ``` 1.We Read the input string 2.predictive parsing table parse the given input using stack 3.If stack [i] matches with token input string pop the token else shift it repeat the process until it reaches to$ ``` The main while loop of the program: ### 4.5 How to Run and Example * Input example ``` i*i+i ``` * Terminal output ``` $ g++ parser.cpp $ ./a.out Enter the input string: i*i+i Stack INPUT $bt i*i+i$ $bcf i*i+i$ $bci i*i+i$ $bc *i+i$ $bcf* *i+i$ $bcf i+i$ $bci i+i$ $bc +i$ $b +i$ $bt+ +i$ $bt i$ $bcf i$ $bci i$ $bc $ $b $ $ $ SUCCESS ``` <div style="page-break-after: always; break-after: page;"></div> # Assignment 5 ### 5.1 Problem Statement Write a C++ program to identify LR(0) cannonical items and produce LR(1)(SLR) parsing table. E-> E+T|T T-> T*F|F F-> (E)|id Two levels of output: 1. LR(0) set of items 1. Parsing Table ### 5.2 Resources ``` * C++14 * lr_zero.cpp ``` ### 5.3 Files `lr_zero.cpp` ### 5.4 Program Logic The rules for the grammar are defined in the `vector<vector<char>>` type variable `grammar`. ```cpp= std::vector<Rule> grammar = { {'S', 'E', '$'}, {'E', 'E', '+', 'T'}, {'E', 'T'}, {'T', 'T' , '*', 'F'}, {'T', 'F'}, {'F', '(', 'E', ')'}, {'F', 'i'} }; ``` We first find the `first` and `follow` of the input grammar using the `get_first` and `get_follow` methods defined in the program. Then LR(0) set of items are stored in the `items` variable. Then we insert the symbols in the grammar and print the table using `print_tbl` method. ### 5.5 How to Run and Example * Input example No input example is needed. * Terminal output ``` $ g++ lr_zero.cpp $ ./a.out LR(0) items are: 0) E -> .E+T E -> .T F -> .(E) F -> .i S -> .E T -> .F T -> .T*F 1) E -> .E+T E -> .T F -> .(E) F -> .i F -> (.E) T -> .F T -> .T*F 2) E -> E.+T S -> E. 3) T -> F. 4) E -> T. T -> T.*F 5) F -> i. 6) E -> E+.T F -> .(E) F -> .i T -> .F T -> .T*F 7) E -> E+T. T -> T.*F 8) F -> .(E) F -> .i T -> T*.F 9) T -> T*F. 10) E -> E.+T F -> (E.) 11) F -> (E). State| i| )| (| *| +| $| E| T| F ----------------------------------------------------------------------------------------------------------- 0| S-5| | S-1| | | | 2| 4| 3 1| S-5| | S-1| | | | 10| 4| 3 2| | | | | S-6| Accept| | | 3| | R-4| | R-4| R-4| R-4| | | 4| | R-2| | S-8| R-2| R-2| | | 5| | R-6| | R-6| R-6| R-6| | | 6| S-5| | S-1| | | | | 7| 3 7| | R-1| | S-8| R-1| R-1| | | 8| S-5| | S-1| | | | | | 9 9| | R-3| | R-3| R-3| R-3| | | 10| | S-11| | | S-6| | | | 11| | R-5| | R-5| R-5| R-5| | | ```