正規語言與自動機

# 正規語言與自動機 ![](https://i.imgur.com/cTQrZcc.jpg) ## 複習部分 ### Set : Collection of Elements :::warning 只有兩種Membership,X $\in$ S或X $\ni$ S ::: #### Notation \begin{cases} S = \{0,1,2\}\\ S = \{i:i > 0,i\ is\ even\} \end{cases} 兩種表示方法都可以使用,如果語言是有限的就用第一種,如果是無限的就用第二種。 #### Set Operation \begin{cases} 聯集 : S_1 \bigcup S_2 = \{ X \in S_1\ or\ X \in S_2 \}\\ 交集 : S_1 \bigcap S_2 = \{ X \in S_1 \ \And \ X \in S_2 \}\\ 差集 : S_1 - S_2 = \{X \in S_1 \ \And \ X \not\in S_2\}\\ 非 : \overline{S_1} = \{X : X \in U,X \not\in S \}\\ \Phi = 空集合 \end{cases} :::warning $U$為宇集 ::: 運算特性 : \begin{cases} S_1 \bigcup \Phi = S_1\\ S_1 - \Phi = S_1\\ \overline{\Phi} = U\\ \overline{\overline{S_1}} = S_1\\ DeMorgan's Law : \\ \qquad\overline{S_1 \bigcup S_2} = \overline{S_1} \bigcap \overline{S_2}\\ \qquad\overline{S_1 \bigcap S_2} = \overline{S_1} \bigcup\overline\{S_2\} \end{cases} #### Subset $S_0 \subseteq S_1 代表 S_0 \subset S_1 或 S_0 = S_1$ #### Disjoint 表示沒有任何交集 $S_1 \bigcap S_2 = \Phi$ \begin{cases} finite\ set \rightarrow \mid S \mid : size \ of\ S,number\ of\ elements \\ infinite \ set \\ \end{cases} **Powerset of S（$2^S$）** : S的所有子集合,將S中任意元素取出,並使之成為集合。 Example: \begin{cases} S = \{a,b,c\}\\ 2^S = \{ \Phi ,\{a\},\{b\},\{c\},\{a,b\},\{b,c\},\{a,c\},\{a,b,c\} \} \\ \mid S\mid = 8\ (size\ of\ powerset) \end{cases} 每一種元素都有選或不選 $\Rightarrow$ 三個elements的set的size為$2^S = 2^3 = 8$ :::warning 空集合為所有集合的subset ::: #### Cartesian Product(笛卡爾積) \begin{matrix} S = S_1 \times S_2 = \{ (x,y):x\in S_1,y\in S_2\} \end{matrix} $從S_1內挑一個,S_2內挑一個配對$ :::warning $有順序性,（1,3）\not = （3,1）,因此S_1 \times S_2 \not = S_2 \times S_1$ ::: #### Partition of Set 1. $把set切割成n份(s_1,s_2,\cdot\cdot\cdot,s_n),每一份s_i都是subset,但彼此要disjoint(Mutually Disjoint)$ 2. $S_1 \bigcup S_2 \bigcup\cdot\cdot\cdot\bigcup S_n = S$ 3. $S_i\ not\ equal\ to\ \Phi$ ### Function \begin{split} &f:S_1 \rightarrow S_2\\ &S_1 = (domain\ set),S_2 = (range\ set)\\ &total\ function:domain\ set = S_1\\ &partial\ function:domain\ set\ \subset S_1 \end{split} :::warning partial 表示domain set有些值無對應解 ::: #### Order $比較function的大小,看數字時函數成長快慢$ \begin{split} &f(n) = O(g(n)),if\ f(n) \le c\mid g(n)\ \mid\\ &f(n) = \Omega(g(n)),if\ f(n) \ge c\mid g(n) \ \mid\\ &f(n) = \Theta(g(n)),if\ c_1\mid g(n) \ \mid \le f(n) \le c_2\mid g(n) \ \mid\\ \end{split} **Examples:** \begin{split} &f(n) = 2n^2 + 3n,f(n) = O(g(n))\\ &g(n) = n^3,g(n) = \Omega(h(n))\\ &h(n) = 10n^2+100n,h(n) = \Theta(f(n)) \end{split} ### Relation \begin{cases} function : 一個或多個x可以得到一個y,但不能一對多\\ relation : 允許一對多 \end{cases} **Equivalence Relations(對等)** \begin{matrix} R_1 \equiv R_2 \end{matrix} 需要滿足三個條件： 1. Reflexivity(反身性)：$X \equiv X$ 2. Symmetry(對稱性)：$X \equiv Y \Rightarrow Y \equiv X$ 3. Transitivity(遞移性)：$X \equiv Y,Y \equiv Z \Rightarrow X \equiv Z$ **Example** \begin{cases} X \equiv Y\\ if\ X\ mod\ 3 =\ Y\ mod\ 3\\ And\ 2\ mod\ 3 = 2,5\ mod\ 3 = 2\\ \Rightarrow2 \equiv5,0\equiv36 \end{cases} ### Graphs \begin{matrix} G(V,E) \end{matrix} 由於之後自動機可能有拜訪路徑,因此複習這個 **名詞** 1. $Walk : 從V_i \to V_j經過一些點的路徑$ 2. $Path : 沒有重複走任何一條edge的Walk$ 3. $Simple\ Path : 沒有重複經過任意vertice的path$ 4. $Cycle : 從V_i出發,走回自己的path$ 5. $Simple\ Cycle : 從V_i出發,走回自己的Simple\ Path$ 6. $loop : 從自己走回自己的edge$ ## Chapter 1 Introduction ### String 用alphabet組合而成的finite Sequence \begin{split} &v = aba\\ &prefix = a,ab\\ &suffix = a,ba\\ &empty\ string = \lambda\ (\lambda V = V\lambda = V)\\ &V^R = aba(Reverse)\\ &W^R = aaaba\\ &V^2 = abaaba = VV\\ &V^0 = \lambda\\ &\mid V \mid = length\ of\ V,\mid\lambda\mid = 0\\ &\mid\ V\mid + \mid W \mid = \mid V + W\ \mid\\ &\Sigma^* = 所有可能的組合\\ &\Sigma^+ = 所有可能的組合,但不包含\lambda \end{split} ### Grammar \begin{split} &G = (V,T,S,P)\\ &V = Variable\\ &T = Terminal\\ &S = Initial\ State\\ &P = Production\ rules \end{split} **Example** \begin{split} &V = \{S\}\\ &T = \{a,b\}\\ &P = \{S\to aSb,S\to \lambda\} \end{split} 除了不能無中生有之外,S部分可以隨意寫。一個Language L 由Grammar產生 **Notation** \begin{matrix} L(G) \end{matrix} ### Derivation(推導) 簡單講就是去套用rule **Example** \begin{matrix} S \Rightarrow aaSbb \Rightarrow aa\lambda bb(aabb) \end{matrix} 依據Production的過程可能產生一個句子aabb,aSb.aaSbb皆為sentential form,表句子內至少包含一個Variable。 \begin{matrix} \to 簡化為 S \Rightarrow^*aabb,意指S經過數步推導可得aabb \end{matrix} :::warning 確保L(G)的每個element可由Grammar產生避免L內有無法產生出的,要回頭證明,也要小心不要產生多餘的。 ::: **反過來,用句子找文法** **Example** \begin{matrix} L=\{a^nb^{n+1} ,n \ge 0\} = \{b,abb,aabbb,....\} \end{matrix} **Solution** \begin{cases} S \to Ab\\ A \to aAb\ |\ \lambda \end{cases} ### Automata(自動機) 自動機是一個數位的抽象模型,包含： \begin{cases} input\\ control\ unit(state\ diagram)\\ Possibly,storage\ mechanism\\ Possibly,output\ mechanism \end{cases} 一般可以設計成accepter input = any w,output is either "yes" or "no" 和transducer : any output strings :::warning Automata can be represented by a state diagram ::: 又分成兩種 \begin{cases} deterministic（dfa） : 狀態轉移唯一\\ non-deterministic(nfa) : 狀態轉移不唯一 \end{cases} Grammar用途：希望能用於生成識別字（identifiers）,範例如下： \begin{split} &<id>\to<letter><rest>\\ &<letter>\to a|b|...|z\\ &<rest>\to <letter><rest>|<digit><rest>|\lambda\\ &<digit>\to 0|1|...|9 \end{split} ## Chapter 2 Finite Automata ### DFA(Deterministic Finite Automata) \begin{matrix} M = \{Q,\Sigma,\delta,q_0,F\} \end{matrix} \begin{split} &Q : set\ of\ All\ state\\ &\Sigma : All\ input\ alphabets\\ &\delta : Transition\ function\\ &q_0 : Initial\ state\\ &F : Final\ state\\ &\delta:Q\times\Sigma\to Q(做笛卡爾積) \end{split} ![](https://i.imgur.com/rUumlom.jpg) \begin{matrix} \delta (q_0,0) = q_0\\ \delta (q_0,1) = q_1\\ \delta (q_1,0) = q_0\\ \delta (q_1,1) = q_2\\ \end{matrix} $\delta 為 total\ function,任意狀況及任意條件都有下一步的結果,而且只有一種狀況。畫的時候,每條路都要畫出來。$ 每一個state都恰有$\mid\Sigma\mid$條出路(每個alphabet都有一條路)。 **Transition table** | | 0 | 1 | | --- | -------- | -------- | |$q_0$|$q_0$|$q_1$| |$q_1$|$q_0$|$q_2$| |$q_2$|$q_2$|$q_1$| :::warning 碰到一種alphabet只有一種結果 ::: 當$L$停在$final\ state$時,表示可以被成功辨識$(accept)$,反之則無法辨識。 ### Regular Language($rl$) \begin{matrix} Def\ :if\ \And\ only\ if\ there\ exists\ a\ dfa\ that\ accepts\ L \end{matrix} 意指可以被$dfa$接受的語言即為正規語言。$dfa$本身只能夠讀$input$做狀態轉移,不能有$output$。 \begin{matrix} L = \{(ab)^n:n \gt 0\}\ is\ regular\ language \end{matrix} ![](https://i.imgur.com/xFGsyYO.jpg) \begin{matrix} L = \{awa:w \in \Sigma ^*\}\ is\ regular \end{matrix} ![](https://i.imgur.com/xGn3zTl.jpg) ### NFA(Non-deterministic Finite Automata) \begin{matrix} M = \{Q,\Sigma,\delta,q_0,F\} \end{matrix} \begin{split} &Q : the\ set\ of\ all\ state\\ &\Sigma : all\ input\ alphabet\\ &\delta : Q \times(\Sigma \bigcup \{\lambda\})\to 2^Q \end{split} $\to$可以不看輸入就進行狀態轉移,且transition function在收到input轉移到的state不只一個, 只要包含在$2^Q$即可。 :::warning 為$partial\ function$,指有一些alphabet在transition function是沒有定義的,且包含$\lambda$ move。 ::: **小統整** \begin{cases} dfa:從q_i出發走到唯一狀況q_j（只有一個final\ state）,中間所經過的path\ length只和輸入長度有關\\ nfa:從q_i出發走到不只一個狀況(不只一個final\ state),中間所經過路徑跟輸入的長度和可走路徑有關。\\ \Lambda:number\ of\ \lambda\ moves。對於連續的\lambda至多只要考慮\Lambda次就好,再多對結果沒有影響。 \end{cases} ### NFA和DFA轉換 \begin{matrix} nfa \approx dfa\\ M_1\approx M_2\\ L(M_1) = L (M_2) \end{matrix} ![](https://i.imgur.com/htSuPoh.jpg) $\to$但nfa可以在空集合時進行狀態轉移,因此轉移過程中,要在a的兩側加上n個$\lambda$,以確保狀態都被保留。 **Example** ![](https://i.imgur.com/NxdZ9Oe.jpg) ### Minimal DFA 要把沒辦法區分的state合併，使用minimal dfa寫程式起來才會短 \begin{split} &if\ \ \delta ^ *(p,w)\in F \to \delta ^ *(q,w)\in F\\ &and\ \ \delta ^ *(p,w)\not\in F \to \delta ^ *(q,w)\not\in F\\ &For\ all\ w \in \Sigma ^* \end{split} 如果不滿足條件則為$distinguishable$(可區分的) **Example** ![](https://i.imgur.com/kx3QONr.jpg) | | $q_0$ | $q_1$ | $q_2$ | $q_3$ | $q_4$ | | --- | --- | --- | --- | --- | --- | | $q_0$ | | | | | | | $q_1$ | $X_1$ | | | | | | $q_2$ | $X_0$ | $X_0$ | | | | | $q_3$ | $X_1$ | | $X_0$ | | | | $q_4$ | $X_0$ | $X_0$ | $X_0$ | $X_1$ | | 由Table可發現，可以將$q_1跟q_3$合併(因為兩者無法區分)。 **cheat sheet** \begin{cases} 1.如果一個是final\ state一個不是,直接可被區分出來。(X_0)\\ 2.如果一個可以走到final\ state一個不行，則看走幾步之後可以被區分出來(X_n,如果走n步區分出來的話)\\ 3.檢查收到輸出到達的state是否相同。 \end{cases} ## Chapter 3 Regular Language ### outline 1. set 2. Regular Grammar 3. Regular Expression ### Regular Expression(rex) 代表可以透過運算式展開,得到句子的集合。 1. $\Phi.\lambda.alphabet \in \Sigma$ 2. 透過組合產生更長的算式;如$union(+),concatenation(\cdot),closure (*)$ \begin{matrix} Ex:(a+b+c)^*\cdot(c+\Phi)\ is\ rex,\Sigma\{a,b,c\} \end{matrix} \begin{cases} r = \Phi,L(r) = \{\}\\ r = \lambda,L(r) = \{lambda\}\\ r = a,L(r) = \{a\} \end{cases} $\to$一個$language$是一個$Set\ of\ Sentence$ \begin{cases} r_1 + r_2 = L(r_1 + r_2) = L(r_1) + L(r_2)\\ r_1 \cdot r_2 = L(r_1r_2) = L(r_1)\cdot L(r_2)\\ r^*,L(r_1^*) = L(r_1)^* = L^0(r_1) \bigcup L^1(r_1) \bigcup L^2(r_1) \bigcup...\bigcup L^n(r_1) \end{cases} \begin{split} L(r) &= L(a^*)\cdot L(a+b)\\ &=(L^0(a)\bigcup L^1(a) \bigcup...\bigcup L^n(a))\cdot L(a+b)\\ &=\{a^n,n\ge 0\}\cdot \{a,b\}\\ &=\{a,b,aa,ab,aaa,aab\} \end{split} **Example** \begin{split} &(ab)^* : L=\{(ab)^n:n \ge 0\}\\ &(a+b):L=\{a,b\}\\ &(a+b)^*:L=\{a,b\}^* \ \ 任意由a,b組合成的字串\\ &a(bb)^*:L=\{ab^{2n}:n\ge 0\}\\ &a^*(a+b):L=\{a^na^ib^j:n,i,j\ge0,i+j=1\}\\ &(aa)^*(bb)^*b:L\{a^{2n}b^{2n+1}:n,m\ge0\} \end{split} $rex\approx regular\ language,\\只要可以用rex表達就是regular\ language;\\ 只要是regular\ language都可以用rex表達$ $proof:rex\Rightarrow regular\ language$ $regular\ language表示可以用dfa或是nfa來表達。$ $試證:rex可以轉成對應的nfa或dfa。$ ![](https://i.imgur.com/FG7fQ3B.jpg) \begin{split} &三種情況基本上都可以換成nfa\\ &證明:r_1+r_2,r_1\cdot r_2,r_1^*,所有nfa都能轉換成只有一個final\ state的狀態。\\ &1.把有多個final\ state的final\ state換成一班state，無條件接到新的final\ state,且新的final\ state只有一個。\\ \end{split} ![](https://i.imgur.com/l0vFDPW.jpg) $2.三種狀態證明$ ![](https://i.imgur.com/aZEwoKQ.jpg) ### Generalized Transition Graph(GTG) \begin{gather} r.l\leftrightarrow nfa \leftrightarrow rex \end{gather} \begin{cases} GTG:generalized\ transition\ graph,transition\ graph\ where\ edges\ are\ labelded\ with\ rex\\ complete\ GTG:all\ edges\ are\ presented \end{cases} **步驟** \begin{gather} r.l \Rightarrow nfa \Rightarrow complete\ GTG \Rightarrow reduce\ to\ two\ states \Rightarrow rex\\ \end{gather} :::warning 不能刪除initial state和final state ::: 簡單來說,檢查沒被刪掉的邊,看有沒有受到影響,如果受到影響就更新。要把所有可能的路線納入考慮,如: $q_0為initial\ state,q_2為final\ state,q_1為一般state,只能刪除q_1,\\ 則要檢查(1)q_0\to q_1^* \to q_0(2)q_0 \to q_2(3)q_2 \to q_0(4)q_2 \to q_1^* \to q_2$ 當只剩下兩個$state$時,便可以將$rex$讀出。 ## Regular Grammar \begin{split} &G = (V,T,S,P)\\ &V = Variable\\ &T = Terminal\\ &S = Start\ Symbol\\ &P = Production\ Rule \end{split} $linear\ grammar$ : 至多一個變數在右手邊，如$X\to Y$，最後會發現X也要限定為單一Variable。又可分成$right\ linear\ grammar\ 和\ left\ linear\ grammar$ :::warning $linear\ grammar一定是屬於right\ linear\ grammar或是left\ linear\ grammar,不可混搭。$ ::: ### clousure properties $L_1\ and \ L_2\mbox{ are regular}$ $\mbox{so are the languages that result from the following operation}$ \begin{align} &L_1 \cup L_2 = L(r_1 + r_2)\\ &L_1 \cdot L_2 = L(r_1 \cdot r_2)(\mbox{concatenation})\\ &\bar{L_1}\\ &L_1^{*} \end{align} #### proof(1) : L bar is regular \begin{gather} L_1\ is\ regular\ \to \bar{L_1}\ is\ regular \end{gather} $\mbox{there exist a machine M that L = L(M),}$ $\mbox{turn final into non-final;non-final into final,}$ $\mbox{then we got M'}$ #### proof(2) : intersection is also closure properties \begin{gather} L_1、L_2\ is\ regular\ \to L_1\cap L_2\ is\ regular \end{gather} $\mbox{by DeMorgan's Law} :　L_1\cap L_2 = \overline{\bar{L_1}\cup \bar{L_2}}$ $\bar{L_1}\ is\ regular ,\bar{L_1} \cup \bar{L_2}\ is\ regular,then\ \overline{\bar{L_1}\cup \bar{L_2}}\ is\ regular$ :::warning 由上述證明可知，遇到這些狀況仍然可以維持closure ::: #### proof(3) : Reverse \begin{gather} L\ is\ regular\ \to L^R\ is\ regular\\ L = L(M),L^R = L(M') \end{gather} \begin{cases} initial\ state \to final\ state\\ final\ state \to initial\ state\\ reverse\ all\ edges \end{cases} $\to Reverse可以維持closure的性質$ :::warning M只能有一個final ::: #### proof(4) : Homo morphism \begin{align} &\Sigma\ and\ \Gamma\ are\ alphabets\\ &{\it h} : \Sigma \to \Gamma\\ &{\it h}(L) = \{{\it h(w) : w\in L}\}\ homorphic\ image(同態映射) \end{align} **example** \begin{cases} L = \{a,b\}\\ {\it h}(a) = ab\\ {\it h}(b) = bbc \end{cases} $\to \ {\it h}(L) = {abbbc}$ #### Elementary Question about Regular Languages 1. $\mbox{given any w,is there exist an algorithm to determine whether or not w is in L?}$ **如果L是regular,只要看最後是否會進final state就可以確定是否是某個Language** 2. $\mbox{Determine L is Empty,Finite or Infinite?}$ **L是regular,就可以從M去觀察。** **(1) "L is empty"表示L無法容納任何句子,r = $\phi$可以用DFA表示,可以解決** **(2) "Infinite"只要DFA有迴圈,很容易就可以讓可辨識的語言變成無窮多** 3. $\mbox{given }L_1\mbox{ and }L_2,L_1 = L_2?$ $L = (L_1\cap \bar{L_2}) \cup (\bar{L_1} \cap L_2)\\ while\ L = \phi \Leftrightarrow L_1 = L_2$ ### identifying non-regular languages \begin{gather} regular \leftrightarrow dfa(finite\ automata) \end{gather} 一個dfa或是nfa能記住的資訊有限,當語言要求超過所能記住的,就沒辦法表達,即non-regular。 **example** $L = \{a^nb^n : n \geq 0\} \to not\ regular$ :::warning 因為不是right-linear grammar 或是 left-linear grammar ::: #### Pigeon principle(鴿籠原理) \begin{gather} n\ objects \xrightarrow{into} m\ boxes \end{gather} $while\ n \gt m$, $\mbox{then at least ine box must have more than one object in it}$ $\mbox{In a transition graph with n vertices(states),any walk of length n or longer must repeat some vertices}$ #### Pumping Lemma \begin{gather} \mbox{L is an infinite regular language} \Rightarrow \mbox{Satisfying Pumping Lemma} \end{gather} 指從initial state到final state上出現迴圈,一定有重複造訪的state,必為infinite language :::warning Satisfying Pumping Lemma is a necessary condition(必要條件), but not a sufficient condition(充分條件), 因此pumping lemma是用來證明**不是**regular language,不能用來證明**是**。 ::: #### Common mistake while using Pumping Lemma 1. use Pumping Lemma to prove L is regular 2. Select one w that is not belonging to L(選擇不屬於語言的句子來證明) 3. Given more restrictions on decomposition(加上額外限制) ### Context-Free languages $a^nb^n\mbox{相當於括號對稱型，在程式語言一定會用到，因此程式語言不能限定只用regular language}$ \begin{gather} \mbox{context-free grammar}\to \mbox{context-free language} \end{gather} #### Context Free Grammar \begin{align} &G:(Variable,Terminal,Start Symbol,Production Rule)\\ &P: A \to x\ ,A \in V\ ,x \in (V\cup T)^* \end{align} **example** $V = \{S\}$ $T = \{a,b\}$ $S \to aSa | bSb | \lambda$ :::warning Variable、Terminal可以任意混合,順序也沒有限制。 Context-Free為linear,幾乎包含所有linear的語言，但Context-Free沒有很好的Closure Properties。 $\mbox{Regular Language} \subset \mbox{Context-Free Language}$ ::: #### Derivation 可以永遠從左邊推到底(left-most),或先從右邊推到底(right-most),當leaf Node都是terminal時即代表推導完成。 #### 辨識Context-Free的機器不是一台DFA 可以用窮舉法辨識，但雖然說是窮舉法，有些不符合的分支根本不用考慮。當長度已經跟要辨識的句子一樣長了就該停下來了，但如果句子中有$\lambda\ production(A \to \lambda)$或是$unit\ production(A \to B)$可能讓文法停不下來(因為Variable的推導結果會讓長度誤判) ### Simple Grammar $\mbox{for any context-free grammar,there exists a parsing algorithm with time complexity }|w|^3$ **不是一個好的grammar,希望找出一個linear-time parser, 但context-free是不可能到linear,所以要用到Simple Grammar** \begin{align} &Simple\ Grammar \to G(V,T,S,P)\\ &P : A \to aX\\ & A \in V,a \in T,X\in V^* (\lambda \in T)\\ &\mbox{and pair(A,a) occur at most once in P}(一個Variable推導出的terminal至多只能出現一次) \end{align} :::warning S-grammar可以很確定下一步會推出什麼,不會有多種情況 ::: #### Ambiguous 無法利用演算法偵測是否出現Ambiguous,有一些語言本身就有交集,也不一定可以轉換成不會Ambiguous的寫法。如果regular grammar寫不好,也會產生Ambiguous。 ### Simplification of context-free grammar 1. $\lambda\mbox{ production}$ 2. $\mbox{unit production}$ 3. $\mbox{useless production(non-terminal、unreachable)}$ :::warning 處理順序不能調換,因為在做前兩步有可能會產生useless production ::: #### Removing $\lambda$ Production $\mbox{direct :}$\begin{gather} A \to \lambda \end{gather} $\mbox{indirect :}$\begin{gather} S \to A\\ A \to \lambda \end{gather} $\mbox{def : nullable variable}$ \begin{gather} A \xrightarrow* \lambda \end{gather} \begin{cases} S \to aS_1b\\ S_1 \to aS_1b | \lambda \end{cases} **把$\lambda$ production拿掉之後** \begin{cases} S \to aS_1b | ab\\ S_1 \to aS_1b| ab \end{cases} :::warning 把$\lambda$的狀況直接考慮進去,滿足$S_1 \to \lambda$的事實 ::: **步驟** 1. $\mbox{collect }V_N,V_N = \phi$ 2. $\mbox{add a variable A }\to V_N,\mbox{if A}\to\lambda\ or\ A \to A_1A_2 \cdots A_n,A_1A_2\cdots A_n \in V_N$ 3. $\mbox{eliminate }\lambda\mbox{ production}$ 4. $\mbox{use substitution rules}$ #### Removing Unit Production \begin{gather} A \to B,B \to C \end{gather} A的內容由C決定,可以直接把A轉成C的結果 **步驟** 1. $\mbox{draw the dependency graph}$ 2. $\mbox{construct a new grammar by removing all unit-productions}$ 3. $\mbox{use subsitution rules}$ **example** \begin{align} &S \to aA\ |\ B\\ &A \to a\ |\ bc\ |\ B\\ &B \to A\ |\ bb\\ &\mbox{after transfer}\cdots\\ &S \to aA\ |\ a\ |\ bc\ |\ bb\\ &A \to a\ |\ bc\ |\ bb\\ &B \to bb\ |\ a\ |\ bc \end{align} :::warning 可能不只會轉換一次,要檢查是否所有可能狀況都有考慮到 ::: #### Removing Useless Production 1. $V = \phi$ 2. $\mbox{add a variable A to }V_1,\mbox{ if } \\A\to x\mbox{ (at least one of x are all terminals)} \\ A \to xBy\mbox{ (x and y are all terminals and }B \in V_1\mbox{)}$只是說明那些Variable可以產生terminals string 3. $\mbox{Eliminate any productions containing Variables not in }V_1$ 4. $\mbox{Eliminate variable that are not reachable from S by using a dependency graph}$ **example** \begin{align} &S \to aS\ |\ A\ |\ C\\ &A \to a\\ &B \to aa\\ &C \to aCb\\ \\ &\mbox{collect }V_1 = \{A,B,S\} \to\mbox{ Eliminate }C\\ &\mbox{B is unreachable from S}\to\mbox{ Eliminate }B \end{align} #### Chomsky Normal Form \begin{gather} S \to AS\ |\ a \end{gather} 文法不能產生$\lambda$,而且要經過Simplify, 只能產生恰好兩個Variable或是單一的terminal #### Greibach Normal Form \begin{align} &A \to aX\\ &a \in T\\ &X \in V^* \end{align} 文法要先產生一個單一個terminal, 再產生任意數目的Variable(也可以沒有) ### Pushdown Automata Pushdown Automata適用於辨識Context-free grammar的機器,跟Finite Aotomata一樣有non-deterministic跟deterministic,但前提是所有的context-free grammar都必須轉換成Greibach Normal Form。 Pushdown Automata在做狀態轉移還有資料輸出。 :::warning 在$Finite\ Automata$的時候我們都會說$\mbox{nfa}$可辨識的句子量$\approx\mbox{ dfa}$可辨識的句子量;但在$Pushdowm\ Automata$,$\mbox{non-deterministic}$跟$\mbox{deterministic}$的能力是不對等的(npda比較強)。 ::: #### Non-deterministic Pushdown Automata \begin{align} &non\mbox{-}deterministic\ pushdown\ automata: M = (Q,\Sigma,\Gamma,\delta,q_0,z,F)\\ & Q: \mbox{a finite set of states}\\ & \Sigma : \mbox{An input alphabet} \\ &\delta : Q\times (\Sigma \cup \{\lambda\})\times T \to 2^{Q\times \Gamma^* } \\ &q_0: \mbox{an initial state}\\ &\Gamma :可以Output到Stack的Symbol \\ &z:stack的起始狀態\\ &F:\mbox{a set of final states} \end{align} 既然是non-deterministic,在進行狀態轉移時可能會有不只一種狀況。 :::warning 在便是期間可能會有一些沒有被定義的步數,指的是辨識發生錯誤,無法繼續往下走,稱為dead state ::: 在進行狀態轉移時除了要注意input alphabet之外,還要檢查現在的stack狀態才能決定要轉移到哪一個state。 Transition rule如下: \begin{gather} \delta(q_0,a,0) = \{(q_1,10),(q_3,\lambda)\} \end{gather} 如上述,指的是$q_0$碰到a的輸入,且stack最上面是0,這時候可能有兩種狀況,第一種狀況為轉移到$q_1$,在stack輸入10(先輸入0再輸入1);或是轉移到$q_3$,不放任何東西到stack :::warning 任何context-free的語言都可以找到npda,狀態轉移都是在模仿文法推導 ::: #### Deterministic Pushdown Automata 跟npda的差異只有"**狀態轉移只有一種可能狀況**",dpda還是可以接受 $\lambda\ movement$。因為大部分推導狀況跟npda相去不遠,不多做贅述。