# 自動機理論與正規語言
<!-- ---
###### tags: `NTNU CSIE`, `大三上課程`, `選修`, `大三上修習` -->
---
> [TOC]
---
## Chapter 1 - Introduction to the Theory of Computation
### 1.1 - Mathematical Preliminaries & Notation
- Order of magnitude
- $|f(n)| ≤ c |g(n)| → f(n) = O(g(n))$
- $|f(n)| ≥ c |g(n)| → f(n) = Ω(g(n))$
- $c_1 |g(n)| ≤ |f(n)| ≤ c_2 |g(n)| → f(n) = Θ(g(n))$
Ex:
> 
- Equivalence
$x≡y$
### 1.2 - Three Basic Concepts
- Languages
- $Σ^*$:≥ 0.
- $Σ^+$:≥ 1.
- Definition of **Grammar**
$G=(V,\ T,\ S,\ P)$
- $V$:a finite set of objects called **variables**.
- $T$:a finite set of objects called **terminal symbols**.
- $S$:a special symbol called the **start variable**.($S∈V$)
- $P$:a finite set of **productions**.
$V$ & $T$ are nonempty & disjoint.
## Chapter 2 - Finite Automata
### 2.1 - Deterministic Finite Accepters(DFA)
- Definition of **DFA**
$M=(Q,\ Σ,\ δ,\ q_0,\ F)$
- $Q$:a finite set of **internal states**.
- $Σ$:a finite set of symbols called the **input alphabet**.
- $q_0$:the **initial state**.($q_0∈Q$)
- $F$:a set of **final states**.($F⊆Q$)
- $δ$:a total function called the **transition function**.($Q×Σ→Q$)
- <span style="color: blue">冰塊的筆記
每個 state 皆須有接受所有 input alphabets 並轉換到下個 state 的路徑,且不得有 $λ$。</span>
### 2.2 - Nondeterministic Finite Accepters(NFA)
- Definition of **NFA**
$M=(Q,\ Σ,\ δ,\ q_0,\ F)$
- $Q$:a finite set of **internal states**.
- $Σ$:a finite set of symbols called the **input alphabet**.
- $q_0$:the **initial state**.($q_0∈Q$)
- $F$:a set of **final states**.($F⊆Q$)
- $δ$:the **transition function**.($Q×(Σ∪\{λ\})→2^Q$)
- <span style="color: blue">冰塊的筆記
每個 state 不一定要有接受所有 input alphabets 並轉換到下個 state 的路徑,且可以接受 λ 作為 input alphabet。</span>
### 2.3 - Equivalence of Deterministic and Nondeterministic Finite Accepters
- Convert NFA to DFA
路徑的格式變成:$δ({q_0}, a) = \{q_1, q_2\}$。
用空集合 state 表示不存在的 states:$Ø$。
## Chapter 3 - Regular Languages and Regular Grammars
[正規語言 - 維基百科](https://zh.wikipedia.org/wiki/%E6%AD%A3%E5%88%99%E8%AF%AD%E8%A8%80)
> 正規語言又稱正則語言是滿足下述相互等價的一組條件的一類形式語言:
> - 可以被確定有限狀態自動機辨識(DFA)
> - 可以被非確定有限狀態自動機辨識(NFA)
> - 可以被唯讀圖靈機辨識
> - 可以用正規表示式描述(Regular Expression, RE)
> - 可以用正則文法生成(Regular grammar)
> - 可以用字首文法生成
### 3.1 - Regular Expressions
### 3.2 - Connections Between Regular Expressions & Regular Languages
- For every regular language there is a regular expression.
For every regular expression there is a regular language.
- If $r$ is a regular expression, then there exists some nfa that accepts $L(r)$. Consequently, $L(r)$ is a regular language.
- Generalized Transition Graph(GTG)
### 3.3 - Regular Grammars
- Right- & Left-linear grammar
皆為 Regular grammar 的一種。
- Right-linear grammar
$G=(V,\ T,\ S,\ P)$ with all productions in form $A→xB$ or $A→x$, where $A,B∈V$ and $x∈T^*$.
- Left-linear grammar
$G=(V,\ T,\ S,\ P)$ with all productions in form $A→Bx$ or $A→x$, where $A,B∈V$ and $x∈T^*$.
> 
## Chapter 4 - Properties of Regular Languages
### 4.1 - Closure Properties of Regular Languages
Closed under union $(L_1∪L_2)$, intersection $(L_1∩L_2)$, concatenation $(L_1L_2)$, complementation $(\bar{L})$, star-closure $(L_1^*)$, reversal $(L^R)$, homomorphism $h(L)$ and right quotient $(L_1/L_2)$.
### 4.2 - Elementary Questions about Regular Languages
- Membership question:membership algorithm.
- Standard representation
### 4.3 - Identifying Nonregular Languages
- Pigeonhole principle
- <span style="color: #CC0000">Pumping lemma for **regular** languages
Let $L$ be an infinite regular language. Then there exists some positive integer m s.t. any $w∈L$ with $|w|≥m$ can be decomposed as $w=xyz$, with $|xy|≤m$ and $|y|≥1$, s.t. $w_i=xy^iz$ which is also in $L$ for all i= 0, 1, 2, ... .</span>
<span style="color: blue">冰塊的筆記 - 重點:
- $w∈L$ + $|w|≥m$
- $w=xyz$ + $|xy|≤m$ + $|y|≥1$
- $w_i=xy^iz$</span>
## Chapter 5 - Pushdown Automata
### 5.1 - Context-Free Grammar(CFG)
- Definition of **CFG**
$G=(V,\ T,\ S,\ P)$ with all productions in $P$ have the form $A→x$, where $A∈V$ and $x∈(V∪T)^*$
- A language L is said to be context-free if and only if there is a **CFG** $G$ s.t. $L=L(G)$.
- Every regular grammar is context-free, so a regular language is also a context-free one.
### 5.2 - Parsing and Ambiguity
- Definition of **simple grammar** or **s-grammar**
$G=(V,\ T,\ S,\ P)$ if all its productions are of the form $A→ax$, where $A∈V$, $a∈T$, $x∈V^*$, and any pair $(A,a)$ occurs at most once in $P$.
- <span style="color: blue">冰塊的筆記
只要存在一個 unambiguous grammar,則一 language 即為 unambiguous;但若每個可生成一 language 的 grammar 皆為 ambiguous,則此 language 為 ambiguous。</span>
### 5.3 - Context-Free Grammars and Programming Languages
## Chapter 6 - Simplification of Context-Free Grammars and Normal Forms
### 6.1 - Methods for Transforming Grammars
- λ-free languages
- Definition of **useful / useless**
A variable is useful if and only if it occurs in at least one derivation, otherwise it's useless.
- λ-production:$A→λ$
- unit-production:$A→B$
### 6.2 - Two Important Normal Forms
- Chomsky Normal Form
A CFG if all productions are of the form $A→BC$ or $A→a$, where $A, B, C$ are in $V$, and $a$ is in $T$.
- Greibach Normal Form
A CFG if all productions are of the form $A→ax$, where $a∈T$ and $x∈V^*$.
## Chapter 7 - Pushdown Automata
### 7.1 - Nondeterministic Pushdown Automata(NPDA)
- Definition of **NPDA**
$M=(Q,\ Σ,\ Γ,\ δ,\ q_0,\ z,\ F)$
- $Q$:a finite set of **internal states of the control unit**.
- $Σ$:the **input alphabet**.
- $Γ$:a finite set of symbols called **stack alphabet**.
- $δ$:$Q×(Σ∪\{λ\})×Γ$ → finite subsets of $Q×Γ^*$ is the **transition function**.
- $q_0$:the **initial state of the control unit**.($q_0∈Q$)
- $z$:the **stack start symbol**.($z∈Γ$)
- $F$:a set of **final states**.($F⊆Q$)
- Example of **NPDA**
> 
### 7.2 - Pushdown Automata & Context-free Languages
- <span style="color: #CC0000">To show a language $L$ is context-free
→ Find an npda $M$ such that $L=L(M)$.</span>
### 7.3 - Deterministic Pushdown Automata(DPDA)& Deterministic Context-free Languages
- Definition of **DPDA**
Same with npda, but s.t. to the restrictions that, for every $q∈Q$, $a∈Σ∪\{λ\}$ and $b∈Γ$,
1. $δ(q,a,b)$ contains at most one element.
1. If $δ(q,λ,b)$ is not empty, then $δ(q,c,b)$ must be empty for every $c∈Σ$.
<span style="color: blue">冰塊白話文:<br> 1. 每個階段的 stack 字符對應任一個輸入字符皆只能有一種結果。<br> 2. 如果階段 A 的某個 stack 字符可不經輸入字符直接跳轉到階段 B,<br> 則跳轉到階段 B 必定不再需要、也不能有任何輸入字符作為管道。<br>其實就像當初 nfa 跟 dfa 的差異一樣。</span>
- <span style="color: #CC0000">To show a language $L$ is deterministic context-free
→ Find a dpdp $M$ such that $L=L(M)$.</span>
## Chapter 8 - Properties of Context-Free Languages
### 8.1 - Two Pumping Lemmas
- <span style="color: #CC0000">Pumping lemma for **context-free** languages
- Let $L$ be an infinite context-free language. Then there exists some positive integer m, s.t. any $w∈L$ with $|w|≥m$ can be decomposed as $w=uvxyz$, with $|vxy|≤m$ and $|vy|≥1$, s.t. $uv^ixy^iz∈L$, for all i= 0, 1, 2, ... .</span>
<span style="color: blue">冰塊的筆記:
- $w∈L$ + $|w|≥m$
- $w=uvxyz$ + $|vxy|≤m$ + $|vy|≥1$
- $w_i=uv^ixy^iz$</span>
- $S \overset{\tiny{*}} \longrightarrow uAz \overset{\tiny{*}} \longrightarrow uvAyz \overset{\tiny{*}} \longrightarrow uvxyz$
- <span style="color: #CC0000">Pumping lemma for **linear** context-free languages
- Let $L$ be an infinite linear language. Then there exists some positive integer m, s.t. any $w∈L$ with $|w|≥m$ can be decomposed as $w=uvxyz$, with $|uvxy|≤m$ and $|vy|≥1$, s.t. $uv^ixy^iz∈L$, for all i= 0, 1, 2, ... .</span>
<span style="color: blue">冰塊的筆記:
- $w∈L$ + $|w|≥m$
- $w=uvxyz$ + $|uvxy|≤m$ + $|vy|≥1$
- $w_i=uv^ixy^iz$</span>
### 8.2 - Closure Properties and Decision Algorithms for Context-Free Languages
- Closed under union $(L_1∪L_2)$, concatenation $(L_1L_2)$, star-closure $(L_1^*)$.
- Not closed under complementation $(\bar{L})$ and intersection $(L_1∩L_2)$ but regular intersection, which $L_1$ is CFL and $L_2$ is RL.
## Chapter 9 - Turing Machines
### 9.1 - The Standard Turing Machine
- Definition of **Turing Machine**
$M=(Q,\ Σ,\ Γ,\ δ,\ q_0,\ □,\ F)$
- $Q$:a finite set of **internal states**.
- $Σ$:a finite set of **input alphabet**, not including $□$.($Σ⊆Γ-\{□\}$)
- $Γ$:a finite set of symbols called **tape alphabet**.
- $δ$:the **transition function**.($Q×Γ→Q×Γ×\{L,R\}$)
- $q_0$:the **initial state**.($q_0∈Q$)
- $□$:the special symbol called the **blank**.($□∈Γ$)
- $F$:a set of **final states**.($F⊆Q$)
### 9.2 - Combining Turing Machines for Complicated Tasks
未讀
### 9.3 - Turing's Thesis
> 任何在算法上可計算的問題同樣可由圖靈機計算。
> Any computation that can be carried out by mechanical means can be performed by some TM.
## Chapter 10 - Other Models of Turing Machines
### 10.1 - Minor Variations on the Turing Machine Theme
未讀
### 10.2 - Turing Machines with more Complex Storage
未讀
### 10.3 - Nondeterministic Turing Machines
- Definition of **Nondeterministic Turing Machine**
$M=(Q,\ Σ,\ Γ,\ δ,\ q_0,\ □,\ F)$
- $Q$:a finite set of **internal states**.
- $Σ$:a finite set of **input alphabet**, not including $□$.($Σ⊆Γ-\{□\}$)
- $Γ$:a finite set of symbols called **tape alphabet**.
- $δ$:the **transition function**.($Q×Γ→\style{color: red}{2^{Q×Γ×\{L,R\}}}$)
- $q_0$:the **initial state**.($q_0∈Q$)
- $□$:the special symbol called the **blank**.($□∈Γ$)
- $F$:a set of **final states**.($F⊆Q$)
基本上跟 [9.1 的 Definition of **Turing Machine**](https://hackmd.io/FRXC435QTQeEjgmP_ceEiQ?both#91---The-Standard-Turing-Machine) 一樣,只差在<span style="color: red">紅色</span>部分。
<span style="color: blue">冰塊的筆記:一如往常,Nondeterministic 就是所有 $δ$ 的對象會變成一個容許多個 states 的 set,而非單一 state。</span>
### 10.4 - A Universal Turing Machine
- Definition
An automaton that, given as input the description of any TM $M$ and a string $w$, can simulate the computation of $M$ on $w$.
- Illustration
> 
### 10.5 - Linear Bounded Automata
- Definition of **LBA**
A nondeterministic TM $M=(Q,\ Σ,\ Γ,\ δ,\ q_0,\ □,\ F)$, s.t. to the restriction that $Σ$ must contain two special symbols $[$ and $]$, such that $δ(q_i,[)$ can contain only elements of the form $(q_j,[,R)$, and $δ(q_i,])$ can contain only elements of the form $(q_j,],L)$.
基本上跟 [10.3 的 Definition of **Nondeterministic Turing Machine**](https://hackmd.io/FRXC435QTQeEjgmP_ceEiQ?both#103---Nondeterministic-Turing-Machines) 一樣,
<span style="color: red">但要再在 $Σ$ 增加兩個特殊符號:$[$ 和 $]$,且 $δ(q_i,\ [)$ 的 state set 只能是 $(q_j,\ [,\ R)$ 格式、$δ(q_i,\ ])$ 的 state set 只能是 $(q_j,\ ],\ L)$ 格式。</span>
<span style="color: blue">冰塊白話文:其實效果就是讓這個 TM 維持在 $[$ 和 $]$ 之間移動。</span>
## 最後統整
- 三個 **Pumping Lemma** 的比較
| RLs | CFLs | Linear CFLs |
| :---------: | :-------------: | :-------------: |
| $\|w\|≥m$ | $\|w\|≥m$ | $\|w\|≥m$ |
| $w=xyz$ | $w=uvxyz$ | $w=uvxyz$ |
| $\|xy\|≤m$ | $\|vxy\|≤m$ | $\|uvxy\|≤m$ |
| $\|y\|≥1$ | $\|vy\|≥1$ | $\|vy\|≥1$ |
| $w_i=xy^iz$ | $w_i=uv^ixy^iz$ | $w_i=uv^ixy^iz$ |
- To show a language $L$ is context-free:
- Show that it's regular(Find its RE or something else)
- Find an npda $M$ such that $L=L(M)$
- Find an dpda $M$ such that $L=L(M)$ → deterministic context-free
- Use Pumping Lemma
- Closure properties