---
tags: [2025_Fall]
---
# 自動機與形式語言筆記
<!-- HackMD ID:pVyh2UKlRniZCGACJ4xFVw -->
## 0 Introduction
<style>
.red {color: red;}
</style>
最終目標是介紹 Complexity,但會先從 Automata、Computability 開始。
### Notions
#### 基本概念
* **集合 (Set)**:無序的獨特物件組合。Subset: $\subseteq$, Proper subset: $\subset$.
* **序列 (Sequence)**:有順序的物件排列。(本筆記中盡量 0-based)
* **元組 (Tuple)**:有限序列,K 個元素稱 K-元組。
* **笛卡兒積 (Cartesian product/ Cross product)**:$A \times B = \{(a, b) | a \in A, b \in B\}$,所有有序對的集合。
* **冪集 (Power Set)**:$P(A) = \{S | S \subseteq A\}$,集合 $A$ 所有子集的集合。
#### 函數與關係
* **函數 (Function)**:每個輸入對應唯一輸出。
* **述詞 (Predicate / Property)**:range 為 {TRUE, FALSE} 的函數。
* **關係 (Relation)**:Domain 為 a set of k tuples ($A \times \ldots \times A$) 的 predicate,比如大於(2>1 是 TRUE)。用集合表示會更方便,例如 $P:D \to \{\text{TRUE}, \text{FALSE}\}$ 寫成 $(D,S)$,其中 $S=\{a\in D | P(a) = \text{TRUE}\}$。
* **等價關係 (Equivalence Relation)**:二元關係 $R$ 須滿足:
1. **自反性 (Reflexive)**:$\forall x, xRx$。
2. **對稱性 (Symmetric)**:$\forall x, y, xRy \Rightarrow yRx$。
3. **遞移性 (Transitive)**:$\forall x, y, z, xRy \land yRz \Rightarrow xRz$。
#### 圖 (Graph)
這裡未指名時通常是無向。
#### Strings 與 Languages
很重要
* **Alphabet** 是任何 **nonempty finite** set
* Alphabet 的成員是 **Symbol**
* **String** (over an alphabet A) 是 **finite** sequence of symbols from A。(本筆記中盡量 1-based)
* $\epsilon$ 是空字串。
* **Shortlex order**: 和字典序一樣,但是先比長度再比字典序,例如 $(\epsilon, 0, 1, 00, 01, \ldots)$ 是 over $\{0, 1\}$ 的 shortlex order。
* **Language** 是某個 alphabet 上的 string 的集合,可能(很常)是 infinite。
### Definitions, Theorems, and Proofs
* **Definition**: 定義概念。
* **Statement**: 命題,未必是 TRUE 或 FALSE。
* **Proof**: 講述命題為真的邏輯論述。
* **Theorem**: 已被證明的 statement。
* **Lemma**: 輔助用的 theorem。
* **Corollary**: 從 theorem 推導出的 statement。
### Types of Proofs
* **Proof by Construction**: 直接構造出例子。
* 比如要求證明對於大於 2 的偶數 n,存在一個 n 個 node 的 3-regular Graph,就直接寫出對於每個 n 要如何建構 Graph。
* **Proof by Contradiction**: 反證法。
* 假設命題為 FALSE,然後推導出矛盾。
* **Proof by Induction**: 歸納法。
* basis: 基礎 case。
* inductive step: 假設對於 n 成立 (**Induction Hypothesis**),證明對於 n+1 也成立。
## 1 Regular Languages
### Finite Automata
#### Definition of Finite Automaton
$(Q, \Sigma, \delta, q_0, F)$ 是一個 finite automaton,其中:
* $Q$ 是 **states** 的有限集合。
* $\Sigma$ 是作為 **alphabet** 的有限集合。
* $\delta: Q \times \Sigma \to Q$ 是 **transition function**。
* $q_0 \in Q$ 是 **start state**。
* $F \subseteq Q$ 是 **accept (final) states** 的集合。
運作方式:自動機從初始狀態 $q_0$ 開始,依序讀取輸入字串中的每個符號。根據轉移函數 $\delta$,從當前狀態和讀取的符號決定下一個狀態。讀取完所有符號後,若最終狀態屬於接受狀態集合 $F$,則該字串被接受;否則,該字串被拒絕。
> $\delta$ is a total function, so for every state and input symbol, there is always a defined next state.
**機器 M 所辨識的語言 (language of machine M)**: 所有被 M 接受的字串所組成的集合 A,寫成 $L(M)=A$,稱為 M recognizes A (accepts 也行,但是通常用在 string)。
#### Definition of Computation
Let
* $M=(Q, \Sigma, \delta, q_0, F)$ be a finite automaton.
* $w = a_1 a_2 \ldots a_n$ be a string over $\Sigma$.
M accepts w $\iff$ there exists a sequence of states $r_0, r_1, \ldots, r_n \in Q$ such that:
1. $r_0=q_0$
2. $r_{i+1} = \delta(r_{i}, a_{i+1}), \forall i \in [0, n-1]$
3. $r_n \in F$
M recognizes language A $\iff A =\{w| M \text{ accepts } w\}$.
A **Regular Language** 是可以被某個 finite automaton 所接受的 language。
#### Regular Operations
Let A and B be languages.
* **Union**: $A \cup B = \{w | w \in A \lor w \in B\}$
* **Concatenation**: $A \circ B = \{w_1 w_2 | w_1 \in A \land w_2 \in B\}$
也可以省略不寫 $\circ$,直接寫成 $AB$。
* **Kleene Star**: $A^* = \{w_1w_2\ldots w_k|k\ge 0 \text{ and each }w_i\in A\}$,也可以寫成 $\bigcup_{n=0}^{\infty} A^n = \{\epsilon\} \cup A \cup A^2 \cup A^3 \cup \ldots$,其中 $A^0=\{\epsilon\}, A^n=\{wv|w\in A^{n-1},v\in A\}$。
**Theorem**: The set of regular languages is closed under all three of the regular operations.
* For union, prove by construction:
Let $A_1$ and $A_2$ be regular languages recognized by finite automata $M_1=(Q_1, \Sigma, \delta_1, q_{01}, F_1)$ and $M_2=(Q_2, \Sigma, \delta_2, q_{02}, F_2)$, we can construct a new finite automaton $M=(Q, \Sigma, \delta, q_0, F)$ that recognizes $A_1 \cup A_2$.
* $Q = Q_1 \times Q_2$ (Cartesian product)
* $\Sigma$ remains the same, but this theorem is still true if the alphabets are different.
* $\delta((q_1, q_2), a) = (\delta_1(q_1, a), \delta_2(q_2, a))$
* $q_0 = (q_{01}, q_{02})$
* $F = (F_1\times Q_2)\cup(Q_1\times F_2) = \{(q_1, q_2) | q_1 \in F_1 \lor q_2 \in F_2\}$
* For concatenation: 無法簡單構造,因為不知道 $A_1$ 和 $A_2$ 中間要在哪裡切,所以需要 Nondeterminism。Kleene Star 也是類似的情況。
### Nondeterminism
#### Definition of Nondeterministic Finite Automaton (NFA)
Transition function $\delta$ 由 $Q \times \Sigma \to Q$ 改成 $Q\times \Sigma_\epsilon\to P(Q)$,其中 $P(Q)$ 是 $Q$ 的 Power set,$\Sigma_\epsilon=\Sigma\cup\{\epsilon\}$。也就是可能轉移到輸出的子集中的任一個 state,甚至輸入的 symbol 可以是 $\epsilon$。
其實也能在使輸入維持 $Q\times \Sigma\to P(Q)$,只要把轉移後再經過若干 $\epsilon$ 轉移能到的狀態都新增到輸出中就行了。
轉移條件改為 $r_{i+1} \in \delta(r_{i}, a_{i+1}), \forall i \in [0, n-1]$
**Nondeterminism is a generalization of determinism**: Every DFA (deterministic finite automaton) is automatically an NFA (nondeterministic finite automaton).
> For Nondeterministic Automata, $\delta$ is a total function but the output can be empty set, so we may skip some transitions when drawing.
#### Equivalence of NFAs and DFAs
NFA 與 DFA 是等價的,也就是對於任意 NFA,都可以找到與其 recognizes 相同 language 的 DFA。對於有 $k$ 個狀態的 NFA $N=(Q,\Sigma,\delta,q_0,F)$,對應的 DFA $D=(Q',\Sigma,\delta',q_0',F')$ 的定義如下:
* 先定義輔助用的 $E(R)=\{q|q$ can be reached from $R$ by traveling along 0 or more $\epsilon$ arrows $\}$,表示從狀態集合 $R$ 可以到達的所有狀態。
* state: $Q'=P(Q)$
* transition function: $\delta'(R,a)=\bigcup_{r\in R}E(\delta(r,a))$
* start state: $q_0' = E(\{q_0\})$
* accept states: $F' = \{R \in Q' | R \cap F \neq \varnothing\}$
因此,以後並不太需要在意 NFA、DFA 的差別。
**Corollary**: A language is regular $\iff$ an NFA recognize it.
$\Rightarrow$:Regular 有 DFA recognize,而 DFA 是 NFA 的特例。
$\Leftarrow$:NFA recognize 某 language,則對應的 DFA 也 recognize 相同 language。
#### Closure under regular operations
Let
* $N_1=(Q_1, \Sigma, \delta_1, q_{01}, F_1)$ be an NFA recognizing language $A_1$.
* $N_2=(Q_2, \Sigma, \delta_2, q_{02}, F_2)$ be an NFA recognizing language $A_2$.
For each operation, construct a NFA $N=(Q,\Sigma,\delta,q_0,F)$ that recognizes the resulting languages, $A_1\cup A_2, A_1\circ A_2, A_1^*$:
* **Union**: 額外建立一個初始狀態 $q_0$,可以經過 $\epsilon$ 隨意到 $q_{01}$ 或 $q_{02}$。
* $Q=\{q_0\}\cup Q_1\cup Q_2$
* $\delta(q,a)=$
* $\delta_1(q,a)$ if $q\in Q_1$
* $\delta_2(q,a)$ if $q\in Q_2$
* $\{q_{01},q_{02}\}$ if $q=q_0 \land a=\epsilon$
* $\varnothing$ if $q=q_0 \land a\ne \epsilon$
* $F=F_1\cup F_2$
* **Concatenation**: 因為先經過 $N_1$ 再經過 $N_2$,$M$ accept 的狀態必須是 $N_1, N_2$ 都 accept 過,所以把所有 $N_1$ 的 accept state 以 $\epsilon$ 接到 $N_2$ 的 start state。細節是 $N_1$ 的 accept state 也可能繼續在 $Q_1$ 裡面走,不一定都直接到 $q_{02}$。
* $Q=Q_1\cup Q_2$
* $\delta(q,a)=$
* $\delta_1(q,a)$ if $q\in Q_1 \land \lnot(q \in F_1 \land a=\epsilon)$
* $\delta_1(q,a)\cup\{q_{02}\}$ if $q\in F_1 \land a=\epsilon$
* $\delta_2(q,a)$ if $q\in Q_2$
* $q_0=q_{01}$
* $F=F_2$
* **Kleene Star**: 所有原本的 accept state 可以回到 start state 繼續,且新 start state 也是一種 accept state。
* $Q=\{q_0\}\cup Q_1$
* $\delta(q,a)=$
* $\delta_1(q,a)$ if $(q\in Q_1 \land q \notin F_1) \lor (q\in F_1 \land a \ne \epsilon)$
* $\delta_1(q,a)\cup\{q_{01}\}$ if $q\in F_1 \land a=\epsilon$
* $\{q_{01}\}$ if $q=q_0 \land a=\epsilon$
* $\varnothing$ if $q=q_0 \land a\ne \epsilon$
* $F=\{q_0\}\cup F_1$
另外,事實上對於一些其他 operation 也是 closed,比如:
* Intersection: $A_1\cap A_2$。使用 product automaton (之前在 Union 使用的方法),使得狀態同時在 $N_1$ 和 $N_2$ 的 accept states 才算 accept。
* Set Difference: $A_1-A_2$。一樣用 product automaton,讓狀態在 $N_1$ 的 accept states 但不在 $N_2$ 的 accept states 才算 accept。
* Complement: $A_1^c=\Sigma^*-A_1$。顯然 $\Sigma^*$ 是 regular,而 Set Difference 下也 closed,所以 $A_1^c$ 也是 regular。
### Regular Expressions
Symbol 或 Alphabet 經過 Regular Operations 即為 Regular Expression。每個 Expression 只是一個記號,代表一個 set of strings 也就是 language,說成 "The expression **describes** the language"。
Let $\Sigma$ be an alphabet. A regular expression $R$ over $\Sigma$ is defined as follows (以下 $\to$ 符號代表 expression 描述的 language):
* $\varnothing\to\varnothing$
* $\epsilon\to\{\epsilon\}$
* $a$ for some $a\in \Sigma\to\{a\}$
* $R_1 \cup R_2$ for regular expressions $R_1, R_2\to L(R_1) \cup L(R_2)$
* $R_1 \circ R_2$ for regular expressions $R_1, R_2\to L(R_1) \circ L(R_2)$
* $R_1^*$ for regular expression $R_1\to L(R_1)^*$
若沒有括號,順序是 Kleene Star > Concatenation > Union,分別可以類比為次方、乘法、加法。
#### 速記
* $R^+=RR^*$,至少重複一次,$R^+\cup\epsilon=R^*$
* $R^n=R^{n-1}\circ R, R^0=\epsilon$
* $L(R)$ 是 $R$ 所代表的 language。
#### Examples
只挑稍微難理解的
* $1^*\varnothing$: 因為是指任何數量的 1 接上「什麼東西都不能」,所以 $1^*\varnothing=\varnothing$,事實上任何 expression 接上 $\varnothing$ 都是 $\varnothing$。
* 任何 expression 與 $\varnothing$ 的 Union 是本身。
* 類似的,任何 expression 與 $\epsilon$ 的 Concatenation 是本身。
* $\varnothing^*=\epsilon$,因為 $L(\varnothing^*) = L(\varnothing)^0 \cup L(\varnothing)^1 \cup L(\varnothing)^2 \cup \ldots = \{\epsilon\} \cup \varnothing \cup \varnothing \cup \ldots = \{\epsilon\}$
注意正規表達式本身是 notation,不直接等於它描述的 language。因此,應寫 $\varnothing^*=\epsilon$(表示這兩個正規表達式描述相同 language),或 $L(\varnothing^*)=\{\epsilon\}$(表示 $\varnothing^*$ 所描述的 language 是只有 epsilon 的集合)。
#### Equivalence with Finite Automata
A language is regular $\iff$ A regular expression describes it.
$\Leftarrow$: Proof by Construction. Given a regular expression $R$, we can construct a NFA $M=(Q,\Sigma,\delta,q_0,F)$ that recognizes the language $L(R)$. (若很明顯會省略)
* **Base cases**
* $R=a, a\in\Sigma$: $\delta(q_0,a)=\{q_1\}, F=\{q_1\}$
* $R=\epsilon$: $F=\{q_0\}$
* $R=\varnothing$: $F=\varnothing$
* **Inductive steps**. For Union, Concatenation, and Kleene Star, use the proof that the set of regular languages is close under those operations.
Let $R_1$ and $R_2$ are regular expressions such that $L(R_1), L(R_2)$ are both regular languages so there exist their respective DFAs. For $(R_1\cup R_2), (R_1\circ R_2), (R_1^*)$, we can use the construction from the proof for closure under regular operations to construct a DFA $M$ that recognizes $L(R_1\cup R_2), L(R_1\circ R_2), L(R_1^*)$ respectively.
$\Rightarrow$: Proof by Construction. Convert a DFA to a regular expression. First to a **special form** of **generalized nondeterministic finite automaton (GNFA)** then to a regular expression.
##### GNFA
**GNFA**: NFA but can have regular expressions as the input to the transition function.
Use a **special form** of GNFA:
* Start state
* It has a transition going to any other state
* It has no transition coming in from any other state
* Accept state
* There is only a single accept state
* It is not the same as the start state
* It has a transition coming in from any other state
* It has no transition going to any other state
* Except for the start and accept states
* One transition goes from every state to every other state
* One transition goes from each state to itself
(可以把**非** special form 的 GNFA 轉換成 special form 的 GNFA)
**Definition of GNFA (Special Form)**: $(Q,\Sigma,\delta,q_{start},q_{accept})$ where:
* $Q,\Sigma$: same as NFA
* $\delta:$<span class="red">$(Q-\{q_{accept}\})\times(Q-\{q_{start}\})$</span>$\to\mathcal{R}$, where $\mathcal{R}$ is the set of regular expressions over $\Sigma$. (從狀態、symbol 對應到下一個狀態改為狀態對對應到 regular expression) No transition from $q_{accept}$ and no transition to $q_{start}$.
* $q_{start},q_{accept}\in Q$ are the start and accept states, respectively.
Computation 非常簡單,略過。
##### 證明過程本體
* Convert a DFA to a GNFA (Special Form)
* Add a new start state with an $\epsilon$ transition to the original start state.
* Add a new accept state with an $\epsilon$ transition from every original accept state.
* 對於每一對狀態(包含自環)的兩個方向,分別使用 Union 整合所有 transition
* 對於每一對狀態(包含自環)的兩個方向,對沒有 transition 的以 label $\varnothing$ 連接
* Convert a GNFA (Special Form) to a regular expression: Remove states until only the start and accept states remain.
* Select any state $q_{rip}\in Q$ different from $q_{start},q_{accept}$.
* Let G' be the GNFA $(Q',\Sigma, \delta',q_{start},q_{accept})$
* $Q' = Q - \{q_{rip}\}$
* $\delta'(q_1,q_2) = \delta(q_1,q_{rip})\delta(q_{rip},q_{rip})^*\delta(q_{rip},q_2)\cup\delta(q_1,q_2)$ for any $q_1\in Q'-\{q_{accept}\}$ and any $q_2\in Q'-\{q_{start}\}$.
* 實際操作時要記得自環也要按照同樣的方法處理。
### Nonregular Languages
* $B={0^n1^n|n\ge 0}$: Nonregular
* $C=\{w|w \text{ has an equal number of 0s and 1s}\}$: Nonregular
* $D=\{w|w \text{ has an equal number of occurrences of 01 and 10 as substrings}\}$: **Regular!**
因為其實只要看開頭與結尾即可,若一樣則 01 與 10 數量相同,若 0 開頭 1 結尾則 01 多了一個,反之亦然。重點是因為兩者數量頂多差 1,所以能以有限種狀態表達。
#### The Pumping Lemma for Regular Languages
$A$ is a regular language $\Rightarrow$ There exists a **pumping length** $p\ge 1$ such that for any string $w\in A$ with $|w|\ge p$, we can write $w=xyz$ such that:
* $xy^iz\in A\quad\forall i\ge 0$
* $|y|\ge 1$
* $|xy|\le p$
[**The converse is not true!!!**](https://en.wikipedia.org/wiki/Pumping_lemma_for_regular_languages#Invalidity_of_the_lemma_converse) It can only be used to prove a language is nonregular, but some nonregular languages may satisfy the property. To prove a language is regular, the typical way is to construct an DFA/NFA or a regular expression.
##### Proof by Construction
Let
* $M=(Q,\Sigma,\delta,q_0,F)$ be a DFA recognizing language $A$.
* $p=|Q|$ be the pumping length.
* $w=a_1a_2\ldots a_n$ be a string in $A$ of length $n\ge p$.
* $r_0, r_1, \ldots, r_n$ be the states visited by $M$ on input $w$. ($r_0=q_0$, $r_{i+1}=\delta(r_{i},a_{i+1}) \forall i\in [0,n-1]$, and $r_n\in F$)
取前 $p+1=|Q|+1$ 個,因為 $M$ 只有 $|Q|$ 種狀態,by the pigeonhole principle, there exists $i,j$ such that $0\le i<j\le p$ and $r_i=r_j$.
* $x=a_1\ldots a_i, y=a_{i+1}\ldots a_j, z=a_{j+1}\ldots a_n$.
圖示: $r_0\xrightarrow{x}r_i\xrightarrow{y}r_j\xrightarrow{z}r_n$
Since $r_i=r_j$, repeating $y$ any number of times will not change the state, so $L(xy^*z)\subseteq A$.
Also, $|y|=j-i\ge 1$ because $i<j$, and $|xy|\le p$ because $j\le p$.
##### Using Pumping Lemma to Prove Nonregularity
* **P (A is a regular language) $\Rightarrow$ Q (The Pumping Lemma Condition):**
If a language $A$ is regular, then **there exists** a pumping length $p \ge 1$ such that for **every string** $w \in A$ with $|w| \ge p$, we **can** write $w = xyz$ satisfying the three conditions:
* **Contrapositive Logic for Non-regularity ($\lnot Q \Rightarrow \lnot P$):**
The negation of the Pumping Lemma Condition ($\lnot Q$): **For every** $p \ge 1$, **there exists** a string $w \in A$ with $|w| \ge p$ such that $w$ **cannot** be written as $xyz$ satisfying the three conditions.
##### Example: Prove $B=\{0^n1^n|n\ge 0\}$ is nonregular
Assume $B$ is regular. Let $p$ be the pumping length.
Select $w=0^p1^p$. This $w \in B$ and $|w|=2p \ge p$.
By the Pumping Lemma, $w=xyz$ must satisfy the three conditions.
Since $|xy|\le p$, $x$ and $y$ can only contain '0's. So $y=0^k$ for some $k \ge 1$.
Consider $xy^2z$. This string will have $p+k$ '0's and $p$ '1's.
Since $k \ge 1$, the number of '0's ($p+k$) is not equal to the number of '1's ($p$).
Therefore, $xy^2z \notin B$. This contradicts the Pumping Lemma. Thus, $B$ is nonregular.
For $\{w|w \text{ has an equal number of 0s and 1s}\}$, we can use a similar argument.
##### Example: Prove $C=\{0^n1^m|n\ne m\land n,m\ge 0\}$ is nonregular
(Note: While $L(0^*1^*) - B = E$ can prove this using the closure property of regular languages under set difference, here's the direct Pumping Lemma proof.)
Assume $E$ is regular. Let $p \ge 1$ be the pumping length.
Select $w=0^p1^{p+p!}$. This $w \in E$ (since $p \ne p+p!$) and $|w|=2p+p! \ge p$.
By the Pumping Lemma, $w=xyz$ must satisfy the three conditions.
Since $|xy|\le p$, $x$ and $y$ can only contain '0's. Let $y=0^k$ for some $k \ge 1$.
Consider $xy^iz$. The number of '0's will be $p+(i-1)k$, while the number of '1's remains $p+p!$.
Choose $i = 1 + \frac{p!}{k}$. (Since $1 \le k \le p$, $k$ divides $p!$, so $i$ is an integer $\ge 1$.)
For this $i$, $xy^iz = 0^{p+p!}1^{p+p!}$.
In this string, the number of '0's equals the number of '1's.
Therefore, $xy^iz \notin E$ (as $n=m$). This contradicts the Pumping Lemma. Thus, $E$ is nonregular.
##### Example: Prove $D=\{1^{n^2}|n\ge 0\}$ is nonregular
Assume $D$ is regular. Let $p \ge 1$ be the pumping length.
Select $w=1^{p^2}$. This $w \in D$ and $|w|=p^2 \ge p$.
By the Pumping Lemma, $w=xyz$ must satisfy the three conditions.
Since $|y|\ge 1$, $|xy^2z|=|xyz|+|y|>p^2$.
Since $|xy|\le p\Rightarrow|y|\le p$, $|xy^2z|=|xyz|+|y|\le p^2+p<(p+1)^2$.
Since $|xy^2z|$ is not a perfect square, $xy^2z \notin D$.
This contradicts the Pumping Lemma. Thus, $D$ is nonregular.
### Takeaways
* NFA、DFA 是等價的
* 一個 language 是 regular,若且唯若一個 FA recognize 該 language
* 一個 language 是 regular,若且唯若一個 regular expression 描述該 language
* Regular languages 在一些 Operations 下是封閉的
* Pumping Lemma 可以用來證明某些 language 是 nonregular
## 2 Context-Free Languages
* Context-Free Grammar: 比 Regular Expression 更強大的描述語言的方式。
* Context-Free Language(CFL): 由 Context-Free Grammar 生成的語言,Regular Language 是 Context-Free Language 的子集。
* Pushdown Automaton(PDA): 一種有 stack 的 automaton,可以辨識 Context-Free Language。
### Context-Free Grammar (CFG)
主要由許多**規則**組成,把 symbol 分為 **variable** (包含 start variable) 與 **terminal**,規則的形式為 $V \to w$,其中 $V$ 是 variable,$w$ 是 terminal 或 variable 的串接。
衍生或生成 (derivation, generation) 一個字串的過程是:
* 寫下 start variable
* 對於每個 variable,找到左邊相同的規則
* 把變數替換為右側,直到整個字串只剩下 terminal 為止
所有某 grammar $G$ 能生成的 strings 的集合稱為 language of the grammar,一樣是 $L(G)$。
能被某 context-free grammar $G$ 生成的 language 稱為 context-free language。
若同個 variable 有多個規則,可以以 | 串接,比如 $A\to 0A1 | B$
#### Formal Definition of CFG
$(\mathcal{V},\Sigma, R, S)$ 是一個 context-free grammar,其中:
* $\mathcal{V}$ 是 **variable** 的有限集合。
* $\Sigma$ 是 **terminal** 的有限集合,$\mathcal{V} \cap \Sigma = \varnothing$。
* $R$ 是**規則**的有限集合,每條規則的形式為 $v \to w$,其中 $v \in \mathcal{V}$,$w \in (\mathcal{V} \cup \Sigma)^*$。
* $S \in \mathcal{V}$ 是 start variable,**通常就是第一條規則左側的變數**。
以下大寫代表 variable,小寫代表 terminal。
**yield** 指替換一步,比如 $uAv$ 以 $A\to w$ 替換為 $uwv$,$uAv$ yields $uwv$,記為 $uAv \Rightarrow uwv$。
**derivation** 指替換任意步,記為 $uAv \xRightarrow{*} uwv$,表示 $uAv$ 經過任意次替換後變成 $uwv$。
因此 language of a grammar 寫為 $\{w\in\Sigma^*|S \xRightarrow{*} w \land w\text{ contains only terminals}\}$。
#### Examples of CFG
* $\{0^n1^n|n\ge 0\}$: $S\to 0S1|\epsilon$
* 對 a 的加與乘運算:
* $\text{<EXPR>} \to \text{<EXPR>}+\text{<TERM>} | \text{<TERM>}$
* $\text{TERM} \to \text{<TERM>}\times\text{<FACTOR>} | \text{<FACTOR>}$
* $\text{FACTOR} \to ( \text{<EXPR>} )|a$
其中第二個例子解釋先乘後加的規則可以由生成規則定義,由生成結構判斷順序:比如 $a+a\times a$ 與 $(a+a)\times a$ 都只有一種生成方式,所以很明顯要哪個先。
#### Designing CFG
* 分開定義,比如 $S\to S_1|S_2$,$S_1,S_2$ 再各自生成 language。
* 對於 regular language,可以轉換 DFA 為 CFG:
* 對每個 DFA 的 state $q_i$ 建立一個 CFG 的 variable $A_i$
* 對每個 $\delta(q_i,a)=q_j$ 的 transition,加入規則 $A_i\to aA_j$
* 對每個 accept state $q_{\text{acc}}$ 加入規則 $A_{\text{acc}}\to \epsilon$
* $A_0$ 作為 start variable
* 遞迴結構:以 $A\to UAV$ 等方式定義
#### Parse Tree
每個 variable 與 terminal 作為節點,以 start variable 為根,每個 variable yield 的 variable 或 terminal 為其子節點,terminal 是 leaf,通常直接畫在最下面一排。
比如上方的「對 a 的加與乘運算」:
```text
<EXPR>
/ \
<EXPR> <TERM>
| / \
| <TERM> <FACTOR>
| | |
| | ( <EXPR> )
| | / \
| | <EXPR> <TERM>
| | | |
a + a x ( a + a )
```
#### Ambiguity of CFG
例子的第二個使加號由 $\text{<EXPR>}$ 生成,而乘號以 $\text{<TERM>}$ 生成。$a+a\times a$ 因為加法外無括號,必然由 start variable $\text{<EXPR>}$ 生成,而不是 $\text{<TERM>}$ 先生成乘號再生成加號。
如果沒有括號則同個 string 能以多種方式 derive,會覺得它是模糊的。
但注意若只是順序不同,不視為不同方式。一律由 **leftmost derivation** (每步都替換最左邊的 variable) 來避免替換順序造成的歧義。
如果同個 string 能以某 grammar 透過多種不同的 **leftmost derivation** 得到,則 the string is derived **ambiguously** in the grammar。
若對於某 grammar,存在一個 string which is derived ambiguously,則 the grammar is **ambiguous**。
有些 CFL 能由 CFG 生成,但必須由 ambiguous CFG 生成,比如 $\{0^i1^j2^k|i=j\lor j=k\}$,稱為 **inherently ambiguous**。
#### Chomsky Normal Form
特殊形式,加上限制後可更簡化,很有用處,~~雖然這學期的課程並沒有用到~~。只包含三類規則:
* $A\to BC$。
* $A\to a$。
* $S\to \epsilon$。$S$ 是 start variable
Start variable 不出現在任意規則的右側。
所有 CFG 都可以被轉換為 Chomsky Normal Form。以下 $W$ 是 variable 或 terminal 的字串。
* **加上新的 start variable $S'$,以及 $S'\to S$。**
能避免 start variable 在任意規則右側。
* **刪除所有 $\epsilon$ 規則 ($A\to\epsilon$) ,除了 $S\to\epsilon$**
對於其他 variable 替換為包含 $A$ 的規則 $B\to W$,對於 $W$ 中 $A$ 的集合的每個子集,加入刪除後的規則。
比如 $B\to uAvAw$ 變成 $B\to uvAw|uAvw|uvw$。
若 $B\to A$,會變成 $B\to A|\epsilon$,留到之後再處理 $B$。但若之前刪過 $B\to\epsilon$,則不必加入 $B\to\epsilon$,因為其下游都已考慮過。
* **刪除所有單一變數規則 ($A\to B$)**
對於所有 $B\to W$ 規則,加入 $A\to W$ 規則。
* **其他轉換**
* 處理右側長度 $k>2$ 的規則 $A\to W_1W_2\ldots W_k$,新增一些中繼變數 $A_1\ldots A_{k-2}$,將其轉換為
* $A\to W_1A_1$
* $A_1\to W_2A_2$
* $\vdots$
* $A_{k-2}\to W_{k-1}W_k$
* 處理 terminal 出現在一類規則的右側,對 $u$ 新增一個中繼變數 $A_u$,替換右側的 $u$,並新增規則 $A_u\to u$。
### Pushdown Automata (PDA)
FA 可視為 State Control 保有狀態,有一個 arrow 指向 input tape,並且只能往右移動。
PDA 作為 FA 的修改版,再加上一個 stack 作為額外的記憶體。
Deterministic PDA 與 Nondeterministic PDA 的能力不等價,但兩者都能 recognize 所有 Context-Free Language,所以以下 PDA 指的都是 Nondeterministic PDA,稱 Deterministic PDA 為 DPDA。
#### Formal Definition of PDA
> 註:設 stack 的頂端在左側。
PDA: $M=(Q, \Sigma, \Gamma, \delta, q_0, F)$ where
* $Q$ 是有限狀態集合。
* $\Sigma$ 是有限集合,作為 input alphabet。
* $\Gamma$ 是有限集合,作為 stack alphabet。
* $\delta: Q \times \Sigma_\epsilon \times \Gamma_\epsilon \to P(Q \times \Gamma_\epsilon)$。
* $q_0 \in Q$ 是初始狀態。
* $F \subseteq Q$ 是 accept states 的集合。
**Computation** 課本的定義稍微怪怪的,因此我自己寫。
Let:
* $M=(Q, \Sigma, \Gamma, \delta, q_0, F)$ be a PDA.
* $w = a_1 a_2 \ldots a_m$ be a string over $\Sigma$.
$M$ accepts $w$ $\iff$ there exists a sequence of **instantaneous descriptions (IDs)** $(q_0, w_0, t_0), (q_1, w_1, t_1), \ldots, (q_n, w_n, t_n)\in Q\times \Sigma^*\times \Gamma^*$ such that:
* **Initial ID:** $q_0$ is the start state, $w_0 = w$ is the full input string, and $t_0=\epsilon$ (Initial stack is empty)。
* **Transition:** For each $i \in [0, n-1]$, $(q_i, w_i, t_i) \Rightarrow (q_{i+1}, w_{i+1}, t_{i+1})$ must satisfy the following:
* Let $t_i = x t'$, where $x \in \Gamma_\epsilon$.
* Let $w_i = c w'$, where $c \in \Sigma_\epsilon$.
* There must exist a pair $\delta(q_i, c, x)\ni(q_{i+1}, y)$ such that $w_{i+1}=w'$ ($c$ is read) and $t_{i+1}=yt'$(stack: $x t'$ to $y t'$).
* **Final ID:**
* $w_n=\epsilon$ (**the entire input string has been consumed**)
* $q_n \in F$ (in an accepting state).
* **There is no requirement on the final stack $t_n$.**
在圖示時,會在邊上標記 $a,x\to y$,代表讀取 $a$,stack 頂端要是 $x$,移除 $x$ 並將 $y$ 放到 stack 頂端。
##### Shorthand for PDA
使 $\delta$ 能 push string,由右而左放入 stack,方便表示 transition。對於簡寫 $\delta(q,a,s)\ni (r,w),\ w=a_1a_2\ldots a_n$,新增 state $q_1\ldots q_{n-1}$,以及 transition:
* $\delta(q,a,s)\ni (q_1,a_n)$
* $\delta(q_1,\epsilon,$<span class="red">$\ \epsilon$</span>$)\ni (q_2,a_{n-1})$
* $\vdots$
* $\delta(q_{n-1},\epsilon,$<span class="red">$\ \epsilon$</span>$)\ni (r,a_1)$
或是 $q\xrightarrow{a,s\to r}r$ 變成:$q\xrightarrow{a,s\to a_n}q_1\xrightarrow{\epsilon,\epsilon\to a_{n-1}}q_2\xrightarrow{\epsilon,\epsilon\to a_{n-2}}\cdots\xrightarrow{\epsilon,\epsilon\to a_1}r$。
實際上維基上的定義及其他課本會使 transition 能直接 push 一個字串進 stack,能用 shorthand 所以是等價的。
另外,因為狀態是有限的,所以自然能 push 的 string 必須是有限的。
#### Equivalence of PDA and CFG
A language is context-free $\iff$ some PDA recognizes it.
$\Rightarrow$: Proof by Construction. Given a context-free grammar $G=(\mathcal{V},\Sigma, R, S)$, we can construct a PDA $M=(Q, \Sigma, \Gamma, \delta, q_0, F)$ that recognizes the language $L(G)$.
* $Q=\{q_{\text{start}},q_{\text{loop}},q_{\text{acc}}\}\cup E$,其中 $E$ 是用於 shorthand 新增的額外狀態。
* $F=\{q_{\text{acc}}\}$。
* $\delta(q_{\text{start}},\epsilon,\epsilon)\ni (q_{\text{loop}},S\$)$
一開始放入 $\$$,確保 accept 時 stack 實質上為空。
放入 $S$ 啟動 derivation。
* $\delta(q_{\text{loop}},\epsilon,A)=\{(q_{\text{loop}},w)|A\to w\text{ is a rule in }R\}$
依規則展開 $A$。
* $\delta(q_{\text{loop}},a,a)=\{(q_{\text{loop}},\epsilon)\}$
讀取 input,若與 stack 頂端的 terminal 相同,則繼續比對。
* $\delta(q_{\text{loop}},\epsilon,\$)=\{(q_{\text{acc}},\epsilon)\}$
Input 與 stack 都讀完,代表輸入與 $S$ derive 的其中一個字串相同。
$\Leftarrow$: Proof by Construction. Given a PDA $M=(Q, \Sigma, \Gamma, \delta, q_0, F)$, we can construct a context-free grammar $G=(\mathcal{V},\Sigma, R, S)$ that generates the language $L(M)$.
首先介紹 Special Form of PDA,以及經過一些看起來很蠢的轉換後把任何 PDA 轉換成這個形式:
* **在 Accept 時 stack 必須是空的**
* 新增 state $q_{\text{pop}}$
* 從每個原本的 accept state 新增 transition $\epsilon,\epsilon\to\epsilon$ 到 $q_{\text{pop}}$
* $\forall x \in \Gamma,\ \delta(q_{\text{pop}},\epsilon,x)\ni(q_{\text{pop}},\epsilon)$,使 $q_{\text{pop}}$ 能把 stack pop 到空
* **只有一個 accept state**
* 新增 start state 與 $S'\xrightarrow{\epsilon,\epsilon\to \$}S$
* 新增 accept state 與 $q_{\text{pop}}\xrightarrow{\epsilon,\$\to \epsilon}q_{\text{acc}}$,順便確保 stack 排空
* **每個 transition 必須 push 或 pop 一個字元**
* 把 $p\xrightarrow{a,b\to c}q$ 替換為 $p\xrightarrow{a,b\to \epsilon}s\xrightarrow{\epsilon,\epsilon\to c}q$,$s$ 是新增狀態
* 把 $p\xrightarrow{a,\epsilon\to \epsilon}q$ 替換為 $p\xrightarrow{a,\epsilon\to b}s\xrightarrow{\epsilon,b\to \epsilon}q$,$s$ 是新增狀態,$b$ 是任意的 stack symbol
對於 $M$ 的每對 states $p,q$,Let $A_{pq}\in \mathcal{V}$,想要使 $A_{pq}$ derives $x \iff$ x 能使機器從 ($p$, empty stack) 走到 ($q$, empty stack)。
這些字串 $\forall w\in \Gamma^*$ 也能使機器從 ($p$, $w$) 走到 ($q$, $w$)。
Construction for $A_{pq}$:
任意 $A_{pp}$ 不經任何轉移可產生 $\epsilon$,即 $A_{pp}\to \epsilon$。
任意 $A_{pq}$ 產生的非空字串 $x$ 在 special form of PDA 上第一步必然是 push,且最後一步是 pop,以維持空 stack,可以分成 push 與 pop 的字元是否相同的兩種情況:
* **兩者相同:假設 $p\xrightarrow{a}r\cdots s\xrightarrow{b}q$,則新增規則 $A_{pq}\to aA_{rs}b$**
$\forall p,q,r,s\in Q,\ u\in \Gamma,\text{ and }a,b\in \Sigma_\epsilon$, if $\delta(p,a,\epsilon)\ni (r,u) \land \delta(s,b,u)\ni (q,\epsilon)$, put $A_{pq}\to aA_{rs}b$ in $G$.
如果 $A_{rs}$ 是 $\varnothing$ 也沒差,反正是 Nondeterministic。
* **兩者不同:在路徑上必然在經過其中一個 state $r$ 時 stack 是空的,新增規則 $A_{pq}\to A_{pr}A_{rq}$**
$\forall p,q,r\in Q$, put $A_{pq}\to A_{pr}A_{rq}$ in $G$.
一樣,如果 $A_{pr}$ 或 $A_{rq}$ 是 $\varnothing$ 也沒差。
兩個方向的證明都是很 trivial 的 induction。
**Regular languages 是 context-free languages 的子集**,因為任何 FA 都自動是 PDA,只要完全不使用 stack 即可。
### Non-Context-Free Languages
Some Non-Context-Free Languages: $\{a^n b^n c^n | n \geq 0\}, \{a^ib^jc^k|0\le i\le j\le k\}, \{ww|w\in \{0,1\}^*\}$.
#### The Pumping Lemma for Context-Free Languages (CFL)
> 括號內是 Pumping Lemma for RL
$A$ is a context-free language $\Rightarrow$ There exists a **pumping length** $p\ge 1$ such that every string $s\in A$ with $|s|\ge p$ can be written as $s=wvxyz$ ($=xyz$), satisfying:
* $\forall i\ge 0,\ wv^ixy^iz\in A$ ($xy^iz\in A$)
* $|vy|\ge 1$ ($|y|\ge 1$)
* $|vxy|\le p$ ($|xy|\le p$)
證明:
設 $G$ 為一個 CFG,其生成語言為 CFL $A$。令 $b$ 為 $G$ 中所有規則右邊部分的 symbol 數的最大值。若 $b=1$,則 $A$ 為 alphabet 的子集,對於生成的 $s$,令 $v=s$ 且 $w=x=y=z=\epsilon$ 就滿足條件。以下假設 $b \ge 2$。
在 parse tree 中,是已經 Nondeterministically 從眾多替換選項中選出一個,因此所有節點都至多有 $b$ 個子節點。對於一個高度 $h$ (深度定義改為由 start variable 走幾步,相當於邊的數量、一般定義的高度減一),至多有 $b^h$ 個葉節點,也就是最終的 terminal 數量、字串長度。換句話說,至少 $b^h+1$ 長的字串必須來自高度至少 $h+1$ 的 parse tree。
設 $p=b^{|\mathcal{V}|+1}$,對於某字串 $s\in A, |s|\ge p$,其任意 parse tree 的高度至少 $|\mathcal{V}|+1$,取其中**節點數最少**的。
在此 parse tree 中,至少有個葉節點,其從根節點到葉節點的路徑長至少為 $|\mathcal{V}|+1$。取此路徑上最底下的 $|\mathcal{V}|+2$ 個節點,也就是 $|\mathcal{V}|+1$ 個 variable 與最後的 terminal。在 variables 中,根據鴿籠原理,必然有某個 variable $A$ 重複出現,稱上方的為 $A_u$,下方的為 $A_d$。
使 $A_d$ 的子樹葉節點為 $x$,$A_u$ 的子樹葉節點為 $vxy$,$s$ 的剩餘部分給 $w,z$。以下分別檢測三個條件:
* $\forall i\ge 0,\ wv^ixy^iz\in A$
* 對於 $i>1$,把 $A_d$ 的子樹遞迴地替換為 $A_u$ 的子樹。
* 對於 $i=0$,把 $A_u$ 的子樹替換為 $A_d$ 的子樹,變成 $wxz$。
* $|vy|\ge 1$
* 此條件僅在 $v=y=\epsilon$ 時不成立,但此時 $A_u$ 子樹葉節點與 $A_d$ 子樹葉節點所代表的字串相同。若以 $A_d$ 的子樹替換 $A_u$ 的子樹,則總節點數變少,與節點數最少的假設矛盾。
* $|vxy|\le p$
* 由於代表 $vxy$ 的 $A_u$ 的子樹高度至多為 $|\mathcal{V}|+1$,因此至多有 $b^{|\mathcal{V}|+1}=p$ 個葉節點。
#### Example: Prove $A=\{0^n1^n2^n|n\ge 0\}$ is non-CFL
For every $p\ge 1$, select $s=0^p1^p2^p$.
* 若 $v, y$ 都只有一種 symbol,則必然 $wv^2xy^2z$ 的三種 symbol 數量不同。
* 若 $v, y$ 其一包含兩種 symbol,則 $wv^2xy^2z$ 的順序不對。
* 若 $v, y$ 其一包含三種 symbol,則除了順序不對以外,其長度至少 $p+2$,違反 $|vxy|\le p$。
## 3 The Church-Turing Thesis
判斷能否以 algorithm 解決的問題,等價於能否被 Turing Machine 解決的問題。Turing Machine 更接近現代電腦,有 memory 而不是 stack。
### Turing Machine
Memory 是一個無限長的 tape,指針可以左右移動、讀寫。與 PDA 一樣可能停不下來。特別的是,就算在 input 的中間,只要進到 accept 或 reject state 就會停下來。
#### Formal Definition of Turing Machine
A Turing Machine is a 7-tuple $(Q, \Sigma, \Gamma, \delta, q_0, q_{\text{accept}}, q_{\text{reject}})$ where:
* $Q$ is a finite set of states. (The same)
* $\Sigma$ is the input alphabet **not containing the blank symbol $\sqcup$**.
* $\Gamma$ is the tape alphabet, where $\sqcup \in \Gamma$ and $\Sigma \subseteq \Gamma$.
* $\delta: Q \times\,$<span class="red">$\Gamma$</span>$\ \to Q \times\,$<span class="red">$\Gamma \times \{L,R\}$</span> is the transition function.
* $q_0 \in Q$ is the start state.
* $q_{\text{accept}} \in Q$ is the accept state.
* $q_{\text{reject}} \in Q$ is the reject state, where $q_{\text{reject}} \ne q_{\text{accept}}$.
#### Computation of Turing Machine
* 首先把輸入放在 tape 上的左側,其他位置為 blank symbol $\sqcup$。這也是輸入不能有 blank symbol 的原因,否則不知道是內容還是字串結束。
* 指針指向 tape 的最左側。
* 根據 $\delta$ 進行 transition。唯一的例外是在最左端時,若要往左移動則停在原地。
* 直到進入 accept 或 reject state 為止,稱為 halt。也有可能停不下來。
#### Formal Computation of Turing Machine
##### Configuration
使用 configuration 作為狀態,包含:current state($q\in Q$), tape contents, head position。$u\in \Gamma^*$ 是目前指針前的字串,$v\in \Gamma^*$ 是指針後的字串(含目前指向的字元),但只包含到最後一個非 blank symbol 為止。寫成:$u\ q\ v$。
##### Yielding
Suppose
* $a,b,c \in \Gamma$
* $u,v \in \Gamma^*$ (Can be empty)
* $q_i, q_j$ are states
it writes and moves:
* $\delta(q_i,b) = (q_j,c,L)$: $ua\ q_i\ bv \to ua\ q_i\ cv \to u\ q_j\ acv$
* $\delta(q_i,b) = (q_j,c,R)$: $ua\ q_i\ bv \to ua\ q_i\ cv \to uac\ q_j\ v$
And special cases:
* left end: $\delta(q_i,b) = (q_j,c,L)$: $q_i\ bv \to q_i\ cv \to q_j\ c v$
* $ua\ q_i$ is equivalent to $ua\ q_i\ \sqcup$, always has blank symbols to read or move to on the right.
##### More about Configuration
* Initial configuration on input $w$: $q_0\ w$
* In accepting and rejecting configurations, $q=q_{\text{accept}}$ and $q=q_{\text{reject}}$ respectively. They are halting configurations and do not yield to any other configuration.
##### Computation
$\delta$ is defined equivalently as $Q' \times \Gamma \to Q \times \Gamma \times \{L,R\}$ where $Q' = Q - \{q_{\text{accept}}, q_{\text{reject}}\}$. We can simply let $q_{\text{accept}}$ and $q_{\text{reject}}$ only have transitions to the same state, so they are equivalent.
A turing machine $M$ accepts input $w$ if there exists a sequence of configurations $C_1, C_2, \ldots, C_k$ such that:
* $C_1$ is the initial configuration on input $w$.
* For each $i \in [1, k-1]$, $C_i \to C_{i+1}$.
* $C_k$ is an accepting configuration.
#### The language of a Turing Machine
##### Turing-recognizable
Definition of recognizing is the same, but three outcomes are possible: accept, reject, or **loop**(doesn't halt).
A language is **Turing-recognizable** (or recursively enumerable) if some Turing machine recognizes it.
##### Turing-decidable
It's hard to say if a machine is looping or just taking a long time, so we prefer machines that halt on all inputs. Turing machines are **deciders** if they halt on all inputs.
A Turing machine $M$ **decides** a language $A$ if
* **$M$ is a decider**.
* $M$ recognizes $A$.
In other words, for any input string $w$:
* If $w \in A$, then $M$ accepts $w$.
* If $w \notin A$, then $M$ rejects $w$.
A language is **Turing-decidable** (or recursive, decidable) if some Turing machine decides it.
By definition, a language is Turing-decidable $\Rightarrow$ it's Turing-recognizable, so the set of Turing-decidable languages is a **proper subset** of the set of Turing-recognizable languages.
We mainly focus on **Turing-decidable languages**.
#### Examples of Turing Machines
##### Example: A Turing Machine that decides $\{0^{2^n}|n\ge 0\}$
```text
if number of 0s is 0:
reject
while 1:
if number of 0s is 1:
accept
elif number of 0s is odd:
reject
else:
half the number of 0s
```
Actually we need to count the number of 0s while halving:
1. Sweep left to right trying to **cross off** every second 0 (counting 0s at the same time)
2. **Accept** if there is only one 0
3. **Reject** if there is an odd number of 0s (of course not 1)
4. Return to the left end and repeat step 1
Actual design:
$\Sigma=\{0\}$, $\Gamma=\{0,\mathsf{X},\sqcup\}$
States and transitions:
> $q\xrightarrow{a\to \{R,L\}}q'$ means $q\xrightarrow{a\to a,\{R,L\}}q'$, not changing the tape content.
* $q_1$: Start state
* $\xrightarrow{0\to \sqcup,R}q_2$: 左端的 $\sqcup$ 標示第一個 0,作為最左端的標記,它本質還是 0。
* $\xrightarrow{\mathsf{X}\to R}q_{\text{reject}}$: not in $\Sigma$.
* $\xrightarrow{\sqcup\to R}q_{\text{reject}}$: input 沒有 0,reject。
* $q_2$: **After** reading the first 0 and several x's. (odd)
* $\xrightarrow{0\to \mathsf{X},R}q_3$: crossing off every second 0.
* $\xrightarrow{\mathsf{X}\to R}q_2$: skip x's.
* $\xrightarrow{\sqcup\to R}q_{\text{accept}}$: reach the right end with only one 0 (The $\sqcup$ at the left end), accept.
* $q_3$: **After** reading even number of 0s and several x's. (even)
* $\xrightarrow{0\to R}q_4$: skip the next 0.
* $\xrightarrow{\mathsf{X}\to R}q_3$: skip x's.
* $\xrightarrow{\sqcup\to L}q_5$: reach the right end with even number of 0s, go back to left end.
* $q_4$: **After** reading odd number ($\ge 3$) of 0s and several x's. (odd)
* $\xrightarrow{0\to \mathsf{X},R}q_3$: crossing off every second 0.
* $\xrightarrow{\mathsf{X}\to R}q_4$: skip x's.
* $\xrightarrow{\sqcup\to R}q_{\text{reject}}$: reach the right end with odd number of 0s, reject.
* $q_5$: Moving back to the left end, and move right to enter $q_2$.
* $\xrightarrow{0\to L}q_5$: move left through 0s.
* $\xrightarrow{\mathsf{X}\to L}q_5$: move left through x's.
* $\xrightarrow{\sqcup\to R}q_2$: reach the left end, move right to enter $q_2$.
> Turing Machines here are deterministic as DFA, $\delta$ is a **total function** (each input in domain has its output).
##### Example: A Turing Machine that decides $\{w\#w|w \in \{0,1\}^*\}$
> $\{w\#w^R|w \in \{0,1\}^*\}$ or $\{ww^R|w \in \{0,1\}^*\}$ are simply recognized by a PDA, but not this one.
```text
while 1:
find the first unmarked symbol before #
if no unmarked symbol before #:
check if there is any unmarked symbol after #
if yes:
reject
else:
accept
mark the symbol
find the corresponding unmarked symbol after #
if not found:
reject
mark the symbol
```
States and transitions (red means rejecting cases):
* $q_1$: Start state: pointing to the first unmarked symbol before #.
* $\xrightarrow{0\to \mathsf{X},R}q_{01}$: mark the first unmarked 0 before #.
* $\xrightarrow{1\to \mathsf{X},R}q_{11}$: mark the first unmarked 1 before #.
* $\xrightarrow{\#\to R}q_4$: no unmarked symbol before #, check after #.
* <span class="red">$\mathsf{X}, \sqcup$: impossible</span>
* $q_{01}$: After marking 0 before #, find corresponding unmarked 0 after #.
* $\xrightarrow{0,1\to R}q_{01}$: skip unmarked
* $\xrightarrow{\#\to R}q_{02}$: reach #
* <span class="red">$\mathsf{X}, \sqcup$: impossible</span>
* $q_{02}$: After reaching # from $q_{01}$, find unmarked 0 after #.
* $\xrightarrow{0\to \mathsf{X},L}q_2$: found unmarked 0, mark it and go back.
* $\xrightarrow{\mathsf{X}\to R}q_{02}$: skip marked
* <span class="red">$1$: not match</span>
* <span class="red">$\sqcup$: the second part is shorter</span>
* <span class="red">$\#$: impossible</span>
* $q_{11}, q_{12}$: Similar to $q_{01}, q_{02}$ but for 1.
* $q_2$: Moving back to the left end to enter $q_1$.
* $\xrightarrow{0,1,\mathsf{X}\to L}q_2$: move left
* $\xrightarrow{\#\to L}q_2$: move left
* <span class="red">$\sqcup$: impossible</span>
* $q_3$: Continue to move left to the left end.
* $\xrightarrow{0,1\to L}q_3$: move left
* $\xrightarrow{\mathsf{X}\to R}q_1$: return to $q_1$
* <span class="red">$\#, \sqcup$: impossible</span>
* $q_4$: After reaching # from $q_1$, check if there is any unmarked symbol after #.
* $\xrightarrow{\mathsf{X}\to R}q_4$: skip marked
* $\xrightarrow{\sqcup\to R}q_{\text{accept}}$: no unmarked symbol after #, accept.
* <span class="red">0, 1: the second part is longer</span>
* <span class="red">$\#$: impossible</span>
##### Example: A Turing Machine that decides $\{a^ib^jc^k|i\times j=k\land i,j,k\ge 1\}$
```text
scan the input to see if it is in the form a^+b^+c^+
while 1:
cross off the first a
if no a to cross off:
check if there is any unmarked c
if yes:
reject (too many c's)
else:
accept
else:
while any b remains:
cross off the first b
cross off the first c
if no c to cross off:
reject (too few c's)
fill back all b's
return to the left end
```
##### Example: A Turing Machine that decides $\{\#x_1\#x_2\ldots\#x_n|\text{ each } x_i\in\{0,1\}^*\land x_i\ne x_j \text{ for each }i\ne j\}$
```text
for i in range(1,n+1):
if i+1 > n:
accept (no more x_j to compare)
for j in range(i+1,n):
if x_i == x_j:
reject
```
Idea: What's important is how to remember which $x_i$ and $x_j$ are being compared. We use $\dot\#$ to mark $x_i$ and $x_j$ being compared, then compare them bit by bit.
At the end, $i=n$ and we cannot find any $\#$ after the $\dot\#$, so we accept.
### Variants of Turing Machine
Similar to DFA and NFA, some variants of Turing Machine are equivalent in power.
> Invariance to certain changes in the definition is called **robustness**.
A simple variant is allowing to stay in the same cell. Since we can move right then left (on any symbol and not changing tape content), it's equivalent.
#### Multitape Turing Machine
k tapes. Each tape has its own head and can read/write/move independently.
$\delta: Q \times \Gamma^k \to Q \times \Gamma^k \times \{L,R\}^k$.
The input is placed on the first tape, others are blank.
It's equivalent to single-tape Turing Machine. A single-tape TM is automatically a multitape TM that doesn't use extra tapes. It remains to prove the other direction.
##### Proof for converting a multitape TM $M$ to a single-tape TM $S$
* $S$ has more states to simulate $M$.
* $S$ uses a special symbol $\#$ to separate the contents of different tapes.
* $S$ uses special symbols (dot on each symbol) to mark the head position on each tape.
* $S$ setups $\#\dot{w_1}w_2\ldots w_k\#\dot\sqcup\#\dot\sqcup\#\ldots\#$ on its tape.
* $S$ simulates each transition of $M$ by:
* First scan: Scanning its tape to read the symbols under each head (This requires many extra states).
* Second scan: Scanning its tape again to perform the write and move operations (This also requires many extra states).
* In addition, if $S$ needs to write a symbol at the end of a tape, it shifts the contents to the right to make space. (This also requires many extra states.)
#### Nondeterministic Turing Machine
Similar to NFA.
$\delta: Q \times \Gamma \to P(Q \times \Gamma \times \{L,R\})$
##### Proof for simulating a NTM $N$ by a Multitape DTM $D$
Tapes of $D$:
* Tape 1 contains the input and is read-only.
* Tape 2 is used to simulate the work tape of $N$.
* Tape 3 is used to iterate through all possible computation paths of $N$.
We cannot use DFS, so use BFS to simulate $N$'s computation.
Suppose $b$ is the size of the largest set in the range of $\delta$ of $N$. Tape 3 contains a sequence of numbers in $\{1,2,\ldots,b\}$, representing the choices made at each step of $N$'s computation. Tape 3's contents iterate through shortlex order of $\{1,2,\ldots,b\}^*$ for each round of simulation (e.g. $(\epsilon,1,2,11,12,\ldots)$ for $b=2$).
* Init
* Set input on Tape 1.
* Set Tape 2,3 to be empty.
* Each branch
* Copy Tape 1 to Tape 2.
* Use Tape 2 to simulate $N$'s computation.
* Use Tape 3 to determine which transition to take at each step.
* Try next branch if:
1. No more symbol on Tape 3. We arrive the desired depth, but not found an accept state.
2. The choice on Tape 3 is invalid ($\delta$ returns empty set). We cannot proceed further.
3. Reach an reject state.
* Accept if reach an accept state.
* Increment Tape 3 to try the next branch.
##### Turing-Recongizable and Decidable
A language is Turing-recognizable $\iff$ some NTM recognizes it. (Shown above)
A language is Turing-decidable $\iff$ some NTM decides it.
A NTM is a decider $\iff$ all its computation branches halt on all inputs. So we need to modify the above simulation slightly so that if $N$ is a decider, $D$ also halts on all inputs.
#### Enumerators
等價於 2-tape TM,其中一個作為 output tape。
從空 input 開始 compute,output tape 上以特殊符號 $\#$ 隔開的字串集合就是這個 enumerator 所 enumerates 的語言,有可能不 halt,所以有無限的 language。
Some enumerator enumerates language $A$ $\iff$ some Turing machine recognizes $A$.
Proof idea:
* $\Rightarrow$: Given an enumerator $E$ for language $A$, construct a TM $M$ by simulating $E$ and compare the input with each output string. Accept if a match is found.
* $\Leftarrow$: Given a TM $M$ that recognizes language $A$, construct an enumerator $E$ by generating all strings in $\Sigma^*$ in some order and simulating $M$ on each string. Output the string if $M$ accepts it.
#### Other Models of Computation
Turing Machine 最重要的特性是 unrestricted access to unlimited memory,與 FA、PDA 不同。
另外也有很多 Computaional models,但幾乎都是等價於 Turing Machine。
### The Definition of Algorithm
#### Hilbert's Problems
1900 年提出,其中第 10 題是:
> Given a Diophantine equation with any number of unknown quantities and with rational integral numerical coefficients: To devise a process according to which it can be determined in a finite number of operations whether the equation is solvable in rational integers.
rational integral: 整數, Diophantine equation: 整係數多項式方程式,finite number of operations: algorithm。
給定一個多變數整係數多項式方程式,是否存在一個演算法能在有限步驟內判斷它是否有整數解?
把多項式 somehow encode 成 string,所有多項式可以視為 language:
* $D=\{\langle p\rangle|p\text{ is a ... polynomial with an integral root}\}$
* Turing-recognizable but not decidable.
* $D_1=\{\langle p\rangle|p\text{ is a ... polynomial over x with an integral root}\}$
* Turing-decidable.
Turing Machines:
* $M_1$ that recognizes $D_1$ but is not a decider:
* Try $0,\pm 1,\pm 2,\ldots$ one by one, accept if find an integral root.
* Doesn't halt if no integral root.
* $M$ that recognizes $D$ but is not a decider:
* Similarly, try all possible integer assignments to all variables one by one, accept if find an integral root.
* Doesn't halt if no integral root.
* $M_1'$ that decides $D_1$:
* 若 $|x|\ge k\cdot c_{\text{max}}/c_1$ (k: number of terms, $c_{\text{max}}$: max absolute coefficient, $c_1$: leading coefficient),則 $\left|c_1 x^k\right|>\left|\sum_{i=0}^{k-1} c_{\text{max}}x^i\right|$,所以不可能是根。
* 嘗試範圍內的整數即可。
* 但多變數時沒有這個性質。
#### Terminology for Describing Turing Machines
實際上不太可能把詳細的指針如何移動全部以 Formal definitiion 描述出來,因此會用 pseudocode 或是更低階一點把大致的流程寫出來。
## 4 Decidability
### 4.1 Decidable Languages
#### Decidable Problems Concerning Regular Languages
**$A_{\text{DFA}}$ (Acceptance problem for DFAs)**:
$$A_{\text{DFA}} = \{\langle B, w \rangle | B \text{ is a DFA that accepts string } w\}$$
* 每個 string 是 DFA $B$ 與 string $w$ 的 encoding。
* 若 input 不符合 encoding 格式,TM reject。
* 若符合格式,檢查 DFA $B$ 是否 accept $w$。
**Theorem 4.1**: $A_{\text{DFA}}$ is decidable.
**Proof**: 設計 TM $M$ 來 decide $A_{\text{DFA}}$:
* $M$ 的 input: $\langle B, w \rangle$ (encoding of DFA $B$ and string $w$)
* $M$ **simulate** $B$ on input $w$
* 若 simulation 在 accept state 結束 → $M$ accept $\langle B, w \rangle$
* 若 simulation 在 non-accept state 結束 → $M$ reject $\langle B, w \rangle$
注意區分:
* $M$ 的 input string: $\langle B, w \rangle$
* $B$ 的 input string: $w$
---
**$A_{\text{NFA}}$ (Acceptance problem for NFAs)**:
$$A_{\text{NFA}} = \{\langle B, w \rangle | B \text{ is an NFA that accepts string } w\}$$
**Theorem 4.2**: $A_{\text{NFA}}$ is decidable.
**Proof**: 設計 TM $N$ 來 decide $A_{\text{NFA}}$:
1. Convert NFA $B$ to equivalent DFA $C$
2. Run TM $M$ (from Theorem 4.1) on input $\langle C, w \rangle$
3. 若 $M$ accepts → $N$ accepts;若 $M$ rejects → $N$ rejects
可以把 $M$ 和 $N$ 想成 functions:
* $N$ 的 input: $(B, w)$
* $N$ 執行: $C \leftarrow \text{Convert}(B)$,然後 call $M(C, w)$
* $N$ return $M$ 的結果
---
**$A_{\text{REX}}$ (Acceptance problem for regular expressions)**:
$$A_{\text{REX}} = \{\langle R, w \rangle | R \text{ is a regex that generates string } w\}$$
**Theorem 4.3**: $A_{\text{REX}}$ is decidable.
**Proof**: 設計 TM $P$ 來 decide $A_{\text{REX}}$:
1. Convert regex $R$ to equivalent NFA $A$
2. Run TM $N$ on input $\langle A, w \rangle$
3. Return $N$ 的結果
---
**$E_{\text{DFA}}$ (Emptiness testing for DFAs)**:
$$E_{\text{DFA}} = \{\langle A \rangle | A \text{ is a DFA and } L(A) = \varnothing\}$$
**Theorem 4.4**: $E_{\text{DFA}}$ is decidable.
**Proof**: 設計 TM $T$ 來 decide $E_{\text{DFA}}$:
1. Mark start state of $A$
2. Repeat: mark any state reachable from marked states (類似 marking algorithm)
3. 直到 no new states get marked
4. 若 no accept state is marked → $T$ accepts (language is empty)
5. 若 any accept state is marked → $T$ rejects (language is not empty)
---
**$EQ_{\text{DFA}}$ (Equivalence testing for DFAs)**:
$$EQ_{\text{DFA}} = \{\langle A, B \rangle | A, B \text{ are DFAs and } L(A) = L(B)\}$$
**Theorem 4.5**: $EQ_{\text{DFA}}$ is decidable.
**Proof**: 設計 TM $F$ 來 decide $EQ_{\text{DFA}}$:
1. Construct DFA $C$ such that $L(C) = (L(A) \cap \overline{L(B)}) \cup (\overline{L(A)} \cap L(B))$
* $C$ accepts strings accepted by **either** $A$ or $B$ but **not both**
* 比照 regular languages 的 closure properties (union, intersection, complement) 的證明,使用 cross-product construction,accept state 定義為:exactly one of $A$, $B$ is in accept state
2. Run TM $T$ on input $\langle C \rangle$
3. 若 $T$ accepts (即 $L(C) = \varnothing$) → $F$ accepts (即 $L(A) = L(B)$)
4. 若 $T$ rejects → $F$ rejects
**關鍵**: $L(C) = \varnothing \Leftrightarrow L(A) = L(B)$
#### Decidable Problems Concerning Context-Free Languages
**$A_{\text{CFG}}$ (Acceptance problem for CFGs)**:
$$A_{\text{CFG}} = \{\langle G, w \rangle | G \text{ is a CFG that generates string } w\}$$
**Theorem 4.7**: $A_{\text{CFG}}$ is decidable.
**Proof**: 設計 TM $S$ 來 decide $A_{\text{CFG}}$:
1. Convert $G$ to equivalent grammar in **Chomsky Normal Form** (CNF)
2. 若 $|w| = n$,則任何 derivation of $w$ 恰有 $2n - 1$ steps
* 若 $n = 0$ (empty string),則檢查 1 step derivation
3. List all derivations with $2n - 1$ steps
4. 若任何 derivation generates $w$ → $S$ accepts
5. 若沒有 → $S$ rejects
**關鍵**: CNF 保證 derivation steps 有界,所以可以窮舉。若是一般的形式,TM 可能不會 halt。
---
**$E_{\text{CFG}}$ (Emptiness testing for CFGs)**:
$$E_{\text{CFG}} = \{\langle G \rangle | G \text{ is a CFG and } L(G) = \varnothing\}$$
**Theorem 4.8**: $E_{\text{CFG}}$ is decidable.
**Proof**: 設計 TM $R$ 來 decide $E_{\text{CFG}}$:
1. Mark all terminal symbols in $G$
2. Repeat: mark variable $A$ if $A \to u$ where all symbols in $u$ are already marked, and mark variable $A$ if $A \to BC$ where $B,C$ are already marked
3. 直到 no new variables get marked
4. 若 start variable is marked → $R$ rejects (can generate some string)
5. 若 start variable is not marked → $R$ accepts (cannot generate any string)
**關鍵**: 從結果開始倒推是否能產生全部都是 terminal symbols 的字串。
如果使用上一個 TM $S$,則有無限的字串需要檢查,無法 halt。
---
**$EQ_{\text{CFG}}$ (Equivalence testing for CFGs)**:
$$EQ_{\text{CFG}} = \{\langle G, H \rangle | G, H \text{ are CFGs and } L(G) = L(H)\}$$
**問題**: $EQ_{\text{CFG}}$ is **NOT decidable** (will prove in Section 5.2)!
與 regular languages 不同,CFLs **not closed under complement or intersection**,無法用類似 $EQ_{\text{DFA}}$ 的方法。
---
**Theorem 4.9**: Every context-free language is decidable.
**Proof**: Let $G$ be a CFG for CFL $A$. 設計 TM $M_G$ 來 decide $A$:
* $M_G$ 的 input: string $w$
* $M_G$ call TM $S$ (from Theorem 4.7) on input $\langle G, w \rangle$
* 若 $S$ accepts → $M_G$ accepts
* 若 $S$ rejects → $M_G$ rejects
**關鍵**: 利用 $A_{\text{CFG}}$ decidable 的結果。
**不可行的方法**: Convert PDA to TM
* PDA 是 nondeterministic,而且某些 branches 可能 loop forever
* 轉換後的 TM 也可能有 nondeterministic branches loop forever
* 因此轉換後的 TM 不是 decider
### 4.2 Undecidability
#### The Undecidable Language $A_{\text{TM}}$
**$A_{\text{TM}}$ (Acceptance problem for TMs)**:
$$A_{\text{TM}} = \{\langle M, w \rangle | M \text{ is a TM and } M \text{ accepts } w\}$$
**性質**:
* $A_{\text{TM}}$ is **Turing-recognizable** but **undecidable**
**為何 Turing-recognizable?**
設計 TM $U$ (Universal TM):
* Input: $\langle M, w \rangle$
* Simulate $M$ on input $w$
* 若 $M$ accepts → $U$ accepts
* 若 $M$ rejects → $U$ rejects
* 若 $M$ loops on $w$ → $U$ also loops (無法 decide)
**為何 undecidable?**
無法判斷 $M$ 是否會在 $w$ 上 halt 還是只是跑很久,這就是著名的 **Halting Problem**。
#### Diagonalization Method
在證明 $A_{\text{TM}}$ undecidable 之前,先介紹 diagonalization method。
**定義 Review (Functions)**:
* **One-to-one (Injective)**: $f: A \to B$ 若 $f(a_1) \neq f(a_2)$ whenever $a_1 \neq a_2$
* 每個 $b \in B$ 最多被 map 一次
* **Onto (Surjective)**: $f: A \to B$ 若 $\forall b \in B, \exists a \in A$ s.t. $f(a) = b$
* 每個 $b \in B$ 至少被 map 一次
* **Correspondence (Bijective)**: Both one-to-one and onto
* 每個 $b \in B$ 恰好被 map 一次
**定義 (Same Size)**: $A$ and $B$ have the **same size** if $\exists$ correspondence $f: A \to B$
**範例**: $\mathbb{N}$ (natural numbers) 和 $E$ (even natural numbers) have the same size
* $\mathbb{N} = \{1, 2, 3, 4, 5, \ldots\}$
* $E = \{2, 4, 6, 8, 10, \ldots\}$
* Correspondence: $f(n) = 2n$
* 雖然 $E \subset \mathbb{N}$,但根據定義它們 same size
**定義 (Countable)**:
* A set $A$ is **countable** if:
* $A$ is finite, OR
* $A$ has the same size as $\mathbb{N}$
* An infinite set is **uncountable** if 不存在 correspondence with $\mathbb{N}$
---
**範例**: $\mathbb{Q}^+$ (positive rational numbers) is countable
* 用 zigzag pattern 列舉:$\frac{1}{1}, \frac{2}{1}, \frac{1}{2}, \frac{3}{1}, \frac{1}{3}, \frac{2}{2}, \ldots$
* 可以跳過重複的(如 $\frac{2}{2} = \frac{1}{1}$)
* 可以建立 correspondence with $\mathbb{N}$
---
**Theorem**: $\mathbb{R}$ (real numbers) is uncountable
**Proof** (Diagonalization):
Assume $\exists$ correspondence $f: \mathbb{N} \to \mathbb{R}$,列出:
* $f(1) = 1.\textbf{1}4514\ldots$
* $f(2) = 1.9\textbf{1}9810\ldots$
* $f(3) = 0.12\textbf{3}45\ldots$
* $\vdots$
構造 $x \in \mathbb{R}$ 使得 $x \neq f(n)$ for all $n$:
* $x$ 的第 $i$ 位 digit $\neq f(i)$ 的第 $i$ 位 digit
* 避免選 0 或 9(防止 0.999... = 1.000... 的情況)
* 則 $x \in \mathbb{R}$ 但 $x$ 沒有被任何 $f(n)$ map 到,矛盾!
---
**Theorem**: Set of all TMs is **countable**
* 每個 TM 都可以 encode 成 string
* 這個 string $\in\Sigma^*$,而 $\Sigma^*$ is countable (可用 shortlex order 列舉)
* 因此 set of all TMs is countable
也可以得出 Turing-recognizable languages is countable。其他 Regular languages、CFLs 也是 countable。
---
**Theorem**: Set of all languages is **uncountable**
**Proof**: Set of all languages has the same size as set of all **infinite binary sequences**
* 給定 $\Sigma^* = \{s_0, s_1, s_2, \ldots\}$ (shortlex order).
* Language $A$ 可表示為 characteristic sequence $\chi_A$:
* $\chi_A[i] = 1$ if $s_i \in A$
* $\chi_A[i] = 0$ if $s_i \notin A$
* 例: $A = \{0, 00, 01, 000, 001\}$ over $\Sigma = \{0, 1\}$
* $\Sigma^* = \{\epsilon, 0, 1, 00, 01, 10, 11, 000, 001, \ldots\}$
* $\chi_A = 010110011\ldots$
* Correspondence $\chi$: Languages $\to$ Infinite binary sequences
但與 real number 相似,Infinite binary sequences 與 $\mathbb{N}$ 之間不存在 correspondence(用 diagonalization 證明),所以 set of all languages is uncountable。
---
**Corollary 4.18**: Some languages are **Turing-unrecognizable**
**Proof**:
* Set of all TMs: countable
* Set of all languages: uncountable
* Each TM recognizes only 1 language
* $\therefore$ 存在無法被任何 TM recognize 的 languages
#### Proof that $A_{\text{TM}}$ is Undecidable
**Theorem 4.11**: $A_{\text{TM}} = \{\langle M, w \rangle | M \text{ is a TM and } M \text{ accepts } w\}$ is undecidable.
**Proof** (By contradiction):
假設 $\exists$ decider $H$ for $A_{\text{TM}}$:
* Input: $\langle M, w \rangle$
* 若 $M$ accepts $w$ → $H$ accepts
* 若 $M$ does not accept $w$ (reject or loop) → $H$ rejects
構造 TM $D$:
```text
D on input ⟨M⟩:
Run H on input ⟨M, ⟨M⟩⟩
Output the OPPOSITE of what H outputs:
If M accepts ⟨M⟩ → H accepts, D rejects
If M rejects ⟨M⟩ → H rejects, D accepts
```
**關鍵步驟**: 執行 $D$ on input $\langle D \rangle$:
* 若 $D$ accepts $\langle D \rangle$
* → $H$ accepts $\langle D, \langle D \rangle \rangle$
* → $D$ rejects $\langle D \rangle$ ⚡ 矛盾
* 若 $D$ rejects $\langle D \rangle$
* → $H$ rejects $\langle D, \langle D \rangle \rangle$
* → $D$ accepts $\langle D \rangle$ ⚡ 矛盾
$\therefore$ $H$ 不存在,$A_{\text{TM}}$ is undecidable.
**Diagonalization 視角**:
建立表格,row = TMs $M_1, M_2, \ldots$,column = encodings $\langle M_1 \rangle, \langle M_2 \rangle, \ldots$
Entry $(i, j)$ = "accept" if $M_i$ accepts $\langle M_j \rangle$,否則空白
若 $H$ exists,可填滿整個表格(用 "accept" 或 "reject")
$D$ 的行為:flip diagonal entries
* 若 $M_i$ accepts $\langle M_i \rangle$ → $D$ rejects $\langle M_i \rangle$
* 若 $M_i$ rejects $\langle M_i \rangle$ → $D$ accepts $\langle M_i \rangle$
但 $D$ 本身也在列表中,$D$ on $\langle D \rangle$ 該是什麼?矛盾!
#### Co-Turing-Recognizable Languages
**定義**: Language $A$ is **co-Turing-recognizable** if $\overline{A}$ is Turing-recognizable
**Theorem 4.22**: A language is **decidable** iff it is both **Turing-recognizable** and **co-Turing-recognizable**
**Proof** ($\Rightarrow$):
* 若 $A$ decidable → $A$ Turing-recognizable (trivial)
* $\overline{A}$ also decidable (可將 decider 的 accept/reject 對調)
* $\therefore$ $\overline{A}$ Turing-recognizable
* $\therefore$ $A$ co-Turing-recognizable
**Proof** ($\Leftarrow$):
* 假設 $A$ 和 $\overline{A}$ 都 Turing-recognizable
* 存在 $M_1$ recognizes $A$,$M_2$ recognizes $\overline{A}$
* 設計 decider $M$ for $A$:
```text
M on input w:
Run M₁ and M₂ in parallel on w
If M₁ accepts → M accepts
If M₂ accepts → M rejects
```
* 因為 $w \in A$ or $w \in \overline{A}$(必居其一)
* $\therefore$ $M_1$ or $M_2$ 必有一個會 accept
* $\therefore$ $M$ always halts (是 decider)
* Correctness: $M$ accepts all strings in $A$, rejects all strings in $\overline{A}$
**Corollary 4.23**: $\overline{A_{\text{TM}}}$ is **Turing-unrecognizable**
**Proof**:
* $A_{\text{TM}}$ is undecidable (Theorem 4.11)
* $A_{\text{TM}}$ is Turing-recognizable (Universal TM)
* 若 $\overline{A_{\text{TM}}}$ Turing-recognizable
* → By Theorem 4.22, $A_{\text{TM}}$ decidable ⚡ 矛盾
* $\therefore$ $\overline{A_{\text{TM}}}$ is Turing-unrecognizable
## 5 Reducibility
### 5.1 Undecidable Problems from Language Theory
#### Concept of Reducibility
**Reducibility**: 將問題 $A$ convert 到問題 $B$,使得解決 $B$ 可以解決 $A$。
**嚴謹定義**: A is reducible to B
$$\exists f: \Sigma^* \to \Sigma^* \text{ such that } \forall w: w \in A \iff f(w) \in B$$
因為這章我們關注 decidability,因此 $f$ 必須是 **computable function**,即 $\exists$ TM 可以對於任意 $w$ 在有限步驟內計算出 $f(w)$。計作 $A \leq_m B$。
> 在關注 Time Complexity 是否是 P 時,就需要 $f$ 是 polynomial-time computable function。計作 $A \leq_p B$。
若 $A$ is reducible to $B$,則:
* Solving $A$ cannot be harder than solving $B$
* $B$ is at least as hard as $A$
* 記作 $A \leq_m B$ (many-to-one reducible)
**邏輯方向**:
* 若 $A \leq_m B$ 且 $B$ decidable → $A$ decidable
* 若 $A \leq_m B$ 且 $A$ undecidable → $B$ undecidable
我們會用第二點來證明很多問題是 undecidable。
---
#### $HALT_{\text{TM}}$ (Halting Problem)
$$HALT_{\text{TM}} = \{\langle M, w \rangle | M \text{ is a TM and } M \text{ halts on } w\}$$
**Theorem 5.1**: $HALT_{\text{TM}}$ is undecidable.
**Proof** (By Reducibility):
> Review: $A_{\text{TM}} = \{\langle M, w \rangle | M \text{ is a TM and } M \text{ accepts } w\}$
We will reduce $A_{\text{TM}}$ to $HALT_{\text{TM}}$.
假設 $\exists$ decider $R$ for $HALT_{\text{TM}}$。
構造 TM $S$ to decide $A_{\text{TM}}$:
```text
S on input ⟨M, w⟩:
Run R on input ⟨M, w⟩
If R rejects (M loops):
S rejects (because M never accepts w)
If R accepts (M halts):
Simulate M on w
If M accepts → S accepts
If M rejects → S rejects
```
**邏輯**: 若 $R$ 存在,可用它決定 $M$ 是否會 halt,進而決定 $M$ 是否會 accept
---
#### $E_{\text{TM}}$ (Emptiness Testing for TMs)
$$E_{\text{TM}} = \{\langle M \rangle | M \text{ is a TM and } L(M) = \varnothing\}$$
**Theorem 5.2**: $E_{\text{TM}}$ is undecidable.
**Proof** (By Reducibility):
假設 $\exists$ decider $R$ for $E_{\text{TM}}$
構造 TM $S$ to decide $A_{\text{TM}}$ on input $\langle M, w \rangle$:
1. Construct TM $M_1$:
```text
M₁ on input x:
If x ≠ w: reject
If x = w: run M on w
accept if M accepts w
```
1. Run $R$ on input $\langle M_1 \rangle$
2. 若 $R$ accepts (即 $L(M_1) = \varnothing$) → $S$ rejects (M does not accept w)
3. 若 $R$ rejects (即 $L(M_1) \neq \varnothing$) → $S$ accepts (M accepts w)
**關鍵性質**: $L(M_1) = \varnothing \Leftrightarrow M$ does not accept $w$
---
#### $REG_{\text{TM}}$ (Regularity Testing for TMs)
$$REG_{\text{TM}} = \{\langle M \rangle | M \text{ is a TM and } L(M) \text{ is regular}\}$$
**Theorem 5.3**: $REG_{\text{TM}}$ is undecidable.
**Proof** (By Reducibility):
假設 $\exists$ decider $R$ for $REG_{\text{TM}}$
構造 TM $S$ to decide $A_{\text{TM}}$ on input $\langle M, w \rangle$:
1. Construct TM $M_2$:
```text
M₂ on input x:
If x = 0ⁿ1ⁿ (n ≥ 0):
accept
Else:
run M on w
accept if M accepts w
```
1. Run $R$ on input $\langle M_2 \rangle$
2. 若 $R$ accepts (即 $L(M_2)$ is regular) → $S$ accepts (M accepts w)
3. 若 $R$ rejects (即 $L(M_2)$ is not regular) → $S$ rejects (M does not accept w)
**關鍵性質**:
* 若 $M$ does not accept $w$: $L(M_2) = \{0^n 1^n | n \geq 0\}$ (非 regular)
* 若 $M$ accepts $w$: $L(M_2) = \Sigma^*$ (regular)
* 故 $L(M_2)$ is regular $\Leftrightarrow$ $M$ accepts $w$
---
#### $EQ_{\text{TM}}$ (Equivalence Testing for TMs)
$$EQ_{\text{TM}} = \{\langle M_1, M_2 \rangle | M_1, M_2 \text{ are TMs and } L(M_1) = L(M_2)\}$$
**Theorem 5.4**: $EQ_{\text{TM}}$ is undecidable.
**Proof** (By Reducibility from $E_{\text{TM}}$):
假設 $\exists$ decider $R$ for $EQ_{\text{TM}}$
構造 TM $S$ to decide $E_{\text{TM}}$ on input $\langle M \rangle$:
1. Construct $M_3$: reject all inputs (即 $L(M_3) = \varnothing$)
2. Run $R$ on input $\langle M, M_3 \rangle$
3. 若 $R$ accepts (即 $L(M) = L(M_3) = \varnothing$) → $S$ accepts
4. 若 $R$ rejects (即 $L(M) \neq \varnothing$) → $S$ rejects
### Computation Histories
**Accepting Computation History**:
* TM $M$ 在 input $w$ 上執行時產生的 configuration sequence
* $C_1 \to C_2 \to \cdots \to C_k$,其中:
* $C_1$ = start configuration
* 每個轉移 $C_i \to C_{i+1}$ 遵循 $M$ 的規則
* $C_k$ = accepting configuration
**Rejecting Computation History**: 最後是 rejecting configuration
**嚴格定義**:
* **Computation history** for $M$ on $w$: 有限 configuration sequence
* 若 $M$ 不 halt on $w$ (loop forever) → **沒有 computation history**
* 若 $M$ halts → 存在唯一的 computation history (deterministic case)
**重要性質**:
* **Deterministic TM**: 最多一個 computation history
* 若 $M$ accepts $w$ → 有 1 個 accepting history
* 若 $M$ rejects $w$ → 有 1 個 rejecting history
* 若 $M$ loops on $w$ → 沒有 history
* **Nondeterministic TM**: 可能有多個 computation histories
* 不同的 branch 產生不同的 history
* 只要**任何一個** branch accepts,就認為 NTM accepts
**強調**: 本課程重點在 **deterministic TM**,故通常最多一個 history
### Linear Bounded Automata
#### Definition
**Linear Bounded Automaton (LBA)**: 是受限的 TM,tape head 不能移出 input 範圍
* Tape 長度受限於 input length:若 $|w| = n$,tape 最多 $n$ 個格子
* Left/right boundary: tape head 不能超出 $[0, n-1]$
* 其他方面與 TM 相同
其實很強,以上 $A_\text{DFA}, A_\text{CFG}, E_\text{DFA}, E_\text{CFG}$ 都可以用 LBA 決定。
而且每個 CFL 都可以被某個 LBA decide。
第九章會介紹不能被 LBA decide 的 decidable language。
**記號**: $A_{\text{LBA}} = \{\langle M, w \rangle | M \text{ is an LBA and } M \text{ accepts } w\}$
#### Decidability of $A_{\text{LBA}}$
**Lemma 5.8**: 設 LBA $M$ 有 $q$ 個 states 和 $g$ 個 tape symbols
Configuration 數量: $q \cdot n \cdot g^n$ (distinctly)
* Tape content: $g^n$ 種可能
* Head position: $n$ 種位置
* State: $q$ 種狀態
**Theorem 5.9**: $A_{\text{LBA}}$ is decidable.
**Proof**:
設計 TM $L$ to decide $A_{\text{LBA}}$:
```text
L on input ⟨M, w⟩:
Simulate M on w for at most q·|w|·g^|w| steps
If M halts and accepts → L accepts
If M halts and rejects → L rejects
If M doesn't halt within limit:
L rejects (M is looping)
```
**關鍵**: 因為 configuration 數有限,超過 $q \cdot n \cdot g^n$ 步後若未 halt 必定在 loop,可以 reject
---
#### Undecidability of $E_{\text{LBA}}$
$$E_{\text{LBA}} = \{\langle M \rangle | M \text{ is an LBA and } L(M) = \varnothing\}$$
**重要對比**: $A_{\text{LBA}}$ decidable,但 $E_{\text{LBA}}$ undecidable!
**Theorem 5.10**: $E_{\text{LBA}}$ is undecidable.
**Proof** (By Reducibility from $A_{\text{TM}}$):
假設 $\exists$ decider $R$ for $E_{\text{LBA}}$
構造 LBA $M'$ to decide $A_{\text{TM}}$ on input $\langle M, w \rangle$:
1. 設計 LBA $B$ 來 recognize accepting computation histories of $M$ on $w$
* Input 格式: $\#C_1 \# C_2 \# \cdots \# C_k\#$ (configuration sequence)
* $B$ 驗證:
* $C_1$ 是 $M$ 在 $w$ 上的 start configuration
* 每個 $C_i$ 到 $C_{i+1}$ 是合法轉移,在兩個 configuration 間 zig zag
* $C_k$ 是 accept configuration
2. 構造 TM $S$:
```text
S on input ⟨M, w⟩:
Construct LBA B (as described above)
Run R on input ⟨B⟩
If R accepts (L(B) = ∅):
S rejects (M does not accept w)
If R rejects (L(B) ≠ ∅):
S accepts (M accepts w)
```
**邏輯**: $L(B) = \varnothing \Leftrightarrow$ no accepting computation history exists $\Leftrightarrow$ M does not accept w
---
### $\text{ALL}_{\text{CFG}}$ is Undecidable
$$\text{ALL}_{\text{CFG}} = \{\langle G \rangle | G \text{ is a CFG and } L(G) = \Sigma^*\}$$
> 無法用 $\text{E}_{\text{CFG}}$ 的方法證明,因為 CFLs is not closed under complement
**Theorem 5.13**: $\text{ALL}_{\text{CFG}}$ is undecidable.
**Proof** (Reduction from $A_{\text{TM}}$):
假設 $\exists$ decider $R$ for $\text{ALL}_{\text{CFG}}$,我們可以用它 decide $A_{\text{TM}}$ ($\langle M, w \rangle$)
**核心想法**: 構造 CFG $G_{M,w}$,使得
$$L(G_{M,w}) = \Sigma^* \Leftrightarrow M \text{ does not accept } w$$
接著用一個 TM 模擬與 $G_{M,w}$ 等價的 PDA。
換句話說,**$G_{M,w}$ 的設計**是生成所有**不是** accepting computation history 的 strings
> $G_{M,w}$ 生成所有滿足以下**至少一個條件**的 strings:
>
> 1. $C_1$ 不是 $M$ 在 $w$ 上的 start configuration
> 2. $C_k$ 不是 accepting configuration
> 3. 某個 $C_i$ 到 $C_{i+1}$ 的轉移不符合 $M$ 的規則
**重要性質**:
* 若 $M$ **不** accept $w$ → 不存在 accepting computation history → 所有 string 都滿足至少一個條件 → $L(G_{M,w}) = \Sigma^*$
* 若 $M$ **accepts** $w$ → 存在 accepting computation history → 該 history 不滿足任何條件 → 被排除在 $L(G_{M,w})$ 外 → $L(G_{M,w}) \neq \Sigma^*$
**構造 $G_{M,w}$ (從 PDA 轉換)**:
因為 CFG 和 PDA 等價,我們構造一個 PDA $D_{M,w}$ 來解釋 $G_{M,w}$ 的構造,只是為了容易理解,理論上也能直接構造 CFG。
**Input 格式**: $\#C_1\#C_2^R\#C_3\#C_4^R\#\cdots\#C_k\#$
> 注意偶數位置的 configuration 是**反轉**的 (reverse order),因為 PDA 是 stack (LIFO),要比較 $C_i$ 和 $C_{i+1}$ 的對應位置,比如都是 (state, tape position, tape symbol) 的順序
**PDA $D_{M,w}$ 的運作**:
Non-deterministically **branch** 到三個檢查 (對應三個條件,用 OR 連接):
1. **Branch 1**: 檢查 $C_1$ 不是 start configuration
2. **Branch 2**: 檢查 $C_k$ 不是 accepting configuration
3. **Branch 3**: Non-deterministically 選某個 $i$,檢查 $C_i$ 不 yield $C_{i+1}$
**重點**: PDA 是 non-deterministic,**只要有一個 branch accept**,整個 string 就被 accept,代表不是一個 accepting computation history。
---
### $\text{EQ}_{\text{CFG}}$ is Undecidable
Reduction from $\text{ALL}_{\text{CFG}}$.
* 構造一個 CFG $H$,使得 $L(H) = \Sigma^*$。
* 然後檢查 $L(G) = L(H)$。
---
### Simple Undecidable Problem: Post Correspondence Problem (PCP)
**定義**: 給定一個骨牌 (dominoes) 的組合 $P = \{[t_1/b_1], [t_2/b_2], \ldots, [t_k/b_k]\}$,每個骨牌上下各有一個 string。
**Match**: 一個序列 $i_1, i_2, \ldots, i_\ell$ (可重複、可改變順序),使得上下字串的串接結果一樣:
$$t_{i_1} t_{i_2} \cdots t_{i_\ell} = b_{i_1} b_{i_2} \cdots b_{i_\ell}$$
**PCP Language**:
$$\text{PCP} = \{\langle P \rangle | P \text{ has a match}\}$$
**Theorem**: PCP is **undecidable**.
(證明略,課堂未涵蓋)
**意義**: 這是一個非常具體的演算法問題 (不像 $A_{\text{TM}}$ 那麼抽象),但仍然是 undecidable
* 給定骨牌集合,無法用 TM 判斷是否存在 match
* 可能永遠跑下去,無法確定是真的不存在還是還在找
## 7 Time Complexity
### 7.1 Measuring Complexity
Complexity Theory 研究**資源使用量** (resource usage),主要是 **time** 與 **space**。本章聚焦 time complexity。
對於 **decidable** language,我們關心:用多少 **time** (steps) 來 decide?
#### 範例: $A = \{0^n 1^n | n \geq 0\}$
**Single-tape TM $M_1$**:
1. **Stage 1**: 掃描 tape,檢查格式是否為 $0^*1^*$。Reject 若有 1 在 0 右側。
* **Complexity**: $O(n)$ (掃描一遍 + 回到最左端)
2. **Stage 2-3**: Repeat:若還有 0 和 1,就掃描一遍,cross off 一個 0 和一個 1。
* 最多 $n/2$ 次掃描,每次 $O(n)$ → **Complexity**: $O(n^2)$
3. **Stage 4**: 檢查是否還有 0 或 1 剩餘。
* **Complexity**: $O(n)$
**總複雜度**: $O(n) + O(n^2) + O(n) = O(n^2)$
---
#### Formal Definition
**Definition 7.1** (Running Time):
設 $M$ 是一個 **deterministic TM** that halts on all inputs。$M$ 的 **running time** (或 **time complexity**) 是函數 $f: \mathbb{N} \to \mathbb{N}$:
$$f(n) = \text{maximum number of steps } M \text{ uses on any input of length } n$$
* 考慮**最壞情況** (worst case)
* 若 $M$ 的 running time 是 $f(n)$,也說 "$M$ runs in time $f(n)$" 或 "$M$ is an $f(n)$ time TM"
---
#### Asymptotic Analysis
**Big O Notation**:
**Definition 7.2**: 設 $f, g: \mathbb{N} \to \mathbb{R}_{\geq 0}$
$$f(n) = O(g(n)) \iff \exists c > 0, n_0 \in \mathbb{N} \text{ s.t. } \forall n \geq n_0, f(n) \leq c \cdot g(n)$$
* $g(n)$ 是 $f(n)$ 的 **asymptotic upper bound**
* 忽略常數係數與低階項
* 只關心當 $n$ 很大時的行為
**範例**:
* $f_1(n) = 5n^3 + 2n^2 + 22n + 6 = O(n^3)$ (也是 $O(n^4)$,但不是 $O(n^2)$)
* $f_2(n) = 3n\log n + 5n\log\log n + 2 = O(n\log n)$
* $\log$ 的 base 不重要 (只差常數倍)
---
**Small o Notation**:
**Definition 7.3**: 設 $f, g: \mathbb{N} \to \mathbb{R}_{\geq 0}$
$$f(n) = o(g(n)) \iff \lim_{n\to\infty} \frac{f(n)}{g(n)} = 0$$
* $g(n)$ 成長速度**嚴格快於** $f(n)$
* 更強的條件
**範例**:
* $n^{0.5} = o(n)$ ($\sqrt{n}$ 相對於 $n$ 可忽略)
* $n = o(n\log\log n)$
* $n\log n = o(n^2)$
---
#### 改進的 TM: $M_2$ for $A = \{0^n 1^n | n \geq 0\}$
**Idea**: 每次掃描**減半** 0 和 1 的數量,而不是只 cross off 一對
**Algorithm**:
1. **Stage 1**: 同 $M_1$,檢查格式 $O(n)$
2. **Stage 2-4**: Repeat:
* 檢查剩餘的 0 和 1 總數是否為偶數,若為奇數 reject
* Cross off **every other** 0 (從第 1 個開始)
* Cross off **every other** 1 (從第 1 個開始)
* 若只剩 1 個 0,accept
* 若剩餘的 0 和 1 數量為 0 但另一個不是,reject
**Complexity 分析**:
* 每次掃描:$O(n)$
* 掃描次數:$O(\log n)$ (每次減半)
* **總複雜度**: $O(n\log n)$
---
#### 再改進: **Two-tape TM $M_3$** for $A$
**Idea**: 用第二條 tape 暫存 0,然後比對 1
**Algorithm**:
1. **Stage 1**: 檢查格式 $O(n)$
2. **Stage 2**: 掃描 tape 1 的所有 0,複製到 tape 2,直到遇到第一個 1
* **Complexity**: $O(n)$
3. **Stage 3**: 掃描 tape 1 的 1,每個 1 cross off tape 2 的一個 0
* **Complexity**: $O(n)$
4. **Stage 4**: 檢查 tape 2 是否為空
* **Complexity**: $O(n)$
**總複雜度**: $O(n) + O(n) + O(n) + O(n) = O(n)$
---
#### Time Complexity Class
**Definition 7.7**:
$$\text{TIME}(t(n)) = \{L | L \text{ is a language decidable by an } O(t(n)) \text{ time TM}\}$$
**範例**:
* $A = \{0^n 1^n | n \geq 0\} \in \text{TIME}(n^2)$ (by $M_1$)
* $A \in \text{TIME}(n\log n)$ (by $M_2$)
* $A \in \text{TIME}(n)$ (by $M_3$)
---
**重要觀察**:
* **Computability** (第 3-4 章): Single-tape TM = Multi-tape TM = NTM (等價)
* **Complexity** (第 7 章): 不同 model 的 complexity **可能不同**!
* Single-tape TM $M_1$: $O(n^2)$
* Single-tape TM $M_2$: $O(n\log n)$
* Two-tape TM $M_3$: $O(n)$
* 課本證明:single-tape TM 對此問題**至少需要** $O(n\log n)$ time
**結論**:
> When we talk about **computability**, all reasonable models are **equivalent**.
> When we talk about **complexity**, different models may have **different** time complexity.
#### Complexity Relationships Among Models
回顧 Chapter 3:不同 model (single-tape TM, multi-tape TM, NTM) 在 **computability** 上等價,但在 **complexity** 上可能不同。
---
**Multi-tape vs Single-tape TM**:
**Theorem 7.8**: 每個 $O(t(n))$ time multi-tape TM 都有等價的 $O(t(n)^2)$ time single-tape TM。
**Proof Idea**:
按照 Chapter 3 的 construction (用特殊符號分隔多條 tape),分析時間:
* **Active portion**: M 在 $t(n)$ steps 內最多使用 tape 上的 $t(n)$ 個位置
* **Simulation**: Single-tape TM $S$ 模擬 multi-tape TM $M$ 的每一步:
* **Two passes**: 第一次掃描讀取、第二次掃描寫入
* 若 $M$ 有 $k$ 條 tape,每條 active portion 至多 $t(n)$,則 $S$ 的 active portion 為 $O(k \cdot t(n)) = O(t(n))$
* 每步模擬時間: $O(t(n))$
* **Total time**: $M$ 執行 $t(n)$ steps,$S$ 每步 $O(t(n))$ → 總時間 $O(t(n)^2)$
**結論**: Complexity relationship 是 **quadratic** (二次方)。
* Multi-tape TM 跑 $O(n)$ → Single-tape TM 跑 $O(n^2)$
* Multi-tape TM 跑 $O(n^2)$ → Single-tape TM 跑 $O(n^4)$
---
**NTM vs DTM**:
**Running Time of NTM**:
**Definition 7.9**: 設 NTM $N$ 是 decider,其 running time $f(n)$ 定義為:
$$f(n) = \text{maximum number of steps on any branch of computation on any input of length } n$$
* 取所有 branches 中**最長路徑**的 steps
* 從 root 到 leaf (accept/reject) 的最大長度
---
**Theorem 7.11**: 每個 $O(t(n))$ time nondeterministic single-tape TM 都有等價的 $O(2^{O(t(n))})$ time deterministic single-tape TM。
**Proof Idea**:
按照 Chapter 3 的 construction (3-tape DTM 模擬 NTM,tape 3 記錄 branch choices),分析時間:
* 設 $b$ 為 $N$ 的 transition function 的最大分支數 (maximum number of legal choices)
* **Computation tree**:
* 最多有 $b^{t(n)}$ 個 leaves
* 總 nodes 數: 最多 leaves 數量的兩倍,$O(b^{t(n)})$
* **Simulation time**:
* 每個 node 從 root 走到該 node: $O(t(n))$
* 總時間: $O(t(n) \cdot b^{t(n)}) = O(2^{O(t(n))})$ (忽略 base $b$)
**結論**: Complexity relationship 是 **exponential** (指數)。
---
**重要觀察**:
* **Computability**: Single-tape TM = Multi-tape TM = NTM (等價)
* **Complexity**:
* Single-tape vs Multi-tape: **quadratic** ($n^2$)
* DTM vs NTM: **exponential** ($2^n$)
* Multi-tape 比 single-tape 「強」一點 (平方倍),但 NTM 比 DTM 「強很多」(指數倍)
---
### 7.2 The Class P
#### Polynomial Time 的重要性
**為何關注 polynomial time?**
1. **Growth rate**: 當 $n$ 很大時,polynomial 與 exponential 差距極大
* $n^{100}$ vs $1.1^n$: 當 $n \to \infty$,exponential 成長更快
2. **Model equivalence**: Deterministic models 之間通常是 **polynomially equivalent**
* 不同 deterministic TM variants 互相模擬的複雜度差距是 polynomial
3. **Efficiency**: Polynomial time 被視為 **efficient** (可接受的速度)
**注意**: 實際應用中,$n^2$ vs $n^3$ 仍有差異,但理論上都視為「fast」。
---
#### Definition of Class P
**Definition 7.12**:
$$\mathbf{P} = \{L | L \text{ is decidable in polynomial time on a deterministic single-tape TM}\}$$
* $\mathbf{P}$ 是 **languages** 的集合
* 可在 deterministic single-tape TM 上以 polynomial time decide
* 因為 deterministic models 是 polynomially equivalent,所以其他 deterministic models (multi-tape, RAM) 也適用
**直覺**: $\mathbf{P}$ 代表「可有效解決」的問題。
---
#### Encoding 的注意事項
**Reasonable encoding**:
* Graph: 列出 vertices 與 edges
* Numbers: 用 binary 表示
**Unreasonable encoding**:
* **Unary encoding** of number: 用 $n$ 個 1 表示數字 $n$
* $32$ 用 binary: `100000` (6 bits)
* $32$ 用 unary: `11111111111111111111111111111111` (32 bits)
* **影響**: 若用 unary,complexity 會被「人為降低」
**Pitfall**:
若 TM 的 input 為 $a_1,\ldots,a_n$,running time 是 $O(\max_i a_i)$:
數字 $a_i$ 的長度大約是 $\log a_i$,意思是長度 $L$ 的數字大約 $2^L$,所以以總輸入長度 $\sum_i O(\log a_i)$ 表示,running time 會是 **exponential** in input length。
---
#### Examples of Problems in P
**PATH**:
**Theorem 7.14**: $\text{PATH} \in \mathbf{P}$
$$\text{PATH} = \{\langle G, s, t \rangle | G \text{ is a directed graph with a directed path from } s \text{ to } t\}$$
**Algorithm**:
1. Mark node $s$
2. Repeat: 若存在 edge 從 marked node 到 unmarked node,mark 該 node
3. 直到無法 mark 更多 nodes
4. 若 $t$ 被 marked → accept;否則 reject
**Complexity**:
* 最多 $m$ 次 iteration ($m$ = number of nodes)
* 每次掃描所有 edges
* 總時間: $O(m \cdot |E|)$ = polynomial
---
**RELPRIME**:
**Theorem 7.15**: $\text{RELPRIME} \in \mathbf{P}$
$$\text{RELPRIME} = \{\langle x, y \rangle | x, y \text{ are relatively prime}\}$$
**Algorithm**: Euclidean algorithm
* 每步至少減半其中一個數字
* Iterations: $O(\min(\log x, \log y))$
**Complexity**: $O(\log x + \log y)$ = polynomial in input length
**注意**: 雖然有 $\log x$,但因為 $x$ 的 encoding 長度本身就是 $\log x$,所以仍是 polynomial。
---
**Every CFL**:
**Theorem 7.16**: Every context-free language is in $\mathbf{P}$
$$A_{\text{CFG}} = \{\langle G, w \rangle | G \text{ is a CFG that generates string } w\}$$
**錯誤嘗試**:
之前證明 decidability of CFL 時:
* 用 Chomsky Normal Form,嘗試所有 $2n-1$ steps 的 derivations
* **問題**: $k$ steps 的 derivations 數量可能是 **exponential** in $k$
**正確方法**: **Dynamic Programming** (CYK algorithm)
**Algorithm** (Pseudocode):
其實和之前證明 $E_{\text{CFG}}$ decidable 的方法類似,都是用逆推的。
設 `table[i][j]` = set of variables that can generate substring `w[i..j]`
```python
# Initialization: length 1 substrings
for i in range(1, n+1):
table[i][i] = {A | A → w[i] is a rule}
# DP: length 2 to n substrings
for length in range(2, n+1):
for i in range(1, n-length+2):
j = i + length - 1
table[i][j] = ∅
for k in range(i, j):
for each rule A → BC:
if B in table[i][k] and C in table[k+1][j]:
add A to table[i][j]
# Check if start variable can generate entire string
if S in table[1][n]:
accept
else:
reject
```
**Complexity**:
* 4 層 loop: $O(n) \times O(n) \times O(n) \times O(r) = O(n^3\cdot r)$,其中 $r$ = number of rules in $G$
* Polynomial time!
**結論**: $\text{CFL} \subseteq \mathbf{P}$
---
### 7.3 The Class NP
#### Motivating Examples
**兩個問題**:
1. **HAMPATH** (Hamiltonian Path): Directed graph 中是否存在經過每個 node 恰好一次的 path?
2. **COMPOSITES**: 給定 $x$,是否存在 $a, b > 1$ 使得 $x = a \times b$?
這兩個問題都在 $\mathbf{NP}$ 中。
---
#### Verifier 與 Certificate
**Definition 7.18** (Verifier):
A **verifier** for language $A$ is an algorithm $V$ where:
$$A = \{w | V \text{ accepts } \langle w, c \rangle \text{ for some string } c\}$$
* $c$ 稱為 **certificate** (或 **proof**) of membership
* $V$ 用來「驗證」$w \in A$
**Polynomial time verifier**:
* $V$ runs in polynomial time **in the length of $w$** (不考慮 $c$ 的長度)
**Polynomially verifiable**: Language $A$ 有 polynomial time verifier。
---
**Examples**:
* **HAMPATH**: Certificate = Hamiltonian path itself
* Verifier: 檢查是否為 valid path & 經過每個 node 恰好一次
* **COMPOSITES**: Certificate = divisor $a$ of $x$ (where $1 < a < x$)
* Verifier: 檢查 $x \mod a = 0$
---
#### Definition of NP
**Definition 7.19**:
$$\mathbf{NP} = \text{class of languages that have polynomial time verifiers}$$
**注意**:
* **NP ≠ "non-polynomial"**!
* **NP = "Nondeterministic Polynomial time"**
---
**Equivalent Definition**:
**Theorem 7.20**: Language $A \in \mathbf{NP}$ $\iff$ $A$ is decided by some nondeterministic polynomial time TM.
**兩個定義**:
1. **Original**: $A$ is polynomially **verifiable** by a deterministic TM
2. **Equivalent**: $A$ is **decidable** in polynomial time by a nondeterministic TM
---
**Proof** ($\Rightarrow$):
設 $A \in \mathbf{NP}$,存在 polynomial time verifier $V$ runs in time $n^k$。
構造 NTM $N$:
1. **Stage 1**: Nondeterministically select string $c$ of length $\leq n^k$
2. **Stage 2**: Run $V$ on $\langle w, c \rangle$
3. 若任一個組合使 $V$ accepts → $N$ accepts;否則 $N$ rejects
* $N$ 利用 nondeterminism 「猜」certificate $c$
* 只要有一個 branch 找到正確的 $c$,$N$ 就 accept
---
**Proof** ($\Leftarrow$):
設 $A$ is decided by NTM $N$ in polynomial time。
構造 verifier $V$:
* **Certificate**: Encode the sequence of nondeterministic choices
* 記錄每步選哪個 branch (e.g., "step 1: option 2, step 2: option 3, ...")
* **Verifier**: Simulate $N$ on $w$,按照 $c$ 的指示走對應 branch
* 若該 branch accepts → $V$ accepts;否則 $V$ rejects
**關鍵**: Certificate 就是「正確的 nondeterministic choices」。
---
#### Examples of NP Problems
##### HAMPATH
**Theorem**: $\text{HAMPATH} \in \mathbf{NP}$
**Method 1** (Verifier):
* **Certificate**: Sequence of nodes $p_1, p_2, \ldots, p_m$
* **Verifier**:
* 檢查無重複
* 檢查 $p_1 = s$,$p_m = t$
* 檢查每對 $(p_i, p_{i+1})$ 是 edge
* Polynomial time!
**Method 2** (NTM):
構造 NTM $N$:
1. Nondeterministically create list $p_1, \ldots, p_m$ (where $m$ = number of nodes)
2. 若有重複 → reject
3. 若 $p_1 \neq s$ 或 $p_m \neq t$ → reject
4. 檢查每對 $(p_i, p_{i+1})$ 是否為 edge,若有缺失 → reject
5. Otherwise, Accept
---
##### CLIQUE
**Theorem 7.24**: $\text{CLIQUE} \in \mathbf{NP}$
$$\text{CLIQUE} = \{\langle G, k \rangle | G \text{ is an undirected graph with a } k\text{-clique}\}$$
* **$k$-clique**: subnet of $k$ 個 nodes,是個完全圖(每對都有 edge 連接)
**Method 1** (Verifier):
* **Certificate**: Subset $C$ of $k$ nodes
* **Verifier**:
* 檢查 $|C| = k$
* 檢查 $C$ 中每對 nodes 都有 edge
* Polynomial time!
**Method 2** (NTM):
1. Nondeterministically select subset $C$ of $k$ nodes
2. 檢查 $G$ 是否包含 $C$ 中所有 node pairs 的 edges
3. 若是 → accept;否則 reject
> 若 $k$ 是常數,則列舉組合是 $\binom{|V|}{k} = O(|V|^k)$,每個組合檢查 edges 是 $O(k^2)$,總時間是 $O(k^2|V|^k)$ polynomial in $|V|$。
> 但若 $k$ 是 input 的一部分,則因為 Pitfall 提到的 encoding 問題,$O(|V|^k)$ 使得整體時間 exponential in input length of $k$。(可以注意到 $k\le |V|$,所以 $O(k^2)$ 不是主因)
---
##### SUBSET-SUM
**Theorem 7.25**: $\text{SUBSET-SUM} \in \mathbf{NP}$
$$\text{SUBSET-SUM} = \{\langle S, t \rangle | S = \{x_1, \ldots, x_k\}, \exists Y \subseteq S: \sum_{y \in Y} y = t\}$$
* 允許 **multiset** (可重複使用元素)
**Example**: $S = \{4, 11, 16, 21, 27\}$,$t = 25$ → Yes ($4 + 21 = 25$)
**Method 1** (Verifier):
* **Certificate**: Subset $C \subseteq S$
* **Verifier**:
* 檢查 $C$ 中所有元素都在 $S$ 中
* 檢查 $\sum_{c \in C} c = t$
* Polynomial time!
**Method 2** (NTM):
1. Nondeterministically select subset $C \subseteq S$
2. 檢查 $\sum_{c \in C} c = t$
3. 若是 → accept;否則 reject
---
#### co-NP
某個 problem 的 complement 在 $\mathbf{NP}$ 中,則該 problem 屬於 **co-NP**。
比如 $\overline{\text{HAMPATH}}$: $G$ 是否不存在一個 Hamiltonian path?光是 certificate 就很難描述。
co-NP-complete: 若一個問題的 complement 是 NP-complete,則該問題是 co-NP-complete。
目前還不知道 $\mathbf{NP} = \text{co-}\mathbf{NP}$ 是否成立。但如果證明任一個 co-NP-complete 問題是 NP 或不是 NP,就知道是否 $\mathbf{NP} = \text{co-}\mathbf{NP}$,因為 NP-complete 問題的 complement 也可以互相 polynomial time reduce。
#### P vs NP
**觀察**:
* $\mathbf{P}$ = languages **decidable** in polynomial time (by DTM)
* $\mathbf{NP}$ = languages **verifiable** in polynomial time (by DTM)
* 或等價地:languages **decidable** in polynomial time (by NTM)
**顯然**: $\mathbf{P} \subseteq \mathbf{NP}$
* 若 $A \in \mathbf{P}$,可直接 decide,不需 certificate
* Verifier 可忽略 certificate,直接 decide $w$
**問題**: $\mathbf{P} = \mathbf{NP}$?
* **Most people believe**: $\mathbf{P} \neq \mathbf{NP}$
* 但至今**無人證明**!
* Clay Mathematics Institute 列為 [Millennium Prize Problems](https://en.wikipedia.org/wiki/P_versus_NP_problem) (獎金 $1M)
---
### 7.4 NP-Completeness (簡介)
#### Definition of NP-Complete
**Definition 7.34**:
Language $B$ is **NP-complete** if:
1. $B \in \mathbf{NP}$
2. Every $A \in \mathbf{NP}$ is **polynomial time reducible** to $B$
**直覺**: NP-complete 是 $\mathbf{NP}$ 中「最難」的問題。
**重要性**:
* 若任一 NP-complete problem $B \in \mathbf{P}$ → 則 $\mathbf{P} = \mathbf{NP}$
* 至今無人找到 polynomial time algorithm 解決任何 NP-complete problem
---
#### Examples of NP-Complete Problems
**一些 NP-complete problems**:
1. **SAT** (Boolean satisfiability): 給定 Boolean formula,是否存在 satisfying assignment?
* Cook-Levin Theorem: SAT 是第一個被證明的 NP-complete problem
2. **3SAT**: SAT 的特殊情況 (每個 clause 恰好 3 literals),2SAT 則在 P 中
3. **CLIQUE**
4. **HAMPATH**
5. **UHAMPATH**: Undirected 版本的 HAMPATH
6. **SUBSET-SUM**
7. **VERTEX-COVER**: 給定 graph $G$ 與 $k$,是否存在 $k$ 個 nodes 使得每條 edge 至少一端被選中?
**意義**: 許多實際問題都是 NP-complete,目前只能用 heuristics 或 approximation algorithms 處理。
---
### Chapter 7 總結
1. **Complexity measures**: Time complexity = number of steps
2. **Big O & Small o**: Asymptotic analysis 忽略常數與低階項
3. **Model relationships**:
* Multi-tape vs Single-tape: quadratic ($O(n^2)$)
* NTM vs DTM: exponential ($O(2^n)$)
4. **Class P**: Polynomial time decidable (efficiently solvable)
5. **Class NP**: Polynomial time verifiable
6. **P vs NP**: 至今未解決
7. **NP-Complete**: NP 中最難的問題,若任一 NP-complete $\in$ P 則 P = NP
---
## Course Summary
### 課程架構回顧
本課程分為三大部分:
#### Part 1: Automata and Languages
* **Regular Languages**:
* **DFA** (Deterministic Finite Automaton)
* **NFA** (Nondeterministic Finite Automaton)
* **Regular Expressions**
* **等價性**: DFA ≡ NFA ≡ Regular Expressions
* 都能 recognize/generate **Regular Languages**
* **Context-Free Languages**:
* **CFG** (Context-Free Grammar)
* **(N)PDA** (Pushdown Automaton,預設指 nondeterministic)
* **等價性**: CFG ≡ Nondeterministic PDA
* 都能 recognize/generate **Context-Free Languages**
* **注意**: DPDA (Deterministic PDA) **不等價於** PDA
* DPDA 能力弱於 PDA (課程未涵蓋,作為補充材料)
#### Part 2: Computability Theory
* **Turing Machine**:
* **Deterministic TM**
* **Nondeterministic TM**
* **等價性**: DTM ≡ NTM (computability 角度)
* 能 recognize **Turing-recognizable languages**
* 若 TM 是 **decider** (總是 halt) → recognize **Decidable languages**
* **Language Hierarchy**:
$$\text{Regular} \subset \text{Context-Free} \subset \text{Decidable} \subset \text{Turing-Recognizable} \subset \text{All Languages}$$
* 還有 **Turing-unrecognizable languages** (在 Turing-recognizable 之外)
* **Decidability** (Chapter 4):
* 證明某些 languages 是 **decidable**
* **Undecidability** (Chapter 5):
* **Diagonalization method**: 證明 $A_{\text{TM}}$ undecidable
* **Reduction**: 從 $A_{\text{TM}}$ reduce 到其他 languages,證明它們 undecidable
| | FA | CFG | TM | LBA |
|-------|-----|-----|----|------|
| $A$ | D | D | UD | D |
| $E$ | D | D | UD | UD |
| $EQ$ | D | UD | UD | (UD) |
| $ALL$ | (D) | UD | | |
(括號代表課堂沒證但確實是這樣)
TM 的 HALT, REG 也都是 UD。
#### Part 3: Complexity Theory (Chapter 7)
關注「需要多少**時間**解決問題」:
* **Class P**: Polynomial time decidable (deterministic)
* 被視為「可有效解決」的問題
* 例: PATH, RELPRIME, Every CFL
* **Class NP**: Polynomial time verifiable (deterministic) 或 Polynomial time decidable (nondeterministic)
* 例: HAMPATH, CLIQUE, SUBSET-SUM
* **P vs NP**:
* $\mathbf{P} \subseteq \mathbf{NP}$ (顯然)
* $\mathbf{P} = \mathbf{NP}$? **未解決** (Millennium Prize Problem)
* **NP-Complete**: NP 中最難的問題
* 例: SAT, 3SAT, CLIQUE, HAMPATH, SUBSET-SUM, VERTEX-COVER
* 若任一 NP-complete problem $\in \mathbf{P}$ → $\mathbf{P} = \mathbf{NP}$
### 課程意義與應用
#### 為何學習 Theory of Computation?
1. **畢業要求**: CSIE 學生必修
2. **基礎知識**: 計算理論的基礎
3. **實際應用**: 雖然畢業生問卷顯示「最不實用」,但:
* 基礎知識有廣泛影響
* 研究領域可能需要 (cryptography, systems engineering)
* Model-based design 的理論基礎
#### Model-Based Design
對於 **safety-critical systems** (醫療設備、自動駕駛、航空器):
* 不能直接實作後測試
* 需從 **model of computation** 開始:
* **Finite State Machine** (FSM): 良好起點
* **Transition System**
* **Hybrid System**: FSM + differential equations
* **驗證**: 確保系統不會到達 unsafe states
* **流程**: Model → Verify → Implementation
**課程教授的 models of computation 是這些應用的理論基礎**。
### 面對計算困難問題的策略
許多現實問題是 computationally hard (NP-complete),可採用以下策略:
1. **犧牲最優解**: 使用 approximation algorithms
2. **考慮其他計算模型**: 非傳統計算 (quantum computing, DNA computing)
3. **不考慮 worst case**: 平均情況或實際情況分析
4. **Heuristics**: 啟發式演算法
### AI 與計算的局限性
* **AI 與電腦很強大**,但存在**根本限制**:
* Undecidable problems
* NP-complete problems (目前無 polynomial time solution)
* **理解限制**才能**善用**:
* 知道什麼能做、什麼不能做
* 選擇合適的方法與工具
**作為 CSIE 學生,應理解計算的能力與局限,才能做出真正有價值的貢獻。**