--- tags: [2025_Fall] --- # 自動機與形式語言筆記 <!-- HackMD ID:pVyh2UKlRniZCGACJ4xFVw --> ## 0 Introduction <style> .red {color: red;} </style> 最終目標是介紹 Complexity,但會先從 Automata、Computability 開始。 ### Notions #### 基本概念 * **集合 (Set)**:無序的獨特物件組合。Subset: $\subseteq$, Proper subset: $\subset$. * **序列 (Sequence)**:有順序的物件排列。(本筆記中盡量 0-based) * **元組 (Tuple)**:有限序列,K 個元素稱 K-元組。 * **笛卡兒積 (Cartesian product/ Cross product)**:$A \times B = \{(a, b) | a \in A, b \in B\}$,所有有序對的集合。 * **冪集 (Power Set)**:$P(A) = \{S | S \subseteq A\}$,集合 $A$ 所有子集的集合。 #### 函數與關係 * **函數 (Function)**:每個輸入對應唯一輸出。 * **述詞 (Predicate / Property)**:range 為 {TRUE, FALSE} 的函數。 * **關係 (Relation)**:Domain 為 a set of k tuples ($A \times \ldots \times A$) 的 predicate,比如大於(2>1 是 TRUE)。用集合表示會更方便,例如 $P:D \to \{\text{TRUE}, \text{FALSE}\}$ 寫成 $(D,S)$,其中 $S=\{a\in D | P(a) = \text{TRUE}\}$。 * **等價關係 (Equivalence Relation)**:二元關係 $R$ 須滿足: 1. **自反性 (Reflexive)**:$\forall x, xRx$。 2. **對稱性 (Symmetric)**:$\forall x, y, xRy \Rightarrow yRx$。 3. **遞移性 (Transitive)**:$\forall x, y, z, xRy \land yRz \Rightarrow xRz$。 #### 圖 (Graph) 這裡未指名時通常是無向。 #### Strings 與 Languages 很重要 * **Alphabet** 是任何 **nonempty finite** set * Alphabet 的成員是 **Symbol** * **String** (over an alphabet A) 是 **finite** sequence of symbols from A。(本筆記中盡量 1-based) * $\epsilon$ 是空字串。 * **Shortlex order**: 和字典序一樣,但是先比長度再比字典序,例如 $(\epsilon, 0, 1, 00, 01, \ldots)$ 是 over $\{0, 1\}$ 的 shortlex order。 * **Language** 是某個 alphabet 上的 string 的集合,可能(很常)是 infinite。 ### Definitions, Theorems, and Proofs * **Definition**: 定義概念。 * **Statement**: 命題,未必是 TRUE 或 FALSE。 * **Proof**: 講述命題為真的邏輯論述。 * **Theorem**: 已被證明的 statement。 * **Lemma**: 輔助用的 theorem。 * **Corollary**: 從 theorem 推導出的 statement。 ### Types of Proofs * **Proof by Construction**: 直接構造出例子。 * 比如要求證明對於大於 2 的偶數 n,存在一個 n 個 node 的 3-regular Graph,就直接寫出對於每個 n 要如何建構 Graph。 * **Proof by Contradiction**: 反證法。 * 假設命題為 FALSE,然後推導出矛盾。 * **Proof by Induction**: 歸納法。 * basis: 基礎 case。 * inductive step: 假設對於 n 成立 (**Induction Hypothesis**),證明對於 n+1 也成立。 ## 1 Regular Languages ### Finite Automata #### Definition of Finite Automaton $(Q, \Sigma, \delta, q_0, F)$ 是一個 finite automaton,其中: * $Q$ 是 **states** 的有限集合。 * $\Sigma$ 是作為 **alphabet** 的有限集合。 * $\delta: Q \times \Sigma \to Q$ 是 **transition function**。 * $q_0 \in Q$ 是 **start state**。 * $F \subseteq Q$ 是 **accept (final) states** 的集合。 運作方式:自動機從初始狀態 $q_0$ 開始,依序讀取輸入字串中的每個符號。根據轉移函數 $\delta$,從當前狀態和讀取的符號決定下一個狀態。讀取完所有符號後,若最終狀態屬於接受狀態集合 $F$,則該字串被接受;否則,該字串被拒絕。 > $\delta$ is a total function, so for every state and input symbol, there is always a defined next state. **機器 M 所辨識的語言 (language of machine M)**: 所有被 M 接受的字串所組成的集合 A,寫成 $L(M)=A$,稱為 M recognizes A (accepts 也行,但是通常用在 string)。 #### Definition of Computation Let * $M=(Q, \Sigma, \delta, q_0, F)$ be a finite automaton. * $w = a_1 a_2 \ldots a_n$ be a string over $\Sigma$. M accepts w $\iff$ there exists a sequence of states $r_0, r_1, \ldots, r_n \in Q$ such that: 1. $r_0=q_0$ 2. $r_{i+1} = \delta(r_{i}, a_{i+1}), \forall i \in [0, n-1]$ 3. $r_n \in F$ M recognizes language A $\iff A =\{w| M \text{ accepts } w\}$. A **Regular Language** 是可以被某個 finite automaton 所接受的 language。 #### Regular Operations Let A and B be languages. * **Union**: $A \cup B = \{w | w \in A \lor w \in B\}$ * **Concatenation**: $A \circ B = \{w_1 w_2 | w_1 \in A \land w_2 \in B\}$ 也可以省略不寫 $\circ$,直接寫成 $AB$。 * **Kleene Star**: $A^* = \{w_1w_2\ldots w_k|k\ge 0 \text{ and each }w_i\in A\}$,也可以寫成 $\bigcup_{n=0}^{\infty} A^n = \{\epsilon\} \cup A \cup A^2 \cup A^3 \cup \ldots$,其中 $A^0=\{\epsilon\}, A^n=\{wv|w\in A^{n-1},v\in A\}$。 **Theorem**: The set of regular languages is closed under all three of the regular operations. * For union, prove by construction: Let $A_1$ and $A_2$ be regular languages recognized by finite automata $M_1=(Q_1, \Sigma, \delta_1, q_{01}, F_1)$ and $M_2=(Q_2, \Sigma, \delta_2, q_{02}, F_2)$, we can construct a new finite automaton $M=(Q, \Sigma, \delta, q_0, F)$ that recognizes $A_1 \cup A_2$. * $Q = Q_1 \times Q_2$ (Cartesian product) * $\Sigma$ remains the same, but this theorem is still true if the alphabets are different. * $\delta((q_1, q_2), a) = (\delta_1(q_1, a), \delta_2(q_2, a))$ * $q_0 = (q_{01}, q_{02})$ * $F = (F_1\times Q_2)\cup(Q_1\times F_2) = \{(q_1, q_2) | q_1 \in F_1 \lor q_2 \in F_2\}$ * For concatenation: 無法簡單構造,因為不知道 $A_1$ 和 $A_2$ 中間要在哪裡切,所以需要 Nondeterminism。Kleene Star 也是類似的情況。 ### Nondeterminism #### Definition of Nondeterministic Finite Automaton (NFA) Transition function $\delta$ 由 $Q \times \Sigma \to Q$ 改成 $Q\times \Sigma_\epsilon\to P(Q)$,其中 $P(Q)$ 是 $Q$ 的 Power set,$\Sigma_\epsilon=\Sigma\cup\{\epsilon\}$。也就是可能轉移到輸出的子集中的任一個 state,甚至輸入的 symbol 可以是 $\epsilon$。 其實也能在使輸入維持 $Q\times \Sigma\to P(Q)$,只要把轉移後再經過若干 $\epsilon$ 轉移能到的狀態都新增到輸出中就行了。 轉移條件改為 $r_{i+1} \in \delta(r_{i}, a_{i+1}), \forall i \in [0, n-1]$ **Nondeterminism is a generalization of determinism**: Every DFA (deterministic finite automaton) is automatically an NFA (nondeterministic finite automaton). > For Nondeterministic Automata, $\delta$ is a total function but the output can be empty set, so we may skip some transitions when drawing. #### Equivalence of NFAs and DFAs NFA 與 DFA 是等價的,也就是對於任意 NFA,都可以找到與其 recognizes 相同 language 的 DFA。對於有 $k$ 個狀態的 NFA $N=(Q,\Sigma,\delta,q_0,F)$,對應的 DFA $D=(Q',\Sigma,\delta',q_0',F')$ 的定義如下: * 先定義輔助用的 $E(R)=\{q|q$ can be reached from $R$ by traveling along 0 or more $\epsilon$ arrows $\}$,表示從狀態集合 $R$ 可以到達的所有狀態。 * state: $Q'=P(Q)$ * transition function: $\delta'(R,a)=\bigcup_{r\in R}E(\delta(r,a))$ * start state: $q_0' = E(\{q_0\})$ * accept states: $F' = \{R \in Q' | R \cap F \neq \varnothing\}$ 因此,以後並不太需要在意 NFA、DFA 的差別。 **Corollary**: A language is regular $\iff$ an NFA recognize it. $\Rightarrow$:Regular 有 DFA recognize,而 DFA 是 NFA 的特例。 $\Leftarrow$:NFA recognize 某 language,則對應的 DFA 也 recognize 相同 language。 #### Closure under regular operations Let * $N_1=(Q_1, \Sigma, \delta_1, q_{01}, F_1)$ be an NFA recognizing language $A_1$. * $N_2=(Q_2, \Sigma, \delta_2, q_{02}, F_2)$ be an NFA recognizing language $A_2$. For each operation, construct a NFA $N=(Q,\Sigma,\delta,q_0,F)$ that recognizes the resulting languages, $A_1\cup A_2, A_1\circ A_2, A_1^*$: * **Union**: 額外建立一個初始狀態 $q_0$,可以經過 $\epsilon$ 隨意到 $q_{01}$ 或 $q_{02}$。 * $Q=\{q_0\}\cup Q_1\cup Q_2$ * $\delta(q,a)=$ * $\delta_1(q,a)$ if $q\in Q_1$ * $\delta_2(q,a)$ if $q\in Q_2$ * $\{q_{01},q_{02}\}$ if $q=q_0 \land a=\epsilon$ * $\varnothing$ if $q=q_0 \land a\ne \epsilon$ * $F=F_1\cup F_2$ * **Concatenation**: 因為先經過 $N_1$ 再經過 $N_2$,$M$ accept 的狀態必須是 $N_1, N_2$ 都 accept 過,所以把所有 $N_1$ 的 accept state 以 $\epsilon$ 接到 $N_2$ 的 start state。細節是 $N_1$ 的 accept state 也可能繼續在 $Q_1$ 裡面走,不一定都直接到 $q_{02}$。 * $Q=Q_1\cup Q_2$ * $\delta(q,a)=$ * $\delta_1(q,a)$ if $q\in Q_1 \land \lnot(q \in F_1 \land a=\epsilon)$ * $\delta_1(q,a)\cup\{q_{02}\}$ if $q\in F_1 \land a=\epsilon$ * $\delta_2(q,a)$ if $q\in Q_2$ * $q_0=q_{01}$ * $F=F_2$ * **Kleene Star**: 所有原本的 accept state 可以回到 start state 繼續,且新 start state 也是一種 accept state。 * $Q=\{q_0\}\cup Q_1$ * $\delta(q,a)=$ * $\delta_1(q,a)$ if $(q\in Q_1 \land q \notin F_1) \lor (q\in F_1 \land a \ne \epsilon)$ * $\delta_1(q,a)\cup\{q_{01}\}$ if $q\in F_1 \land a=\epsilon$ * $\{q_{01}\}$ if $q=q_0 \land a=\epsilon$ * $\varnothing$ if $q=q_0 \land a\ne \epsilon$ * $F=\{q_0\}\cup F_1$ 另外,事實上對於一些其他 operation 也是 closed,比如: * Intersection: $A_1\cap A_2$。使用 product automaton (之前在 Union 使用的方法),使得狀態同時在 $N_1$ 和 $N_2$ 的 accept states 才算 accept。 * Set Difference: $A_1-A_2$。一樣用 product automaton,讓狀態在 $N_1$ 的 accept states 但不在 $N_2$ 的 accept states 才算 accept。 * Complement: $A_1^c=\Sigma^*-A_1$。顯然 $\Sigma^*$ 是 regular,而 Set Difference 下也 closed,所以 $A_1^c$ 也是 regular。 ### Regular Expressions Symbol 或 Alphabet 經過 Regular Operations 即為 Regular Expression。每個 Expression 只是一個記號,代表一個 set of strings 也就是 language,說成 "The expression **describes** the language"。 Let $\Sigma$ be an alphabet. A regular expression $R$ over $\Sigma$ is defined as follows (以下 $\to$ 符號代表 expression 描述的 language): * $\varnothing\to\varnothing$ * $\epsilon\to\{\epsilon\}$ * $a$ for some $a\in \Sigma\to\{a\}$ * $R_1 \cup R_2$ for regular expressions $R_1, R_2\to L(R_1) \cup L(R_2)$ * $R_1 \circ R_2$ for regular expressions $R_1, R_2\to L(R_1) \circ L(R_2)$ * $R_1^*$ for regular expression $R_1\to L(R_1)^*$ 若沒有括號,順序是 Kleene Star > Concatenation > Union,分別可以類比為次方、乘法、加法。 #### 速記 * $R^+=RR^*$,至少重複一次,$R^+\cup\epsilon=R^*$ * $R^n=R^{n-1}\circ R, R^0=\epsilon$ * $L(R)$ 是 $R$ 所代表的 language。 #### Examples 只挑稍微難理解的 * $1^*\varnothing$: 因為是指任何數量的 1 接上「什麼東西都不能」,所以 $1^*\varnothing=\varnothing$,事實上任何 expression 接上 $\varnothing$ 都是 $\varnothing$。 * 任何 expression 與 $\varnothing$ 的 Union 是本身。 * 類似的,任何 expression 與 $\epsilon$ 的 Concatenation 是本身。 * $\varnothing^*=\epsilon$,因為 $L(\varnothing^*) = L(\varnothing)^0 \cup L(\varnothing)^1 \cup L(\varnothing)^2 \cup \ldots = \{\epsilon\} \cup \varnothing \cup \varnothing \cup \ldots = \{\epsilon\}$ 注意正規表達式本身是 notation,不直接等於它描述的 language。因此,應寫 $\varnothing^*=\epsilon$(表示這兩個正規表達式描述相同 language),或 $L(\varnothing^*)=\{\epsilon\}$(表示 $\varnothing^*$ 所描述的 language 是只有 epsilon 的集合)。 #### Equivalence with Finite Automata A language is regular $\iff$ A regular expression describes it. $\Leftarrow$: Proof by Construction. Given a regular expression $R$, we can construct a NFA $M=(Q,\Sigma,\delta,q_0,F)$ that recognizes the language $L(R)$. (若很明顯會省略) * **Base cases** * $R=a, a\in\Sigma$: $\delta(q_0,a)=\{q_1\}, F=\{q_1\}$ * $R=\epsilon$: $F=\{q_0\}$ * $R=\varnothing$: $F=\varnothing$ * **Inductive steps**. For Union, Concatenation, and Kleene Star, use the proof that the set of regular languages is close under those operations. Let $R_1$ and $R_2$ are regular expressions such that $L(R_1), L(R_2)$ are both regular languages so there exist their respective DFAs. For $(R_1\cup R_2), (R_1\circ R_2), (R_1^*)$, we can use the construction from the proof for closure under regular operations to construct a DFA $M$ that recognizes $L(R_1\cup R_2), L(R_1\circ R_2), L(R_1^*)$ respectively. $\Rightarrow$: Proof by Construction. Convert a DFA to a regular expression. First to a **special form** of **generalized nondeterministic finite automaton (GNFA)** then to a regular expression. ##### GNFA **GNFA**: NFA but can have regular expressions as the input to the transition function. Use a **special form** of GNFA: * Start state * It has a transition going to any other state * It has no transition coming in from any other state * Accept state * There is only a single accept state * It is not the same as the start state * It has a transition coming in from any other state * It has no transition going to any other state * Except for the start and accept states * One transition goes from every state to every other state * One transition goes from each state to itself (可以把**非** special form 的 GNFA 轉換成 special form 的 GNFA) **Definition of GNFA (Special Form)**: $(Q,\Sigma,\delta,q_{start},q_{accept})$ where: * $Q,\Sigma$: same as NFA * $\delta:$<span class="red">$(Q-\{q_{accept}\})\times(Q-\{q_{start}\})$</span>$\to\mathcal{R}$, where $\mathcal{R}$ is the set of regular expressions over $\Sigma$. (從狀態、symbol 對應到下一個狀態改為狀態對對應到 regular expression) No transition from $q_{accept}$ and no transition to $q_{start}$. * $q_{start},q_{accept}\in Q$ are the start and accept states, respectively. Computation 非常簡單,略過。 ##### 證明過程本體 * Convert a DFA to a GNFA (Special Form) * Add a new start state with an $\epsilon$ transition to the original start state. * Add a new accept state with an $\epsilon$ transition from every original accept state. * 對於每一對狀態(包含自環)的兩個方向,分別使用 Union 整合所有 transition * 對於每一對狀態(包含自環)的兩個方向,對沒有 transition 的以 label $\varnothing$ 連接 * Convert a GNFA (Special Form) to a regular expression: Remove states until only the start and accept states remain. * Select any state $q_{rip}\in Q$ different from $q_{start},q_{accept}$. * Let G' be the GNFA $(Q',\Sigma, \delta',q_{start},q_{accept})$ * $Q' = Q - \{q_{rip}\}$ * $\delta'(q_1,q_2) = \delta(q_1,q_{rip})\delta(q_{rip},q_{rip})^*\delta(q_{rip},q_2)\cup\delta(q_1,q_2)$ for any $q_1\in Q'-\{q_{accept}\}$ and any $q_2\in Q'-\{q_{start}\}$. * 實際操作時要記得自環也要按照同樣的方法處理。 ### Nonregular Languages * $B={0^n1^n|n\ge 0}$: Nonregular * $C=\{w|w \text{ has an equal number of 0s and 1s}\}$: Nonregular * $D=\{w|w \text{ has an equal number of occurrences of 01 and 10 as substrings}\}$: **Regular!** 因為其實只要看開頭與結尾即可,若一樣則 01 與 10 數量相同,若 0 開頭 1 結尾則 01 多了一個,反之亦然。重點是因為兩者數量頂多差 1,所以能以有限種狀態表達。 #### The Pumping Lemma for Regular Languages $A$ is a regular language $\Rightarrow$ There exists a **pumping length** $p\ge 1$ such that for any string $w\in A$ with $|w|\ge p$, we can write $w=xyz$ such that: * $xy^iz\in A\quad\forall i\ge 0$ * $|y|\ge 1$ * $|xy|\le p$ [**The converse is not true!!!**](https://en.wikipedia.org/wiki/Pumping_lemma_for_regular_languages#Invalidity_of_the_lemma_converse) It can only be used to prove a language is nonregular, but some nonregular languages may satisfy the property. To prove a language is regular, the typical way is to construct an DFA/NFA or a regular expression. ##### Proof by Construction Let * $M=(Q,\Sigma,\delta,q_0,F)$ be a DFA recognizing language $A$. * $p=|Q|$ be the pumping length. * $w=a_1a_2\ldots a_n$ be a string in $A$ of length $n\ge p$. * $r_0, r_1, \ldots, r_n$ be the states visited by $M$ on input $w$. ($r_0=q_0$, $r_{i+1}=\delta(r_{i},a_{i+1}) \forall i\in [0,n-1]$, and $r_n\in F$) 取前 $p+1=|Q|+1$ 個,因為 $M$ 只有 $|Q|$ 種狀態,by the pigeonhole principle, there exists $i,j$ such that $0\le i<j\le p$ and $r_i=r_j$. * $x=a_1\ldots a_i, y=a_{i+1}\ldots a_j, z=a_{j+1}\ldots a_n$. 圖示: $r_0\xrightarrow{x}r_i\xrightarrow{y}r_j\xrightarrow{z}r_n$ Since $r_i=r_j$, repeating $y$ any number of times will not change the state, so $L(xy^*z)\subseteq A$. Also, $|y|=j-i\ge 1$ because $i<j$, and $|xy|\le p$ because $j\le p$. ##### Using Pumping Lemma to Prove Nonregularity * **P (A is a regular language) $\Rightarrow$ Q (The Pumping Lemma Condition):** If a language $A$ is regular, then **there exists** a pumping length $p \ge 1$ such that for **every string** $w \in A$ with $|w| \ge p$, we **can** write $w = xyz$ satisfying the three conditions: * **Contrapositive Logic for Non-regularity ($\lnot Q \Rightarrow \lnot P$):** The negation of the Pumping Lemma Condition ($\lnot Q$): **For every** $p \ge 1$, **there exists** a string $w \in A$ with $|w| \ge p$ such that $w$ **cannot** be written as $xyz$ satisfying the three conditions. ##### Example: Prove $B=\{0^n1^n|n\ge 0\}$ is nonregular Assume $B$ is regular. Let $p$ be the pumping length. Select $w=0^p1^p$. This $w \in B$ and $|w|=2p \ge p$. By the Pumping Lemma, $w=xyz$ must satisfy the three conditions. Since $|xy|\le p$, $x$ and $y$ can only contain '0's. So $y=0^k$ for some $k \ge 1$. Consider $xy^2z$. This string will have $p+k$ '0's and $p$ '1's. Since $k \ge 1$, the number of '0's ($p+k$) is not equal to the number of '1's ($p$). Therefore, $xy^2z \notin B$. This contradicts the Pumping Lemma. Thus, $B$ is nonregular. For $\{w|w \text{ has an equal number of 0s and 1s}\}$, we can use a similar argument. ##### Example: Prove $C=\{0^n1^m|n\ne m\land n,m\ge 0\}$ is nonregular (Note: While $L(0^*1^*) - B = E$ can prove this using the closure property of regular languages under set difference, here's the direct Pumping Lemma proof.) Assume $E$ is regular. Let $p \ge 1$ be the pumping length. Select $w=0^p1^{p+p!}$. This $w \in E$ (since $p \ne p+p!$) and $|w|=2p+p! \ge p$. By the Pumping Lemma, $w=xyz$ must satisfy the three conditions. Since $|xy|\le p$, $x$ and $y$ can only contain '0's. Let $y=0^k$ for some $k \ge 1$. Consider $xy^iz$. The number of '0's will be $p+(i-1)k$, while the number of '1's remains $p+p!$. Choose $i = 1 + \frac{p!}{k}$. (Since $1 \le k \le p$, $k$ divides $p!$, so $i$ is an integer $\ge 1$.) For this $i$, $xy^iz = 0^{p+p!}1^{p+p!}$. In this string, the number of '0's equals the number of '1's. Therefore, $xy^iz \notin E$ (as $n=m$). This contradicts the Pumping Lemma. Thus, $E$ is nonregular. ##### Example: Prove $D=\{1^{n^2}|n\ge 0\}$ is nonregular Assume $D$ is regular. Let $p \ge 1$ be the pumping length. Select $w=1^{p^2}$. This $w \in D$ and $|w|=p^2 \ge p$. By the Pumping Lemma, $w=xyz$ must satisfy the three conditions. Since $|y|\ge 1$, $|xy^2z|=|xyz|+|y|>p^2$. Since $|xy|\le p\Rightarrow|y|\le p$, $|xy^2z|=|xyz|+|y|\le p^2+p<(p+1)^2$. Since $|xy^2z|$ is not a perfect square, $xy^2z \notin D$. This contradicts the Pumping Lemma. Thus, $D$ is nonregular. ### Takeaways * NFA、DFA 是等價的 * 一個 language 是 regular,若且唯若一個 FA recognize 該 language * 一個 language 是 regular,若且唯若一個 regular expression 描述該 language * Regular languages 在一些 Operations 下是封閉的 * Pumping Lemma 可以用來證明某些 language 是 nonregular ## 2 Context-Free Languages * Context-Free Grammar: 比 Regular Expression 更強大的描述語言的方式。 * Context-Free Language(CFL): 由 Context-Free Grammar 生成的語言,Regular Language 是 Context-Free Language 的子集。 * Pushdown Automaton(PDA): 一種有 stack 的 automaton,可以辨識 Context-Free Language。 ### Context-Free Grammar (CFG) 主要由許多**規則**組成,把 symbol 分為 **variable** (包含 start variable) 與 **terminal**,規則的形式為 $V \to w$,其中 $V$ 是 variable,$w$ 是 terminal 或 variable 的串接。 衍生或生成 (derivation, generation) 一個字串的過程是: * 寫下 start variable * 對於每個 variable,找到左邊相同的規則 * 把變數替換為右側,直到整個字串只剩下 terminal 為止 所有某 grammar $G$ 能生成的 strings 的集合稱為 language of the grammar,一樣是 $L(G)$。 能被某 context-free grammar $G$ 生成的 language 稱為 context-free language。 若同個 variable 有多個規則,可以以 | 串接,比如 $A\to 0A1 | B$ #### Formal Definition of CFG $(\mathcal{V},\Sigma, R, S)$ 是一個 context-free grammar,其中: * $\mathcal{V}$ 是 **variable** 的有限集合。 * $\Sigma$ 是 **terminal** 的有限集合,$\mathcal{V} \cap \Sigma = \varnothing$。 * $R$ 是**規則**的有限集合,每條規則的形式為 $v \to w$,其中 $v \in \mathcal{V}$,$w \in (\mathcal{V} \cup \Sigma)^*$。 * $S \in \mathcal{V}$ 是 start variable,**通常就是第一條規則左側的變數**。 以下大寫代表 variable,小寫代表 terminal。 **yield** 指替換一步,比如 $uAv$ 以 $A\to w$ 替換為 $uwv$,$uAv$ yields $uwv$,記為 $uAv \Rightarrow uwv$。 **derivation** 指替換任意步,記為 $uAv \xRightarrow{*} uwv$,表示 $uAv$ 經過任意次替換後變成 $uwv$。 因此 language of a grammar 寫為 $\{w\in\Sigma^*|S \xRightarrow{*} w \land w\text{ contains only terminals}\}$。 #### Examples of CFG * $\{0^n1^n|n\ge 0\}$: $S\to 0S1|\epsilon$ * 對 a 的加與乘運算: * $\text{<EXPR>} \to \text{<EXPR>}+\text{<TERM>} | \text{<TERM>}$ * $\text{TERM} \to \text{<TERM>}\times\text{<FACTOR>} | \text{<FACTOR>}$ * $\text{FACTOR} \to ( \text{<EXPR>} )|a$ 其中第二個例子解釋先乘後加的規則可以由生成規則定義,由生成結構判斷順序:比如 $a+a\times a$ 與 $(a+a)\times a$ 都只有一種生成方式,所以很明顯要哪個先。 #### Designing CFG * 分開定義,比如 $S\to S_1|S_2$,$S_1,S_2$ 再各自生成 language。 * 對於 regular language,可以轉換 DFA 為 CFG: * 對每個 DFA 的 state $q_i$ 建立一個 CFG 的 variable $A_i$ * 對每個 $\delta(q_i,a)=q_j$ 的 transition,加入規則 $A_i\to aA_j$ * 對每個 accept state $q_{\text{acc}}$ 加入規則 $A_{\text{acc}}\to \epsilon$ * $A_0$ 作為 start variable * 遞迴結構:以 $A\to UAV$ 等方式定義 #### Parse Tree 每個 variable 與 terminal 作為節點,以 start variable 為根,每個 variable yield 的 variable 或 terminal 為其子節點,terminal 是 leaf,通常直接畫在最下面一排。 比如上方的「對 a 的加與乘運算」: ```text <EXPR> / \ <EXPR> <TERM> | / \ | <TERM> <FACTOR> | | | | | ( <EXPR> ) | | / \ | | <EXPR> <TERM> | | | | a + a x ( a + a ) ``` #### Ambiguity of CFG 例子的第二個使加號由 $\text{<EXPR>}$ 生成,而乘號以 $\text{<TERM>}$ 生成。$a+a\times a$ 因為加法外無括號,必然由 start variable $\text{<EXPR>}$ 生成,而不是 $\text{<TERM>}$ 先生成乘號再生成加號。 如果沒有括號則同個 string 能以多種方式 derive,會覺得它是模糊的。 但注意若只是順序不同,不視為不同方式。一律由 **leftmost derivation** (每步都替換最左邊的 variable) 來避免替換順序造成的歧義。 如果同個 string 能以某 grammar 透過多種不同的 **leftmost derivation** 得到,則 the string is derived **ambiguously** in the grammar。 若對於某 grammar,存在一個 string which is derived ambiguously,則 the grammar is **ambiguous**。 有些 CFL 能由 CFG 生成,但必須由 ambiguous CFG 生成,比如 $\{0^i1^j2^k|i=j\lor j=k\}$,稱為 **inherently ambiguous**。 #### Chomsky Normal Form 特殊形式,加上限制後可更簡化,很有用處,~~雖然這學期的課程並沒有用到~~。只包含三類規則: * $A\to BC$。 * $A\to a$。 * $S\to \epsilon$。$S$ 是 start variable Start variable 不出現在任意規則的右側。 所有 CFG 都可以被轉換為 Chomsky Normal Form。以下 $W$ 是 variable 或 terminal 的字串。 * **加上新的 start variable $S'$,以及 $S'\to S$。** 能避免 start variable 在任意規則右側。 * **刪除所有 $\epsilon$ 規則 ($A\to\epsilon$) ,除了 $S\to\epsilon$** 對於其他 variable 替換為包含 $A$ 的規則 $B\to W$,對於 $W$ 中 $A$ 的集合的每個子集,加入刪除後的規則。 比如 $B\to uAvAw$ 變成 $B\to uvAw|uAvw|uvw$。 若 $B\to A$,會變成 $B\to A|\epsilon$,留到之後再處理 $B$。但若之前刪過 $B\to\epsilon$,則不必加入 $B\to\epsilon$,因為其下游都已考慮過。 * **刪除所有單一變數規則 ($A\to B$)** 對於所有 $B\to W$ 規則,加入 $A\to W$ 規則。 * **其他轉換** * 處理右側長度 $k>2$ 的規則 $A\to W_1W_2\ldots W_k$,新增一些中繼變數 $A_1\ldots A_{k-2}$,將其轉換為 * $A\to W_1A_1$ * $A_1\to W_2A_2$ * $\vdots$ * $A_{k-2}\to W_{k-1}W_k$ * 處理 terminal 出現在一類規則的右側,對 $u$ 新增一個中繼變數 $A_u$,替換右側的 $u$,並新增規則 $A_u\to u$。 ### Pushdown Automata (PDA) FA 可視為 State Control 保有狀態,有一個 arrow 指向 input tape,並且只能往右移動。 PDA 作為 FA 的修改版,再加上一個 stack 作為額外的記憶體。 Deterministic PDA 與 Nondeterministic PDA 的能力不等價,但兩者都能 recognize 所有 Context-Free Language,所以以下 PDA 指的都是 Nondeterministic PDA,稱 Deterministic PDA 為 DPDA。 #### Formal Definition of PDA > 註:設 stack 的頂端在左側。 PDA: $M=(Q, \Sigma, \Gamma, \delta, q_0, F)$ where * $Q$ 是有限狀態集合。 * $\Sigma$ 是有限集合,作為 input alphabet。 * $\Gamma$ 是有限集合,作為 stack alphabet。 * $\delta: Q \times \Sigma_\epsilon \times \Gamma_\epsilon \to P(Q \times \Gamma_\epsilon)$。 * $q_0 \in Q$ 是初始狀態。 * $F \subseteq Q$ 是 accept states 的集合。 **Computation** 課本的定義稍微怪怪的,因此我自己寫。 Let: * $M=(Q, \Sigma, \Gamma, \delta, q_0, F)$ be a PDA. * $w = a_1 a_2 \ldots a_m$ be a string over $\Sigma$. $M$ accepts $w$ $\iff$ there exists a sequence of **instantaneous descriptions (IDs)** $(q_0, w_0, t_0), (q_1, w_1, t_1), \ldots, (q_n, w_n, t_n)\in Q\times \Sigma^*\times \Gamma^*$ such that: * **Initial ID:** $q_0$ is the start state, $w_0 = w$ is the full input string, and $t_0=\epsilon$ (Initial stack is empty)。 * **Transition:** For each $i \in [0, n-1]$, $(q_i, w_i, t_i) \Rightarrow (q_{i+1}, w_{i+1}, t_{i+1})$ must satisfy the following: * Let $t_i = x t'$, where $x \in \Gamma_\epsilon$. * Let $w_i = c w'$, where $c \in \Sigma_\epsilon$. * There must exist a pair $\delta(q_i, c, x)\ni(q_{i+1}, y)$ such that $w_{i+1}=w'$ ($c$ is read) and $t_{i+1}=yt'$(stack: $x t'$ to $y t'$). * **Final ID:** * $w_n=\epsilon$ (**the entire input string has been consumed**) * $q_n \in F$ (in an accepting state). * **There is no requirement on the final stack $t_n$.** 在圖示時,會在邊上標記 $a,x\to y$,代表讀取 $a$,stack 頂端要是 $x$,移除 $x$ 並將 $y$ 放到 stack 頂端。 ##### Shorthand for PDA 使 $\delta$ 能 push string,由右而左放入 stack,方便表示 transition。對於簡寫 $\delta(q,a,s)\ni (r,w),\ w=a_1a_2\ldots a_n$,新增 state $q_1\ldots q_{n-1}$,以及 transition: * $\delta(q,a,s)\ni (q_1,a_n)$ * $\delta(q_1,\epsilon,$<span class="red">$\ \epsilon$</span>$)\ni (q_2,a_{n-1})$ * $\vdots$ * $\delta(q_{n-1},\epsilon,$<span class="red">$\ \epsilon$</span>$)\ni (r,a_1)$ 或是 $q\xrightarrow{a,s\to r}r$ 變成:$q\xrightarrow{a,s\to a_n}q_1\xrightarrow{\epsilon,\epsilon\to a_{n-1}}q_2\xrightarrow{\epsilon,\epsilon\to a_{n-2}}\cdots\xrightarrow{\epsilon,\epsilon\to a_1}r$。 實際上維基上的定義及其他課本會使 transition 能直接 push 一個字串進 stack,能用 shorthand 所以是等價的。 另外,因為狀態是有限的,所以自然能 push 的 string 必須是有限的。 #### Equivalence of PDA and CFG A language is context-free $\iff$ some PDA recognizes it. $\Rightarrow$: Proof by Construction. Given a context-free grammar $G=(\mathcal{V},\Sigma, R, S)$, we can construct a PDA $M=(Q, \Sigma, \Gamma, \delta, q_0, F)$ that recognizes the language $L(G)$. * $Q=\{q_{\text{start}},q_{\text{loop}},q_{\text{acc}}\}\cup E$,其中 $E$ 是用於 shorthand 新增的額外狀態。 * $F=\{q_{\text{acc}}\}$。 * $\delta(q_{\text{start}},\epsilon,\epsilon)\ni (q_{\text{loop}},S\$)$ 一開始放入 $\$$,確保 accept 時 stack 實質上為空。 放入 $S$ 啟動 derivation。 * $\delta(q_{\text{loop}},\epsilon,A)=\{(q_{\text{loop}},w)|A\to w\text{ is a rule in }R\}$ 依規則展開 $A$。 * $\delta(q_{\text{loop}},a,a)=\{(q_{\text{loop}},\epsilon)\}$ 讀取 input,若與 stack 頂端的 terminal 相同,則繼續比對。 * $\delta(q_{\text{loop}},\epsilon,\$)=\{(q_{\text{acc}},\epsilon)\}$ Input 與 stack 都讀完,代表輸入與 $S$ derive 的其中一個字串相同。 $\Leftarrow$: Proof by Construction. Given a PDA $M=(Q, \Sigma, \Gamma, \delta, q_0, F)$, we can construct a context-free grammar $G=(\mathcal{V},\Sigma, R, S)$ that generates the language $L(M)$. 首先介紹 Special Form of PDA,以及經過一些看起來很蠢的轉換後把任何 PDA 轉換成這個形式: * **在 Accept 時 stack 必須是空的** * 新增 state $q_{\text{pop}}$ * 從每個原本的 accept state 新增 transition $\epsilon,\epsilon\to\epsilon$ 到 $q_{\text{pop}}$ * $\forall x \in \Gamma,\ \delta(q_{\text{pop}},\epsilon,x)\ni(q_{\text{pop}},\epsilon)$,使 $q_{\text{pop}}$ 能把 stack pop 到空 * **只有一個 accept state** * 新增 start state 與 $S'\xrightarrow{\epsilon,\epsilon\to \$}S$ * 新增 accept state 與 $q_{\text{pop}}\xrightarrow{\epsilon,\$\to \epsilon}q_{\text{acc}}$,順便確保 stack 排空 * **每個 transition 必須 push 或 pop 一個字元** * 把 $p\xrightarrow{a,b\to c}q$ 替換為 $p\xrightarrow{a,b\to \epsilon}s\xrightarrow{\epsilon,\epsilon\to c}q$,$s$ 是新增狀態 * 把 $p\xrightarrow{a,\epsilon\to \epsilon}q$ 替換為 $p\xrightarrow{a,\epsilon\to b}s\xrightarrow{\epsilon,b\to \epsilon}q$,$s$ 是新增狀態,$b$ 是任意的 stack symbol 對於 $M$ 的每對 states $p,q$,Let $A_{pq}\in \mathcal{V}$,想要使 $A_{pq}$ derives $x \iff$ x 能使機器從 ($p$, empty stack) 走到 ($q$, empty stack)。 這些字串 $\forall w\in \Gamma^*$ 也能使機器從 ($p$, $w$) 走到 ($q$, $w$)。 Construction for $A_{pq}$: 任意 $A_{pp}$ 不經任何轉移可產生 $\epsilon$,即 $A_{pp}\to \epsilon$。 任意 $A_{pq}$ 產生的非空字串 $x$ 在 special form of PDA 上第一步必然是 push,且最後一步是 pop,以維持空 stack,可以分成 push 與 pop 的字元是否相同的兩種情況: * **兩者相同:假設 $p\xrightarrow{a}r\cdots s\xrightarrow{b}q$,則新增規則 $A_{pq}\to aA_{rs}b$** $\forall p,q,r,s\in Q,\ u\in \Gamma,\text{ and }a,b\in \Sigma_\epsilon$, if $\delta(p,a,\epsilon)\ni (r,u) \land \delta(s,b,u)\ni (q,\epsilon)$, put $A_{pq}\to aA_{rs}b$ in $G$. 如果 $A_{rs}$ 是 $\varnothing$ 也沒差,反正是 Nondeterministic。 * **兩者不同:在路徑上必然在經過其中一個 state $r$ 時 stack 是空的,新增規則 $A_{pq}\to A_{pr}A_{rq}$** $\forall p,q,r\in Q$, put $A_{pq}\to A_{pr}A_{rq}$ in $G$. 一樣,如果 $A_{pr}$ 或 $A_{rq}$ 是 $\varnothing$ 也沒差。 兩個方向的證明都是很 trivial 的 induction。 **Regular languages 是 context-free languages 的子集**,因為任何 FA 都自動是 PDA,只要完全不使用 stack 即可。 ### Non-Context-Free Languages Some Non-Context-Free Languages: $\{a^n b^n c^n | n \geq 0\}, \{a^ib^jc^k|0\le i\le j\le k\}, \{ww|w\in \{0,1\}^*\}$. #### The Pumping Lemma for Context-Free Languages (CFL) > 括號內是 Pumping Lemma for RL $A$ is a context-free language $\Rightarrow$ There exists a **pumping length** $p\ge 1$ such that every string $s\in A$ with $|s|\ge p$ can be written as $s=wvxyz$ ($=xyz$), satisfying: * $\forall i\ge 0,\ wv^ixy^iz\in A$ ($xy^iz\in A$) * $|vy|\ge 1$ ($|y|\ge 1$) * $|vxy|\le p$ ($|xy|\le p$) 證明: 設 $G$ 為一個 CFG,其生成語言為 CFL $A$。令 $b$ 為 $G$ 中所有規則右邊部分的 symbol 數的最大值。若 $b=1$,則 $A$ 為 alphabet 的子集,對於生成的 $s$,令 $v=s$ 且 $w=x=y=z=\epsilon$ 就滿足條件。以下假設 $b \ge 2$。 在 parse tree 中,是已經 Nondeterministically 從眾多替換選項中選出一個,因此所有節點都至多有 $b$ 個子節點。對於一個高度 $h$ (深度定義改為由 start variable 走幾步,相當於邊的數量、一般定義的高度減一),至多有 $b^h$ 個葉節點,也就是最終的 terminal 數量、字串長度。換句話說,至少 $b^h+1$ 長的字串必須來自高度至少 $h+1$ 的 parse tree。 設 $p=b^{|\mathcal{V}|+1}$,對於某字串 $s\in A, |s|\ge p$,其任意 parse tree 的高度至少 $|\mathcal{V}|+1$,取其中**節點數最少**的。 在此 parse tree 中,至少有個葉節點,其從根節點到葉節點的路徑長至少為 $|\mathcal{V}|+1$。取此路徑上最底下的 $|\mathcal{V}|+2$ 個節點,也就是 $|\mathcal{V}|+1$ 個 variable 與最後的 terminal。在 variables 中,根據鴿籠原理,必然有某個 variable $A$ 重複出現,稱上方的為 $A_u$,下方的為 $A_d$。 使 $A_d$ 的子樹葉節點為 $x$,$A_u$ 的子樹葉節點為 $vxy$,$s$ 的剩餘部分給 $w,z$。以下分別檢測三個條件: * $\forall i\ge 0,\ wv^ixy^iz\in A$ * 對於 $i>1$,把 $A_d$ 的子樹遞迴地替換為 $A_u$ 的子樹。 * 對於 $i=0$,把 $A_u$ 的子樹替換為 $A_d$ 的子樹,變成 $wxz$。 * $|vy|\ge 1$ * 此條件僅在 $v=y=\epsilon$ 時不成立,但此時 $A_u$ 子樹葉節點與 $A_d$ 子樹葉節點所代表的字串相同。若以 $A_d$ 的子樹替換 $A_u$ 的子樹,則總節點數變少,與節點數最少的假設矛盾。 * $|vxy|\le p$ * 由於代表 $vxy$ 的 $A_u$ 的子樹高度至多為 $|\mathcal{V}|+1$,因此至多有 $b^{|\mathcal{V}|+1}=p$ 個葉節點。 #### Example: Prove $A=\{0^n1^n2^n|n\ge 0\}$ is non-CFL For every $p\ge 1$, select $s=0^p1^p2^p$. * 若 $v, y$ 都只有一種 symbol,則必然 $wv^2xy^2z$ 的三種 symbol 數量不同。 * 若 $v, y$ 其一包含兩種 symbol,則 $wv^2xy^2z$ 的順序不對。 * 若 $v, y$ 其一包含三種 symbol,則除了順序不對以外,其長度至少 $p+2$,違反 $|vxy|\le p$。 ## 3 The Church-Turing Thesis 判斷能否以 algorithm 解決的問題,等價於能否被 Turing Machine 解決的問題。Turing Machine 更接近現代電腦,有 memory 而不是 stack。 ### Turing Machine Memory 是一個無限長的 tape,指針可以左右移動、讀寫。與 PDA 一樣可能停不下來。特別的是,就算在 input 的中間,只要進到 accept 或 reject state 就會停下來。 #### Formal Definition of Turing Machine A Turing Machine is a 7-tuple $(Q, \Sigma, \Gamma, \delta, q_0, q_{\text{accept}}, q_{\text{reject}})$ where: * $Q$ is a finite set of states. (The same) * $\Sigma$ is the input alphabet **not containing the blank symbol $\sqcup$**. * $\Gamma$ is the tape alphabet, where $\sqcup \in \Gamma$ and $\Sigma \subseteq \Gamma$. * $\delta: Q \times\,$<span class="red">$\Gamma$</span>$\ \to Q \times\,$<span class="red">$\Gamma \times \{L,R\}$</span> is the transition function. * $q_0 \in Q$ is the start state. * $q_{\text{accept}} \in Q$ is the accept state. * $q_{\text{reject}} \in Q$ is the reject state, where $q_{\text{reject}} \ne q_{\text{accept}}$. #### Computation of Turing Machine * 首先把輸入放在 tape 上的左側,其他位置為 blank symbol $\sqcup$。這也是輸入不能有 blank symbol 的原因,否則不知道是內容還是字串結束。 * 指針指向 tape 的最左側。 * 根據 $\delta$ 進行 transition。唯一的例外是在最左端時,若要往左移動則停在原地。 * 直到進入 accept 或 reject state 為止,稱為 halt。也有可能停不下來。 #### Formal Computation of Turing Machine ##### Configuration 使用 configuration 作為狀態,包含:current state($q\in Q$), tape contents, head position。$u\in \Gamma^*$ 是目前指針前的字串,$v\in \Gamma^*$ 是指針後的字串(含目前指向的字元),但只包含到最後一個非 blank symbol 為止。寫成:$u\ q\ v$。 ##### Yielding Suppose * $a,b,c \in \Gamma$ * $u,v \in \Gamma^*$ (Can be empty) * $q_i, q_j$ are states it writes and moves: * $\delta(q_i,b) = (q_j,c,L)$: $ua\ q_i\ bv \to ua\ q_i\ cv \to u\ q_j\ acv$ * $\delta(q_i,b) = (q_j,c,R)$: $ua\ q_i\ bv \to ua\ q_i\ cv \to uac\ q_j\ v$ And special cases: * left end: $\delta(q_i,b) = (q_j,c,L)$: $q_i\ bv \to q_i\ cv \to q_j\ c v$ * $ua\ q_i$ is equivalent to $ua\ q_i\ \sqcup$, always has blank symbols to read or move to on the right. ##### More about Configuration * Initial configuration on input $w$: $q_0\ w$ * In accepting and rejecting configurations, $q=q_{\text{accept}}$ and $q=q_{\text{reject}}$ respectively. They are halting configurations and do not yield to any other configuration. ##### Computation $\delta$ is defined equivalently as $Q' \times \Gamma \to Q \times \Gamma \times \{L,R\}$ where $Q' = Q - \{q_{\text{accept}}, q_{\text{reject}}\}$. We can simply let $q_{\text{accept}}$ and $q_{\text{reject}}$ only have transitions to the same state, so they are equivalent. A turing machine $M$ accepts input $w$ if there exists a sequence of configurations $C_1, C_2, \ldots, C_k$ such that: * $C_1$ is the initial configuration on input $w$. * For each $i \in [1, k-1]$, $C_i \to C_{i+1}$. * $C_k$ is an accepting configuration. #### The language of a Turing Machine ##### Turing-recognizable Definition of recognizing is the same, but three outcomes are possible: accept, reject, or **loop**(doesn't halt). A language is **Turing-recognizable** (or recursively enumerable) if some Turing machine recognizes it. ##### Turing-decidable It's hard to say if a machine is looping or just taking a long time, so we prefer machines that halt on all inputs. Turing machines are **deciders** if they halt on all inputs. A Turing machine $M$ **decides** a language $A$ if * **$M$ is a decider**. * $M$ recognizes $A$. In other words, for any input string $w$: * If $w \in A$, then $M$ accepts $w$. * If $w \notin A$, then $M$ rejects $w$. A language is **Turing-decidable** (or recursive, decidable) if some Turing machine decides it. By definition, a language is Turing-decidable $\Rightarrow$ it's Turing-recognizable, so the set of Turing-decidable languages is a **proper subset** of the set of Turing-recognizable languages. We mainly focus on **Turing-decidable languages**. #### Examples of Turing Machines ##### Example: A Turing Machine that decides $\{0^{2^n}|n\ge 0\}$ ```text if number of 0s is 0: reject while 1: if number of 0s is 1: accept elif number of 0s is odd: reject else: half the number of 0s ``` Actually we need to count the number of 0s while halving: 1. Sweep left to right trying to **cross off** every second 0 (counting 0s at the same time) 2. **Accept** if there is only one 0 3. **Reject** if there is an odd number of 0s (of course not 1) 4. Return to the left end and repeat step 1 Actual design: $\Sigma=\{0\}$, $\Gamma=\{0,\mathsf{X},\sqcup\}$ States and transitions: > $q\xrightarrow{a\to \{R,L\}}q'$ means $q\xrightarrow{a\to a,\{R,L\}}q'$, not changing the tape content. * $q_1$: Start state * $\xrightarrow{0\to \sqcup,R}q_2$: 左端的 $\sqcup$ 標示第一個 0,作為最左端的標記,它本質還是 0。 * $\xrightarrow{\mathsf{X}\to R}q_{\text{reject}}$: not in $\Sigma$. * $\xrightarrow{\sqcup\to R}q_{\text{reject}}$: input 沒有 0,reject。 * $q_2$: **After** reading the first 0 and several x's. (odd) * $\xrightarrow{0\to \mathsf{X},R}q_3$: crossing off every second 0. * $\xrightarrow{\mathsf{X}\to R}q_2$: skip x's. * $\xrightarrow{\sqcup\to R}q_{\text{accept}}$: reach the right end with only one 0 (The $\sqcup$ at the left end), accept. * $q_3$: **After** reading even number of 0s and several x's. (even) * $\xrightarrow{0\to R}q_4$: skip the next 0. * $\xrightarrow{\mathsf{X}\to R}q_3$: skip x's. * $\xrightarrow{\sqcup\to L}q_5$: reach the right end with even number of 0s, go back to left end. * $q_4$: **After** reading odd number ($\ge 3$) of 0s and several x's. (odd) * $\xrightarrow{0\to \mathsf{X},R}q_3$: crossing off every second 0. * $\xrightarrow{\mathsf{X}\to R}q_4$: skip x's. * $\xrightarrow{\sqcup\to R}q_{\text{reject}}$: reach the right end with odd number of 0s, reject. * $q_5$: Moving back to the left end, and move right to enter $q_2$. * $\xrightarrow{0\to L}q_5$: move left through 0s. * $\xrightarrow{\mathsf{X}\to L}q_5$: move left through x's. * $\xrightarrow{\sqcup\to R}q_2$: reach the left end, move right to enter $q_2$. > Turing Machines here are deterministic as DFA, $\delta$ is a **total function** (each input in domain has its output). ##### Example: A Turing Machine that decides $\{w\#w|w \in \{0,1\}^*\}$ > $\{w\#w^R|w \in \{0,1\}^*\}$ or $\{ww^R|w \in \{0,1\}^*\}$ are simply recognized by a PDA, but not this one. ```text while 1: find the first unmarked symbol before # if no unmarked symbol before #: check if there is any unmarked symbol after # if yes: reject else: accept mark the symbol find the corresponding unmarked symbol after # if not found: reject mark the symbol ``` States and transitions (red means rejecting cases): * $q_1$: Start state: pointing to the first unmarked symbol before #. * $\xrightarrow{0\to \mathsf{X},R}q_{01}$: mark the first unmarked 0 before #. * $\xrightarrow{1\to \mathsf{X},R}q_{11}$: mark the first unmarked 1 before #. * $\xrightarrow{\#\to R}q_4$: no unmarked symbol before #, check after #. * <span class="red">$\mathsf{X}, \sqcup$: impossible</span> * $q_{01}$: After marking 0 before #, find corresponding unmarked 0 after #. * $\xrightarrow{0,1\to R}q_{01}$: skip unmarked * $\xrightarrow{\#\to R}q_{02}$: reach # * <span class="red">$\mathsf{X}, \sqcup$: impossible</span> * $q_{02}$: After reaching # from $q_{01}$, find unmarked 0 after #. * $\xrightarrow{0\to \mathsf{X},L}q_2$: found unmarked 0, mark it and go back. * $\xrightarrow{\mathsf{X}\to R}q_{02}$: skip marked * <span class="red">$1$: not match</span> * <span class="red">$\sqcup$: the second part is shorter</span> * <span class="red">$\#$: impossible</span> * $q_{11}, q_{12}$: Similar to $q_{01}, q_{02}$ but for 1. * $q_2$: Moving back to the left end to enter $q_1$. * $\xrightarrow{0,1,\mathsf{X}\to L}q_2$: move left * $\xrightarrow{\#\to L}q_2$: move left * <span class="red">$\sqcup$: impossible</span> * $q_3$: Continue to move left to the left end. * $\xrightarrow{0,1\to L}q_3$: move left * $\xrightarrow{\mathsf{X}\to R}q_1$: return to $q_1$ * <span class="red">$\#, \sqcup$: impossible</span> * $q_4$: After reaching # from $q_1$, check if there is any unmarked symbol after #. * $\xrightarrow{\mathsf{X}\to R}q_4$: skip marked * $\xrightarrow{\sqcup\to R}q_{\text{accept}}$: no unmarked symbol after #, accept. * <span class="red">0, 1: the second part is longer</span> * <span class="red">$\#$: impossible</span> ##### Example: A Turing Machine that decides $\{a^ib^jc^k|i\times j=k\land i,j,k\ge 1\}$ ```text scan the input to see if it is in the form a^+b^+c^+ while 1: cross off the first a if no a to cross off: check if there is any unmarked c if yes: reject (too many c's) else: accept else: while any b remains: cross off the first b cross off the first c if no c to cross off: reject (too few c's) fill back all b's return to the left end ``` ##### Example: A Turing Machine that decides $\{\#x_1\#x_2\ldots\#x_n|\text{ each } x_i\in\{0,1\}^*\land x_i\ne x_j \text{ for each }i\ne j\}$ ```text for i in range(1,n+1): if i+1 > n: accept (no more x_j to compare) for j in range(i+1,n): if x_i == x_j: reject ``` Idea: What's important is how to remember which $x_i$ and $x_j$ are being compared. We use $\dot\#$ to mark $x_i$ and $x_j$ being compared, then compare them bit by bit. At the end, $i=n$ and we cannot find any $\#$ after the $\dot\#$, so we accept. ### Variants of Turing Machine Similar to DFA and NFA, some variants of Turing Machine are equivalent in power. > Invariance to certain changes in the definition is called **robustness**. A simple variant is allowing to stay in the same cell. Since we can move right then left (on any symbol and not changing tape content), it's equivalent. #### Multitape Turing Machine k tapes. Each tape has its own head and can read/write/move independently. $\delta: Q \times \Gamma^k \to Q \times \Gamma^k \times \{L,R\}^k$. The input is placed on the first tape, others are blank. It's equivalent to single-tape Turing Machine. A single-tape TM is automatically a multitape TM that doesn't use extra tapes. It remains to prove the other direction. ##### Proof for converting a multitape TM $M$ to a single-tape TM $S$ * $S$ has more states to simulate $M$. * $S$ uses a special symbol $\#$ to separate the contents of different tapes. * $S$ uses special symbols (dot on each symbol) to mark the head position on each tape. * $S$ setups $\#\dot{w_1}w_2\ldots w_k\#\dot\sqcup\#\dot\sqcup\#\ldots\#$ on its tape. * $S$ simulates each transition of $M$ by: * First scan: Scanning its tape to read the symbols under each head (This requires many extra states). * Second scan: Scanning its tape again to perform the write and move operations (This also requires many extra states). * In addition, if $S$ needs to write a symbol at the end of a tape, it shifts the contents to the right to make space. (This also requires many extra states.) #### Nondeterministic Turing Machine Similar to NFA. $\delta: Q \times \Gamma \to P(Q \times \Gamma \times \{L,R\})$ ##### Proof for simulating a NTM $N$ by a Multitape DTM $D$ Tapes of $D$: * Tape 1 contains the input and is read-only. * Tape 2 is used to simulate the work tape of $N$. * Tape 3 is used to iterate through all possible computation paths of $N$. We cannot use DFS, so use BFS to simulate $N$'s computation. Suppose $b$ is the size of the largest set in the range of $\delta$ of $N$. Tape 3 contains a sequence of numbers in $\{1,2,\ldots,b\}$, representing the choices made at each step of $N$'s computation. Tape 3's contents iterate through shortlex order of $\{1,2,\ldots,b\}^*$ for each round of simulation (e.g. $(\epsilon,1,2,11,12,\ldots)$ for $b=2$). * Init * Set input on Tape 1. * Set Tape 2,3 to be empty. * Each branch * Copy Tape 1 to Tape 2. * Use Tape 2 to simulate $N$'s computation. * Use Tape 3 to determine which transition to take at each step. * Try next branch if: 1. No more symbol on Tape 3. We arrive the desired depth, but not found an accept state. 2. The choice on Tape 3 is invalid ($\delta$ returns empty set). We cannot proceed further. 3. Reach an reject state. * Accept if reach an accept state. * Increment Tape 3 to try the next branch. ##### Turing-Recongizable and Decidable A language is Turing-recognizable $\iff$ some NTM recognizes it. (Shown above) A language is Turing-decidable $\iff$ some NTM decides it. A NTM is a decider $\iff$ all its computation branches halt on all inputs. So we need to modify the above simulation slightly so that if $N$ is a decider, $D$ also halts on all inputs. #### Enumerators 等價於 2-tape TM,其中一個作為 output tape。 從空 input 開始 compute,output tape 上以特殊符號 $\#$ 隔開的字串集合就是這個 enumerator 所 enumerates 的語言,有可能不 halt,所以有無限的 language。 Some enumerator enumerates language $A$ $\iff$ some Turing machine recognizes $A$. Proof idea: * $\Rightarrow$: Given an enumerator $E$ for language $A$, construct a TM $M$ by simulating $E$ and compare the input with each output string. Accept if a match is found. * $\Leftarrow$: Given a TM $M$ that recognizes language $A$, construct an enumerator $E$ by generating all strings in $\Sigma^*$ in some order and simulating $M$ on each string. Output the string if $M$ accepts it. #### Other Models of Computation Turing Machine 最重要的特性是 unrestricted access to unlimited memory,與 FA、PDA 不同。 另外也有很多 Computaional models,但幾乎都是等價於 Turing Machine。 ### The Definition of Algorithm #### Hilbert's Problems 1900 年提出,其中第 10 題是: > Given a Diophantine equation with any number of unknown quantities and with rational integral numerical coefficients: To devise a process according to which it can be determined in a finite number of operations whether the equation is solvable in rational integers. rational integral: 整數, Diophantine equation: 整係數多項式方程式,finite number of operations: algorithm。 給定一個多變數整係數多項式方程式,是否存在一個演算法能在有限步驟內判斷它是否有整數解? 把多項式 somehow encode 成 string,所有多項式可以視為 language: * $D=\{\langle p\rangle|p\text{ is a ... polynomial with an integral root}\}$ * Turing-recognizable but not decidable. * $D_1=\{\langle p\rangle|p\text{ is a ... polynomial over x with an integral root}\}$ * Turing-decidable. Turing Machines: * $M_1$ that recognizes $D_1$ but is not a decider: * Try $0,\pm 1,\pm 2,\ldots$ one by one, accept if find an integral root. * Doesn't halt if no integral root. * $M$ that recognizes $D$ but is not a decider: * Similarly, try all possible integer assignments to all variables one by one, accept if find an integral root. * Doesn't halt if no integral root. * $M_1'$ that decides $D_1$: * 若 $|x|\ge k\cdot c_{\text{max}}/c_1$ (k: number of terms, $c_{\text{max}}$: max absolute coefficient, $c_1$: leading coefficient),則 $\left|c_1 x^k\right|>\left|\sum_{i=0}^{k-1} c_{\text{max}}x^i\right|$,所以不可能是根。 * 嘗試範圍內的整數即可。 * 但多變數時沒有這個性質。 #### Terminology for Describing Turing Machines 實際上不太可能把詳細的指針如何移動全部以 Formal definitiion 描述出來,因此會用 pseudocode 或是更低階一點把大致的流程寫出來。 ## 4 Decidability ### 4.1 Decidable Languages #### Decidable Problems Concerning Regular Languages **$A_{\text{DFA}}$ (Acceptance problem for DFAs)**: $$A_{\text{DFA}} = \{\langle B, w \rangle | B \text{ is a DFA that accepts string } w\}$$ * 每個 string 是 DFA $B$ 與 string $w$ 的 encoding。 * 若 input 不符合 encoding 格式,TM reject。 * 若符合格式,檢查 DFA $B$ 是否 accept $w$。 **Theorem 4.1**: $A_{\text{DFA}}$ is decidable. **Proof**: 設計 TM $M$ 來 decide $A_{\text{DFA}}$: * $M$ 的 input: $\langle B, w \rangle$ (encoding of DFA $B$ and string $w$) * $M$ **simulate** $B$ on input $w$ * 若 simulation 在 accept state 結束 → $M$ accept $\langle B, w \rangle$ * 若 simulation 在 non-accept state 結束 → $M$ reject $\langle B, w \rangle$ 注意區分: * $M$ 的 input string: $\langle B, w \rangle$ * $B$ 的 input string: $w$ --- **$A_{\text{NFA}}$ (Acceptance problem for NFAs)**: $$A_{\text{NFA}} = \{\langle B, w \rangle | B \text{ is an NFA that accepts string } w\}$$ **Theorem 4.2**: $A_{\text{NFA}}$ is decidable. **Proof**: 設計 TM $N$ 來 decide $A_{\text{NFA}}$: 1. Convert NFA $B$ to equivalent DFA $C$ 2. Run TM $M$ (from Theorem 4.1) on input $\langle C, w \rangle$ 3. 若 $M$ accepts → $N$ accepts;若 $M$ rejects → $N$ rejects 可以把 $M$ 和 $N$ 想成 functions: * $N$ 的 input: $(B, w)$ * $N$ 執行: $C \leftarrow \text{Convert}(B)$,然後 call $M(C, w)$ * $N$ return $M$ 的結果 --- **$A_{\text{REX}}$ (Acceptance problem for regular expressions)**: $$A_{\text{REX}} = \{\langle R, w \rangle | R \text{ is a regex that generates string } w\}$$ **Theorem 4.3**: $A_{\text{REX}}$ is decidable. **Proof**: 設計 TM $P$ 來 decide $A_{\text{REX}}$: 1. Convert regex $R$ to equivalent NFA $A$ 2. Run TM $N$ on input $\langle A, w \rangle$ 3. Return $N$ 的結果 --- **$E_{\text{DFA}}$ (Emptiness testing for DFAs)**: $$E_{\text{DFA}} = \{\langle A \rangle | A \text{ is a DFA and } L(A) = \varnothing\}$$ **Theorem 4.4**: $E_{\text{DFA}}$ is decidable. **Proof**: 設計 TM $T$ 來 decide $E_{\text{DFA}}$: 1. Mark start state of $A$ 2. Repeat: mark any state reachable from marked states (類似 marking algorithm) 3. 直到 no new states get marked 4. 若 no accept state is marked → $T$ accepts (language is empty) 5. 若 any accept state is marked → $T$ rejects (language is not empty) --- **$EQ_{\text{DFA}}$ (Equivalence testing for DFAs)**: $$EQ_{\text{DFA}} = \{\langle A, B \rangle | A, B \text{ are DFAs and } L(A) = L(B)\}$$ **Theorem 4.5**: $EQ_{\text{DFA}}$ is decidable. **Proof**: 設計 TM $F$ 來 decide $EQ_{\text{DFA}}$: 1. Construct DFA $C$ such that $L(C) = (L(A) \cap \overline{L(B)}) \cup (\overline{L(A)} \cap L(B))$ * $C$ accepts strings accepted by **either** $A$ or $B$ but **not both** * 比照 regular languages 的 closure properties (union, intersection, complement) 的證明,使用 cross-product construction,accept state 定義為:exactly one of $A$, $B$ is in accept state 2. Run TM $T$ on input $\langle C \rangle$ 3. 若 $T$ accepts (即 $L(C) = \varnothing$) → $F$ accepts (即 $L(A) = L(B)$) 4. 若 $T$ rejects → $F$ rejects **關鍵**: $L(C) = \varnothing \Leftrightarrow L(A) = L(B)$ #### Decidable Problems Concerning Context-Free Languages **$A_{\text{CFG}}$ (Acceptance problem for CFGs)**: $$A_{\text{CFG}} = \{\langle G, w \rangle | G \text{ is a CFG that generates string } w\}$$ **Theorem 4.7**: $A_{\text{CFG}}$ is decidable. **Proof**: 設計 TM $S$ 來 decide $A_{\text{CFG}}$: 1. Convert $G$ to equivalent grammar in **Chomsky Normal Form** (CNF) 2. 若 $|w| = n$,則任何 derivation of $w$ 恰有 $2n - 1$ steps * 若 $n = 0$ (empty string),則檢查 1 step derivation 3. List all derivations with $2n - 1$ steps 4. 若任何 derivation generates $w$ → $S$ accepts 5. 若沒有 → $S$ rejects **關鍵**: CNF 保證 derivation steps 有界,所以可以窮舉。若是一般的形式,TM 可能不會 halt。 --- **$E_{\text{CFG}}$ (Emptiness testing for CFGs)**: $$E_{\text{CFG}} = \{\langle G \rangle | G \text{ is a CFG and } L(G) = \varnothing\}$$ **Theorem 4.8**: $E_{\text{CFG}}$ is decidable. **Proof**: 設計 TM $R$ 來 decide $E_{\text{CFG}}$: 1. Mark all terminal symbols in $G$ 2. Repeat: mark variable $A$ if $A \to u$ where all symbols in $u$ are already marked, and mark variable $A$ if $A \to BC$ where $B,C$ are already marked 3. 直到 no new variables get marked 4. 若 start variable is marked → $R$ rejects (can generate some string) 5. 若 start variable is not marked → $R$ accepts (cannot generate any string) **關鍵**: 從結果開始倒推是否能產生全部都是 terminal symbols 的字串。 如果使用上一個 TM $S$,則有無限的字串需要檢查,無法 halt。 --- **$EQ_{\text{CFG}}$ (Equivalence testing for CFGs)**: $$EQ_{\text{CFG}} = \{\langle G, H \rangle | G, H \text{ are CFGs and } L(G) = L(H)\}$$ **問題**: $EQ_{\text{CFG}}$ is **NOT decidable** (will prove in Section 5.2)! 與 regular languages 不同,CFLs **not closed under complement or intersection**,無法用類似 $EQ_{\text{DFA}}$ 的方法。 --- **Theorem 4.9**: Every context-free language is decidable. **Proof**: Let $G$ be a CFG for CFL $A$. 設計 TM $M_G$ 來 decide $A$: * $M_G$ 的 input: string $w$ * $M_G$ call TM $S$ (from Theorem 4.7) on input $\langle G, w \rangle$ * 若 $S$ accepts → $M_G$ accepts * 若 $S$ rejects → $M_G$ rejects **關鍵**: 利用 $A_{\text{CFG}}$ decidable 的結果。 **不可行的方法**: Convert PDA to TM * PDA 是 nondeterministic,而且某些 branches 可能 loop forever * 轉換後的 TM 也可能有 nondeterministic branches loop forever * 因此轉換後的 TM 不是 decider ### 4.2 Undecidability #### The Undecidable Language $A_{\text{TM}}$ **$A_{\text{TM}}$ (Acceptance problem for TMs)**: $$A_{\text{TM}} = \{\langle M, w \rangle | M \text{ is a TM and } M \text{ accepts } w\}$$ **性質**: * $A_{\text{TM}}$ is **Turing-recognizable** but **undecidable** **為何 Turing-recognizable?** 設計 TM $U$ (Universal TM): * Input: $\langle M, w \rangle$ * Simulate $M$ on input $w$ * 若 $M$ accepts → $U$ accepts * 若 $M$ rejects → $U$ rejects * 若 $M$ loops on $w$ → $U$ also loops (無法 decide) **為何 undecidable?** 無法判斷 $M$ 是否會在 $w$ 上 halt 還是只是跑很久,這就是著名的 **Halting Problem**。 #### Diagonalization Method 在證明 $A_{\text{TM}}$ undecidable 之前,先介紹 diagonalization method。 **定義 Review (Functions)**: * **One-to-one (Injective)**: $f: A \to B$ 若 $f(a_1) \neq f(a_2)$ whenever $a_1 \neq a_2$ * 每個 $b \in B$ 最多被 map 一次 * **Onto (Surjective)**: $f: A \to B$ 若 $\forall b \in B, \exists a \in A$ s.t. $f(a) = b$ * 每個 $b \in B$ 至少被 map 一次 * **Correspondence (Bijective)**: Both one-to-one and onto * 每個 $b \in B$ 恰好被 map 一次 **定義 (Same Size)**: $A$ and $B$ have the **same size** if $\exists$ correspondence $f: A \to B$ **範例**: $\mathbb{N}$ (natural numbers) 和 $E$ (even natural numbers) have the same size * $\mathbb{N} = \{1, 2, 3, 4, 5, \ldots\}$ * $E = \{2, 4, 6, 8, 10, \ldots\}$ * Correspondence: $f(n) = 2n$ * 雖然 $E \subset \mathbb{N}$,但根據定義它們 same size **定義 (Countable)**: * A set $A$ is **countable** if: * $A$ is finite, OR * $A$ has the same size as $\mathbb{N}$ * An infinite set is **uncountable** if 不存在 correspondence with $\mathbb{N}$ --- **範例**: $\mathbb{Q}^+$ (positive rational numbers) is countable * 用 zigzag pattern 列舉:$\frac{1}{1}, \frac{2}{1}, \frac{1}{2}, \frac{3}{1}, \frac{1}{3}, \frac{2}{2}, \ldots$ * 可以跳過重複的(如 $\frac{2}{2} = \frac{1}{1}$) * 可以建立 correspondence with $\mathbb{N}$ --- **Theorem**: $\mathbb{R}$ (real numbers) is uncountable **Proof** (Diagonalization): Assume $\exists$ correspondence $f: \mathbb{N} \to \mathbb{R}$,列出: * $f(1) = 1.\textbf{1}4514\ldots$ * $f(2) = 1.9\textbf{1}9810\ldots$ * $f(3) = 0.12\textbf{3}45\ldots$ * $\vdots$ 構造 $x \in \mathbb{R}$ 使得 $x \neq f(n)$ for all $n$: * $x$ 的第 $i$ 位 digit $\neq f(i)$ 的第 $i$ 位 digit * 避免選 0 或 9(防止 0.999... = 1.000... 的情況) * 則 $x \in \mathbb{R}$ 但 $x$ 沒有被任何 $f(n)$ map 到,矛盾! --- **Theorem**: Set of all TMs is **countable** * 每個 TM 都可以 encode 成 string * 這個 string $\in\Sigma^*$,而 $\Sigma^*$ is countable (可用 shortlex order 列舉) * 因此 set of all TMs is countable 也可以得出 Turing-recognizable languages is countable。其他 Regular languages、CFLs 也是 countable。 --- **Theorem**: Set of all languages is **uncountable** **Proof**: Set of all languages has the same size as set of all **infinite binary sequences** * 給定 $\Sigma^* = \{s_0, s_1, s_2, \ldots\}$ (shortlex order). * Language $A$ 可表示為 characteristic sequence $\chi_A$: * $\chi_A[i] = 1$ if $s_i \in A$ * $\chi_A[i] = 0$ if $s_i \notin A$ * 例: $A = \{0, 00, 01, 000, 001\}$ over $\Sigma = \{0, 1\}$ * $\Sigma^* = \{\epsilon, 0, 1, 00, 01, 10, 11, 000, 001, \ldots\}$ * $\chi_A = 010110011\ldots$ * Correspondence $\chi$: Languages $\to$ Infinite binary sequences 但與 real number 相似,Infinite binary sequences 與 $\mathbb{N}$ 之間不存在 correspondence(用 diagonalization 證明),所以 set of all languages is uncountable。 --- **Corollary 4.18**: Some languages are **Turing-unrecognizable** **Proof**: * Set of all TMs: countable * Set of all languages: uncountable * Each TM recognizes only 1 language * $\therefore$ 存在無法被任何 TM recognize 的 languages #### Proof that $A_{\text{TM}}$ is Undecidable **Theorem 4.11**: $A_{\text{TM}} = \{\langle M, w \rangle | M \text{ is a TM and } M \text{ accepts } w\}$ is undecidable. **Proof** (By contradiction): 假設 $\exists$ decider $H$ for $A_{\text{TM}}$: * Input: $\langle M, w \rangle$ * 若 $M$ accepts $w$ → $H$ accepts * 若 $M$ does not accept $w$ (reject or loop) → $H$ rejects 構造 TM $D$: ```text D on input ⟨M⟩: Run H on input ⟨M, ⟨M⟩⟩ Output the OPPOSITE of what H outputs: If M accepts ⟨M⟩ → H accepts, D rejects If M rejects ⟨M⟩ → H rejects, D accepts ``` **關鍵步驟**: 執行 $D$ on input $\langle D \rangle$: * 若 $D$ accepts $\langle D \rangle$ * → $H$ accepts $\langle D, \langle D \rangle \rangle$ * → $D$ rejects $\langle D \rangle$ ⚡ 矛盾 * 若 $D$ rejects $\langle D \rangle$ * → $H$ rejects $\langle D, \langle D \rangle \rangle$ * → $D$ accepts $\langle D \rangle$ ⚡ 矛盾 $\therefore$ $H$ 不存在,$A_{\text{TM}}$ is undecidable. **Diagonalization 視角**: 建立表格,row = TMs $M_1, M_2, \ldots$,column = encodings $\langle M_1 \rangle, \langle M_2 \rangle, \ldots$ Entry $(i, j)$ = "accept" if $M_i$ accepts $\langle M_j \rangle$,否則空白 若 $H$ exists,可填滿整個表格(用 "accept" 或 "reject") $D$ 的行為:flip diagonal entries * 若 $M_i$ accepts $\langle M_i \rangle$ → $D$ rejects $\langle M_i \rangle$ * 若 $M_i$ rejects $\langle M_i \rangle$ → $D$ accepts $\langle M_i \rangle$ 但 $D$ 本身也在列表中,$D$ on $\langle D \rangle$ 該是什麼?矛盾! #### Co-Turing-Recognizable Languages **定義**: Language $A$ is **co-Turing-recognizable** if $\overline{A}$ is Turing-recognizable **Theorem 4.22**: A language is **decidable** iff it is both **Turing-recognizable** and **co-Turing-recognizable** **Proof** ($\Rightarrow$): * 若 $A$ decidable → $A$ Turing-recognizable (trivial) * $\overline{A}$ also decidable (可將 decider 的 accept/reject 對調) * $\therefore$ $\overline{A}$ Turing-recognizable * $\therefore$ $A$ co-Turing-recognizable **Proof** ($\Leftarrow$): * 假設 $A$ 和 $\overline{A}$ 都 Turing-recognizable * 存在 $M_1$ recognizes $A$,$M_2$ recognizes $\overline{A}$ * 設計 decider $M$ for $A$: ```text M on input w: Run M₁ and M₂ in parallel on w If M₁ accepts → M accepts If M₂ accepts → M rejects ``` * 因為 $w \in A$ or $w \in \overline{A}$(必居其一) * $\therefore$ $M_1$ or $M_2$ 必有一個會 accept * $\therefore$ $M$ always halts (是 decider) * Correctness: $M$ accepts all strings in $A$, rejects all strings in $\overline{A}$ **Corollary 4.23**: $\overline{A_{\text{TM}}}$ is **Turing-unrecognizable** **Proof**: * $A_{\text{TM}}$ is undecidable (Theorem 4.11) * $A_{\text{TM}}$ is Turing-recognizable (Universal TM) * 若 $\overline{A_{\text{TM}}}$ Turing-recognizable * → By Theorem 4.22, $A_{\text{TM}}$ decidable ⚡ 矛盾 * $\therefore$ $\overline{A_{\text{TM}}}$ is Turing-unrecognizable ## 5 Reducibility ### 5.1 Undecidable Problems from Language Theory #### Concept of Reducibility **Reducibility**: 將問題 $A$ convert 到問題 $B$,使得解決 $B$ 可以解決 $A$。 **嚴謹定義**: A is reducible to B $$\exists f: \Sigma^* \to \Sigma^* \text{ such that } \forall w: w \in A \iff f(w) \in B$$ 因為這章我們關注 decidability,因此 $f$ 必須是 **computable function**,即 $\exists$ TM 可以對於任意 $w$ 在有限步驟內計算出 $f(w)$。計作 $A \leq_m B$。 > 在關注 Time Complexity 是否是 P 時,就需要 $f$ 是 polynomial-time computable function。計作 $A \leq_p B$。 若 $A$ is reducible to $B$,則: * Solving $A$ cannot be harder than solving $B$ * $B$ is at least as hard as $A$ * 記作 $A \leq_m B$ (many-to-one reducible) **邏輯方向**: * 若 $A \leq_m B$ 且 $B$ decidable → $A$ decidable * 若 $A \leq_m B$ 且 $A$ undecidable → $B$ undecidable 我們會用第二點來證明很多問題是 undecidable。 --- #### $HALT_{\text{TM}}$ (Halting Problem) $$HALT_{\text{TM}} = \{\langle M, w \rangle | M \text{ is a TM and } M \text{ halts on } w\}$$ **Theorem 5.1**: $HALT_{\text{TM}}$ is undecidable. **Proof** (By Reducibility): > Review: $A_{\text{TM}} = \{\langle M, w \rangle | M \text{ is a TM and } M \text{ accepts } w\}$ We will reduce $A_{\text{TM}}$ to $HALT_{\text{TM}}$. 假設 $\exists$ decider $R$ for $HALT_{\text{TM}}$。 構造 TM $S$ to decide $A_{\text{TM}}$: ```text S on input ⟨M, w⟩: Run R on input ⟨M, w⟩ If R rejects (M loops): S rejects (because M never accepts w) If R accepts (M halts): Simulate M on w If M accepts → S accepts If M rejects → S rejects ``` **邏輯**: 若 $R$ 存在,可用它決定 $M$ 是否會 halt,進而決定 $M$ 是否會 accept --- #### $E_{\text{TM}}$ (Emptiness Testing for TMs) $$E_{\text{TM}} = \{\langle M \rangle | M \text{ is a TM and } L(M) = \varnothing\}$$ **Theorem 5.2**: $E_{\text{TM}}$ is undecidable. **Proof** (By Reducibility): 假設 $\exists$ decider $R$ for $E_{\text{TM}}$ 構造 TM $S$ to decide $A_{\text{TM}}$ on input $\langle M, w \rangle$: 1. Construct TM $M_1$: ```text M₁ on input x: If x ≠ w: reject If x = w: run M on w accept if M accepts w ``` 1. Run $R$ on input $\langle M_1 \rangle$ 2. 若 $R$ accepts (即 $L(M_1) = \varnothing$) → $S$ rejects (M does not accept w) 3. 若 $R$ rejects (即 $L(M_1) \neq \varnothing$) → $S$ accepts (M accepts w) **關鍵性質**: $L(M_1) = \varnothing \Leftrightarrow M$ does not accept $w$ --- #### $REG_{\text{TM}}$ (Regularity Testing for TMs) $$REG_{\text{TM}} = \{\langle M \rangle | M \text{ is a TM and } L(M) \text{ is regular}\}$$ **Theorem 5.3**: $REG_{\text{TM}}$ is undecidable. **Proof** (By Reducibility): 假設 $\exists$ decider $R$ for $REG_{\text{TM}}$ 構造 TM $S$ to decide $A_{\text{TM}}$ on input $\langle M, w \rangle$: 1. Construct TM $M_2$: ```text M₂ on input x: If x = 0ⁿ1ⁿ (n ≥ 0): accept Else: run M on w accept if M accepts w ``` 1. Run $R$ on input $\langle M_2 \rangle$ 2. 若 $R$ accepts (即 $L(M_2)$ is regular) → $S$ accepts (M accepts w) 3. 若 $R$ rejects (即 $L(M_2)$ is not regular) → $S$ rejects (M does not accept w) **關鍵性質**: * 若 $M$ does not accept $w$: $L(M_2) = \{0^n 1^n | n \geq 0\}$ (非 regular) * 若 $M$ accepts $w$: $L(M_2) = \Sigma^*$ (regular) * 故 $L(M_2)$ is regular $\Leftrightarrow$ $M$ accepts $w$ --- #### $EQ_{\text{TM}}$ (Equivalence Testing for TMs) $$EQ_{\text{TM}} = \{\langle M_1, M_2 \rangle | M_1, M_2 \text{ are TMs and } L(M_1) = L(M_2)\}$$ **Theorem 5.4**: $EQ_{\text{TM}}$ is undecidable. **Proof** (By Reducibility from $E_{\text{TM}}$): 假設 $\exists$ decider $R$ for $EQ_{\text{TM}}$ 構造 TM $S$ to decide $E_{\text{TM}}$ on input $\langle M \rangle$: 1. Construct $M_3$: reject all inputs (即 $L(M_3) = \varnothing$) 2. Run $R$ on input $\langle M, M_3 \rangle$ 3. 若 $R$ accepts (即 $L(M) = L(M_3) = \varnothing$) → $S$ accepts 4. 若 $R$ rejects (即 $L(M) \neq \varnothing$) → $S$ rejects ### Computation Histories **Accepting Computation History**: * TM $M$ 在 input $w$ 上執行時產生的 configuration sequence * $C_1 \to C_2 \to \cdots \to C_k$,其中: * $C_1$ = start configuration * 每個轉移 $C_i \to C_{i+1}$ 遵循 $M$ 的規則 * $C_k$ = accepting configuration **Rejecting Computation History**: 最後是 rejecting configuration **嚴格定義**: * **Computation history** for $M$ on $w$: 有限 configuration sequence * 若 $M$ 不 halt on $w$ (loop forever) → **沒有 computation history** * 若 $M$ halts → 存在唯一的 computation history (deterministic case) **重要性質**: * **Deterministic TM**: 最多一個 computation history * 若 $M$ accepts $w$ → 有 1 個 accepting history * 若 $M$ rejects $w$ → 有 1 個 rejecting history * 若 $M$ loops on $w$ → 沒有 history * **Nondeterministic TM**: 可能有多個 computation histories * 不同的 branch 產生不同的 history * 只要**任何一個** branch accepts,就認為 NTM accepts **強調**: 本課程重點在 **deterministic TM**,故通常最多一個 history ### Linear Bounded Automata #### Definition **Linear Bounded Automaton (LBA)**: 是受限的 TM,tape head 不能移出 input 範圍 * Tape 長度受限於 input length:若 $|w| = n$,tape 最多 $n$ 個格子 * Left/right boundary: tape head 不能超出 $[0, n-1]$ * 其他方面與 TM 相同 其實很強,以上 $A_\text{DFA}, A_\text{CFG}, E_\text{DFA}, E_\text{CFG}$ 都可以用 LBA 決定。 而且每個 CFL 都可以被某個 LBA decide。 第九章會介紹不能被 LBA decide 的 decidable language。 **記號**: $A_{\text{LBA}} = \{\langle M, w \rangle | M \text{ is an LBA and } M \text{ accepts } w\}$ #### Decidability of $A_{\text{LBA}}$ **Lemma 5.8**: 設 LBA $M$ 有 $q$ 個 states 和 $g$ 個 tape symbols Configuration 數量: $q \cdot n \cdot g^n$ (distinctly) * Tape content: $g^n$ 種可能 * Head position: $n$ 種位置 * State: $q$ 種狀態 **Theorem 5.9**: $A_{\text{LBA}}$ is decidable. **Proof**: 設計 TM $L$ to decide $A_{\text{LBA}}$: ```text L on input ⟨M, w⟩: Simulate M on w for at most q·|w|·g^|w| steps If M halts and accepts → L accepts If M halts and rejects → L rejects If M doesn't halt within limit: L rejects (M is looping) ``` **關鍵**: 因為 configuration 數有限,超過 $q \cdot n \cdot g^n$ 步後若未 halt 必定在 loop,可以 reject --- #### Undecidability of $E_{\text{LBA}}$ $$E_{\text{LBA}} = \{\langle M \rangle | M \text{ is an LBA and } L(M) = \varnothing\}$$ **重要對比**: $A_{\text{LBA}}$ decidable,但 $E_{\text{LBA}}$ undecidable! **Theorem 5.10**: $E_{\text{LBA}}$ is undecidable. **Proof** (By Reducibility from $A_{\text{TM}}$): 假設 $\exists$ decider $R$ for $E_{\text{LBA}}$ 構造 LBA $M'$ to decide $A_{\text{TM}}$ on input $\langle M, w \rangle$: 1. 設計 LBA $B$ 來 recognize accepting computation histories of $M$ on $w$ * Input 格式: $\#C_1 \# C_2 \# \cdots \# C_k\#$ (configuration sequence) * $B$ 驗證: * $C_1$ 是 $M$ 在 $w$ 上的 start configuration * 每個 $C_i$ 到 $C_{i+1}$ 是合法轉移,在兩個 configuration 間 zig zag * $C_k$ 是 accept configuration 2. 構造 TM $S$: ```text S on input ⟨M, w⟩: Construct LBA B (as described above) Run R on input ⟨B⟩ If R accepts (L(B) = ∅): S rejects (M does not accept w) If R rejects (L(B) ≠ ∅): S accepts (M accepts w) ``` **邏輯**: $L(B) = \varnothing \Leftrightarrow$ no accepting computation history exists $\Leftrightarrow$ M does not accept w --- ### $\text{ALL}_{\text{CFG}}$ is Undecidable $$\text{ALL}_{\text{CFG}} = \{\langle G \rangle | G \text{ is a CFG and } L(G) = \Sigma^*\}$$ > 無法用 $\text{E}_{\text{CFG}}$ 的方法證明,因為 CFLs is not closed under complement **Theorem 5.13**: $\text{ALL}_{\text{CFG}}$ is undecidable. **Proof** (Reduction from $A_{\text{TM}}$): 假設 $\exists$ decider $R$ for $\text{ALL}_{\text{CFG}}$,我們可以用它 decide $A_{\text{TM}}$ ($\langle M, w \rangle$) **核心想法**: 構造 CFG $G_{M,w}$,使得 $$L(G_{M,w}) = \Sigma^* \Leftrightarrow M \text{ does not accept } w$$ 接著用一個 TM 模擬與 $G_{M,w}$ 等價的 PDA。 換句話說,**$G_{M,w}$ 的設計**是生成所有**不是** accepting computation history 的 strings > $G_{M,w}$ 生成所有滿足以下**至少一個條件**的 strings: > > 1. $C_1$ 不是 $M$ 在 $w$ 上的 start configuration > 2. $C_k$ 不是 accepting configuration > 3. 某個 $C_i$ 到 $C_{i+1}$ 的轉移不符合 $M$ 的規則 **重要性質**: * 若 $M$ **不** accept $w$ → 不存在 accepting computation history → 所有 string 都滿足至少一個條件 → $L(G_{M,w}) = \Sigma^*$ * 若 $M$ **accepts** $w$ → 存在 accepting computation history → 該 history 不滿足任何條件 → 被排除在 $L(G_{M,w})$ 外 → $L(G_{M,w}) \neq \Sigma^*$ **構造 $G_{M,w}$ (從 PDA 轉換)**: 因為 CFG 和 PDA 等價,我們構造一個 PDA $D_{M,w}$ 來解釋 $G_{M,w}$ 的構造,只是為了容易理解,理論上也能直接構造 CFG。 **Input 格式**: $\#C_1\#C_2^R\#C_3\#C_4^R\#\cdots\#C_k\#$ > 注意偶數位置的 configuration 是**反轉**的 (reverse order),因為 PDA 是 stack (LIFO),要比較 $C_i$ 和 $C_{i+1}$ 的對應位置,比如都是 (state, tape position, tape symbol) 的順序 **PDA $D_{M,w}$ 的運作**: Non-deterministically **branch** 到三個檢查 (對應三個條件,用 OR 連接): 1. **Branch 1**: 檢查 $C_1$ 不是 start configuration 2. **Branch 2**: 檢查 $C_k$ 不是 accepting configuration 3. **Branch 3**: Non-deterministically 選某個 $i$,檢查 $C_i$ 不 yield $C_{i+1}$ **重點**: PDA 是 non-deterministic,**只要有一個 branch accept**,整個 string 就被 accept,代表不是一個 accepting computation history。 --- ### $\text{EQ}_{\text{CFG}}$ is Undecidable Reduction from $\text{ALL}_{\text{CFG}}$. * 構造一個 CFG $H$,使得 $L(H) = \Sigma^*$。 * 然後檢查 $L(G) = L(H)$。 --- ### Simple Undecidable Problem: Post Correspondence Problem (PCP) **定義**: 給定一個骨牌 (dominoes) 的組合 $P = \{[t_1/b_1], [t_2/b_2], \ldots, [t_k/b_k]\}$,每個骨牌上下各有一個 string。 **Match**: 一個序列 $i_1, i_2, \ldots, i_\ell$ (可重複、可改變順序),使得上下字串的串接結果一樣: $$t_{i_1} t_{i_2} \cdots t_{i_\ell} = b_{i_1} b_{i_2} \cdots b_{i_\ell}$$ **PCP Language**: $$\text{PCP} = \{\langle P \rangle | P \text{ has a match}\}$$ **Theorem**: PCP is **undecidable**. (證明略,課堂未涵蓋) **意義**: 這是一個非常具體的演算法問題 (不像 $A_{\text{TM}}$ 那麼抽象),但仍然是 undecidable * 給定骨牌集合,無法用 TM 判斷是否存在 match * 可能永遠跑下去,無法確定是真的不存在還是還在找 ## 7 Time Complexity ### 7.1 Measuring Complexity Complexity Theory 研究**資源使用量** (resource usage),主要是 **time** 與 **space**。本章聚焦 time complexity。 對於 **decidable** language,我們關心:用多少 **time** (steps) 來 decide? #### 範例: $A = \{0^n 1^n | n \geq 0\}$ **Single-tape TM $M_1$**: 1. **Stage 1**: 掃描 tape,檢查格式是否為 $0^*1^*$。Reject 若有 1 在 0 右側。 * **Complexity**: $O(n)$ (掃描一遍 + 回到最左端) 2. **Stage 2-3**: Repeat:若還有 0 和 1,就掃描一遍,cross off 一個 0 和一個 1。 * 最多 $n/2$ 次掃描,每次 $O(n)$ → **Complexity**: $O(n^2)$ 3. **Stage 4**: 檢查是否還有 0 或 1 剩餘。 * **Complexity**: $O(n)$ **總複雜度**: $O(n) + O(n^2) + O(n) = O(n^2)$ --- #### Formal Definition **Definition 7.1** (Running Time): 設 $M$ 是一個 **deterministic TM** that halts on all inputs。$M$ 的 **running time** (或 **time complexity**) 是函數 $f: \mathbb{N} \to \mathbb{N}$: $$f(n) = \text{maximum number of steps } M \text{ uses on any input of length } n$$ * 考慮**最壞情況** (worst case) * 若 $M$ 的 running time 是 $f(n)$,也說 "$M$ runs in time $f(n)$" 或 "$M$ is an $f(n)$ time TM" --- #### Asymptotic Analysis **Big O Notation**: **Definition 7.2**: 設 $f, g: \mathbb{N} \to \mathbb{R}_{\geq 0}$ $$f(n) = O(g(n)) \iff \exists c > 0, n_0 \in \mathbb{N} \text{ s.t. } \forall n \geq n_0, f(n) \leq c \cdot g(n)$$ * $g(n)$ 是 $f(n)$ 的 **asymptotic upper bound** * 忽略常數係數與低階項 * 只關心當 $n$ 很大時的行為 **範例**: * $f_1(n) = 5n^3 + 2n^2 + 22n + 6 = O(n^3)$ (也是 $O(n^4)$,但不是 $O(n^2)$) * $f_2(n) = 3n\log n + 5n\log\log n + 2 = O(n\log n)$ * $\log$ 的 base 不重要 (只差常數倍) --- **Small o Notation**: **Definition 7.3**: 設 $f, g: \mathbb{N} \to \mathbb{R}_{\geq 0}$ $$f(n) = o(g(n)) \iff \lim_{n\to\infty} \frac{f(n)}{g(n)} = 0$$ * $g(n)$ 成長速度**嚴格快於** $f(n)$ * 更強的條件 **範例**: * $n^{0.5} = o(n)$ ($\sqrt{n}$ 相對於 $n$ 可忽略) * $n = o(n\log\log n)$ * $n\log n = o(n^2)$ --- #### 改進的 TM: $M_2$ for $A = \{0^n 1^n | n \geq 0\}$ **Idea**: 每次掃描**減半** 0 和 1 的數量,而不是只 cross off 一對 **Algorithm**: 1. **Stage 1**: 同 $M_1$,檢查格式 $O(n)$ 2. **Stage 2-4**: Repeat: * 檢查剩餘的 0 和 1 總數是否為偶數,若為奇數 reject * Cross off **every other** 0 (從第 1 個開始) * Cross off **every other** 1 (從第 1 個開始) * 若只剩 1 個 0,accept * 若剩餘的 0 和 1 數量為 0 但另一個不是,reject **Complexity 分析**: * 每次掃描:$O(n)$ * 掃描次數:$O(\log n)$ (每次減半) * **總複雜度**: $O(n\log n)$ --- #### 再改進: **Two-tape TM $M_3$** for $A$ **Idea**: 用第二條 tape 暫存 0,然後比對 1 **Algorithm**: 1. **Stage 1**: 檢查格式 $O(n)$ 2. **Stage 2**: 掃描 tape 1 的所有 0,複製到 tape 2,直到遇到第一個 1 * **Complexity**: $O(n)$ 3. **Stage 3**: 掃描 tape 1 的 1,每個 1 cross off tape 2 的一個 0 * **Complexity**: $O(n)$ 4. **Stage 4**: 檢查 tape 2 是否為空 * **Complexity**: $O(n)$ **總複雜度**: $O(n) + O(n) + O(n) + O(n) = O(n)$ --- #### Time Complexity Class **Definition 7.7**: $$\text{TIME}(t(n)) = \{L | L \text{ is a language decidable by an } O(t(n)) \text{ time TM}\}$$ **範例**: * $A = \{0^n 1^n | n \geq 0\} \in \text{TIME}(n^2)$ (by $M_1$) * $A \in \text{TIME}(n\log n)$ (by $M_2$) * $A \in \text{TIME}(n)$ (by $M_3$) --- **重要觀察**: * **Computability** (第 3-4 章): Single-tape TM = Multi-tape TM = NTM (等價) * **Complexity** (第 7 章): 不同 model 的 complexity **可能不同**! * Single-tape TM $M_1$: $O(n^2)$ * Single-tape TM $M_2$: $O(n\log n)$ * Two-tape TM $M_3$: $O(n)$ * 課本證明:single-tape TM 對此問題**至少需要** $O(n\log n)$ time **結論**: > When we talk about **computability**, all reasonable models are **equivalent**. > When we talk about **complexity**, different models may have **different** time complexity. #### Complexity Relationships Among Models 回顧 Chapter 3:不同 model (single-tape TM, multi-tape TM, NTM) 在 **computability** 上等價,但在 **complexity** 上可能不同。 --- **Multi-tape vs Single-tape TM**: **Theorem 7.8**: 每個 $O(t(n))$ time multi-tape TM 都有等價的 $O(t(n)^2)$ time single-tape TM。 **Proof Idea**: 按照 Chapter 3 的 construction (用特殊符號分隔多條 tape),分析時間: * **Active portion**: M 在 $t(n)$ steps 內最多使用 tape 上的 $t(n)$ 個位置 * **Simulation**: Single-tape TM $S$ 模擬 multi-tape TM $M$ 的每一步: * **Two passes**: 第一次掃描讀取、第二次掃描寫入 * 若 $M$ 有 $k$ 條 tape,每條 active portion 至多 $t(n)$,則 $S$ 的 active portion 為 $O(k \cdot t(n)) = O(t(n))$ * 每步模擬時間: $O(t(n))$ * **Total time**: $M$ 執行 $t(n)$ steps,$S$ 每步 $O(t(n))$ → 總時間 $O(t(n)^2)$ **結論**: Complexity relationship 是 **quadratic** (二次方)。 * Multi-tape TM 跑 $O(n)$ → Single-tape TM 跑 $O(n^2)$ * Multi-tape TM 跑 $O(n^2)$ → Single-tape TM 跑 $O(n^4)$ --- **NTM vs DTM**: **Running Time of NTM**: **Definition 7.9**: 設 NTM $N$ 是 decider,其 running time $f(n)$ 定義為: $$f(n) = \text{maximum number of steps on any branch of computation on any input of length } n$$ * 取所有 branches 中**最長路徑**的 steps * 從 root 到 leaf (accept/reject) 的最大長度 --- **Theorem 7.11**: 每個 $O(t(n))$ time nondeterministic single-tape TM 都有等價的 $O(2^{O(t(n))})$ time deterministic single-tape TM。 **Proof Idea**: 按照 Chapter 3 的 construction (3-tape DTM 模擬 NTM,tape 3 記錄 branch choices),分析時間: * 設 $b$ 為 $N$ 的 transition function 的最大分支數 (maximum number of legal choices) * **Computation tree**: * 最多有 $b^{t(n)}$ 個 leaves * 總 nodes 數: 最多 leaves 數量的兩倍,$O(b^{t(n)})$ * **Simulation time**: * 每個 node 從 root 走到該 node: $O(t(n))$ * 總時間: $O(t(n) \cdot b^{t(n)}) = O(2^{O(t(n))})$ (忽略 base $b$) **結論**: Complexity relationship 是 **exponential** (指數)。 --- **重要觀察**: * **Computability**: Single-tape TM = Multi-tape TM = NTM (等價) * **Complexity**: * Single-tape vs Multi-tape: **quadratic** ($n^2$) * DTM vs NTM: **exponential** ($2^n$) * Multi-tape 比 single-tape 「強」一點 (平方倍),但 NTM 比 DTM 「強很多」(指數倍) --- ### 7.2 The Class P #### Polynomial Time 的重要性 **為何關注 polynomial time?** 1. **Growth rate**: 當 $n$ 很大時,polynomial 與 exponential 差距極大 * $n^{100}$ vs $1.1^n$: 當 $n \to \infty$,exponential 成長更快 2. **Model equivalence**: Deterministic models 之間通常是 **polynomially equivalent** * 不同 deterministic TM variants 互相模擬的複雜度差距是 polynomial 3. **Efficiency**: Polynomial time 被視為 **efficient** (可接受的速度) **注意**: 實際應用中,$n^2$ vs $n^3$ 仍有差異,但理論上都視為「fast」。 --- #### Definition of Class P **Definition 7.12**: $$\mathbf{P} = \{L | L \text{ is decidable in polynomial time on a deterministic single-tape TM}\}$$ * $\mathbf{P}$ 是 **languages** 的集合 * 可在 deterministic single-tape TM 上以 polynomial time decide * 因為 deterministic models 是 polynomially equivalent,所以其他 deterministic models (multi-tape, RAM) 也適用 **直覺**: $\mathbf{P}$ 代表「可有效解決」的問題。 --- #### Encoding 的注意事項 **Reasonable encoding**: * Graph: 列出 vertices 與 edges * Numbers: 用 binary 表示 **Unreasonable encoding**: * **Unary encoding** of number: 用 $n$ 個 1 表示數字 $n$ * $32$ 用 binary: `100000` (6 bits) * $32$ 用 unary: `11111111111111111111111111111111` (32 bits) * **影響**: 若用 unary,complexity 會被「人為降低」 **Pitfall**: 若 TM 的 input 為 $a_1,\ldots,a_n$,running time 是 $O(\max_i a_i)$: 數字 $a_i$ 的長度大約是 $\log a_i$,意思是長度 $L$ 的數字大約 $2^L$,所以以總輸入長度 $\sum_i O(\log a_i)$ 表示,running time 會是 **exponential** in input length。 --- #### Examples of Problems in P **PATH**: **Theorem 7.14**: $\text{PATH} \in \mathbf{P}$ $$\text{PATH} = \{\langle G, s, t \rangle | G \text{ is a directed graph with a directed path from } s \text{ to } t\}$$ **Algorithm**: 1. Mark node $s$ 2. Repeat: 若存在 edge 從 marked node 到 unmarked node,mark 該 node 3. 直到無法 mark 更多 nodes 4. 若 $t$ 被 marked → accept;否則 reject **Complexity**: * 最多 $m$ 次 iteration ($m$ = number of nodes) * 每次掃描所有 edges * 總時間: $O(m \cdot |E|)$ = polynomial --- **RELPRIME**: **Theorem 7.15**: $\text{RELPRIME} \in \mathbf{P}$ $$\text{RELPRIME} = \{\langle x, y \rangle | x, y \text{ are relatively prime}\}$$ **Algorithm**: Euclidean algorithm * 每步至少減半其中一個數字 * Iterations: $O(\min(\log x, \log y))$ **Complexity**: $O(\log x + \log y)$ = polynomial in input length **注意**: 雖然有 $\log x$,但因為 $x$ 的 encoding 長度本身就是 $\log x$,所以仍是 polynomial。 --- **Every CFL**: **Theorem 7.16**: Every context-free language is in $\mathbf{P}$ $$A_{\text{CFG}} = \{\langle G, w \rangle | G \text{ is a CFG that generates string } w\}$$ **錯誤嘗試**: 之前證明 decidability of CFL 時: * 用 Chomsky Normal Form,嘗試所有 $2n-1$ steps 的 derivations * **問題**: $k$ steps 的 derivations 數量可能是 **exponential** in $k$ **正確方法**: **Dynamic Programming** (CYK algorithm) **Algorithm** (Pseudocode): 其實和之前證明 $E_{\text{CFG}}$ decidable 的方法類似,都是用逆推的。 設 `table[i][j]` = set of variables that can generate substring `w[i..j]` ```python # Initialization: length 1 substrings for i in range(1, n+1): table[i][i] = {A | A → w[i] is a rule} # DP: length 2 to n substrings for length in range(2, n+1): for i in range(1, n-length+2): j = i + length - 1 table[i][j] = ∅ for k in range(i, j): for each rule A → BC: if B in table[i][k] and C in table[k+1][j]: add A to table[i][j] # Check if start variable can generate entire string if S in table[1][n]: accept else: reject ``` **Complexity**: * 4 層 loop: $O(n) \times O(n) \times O(n) \times O(r) = O(n^3\cdot r)$,其中 $r$ = number of rules in $G$ * Polynomial time! **結論**: $\text{CFL} \subseteq \mathbf{P}$ --- ### 7.3 The Class NP #### Motivating Examples **兩個問題**: 1. **HAMPATH** (Hamiltonian Path): Directed graph 中是否存在經過每個 node 恰好一次的 path? 2. **COMPOSITES**: 給定 $x$,是否存在 $a, b > 1$ 使得 $x = a \times b$? 這兩個問題都在 $\mathbf{NP}$ 中。 --- #### Verifier 與 Certificate **Definition 7.18** (Verifier): A **verifier** for language $A$ is an algorithm $V$ where: $$A = \{w | V \text{ accepts } \langle w, c \rangle \text{ for some string } c\}$$ * $c$ 稱為 **certificate** (或 **proof**) of membership * $V$ 用來「驗證」$w \in A$ **Polynomial time verifier**: * $V$ runs in polynomial time **in the length of $w$** (不考慮 $c$ 的長度) **Polynomially verifiable**: Language $A$ 有 polynomial time verifier。 --- **Examples**: * **HAMPATH**: Certificate = Hamiltonian path itself * Verifier: 檢查是否為 valid path & 經過每個 node 恰好一次 * **COMPOSITES**: Certificate = divisor $a$ of $x$ (where $1 < a < x$) * Verifier: 檢查 $x \mod a = 0$ --- #### Definition of NP **Definition 7.19**: $$\mathbf{NP} = \text{class of languages that have polynomial time verifiers}$$ **注意**: * **NP ≠ "non-polynomial"**! * **NP = "Nondeterministic Polynomial time"** --- **Equivalent Definition**: **Theorem 7.20**: Language $A \in \mathbf{NP}$ $\iff$ $A$ is decided by some nondeterministic polynomial time TM. **兩個定義**: 1. **Original**: $A$ is polynomially **verifiable** by a deterministic TM 2. **Equivalent**: $A$ is **decidable** in polynomial time by a nondeterministic TM --- **Proof** ($\Rightarrow$): 設 $A \in \mathbf{NP}$,存在 polynomial time verifier $V$ runs in time $n^k$。 構造 NTM $N$: 1. **Stage 1**: Nondeterministically select string $c$ of length $\leq n^k$ 2. **Stage 2**: Run $V$ on $\langle w, c \rangle$ 3. 若任一個組合使 $V$ accepts → $N$ accepts;否則 $N$ rejects * $N$ 利用 nondeterminism 「猜」certificate $c$ * 只要有一個 branch 找到正確的 $c$,$N$ 就 accept --- **Proof** ($\Leftarrow$): 設 $A$ is decided by NTM $N$ in polynomial time。 構造 verifier $V$: * **Certificate**: Encode the sequence of nondeterministic choices * 記錄每步選哪個 branch (e.g., "step 1: option 2, step 2: option 3, ...") * **Verifier**: Simulate $N$ on $w$,按照 $c$ 的指示走對應 branch * 若該 branch accepts → $V$ accepts;否則 $V$ rejects **關鍵**: Certificate 就是「正確的 nondeterministic choices」。 --- #### Examples of NP Problems ##### HAMPATH **Theorem**: $\text{HAMPATH} \in \mathbf{NP}$ **Method 1** (Verifier): * **Certificate**: Sequence of nodes $p_1, p_2, \ldots, p_m$ * **Verifier**: * 檢查無重複 * 檢查 $p_1 = s$,$p_m = t$ * 檢查每對 $(p_i, p_{i+1})$ 是 edge * Polynomial time! **Method 2** (NTM): 構造 NTM $N$: 1. Nondeterministically create list $p_1, \ldots, p_m$ (where $m$ = number of nodes) 2. 若有重複 → reject 3. 若 $p_1 \neq s$ 或 $p_m \neq t$ → reject 4. 檢查每對 $(p_i, p_{i+1})$ 是否為 edge,若有缺失 → reject 5. Otherwise, Accept --- ##### CLIQUE **Theorem 7.24**: $\text{CLIQUE} \in \mathbf{NP}$ $$\text{CLIQUE} = \{\langle G, k \rangle | G \text{ is an undirected graph with a } k\text{-clique}\}$$ * **$k$-clique**: subnet of $k$ 個 nodes,是個完全圖(每對都有 edge 連接) **Method 1** (Verifier): * **Certificate**: Subset $C$ of $k$ nodes * **Verifier**: * 檢查 $|C| = k$ * 檢查 $C$ 中每對 nodes 都有 edge * Polynomial time! **Method 2** (NTM): 1. Nondeterministically select subset $C$ of $k$ nodes 2. 檢查 $G$ 是否包含 $C$ 中所有 node pairs 的 edges 3. 若是 → accept;否則 reject > 若 $k$ 是常數,則列舉組合是 $\binom{|V|}{k} = O(|V|^k)$,每個組合檢查 edges 是 $O(k^2)$,總時間是 $O(k^2|V|^k)$ polynomial in $|V|$。 > 但若 $k$ 是 input 的一部分,則因為 Pitfall 提到的 encoding 問題,$O(|V|^k)$ 使得整體時間 exponential in input length of $k$。(可以注意到 $k\le |V|$,所以 $O(k^2)$ 不是主因) --- ##### SUBSET-SUM **Theorem 7.25**: $\text{SUBSET-SUM} \in \mathbf{NP}$ $$\text{SUBSET-SUM} = \{\langle S, t \rangle | S = \{x_1, \ldots, x_k\}, \exists Y \subseteq S: \sum_{y \in Y} y = t\}$$ * 允許 **multiset** (可重複使用元素) **Example**: $S = \{4, 11, 16, 21, 27\}$,$t = 25$ → Yes ($4 + 21 = 25$) **Method 1** (Verifier): * **Certificate**: Subset $C \subseteq S$ * **Verifier**: * 檢查 $C$ 中所有元素都在 $S$ 中 * 檢查 $\sum_{c \in C} c = t$ * Polynomial time! **Method 2** (NTM): 1. Nondeterministically select subset $C \subseteq S$ 2. 檢查 $\sum_{c \in C} c = t$ 3. 若是 → accept;否則 reject --- #### co-NP 某個 problem 的 complement 在 $\mathbf{NP}$ 中,則該 problem 屬於 **co-NP**。 比如 $\overline{\text{HAMPATH}}$: $G$ 是否不存在一個 Hamiltonian path?光是 certificate 就很難描述。 co-NP-complete: 若一個問題的 complement 是 NP-complete,則該問題是 co-NP-complete。 目前還不知道 $\mathbf{NP} = \text{co-}\mathbf{NP}$ 是否成立。但如果證明任一個 co-NP-complete 問題是 NP 或不是 NP,就知道是否 $\mathbf{NP} = \text{co-}\mathbf{NP}$,因為 NP-complete 問題的 complement 也可以互相 polynomial time reduce。 #### P vs NP **觀察**: * $\mathbf{P}$ = languages **decidable** in polynomial time (by DTM) * $\mathbf{NP}$ = languages **verifiable** in polynomial time (by DTM) * 或等價地:languages **decidable** in polynomial time (by NTM) **顯然**: $\mathbf{P} \subseteq \mathbf{NP}$ * 若 $A \in \mathbf{P}$,可直接 decide,不需 certificate * Verifier 可忽略 certificate,直接 decide $w$ **問題**: $\mathbf{P} = \mathbf{NP}$? * **Most people believe**: $\mathbf{P} \neq \mathbf{NP}$ * 但至今**無人證明**! * Clay Mathematics Institute 列為 [Millennium Prize Problems](https://en.wikipedia.org/wiki/P_versus_NP_problem) (獎金 $1M) --- ### 7.4 NP-Completeness (簡介) #### Definition of NP-Complete **Definition 7.34**: Language $B$ is **NP-complete** if: 1. $B \in \mathbf{NP}$ 2. Every $A \in \mathbf{NP}$ is **polynomial time reducible** to $B$ **直覺**: NP-complete 是 $\mathbf{NP}$ 中「最難」的問題。 **重要性**: * 若任一 NP-complete problem $B \in \mathbf{P}$ → 則 $\mathbf{P} = \mathbf{NP}$ * 至今無人找到 polynomial time algorithm 解決任何 NP-complete problem --- #### Examples of NP-Complete Problems **一些 NP-complete problems**: 1. **SAT** (Boolean satisfiability): 給定 Boolean formula,是否存在 satisfying assignment? * Cook-Levin Theorem: SAT 是第一個被證明的 NP-complete problem 2. **3SAT**: SAT 的特殊情況 (每個 clause 恰好 3 literals),2SAT 則在 P 中 3. **CLIQUE** 4. **HAMPATH** 5. **UHAMPATH**: Undirected 版本的 HAMPATH 6. **SUBSET-SUM** 7. **VERTEX-COVER**: 給定 graph $G$ 與 $k$,是否存在 $k$ 個 nodes 使得每條 edge 至少一端被選中? **意義**: 許多實際問題都是 NP-complete,目前只能用 heuristics 或 approximation algorithms 處理。 --- ### Chapter 7 總結 1. **Complexity measures**: Time complexity = number of steps 2. **Big O & Small o**: Asymptotic analysis 忽略常數與低階項 3. **Model relationships**: * Multi-tape vs Single-tape: quadratic ($O(n^2)$) * NTM vs DTM: exponential ($O(2^n)$) 4. **Class P**: Polynomial time decidable (efficiently solvable) 5. **Class NP**: Polynomial time verifiable 6. **P vs NP**: 至今未解決 7. **NP-Complete**: NP 中最難的問題,若任一 NP-complete $\in$ P 則 P = NP --- ## Course Summary ### 課程架構回顧 本課程分為三大部分: #### Part 1: Automata and Languages * **Regular Languages**: * **DFA** (Deterministic Finite Automaton) * **NFA** (Nondeterministic Finite Automaton) * **Regular Expressions** * **等價性**: DFA ≡ NFA ≡ Regular Expressions * 都能 recognize/generate **Regular Languages** * **Context-Free Languages**: * **CFG** (Context-Free Grammar) * **(N)PDA** (Pushdown Automaton,預設指 nondeterministic) * **等價性**: CFG ≡ Nondeterministic PDA * 都能 recognize/generate **Context-Free Languages** * **注意**: DPDA (Deterministic PDA) **不等價於** PDA * DPDA 能力弱於 PDA (課程未涵蓋,作為補充材料) #### Part 2: Computability Theory * **Turing Machine**: * **Deterministic TM** * **Nondeterministic TM** * **等價性**: DTM ≡ NTM (computability 角度) * 能 recognize **Turing-recognizable languages** * 若 TM 是 **decider** (總是 halt) → recognize **Decidable languages** * **Language Hierarchy**: $$\text{Regular} \subset \text{Context-Free} \subset \text{Decidable} \subset \text{Turing-Recognizable} \subset \text{All Languages}$$ * 還有 **Turing-unrecognizable languages** (在 Turing-recognizable 之外) * **Decidability** (Chapter 4): * 證明某些 languages 是 **decidable** * **Undecidability** (Chapter 5): * **Diagonalization method**: 證明 $A_{\text{TM}}$ undecidable * **Reduction**: 從 $A_{\text{TM}}$ reduce 到其他 languages,證明它們 undecidable | | FA | CFG | TM | LBA | |-------|-----|-----|----|------| | $A$ | D | D | UD | D | | $E$ | D | D | UD | UD | | $EQ$ | D | UD | UD | (UD) | | $ALL$ | (D) | UD | | | (括號代表課堂沒證但確實是這樣) TM 的 HALT, REG 也都是 UD。 #### Part 3: Complexity Theory (Chapter 7) 關注「需要多少**時間**解決問題」: * **Class P**: Polynomial time decidable (deterministic) * 被視為「可有效解決」的問題 * 例: PATH, RELPRIME, Every CFL * **Class NP**: Polynomial time verifiable (deterministic) 或 Polynomial time decidable (nondeterministic) * 例: HAMPATH, CLIQUE, SUBSET-SUM * **P vs NP**: * $\mathbf{P} \subseteq \mathbf{NP}$ (顯然) * $\mathbf{P} = \mathbf{NP}$? **未解決** (Millennium Prize Problem) * **NP-Complete**: NP 中最難的問題 * 例: SAT, 3SAT, CLIQUE, HAMPATH, SUBSET-SUM, VERTEX-COVER * 若任一 NP-complete problem $\in \mathbf{P}$ → $\mathbf{P} = \mathbf{NP}$ ### 課程意義與應用 #### 為何學習 Theory of Computation? 1. **畢業要求**: CSIE 學生必修 2. **基礎知識**: 計算理論的基礎 3. **實際應用**: 雖然畢業生問卷顯示「最不實用」,但: * 基礎知識有廣泛影響 * 研究領域可能需要 (cryptography, systems engineering) * Model-based design 的理論基礎 #### Model-Based Design 對於 **safety-critical systems** (醫療設備、自動駕駛、航空器): * 不能直接實作後測試 * 需從 **model of computation** 開始: * **Finite State Machine** (FSM): 良好起點 * **Transition System** * **Hybrid System**: FSM + differential equations * **驗證**: 確保系統不會到達 unsafe states * **流程**: Model → Verify → Implementation **課程教授的 models of computation 是這些應用的理論基礎**。 ### 面對計算困難問題的策略 許多現實問題是 computationally hard (NP-complete),可採用以下策略: 1. **犧牲最優解**: 使用 approximation algorithms 2. **考慮其他計算模型**: 非傳統計算 (quantum computing, DNA computing) 3. **不考慮 worst case**: 平均情況或實際情況分析 4. **Heuristics**: 啟發式演算法 ### AI 與計算的局限性 * **AI 與電腦很強大**,但存在**根本限制**: * Undecidable problems * NP-complete problems (目前無 polynomial time solution) * **理解限制**才能**善用**: * 知道什麼能做、什麼不能做 * 選擇合適的方法與工具 **作為 CSIE 學生,應理解計算的能力與局限,才能做出真正有價值的貢獻。**