owned this note
owned this note
Published
Linked with GitHub
正規語言--曾文貴
=====
`NCTU` `CS`
[回主頁](https://hackmd.io/s/ByOm-sFue)
> 廢話區
> > GG 我走錯教室了 [name=Gavin Lee]
>
> > 我走進陳的教室,我以為我是曾的,第一節下課後我回到曾的教室,上完回宿舍後,我發現其實我是陳的@@a [name=Gavin Lee]
>
> > 到此一遊~~OuO [name=Taylor Huang]
# Table of content
[TOC]
# Syllabus
- [課程網頁](http://people.cs.nctu.edu.tw/~wgtzeng/courses/FL2017SpringUnder/index.html)
- Score
- HW 25%
- Mid-term 50%
- Final 25%
# Ch1 Introduction & Notations in Formal Language
## Mathematical Preliminaries and Notations
- Set:
- Collection of Elements
- Union: $A\cup B:=\{x|x\in A \lor x\in B \}$
- Intersection: $A\cap B:=\{x|x\in A \land x\in B\}$
- Cartesian Product: $A\times B:=\{(a,b)|a\in A \land b \in B\}$
- Power Set: $2^S:=\{A\subseteq S\}$
- See also: https://en.wikipedia.org/wiki/Zermelo%E2%80%93Fraenkel_set_theory
- Relation:
- A relation R between set A,B is a subset of $A\times B$
- $(a,b)\in R$, denoted as $aRb$
- Function:
- A function $F:D→C$ is:
- A relation between D and C
- $\forall x\in D\exists!y\in C:xFy$
- $\forall x:F(x):=y \iff xFy$
- Cardinality:
- The cardinality of a set $S$ is denoted as $|S|$
- Consider set A,B:
- $|A|\leq|B|\iff \exists f:A→B, f$ **injection**
- $|A|\geq|B|\iff \exists f:A→B, f$ **surjection**
- $|A|=|B|\iff \exists f:A→B, f$ **bijection**
- $\forall n\in N^0:\{0\leq x<n|x\in N^0 \}:=n$
## Language
- Alphabet: A set of symbols, usually denoted as $\sum$
- String: A finite sequence of symbols
- $|w|:=|Domain(w)|$
>Regarded as the length of a string
- Concatenation: $w_1 * w_2 = w_1w_2$
- A reverse $w^R$ of string $w$ is $\prod_{k=0..n}a_{n-k}$ where $w=\prod_{k=0..n}a_{k}$
- repeated: $a^n = a*a*...*a$ (n times)
- $S=P*Q$, for some P and Q, $\iff$ such P/Q is a prefix/suffix of S
>prefix ($\{\lambda, a_1, a_2, ...\}$), suffix ($\{\lambda, a_n, a_{n-1}, ...\}$), postfix (?)
- Empty string: The string containing no symbol
- $|\lambda| = 0$
- Identity: $\lambda w = w \lambda = w$
- Language: Collection of strings
>*Alphabet → Strings → Languages*
>
- Kleene Star Notation:
- $\sum^*:= \bigcup_{k∈N^0}{\sum^k}$
>Set of all strings from a specified alphabet $\sum$
- $\forall L\subseteq\sum^*:L^*:=\bigcup_{k\in N^0}L^k$
- Other Terminology:
- $\sum^+ := \sum^* - \{\lambda\}$
>Set of all non-empty strings
- Complementation: $L^c := \sum^* - L$
- Concatenation: $L_1L_2 := \{xy: x∈ L_1, y∈ L_2\}$
- $L^n :=\prod_{k=1..n}L = \{x_1x_2...x_n: x_j\in L\}$ (任取n個排組):
>e.g. $\{ab,c\}^3 = \{ababab,ababc,abcab,cabab,abcc,cabc,ccab,ccc\}$
- A Language L over $\sum$ is a subset of $\sum^*$
- Boundary Definition:
- $\forall s\in\sum^*:$
- $s^0 := \lambda$
- $s^1:=s$
- $\forall L\subseteq\sum^*:$
- $L^0 := \{ \lambda \}$
- $L^1:=L$
> **Remark:**
> - We use "." as the concatenation operator instead of "+"
> - String may be of infinite length, but this course assumes all strings are finite.
> - In Formal-Language-Theory Custom: $L^2 = L \cdot L \neq L \times L$
> - $s^0$, for string s, is an empty string
> - $L^0$, for language L, is an language containing only the empty string
## Grammer
***$G = (V, T, P, S)$***
- Definition:
- $V$: A set of **Variables**
- $V\cap T=\emptyset$
- $T$: A set $\sum$ of **Terminals** (Symbols)
- $P$: A set of **Production Rules**
- $S$: An initial variable called **Start**
- $S\in V$
- $\forall x,y\in (V\cup T)^*:$
- $(x\Rightarrow y) \iff \exists (a→b)\in P,x_0,x_1,y_0,y_1:(x=x_0ax_1\land y=y_0by_1)$
- $(x\Rightarrow^*y)\iff\exists x_n(x\Rightarrow x_1\Rightarrow x_2 \Rightarrow...\Rightarrow y)$
- $L(G) := \{w\in\sum^*|S \Rightarrow^* w\}$
> **Remark:**
> - S can only be a single element(A grammer contains a single starting)
> - V, T are mutually exclusive
> - e.g. $G = (\{S,A,B\},\{a,b\}, \{S → aA, A→bS, S→\lambda \}, S)$
> - Multiple grammar with different Starts may be merged as one by sourcing from another Variable and maps the new Start to the old Starts in Production Rules
## Automaton
- Abstraction:
- Input
- Control unit
- Storage
- Output
- Operation in descrete time frame
## Language representation
- Mathematical Custom
- $L=\{<Enum>\}$
- $L = \{x|P(x)\}$
- Formal Language Custom (easier for computer to understand)
- Grammer: language generator
- Automata: language acceptor
# CH2 Finite Automata
:::info
**Main concept:** A way to **describe a machine** that have:
- Finite States
- Inputs
- Outputs
- Storage
:::
## Deterministic Finite Accepter (D.F.A.):
$M=(Q,\sum,\delta,q_0,F)$
- Definition:
- Q is a finite set of **Internal States**
- $\sum$ is a finite set of symbols called **Input Alphabet**
- $\delta:Q\times\sum→Q$ called **Transition Function**
- $q_0\in Q$ called **Initial State** (only 1)
- $F\subseteq Q$ is a **set** of **Final States** (can have many)
>- A machine with finite states that the next state is completely determined by the present state and the input symbol
>- States are not seen as memory in such a DFA model
- Extended Transistion Function:
- $\forall q\in Q, s\in\sum^*:$
- $\delta^*(q,\lambda):=q$
- $\delta^*(q,s):=\delta(\delta^*(q,s_0),s_1)$, for $|s_1|=1, s=s_0s_1$
- Acceptance:
- $\forall s\in\sum^*:s$ is accepted $\iff\delta^*(q0,s)\in F$
>**Accepted strings**: inputs that lead to a final state
>**Rejected strings**: inputs that lead to a non-final state
- $L(M):=\{s\in\sum^*:\delta^*(q0,s)\in F\}$
- Regular Language
- $\forall L\in\sum^*:L$ is called **Regular**$\iff\exists M\in DFA:L=L(M)$
## Nondeterministic Finite Accepter (N.F.A.):
$M=(Q,\sum,\delta,q_0,F)$
- Definition:
- Q is a finite set of **Internal States**
- $\sum$ is a finite set of symbols called **Input Alphabet**
- $\delta:Q\times(\sum \cup\{\lambda\} )→2^Q$ called **Transition Function**
- $q_0\in Q$ called **Initial State**
- $F\subseteq Q$ is a set of **Final States**
>A machine with finite states that have some or none possible next states
- Extended Trainsition Function:
- Collection of states that can be traversed from $q_0$ with mark $s$ is denoted $\delta^*(q_0,s)$
- Accpetance:
- $\forall s\in\sum^*: s$ is accepted $\iff\delta^*(q_0,s)\cap F\neq\phi$
- $L(M):=\{s\in\sum^*|\delta^*(q_0,s)\neq\phi\}$
## Equivaluence of Finite Automata
- Finite Automata $M_0,M_1$ are **equivalent** $\iff L(M_0)=L(M_1)$
- $\forall M_0\in NFA,\exists M_1\in DFA:M_0,M_1$**are equivalent**
- Strategy:
- Take $M_1.Q=2^{M_0.Q}$
- $\forall q_1\in M_1.Q:M_1.\delta(q_1,s)=\cup_{q_0\in q_1}{M_0.\delta(q_0,s)}$
- Then find $L(M_0)=L(M_1)$
## Minimization of Deterministic Finite Accepter
- Minimum DFA
- Definition
- M is a minimum DFA$\iff$ M contains minimum number of states, for all DFA that accepts L(M)
- Existence:
- By the well-ordering property of Natural numbers, such DFA exists for all language
- Nondistinguishable States
- Two states $q_0, q_1$ are **Nondistinguishable**, *if and only if* $\forall s\in\sum^*:\delta^*(q_0,s)\in F\iff\delta^*(q_1,s)\in F$
- Reduction:
- Strategy:
- Remove all unreachable states
- Merge all nondistinguishable states
- This always takes us to the minimum DFA
- Yet to be proven by the course
- See also: https://en.wikipedia.org/wiki/DFA_minimization#Minimum_DFA