# Theory of Computation Notes
## Part 0:
**Some notes on sets, sequences, and tuples**:
*Proper* subsets do not contain the original set.
Numbers of occurences are irrelevant in sets, but if we want to take them into account, we use *multisets*.
Order does not matter in sets {}, but it does in sequences (). Finite sequences are often called n-tuples (where n is the number of elements).
**Cartesian product**: If $A = \{ 1, 2 \}$ and $B = \{ x, y, z \}$, then $A \times B = \{ (1, x), (1, y), (1, z), (2, x), (2, y), (2, z) \}$
How we describe the add function formally:
Z x Z --> Z, is a binary function, takes a 2-tuple as input.
- The number space Z_m defines a number set from {1, 2, ... m-1} (modular arithmetic)
- Functions do **NOT** necessarily fill its range: when it does, it is known as *onto*.
- A function that takes a *single* argument is an *unary function*.
A *predicate* can only be {true, false} (assuming the law of excluded middles). It is a function which returns a boolean. We can express a predicate as a set of mappings instead of a function when it is nicely defined that way. For example, instead of listing beats for every outcome of Rock, Paper, Scissors, we can write:
beats = {{Scissors, Paper}, {Paper, Stone}, {Stone, Scissors}}
### equivalence relation
A binary relation is an equivalence relation if it is reflexive ($xRx$ for all $x$), symmetric ($xRy$ implies $yRx$ for all $x$,$y$), and transitive (for all $x$, $y$, $z$, $xRy$, $yRz$ implies $xRz$).
❓ Why is modular equals an equivalence relation?
### graphs
points are called *nodes* or *vertices*; lines are *edges*; number of edges at a node is that node's *degree*.
### booleans
Booleans: negation NOT, conjunction AND, disjunction OR, exclusive XOR, equality, implication (anything except 1,0)).
Equality is A implication B AND B implication A. This makes sense because (1, 1, 0, 1) OR (1, 0, 1, 1) is (1,0,0,1): when we have the inputs (0,0) or (1,1).
(Set functions: union U , intersection, complement)
_See page 16 for reference._
## Part 1: Automata and Languages
### 1.1: Finite Automata
#### Formal Definition
1. $Q$ is a set of **states**
2. $\Sigma$ is the **alphabet**
3. $\delta: Q \times \Sigma \to Q$ is the **transition function**
4. $q_{0} \in Q$ is the **start state**
5. $F \subseteq Q$ is the **set of accept states**
_See page 36 for example_
In other words, a finite automaton is composed of a finite set of states. The transition function dictates where to move given a state and a symbol.
Each finite automaton has a start state and a set of accept states, taking an input string and accepting or rejecting.
**Language of machine $M$**: the set of strings that $M$ accepts. $L(M) = A$, or $M$ recognizes $A$
#### formal definition of deterministic finite automaton M
Let $w = w_{1} w_{2} w_{3} \dots w_{n}$ where $w_{i} \in \Sigma$.
$M$ accepts $w$ if a sequence of states $r_0, r_1, \dots, r_n \in Q$ exists given:
1. $r_{0} = q_{0}$
Machine starts in start state
2. $\delta ( r_{i}, w_{i + 1}) = r_{i + 1}$ for $i = 0, \dots, n - 1$
Machine goes between states according to transition function
3. $r_{n} \in F$
Machine ends in accept state
(uninituitive) Recognition is defined as all acceptances
**Regular language**: a language that a finite automaton accepts
#### Regular Operators
We construct three operations on languages to aid in studying them.
##### Union
$A \cup B = \{ x : x \in A \lor x \in B \}$
If $A = \{ 1, 2 \}$ and $B = \{ 3, 4 \}$, $A \cup B = \{ 1, 2, 3, 4 \}$.
##### Concatenation
$A \circ B = \{ xy : x \in A \land y \in B \}$
If $A = \{ 1, 2 \}$ and $B = \{ 3, 4 \}$, $A \circ B = \{ 13, 14, 23, 24 \}$.
##### Star
$A ^ \ast = \{ x_{1}, x_{2}, \dots, x_{k} : k \geq 0 \land x_{i} \in A \}$
If $A = \{ 1, 2 \}$, $A ^ \ast = \{ \epsilon, 1, 2, 11, 12, 111, 112, 121, 122, \dots \}$.
### 1.2: Nondeterminism
> There's some annoying skipping around that happens so this is *slightly* out of order
#### Closure Under Regular Operators
We show that regular languages are closed under regular operators
##### Union
We can construct $M$ such that it recognizes $A_{1} \cup A_{2}$ where $M = (Q, \Sigma, \delta, q_{0}, F)$ and:
$$
L(M_{1}) = A_{1}, M_{1} = (Q_{1}, \Sigma_{1}, \delta_{1}, q_{1}, F_1) \\
L(M_{2}) = A_{2}, M_{2} = (Q_{2}, \Sigma_{2}, \delta_{2}, q_{2}, F_2)
$$
The trick is to run both $M_{1}$ and $M_{2}$ at the same time by using pairs of states.
1. $Q = Q_{1} \times Q_{2}$.
2. $\Sigma = \Sigma_{1} \cup \Sigma_{2}$.
3. For $(r_{1}, r_{2}) \in Q$ and $a \in \Sigma$:
$$
\delta((r_{1}, r_{2}), a) = (\delta_{1}(r_{1}, a), \delta_{2}(r_{2}, a))
$$
We track where $M_{1}$ and $M_{2}$ are by using tuples of the two.
4. $q_{0}$ is $(q_{1}, q_{2})$.
5. $F = (F_{1} \times Q_{2}) \cup (F_{2} \times Q_{1})$
We accept when either component of the state tuple is an accept state of $M_{1}$ or $M_{2}$, respectively.
We find the cartesian product because we don't care what the other component is.
### formal definition of NFA
NFA has the same formal definition as DFA, except the transition function $\delta$ takes {current state, {an input symbol or $\epsilon$}} and return the set of next possible states.
1. $Q$ is finite set of states
*given $\sum$ is finite alphabet*
2. $\sum_\epsilon$ is $\sum$ $\cup$ the empty string $\epsilon$
3. $\delta: Q \times \sum_\epsilon$ --> P(Q)
*reads: the transition function takes every combination of {state, (character | $\epsilon$ )} to produce elements in the power set of Q, aka subsets of Q. P(Q) is the range.* :face_with_cowboy_hat: Not all subsets are necessarily reachable.*
4. $q_0$ $\in Q$ is the start state
5. Set of accept states $F$ is subset $\subseteq$ of $Q$
### How to express any NFA as an equivalent DFA
Key idea: we will write each deterministic state as a set of nondeterministic states.
> tracking finger positions in NFA is tracking hand positions in DFA. Many *ones* becomes one *many*.
Let nondeterministic N = $(Q, \sum, \delta, q_0, F)$
and deterministic M = $(Q', \sum', \delta', q_0', F')$.
Filling out our formal definition boilerplate :face_with_rolling_eyes:
---
1. Our states for M, $Q'$ is the set of all subsets of Q, or $P$owerset($Q$).
$Q' = P(Q)$
2. Our alphabet $\sum'$ will be the same.
3. Our transition function $\delta'$ will return the set which is the union of all output states specified by the original $\delta$. Essentially, we're just collecting the NFA state outputs and putting them in a single set. This is expressed as:
$\delta'(R, a) = \quad \cup \delta(r,a) \quad$ for each possible r in R and where R is $\in Q'$
4. Similarly, our start state $q_0'$ is the SET which contains $q_0$: $\{q_0\}$.
5. Our accept state $F'$ for M is any set which contains the accept state $F$ for N
$F' = \{ R \in Q' | \text{R contains an accept state of N}\}$
---
This definition is not quite complete, because we haven't dealt with the empty string $\epsilon$.
We can define $E(R)$ (where R is a set of states, or subset of Q) as the set of states which can be reached from R by traveling among 0 or more $\epsilon$ arrows.
Then we can wrap this function around our existing expression $\cup \delta(r,a)$ to define the transition function $\delta'$ as
$\delta'(R, a) = {q \in Q |\ q \in E(\delta(r,a)) \quad}$ for some r $\in$ R
And also wrap this around our start state for $q'_0$ = $\{E(q_0)\}$
### NFAs can read strings in A1 and in A2 (union)
> union with NFAs is so easy it's basically cheating
Summary diagram:

### NFAs can read A1A2 strings (concatenation)
> this one is like playing mini golf
Summary diagram:

To construct an NFA N which recognizes A1A2, we combine the two NFAs which recognize A1 and A2. We link the accept states of $N_1$ with an $\epsilon$ to the start of $N_2$, to tentatively enter $N_2$ after we've gone through $N_1$. The accept states of N are now the accept states of $N_2$.
We can also define N by fleshing out our boilerplate definition again :face_with_rolling_eyes:
---
*Q*: states of N are the states of N1 $\cup$ states of N2
$\sum$: the alphabet of N is still $\sum$
$q_0$: the start state for N is the same as the start state for N1
$F$: the accept states for N is the same as the accept states of N2
$\delta$: the transition function for N is the same as N1 when we're within N1 and the same as N2 when we're within N2. The exception is when we describe the $\epsilon$ bridges between the two, or when the state $q$ in $\delta(q, a)$ is an accept state of N1.
---
### NFAs can read any string made of A1 (star)
> just add an $\epsilon$ loopback from every accept state to a "hub" state
:question: part I was confused about: what if we had a string that was like A1 + "blahblah" and the NFA had copies still on the accept state after reading the A1 part? And then I realized that because the $\epsilon$ is the only path out, those copies would die! These $\epsilon$s are not like the other $\epsilon$s because they will never be splitters :crying_cat_face:
[I think I have answer to exercise 1.11](/JxZ74aD4SWuR7-67zrlRjQ)
Summary diagram:

If anyone else wants formal definition practice be my guest, I am tired :sleeping:
## regular expressions
Regular expressions describe regular languages.
$(0 \cup 1)0$* describes a language starting with a 0 or 1 and followed with any number of 0s. The asterisk is the star operator applied to the set {0}
$(0 \cup 1)$\* is any number of 0s and 1s.
Notation: if $\sum$ is an alphabet, then $\sum$ as a _regular expression_ is every length 1 string in that alphabet. The regex $\sum$\* is all strings across that alphabet.
Formal definition: R is a regular expression if it is
1. {a} for some a in an alphabet $\sum$
2. {$\epsilon$}
3. $\emptyset$
4. $R_1 \cup R_2$
5. $R_1 \circ R_2$
6. $R_1*$
Through 1 & 4, we see that R can be any subset of an alphabet $\sum$.
❗ Note that starred terms can repeat 0 times:
0\* 1 0\* is any string has exactly a single 1
$\sum$\*001$\sum$\* describes any string containing 001
$(0 \cup \epsilon) 1$\* = $01^{*} \cup 1^{*}$ (we're adding "0" or "" before any number of 1s)
>Confused about why concatenating A with an $\emptyset$ is $\emptyset$?
The answer is that we've defined $A \circ B=\{ab:a\in A\text{ and }b \in B\}$
When there are no elements in B, there are no objects $ab$, so $A \circ B = \emptyset$.
Eric's bad notes on chapter 2
# context free grammars
first used in study of human languages
You write a grammar in order to create a parser
A grammar consists of substitution rules.
It looks like this:
[symbol] --> [string]
A string can contain other symbols and a terminal.
This grammar describes the language of all possible END STATE strings it can make.
A -> 0A1 -> 00A11 -> 000A111 -> 000B111 -> 000#111
Formally a context free grammar consists of:
1. V the finite set of Variables
2. $\sum$ the finite set of terminals (this is like an alphabet?)
3. R the finite set of rules
4. S $\in$ V is the start variable
### example: the grammar for nicely nested parentheses
S -> aSb | SS | $\epsilon$. The final results of a grammar must be strings of terminals, as all variables must be removed. In the above case, this would be S.
We can easily union context free grammars by using a rule
S -> S1 | S2 | S3
where S1 .. is the start rules for subgrammars
Context free grammars are equivalent to deterministic finite automatons.
Generally we want to avoid ambiguity, or two different parse trees leading to the same expression. To avoid confusing different parse trees with different derivations, we will always use the "leftmost" derivation of strings, where at every step the leftmost variable is replaced.
Thus, a string is ambiguously derived if, even when looking only at leftmost derivations, it has more than 1 of them.
Some languages are inherently ambiguous: their strings can be generated in more than 1 way.
$\{0^i 1^j 2^k | i = j \text{ or } j = k\}$ is inherently ambiguous as its resulting strings can always be built by more than 1 way.
Converting to Chomsky Normal form requires removing e rules and UNIT rules. Instead of having A -> B, we instead map all of B -> u to A -> u.
Then, for rules like A -> u1u2u3, we break it down into A -> u1A1, A1 -> u2A2 ...
Stacks are nice because they can hold an unlimited amount of information. We can recognize nonregular languages like 0^n1^n with Stacks.
Pushdown automata is finite automata with a stack.
E is the input alphabet, L is the stack alphabet,
Reading:
This means that the transition function is Q x Ee x Le -- the current state, next input symbol, and top symbol of stack determine the next move. e in context of input E is don't read symbol, e in the context of stack L is don't read stack.
Writing:
Pushdown Automata can enter a new state, or write a symbol on top of stack. This means that our $\delta$ function returns a member of Q (a new state) and a member of L (something to write to stack)
So our formal definition for a pushdown automata is similar to finite, with additional operations for reading and writing to the stack, and a stack alphabet L:
Q: set of states
E: input alphabet
L: stack alphabet
$\delta$: Q x E x L --> P(Q x L ) transition function
$q_0 \in Q$ start state
$F \subset Q$ set of accept states
There's a nice diagram for recognizing 0^n1^n
There's a nice diagram for recognizing WW^R
![[Pasted image 20210305130011.png]]
If context free, we can use pushdown automata to recognize it.
### Using a pushdown automata to recognize context free grammars, nondeterministically.
We cheat by always pushing a $ to top of stack, so when we read it we can tell when the stack is empty.
We use the stack to store intermediate strings, but remove any terminal symbols before the first variable (and compare them one-by-one with the input). If it doesn't match, fail this branch.
If this intermediate string we've stored in the stack has been completely processed, we will reach the $ symbol we put at the end of the stack and enter the accept state.
![[Pasted image 20210305130459.png]]
Example:
![[Pasted image 20210305130900.png]]
Chapter 3 notes
# Chapter 3
The Turing machine is a finite automaton with an unlimited memory.
This is how we defined a finite automata:
1. $Q$ is a set of **states**
2. $\Sigma$ is the **alphabet**
3. $\delta: Q \times \Sigma \to Q$ is the **transition function**
4. $q_{0} \in Q$ is the **start state**
5. $F \subseteq Q$ is the **set of accept states**
The Turing machine is similar but can read and write onto a tape, move back and forth, and has infinite space.
If the Turing machine never enters an accept or reject state, it will go on forever, never halting.
We can give formal definitions for Turing machines but they tend to be very complex and boring.
Because we get to move the head around the tape, the transition function for a Turing machine is not just {state, letter} --> {state} as with the finite automata. Instead we get to go from {state, letter (IF READ)} --> {state, letter (TO WRITE), {Left, Right}}. The machine goes from the first state to the second, and replaces the first letter with the second (if it exists at the current position at the tape). Then it moves to left or right on tape.
A configuration consists of a state and a position on the tape. This is the convention for writing a configuration compactly:
1011q_701111 means that we are on the "0" and state q_7. The symbol that the head is on is in front of the state.

and

When we start a TM on an input, it will either accept, reject, or loop. We don't like looping. We say a machine is "deciding" a language when it is rejecting or accepting on all inputs. A language is decidable if a Turing Machine can decide it. (this is also known as a recursive language because it can handle any input)
There is a Turing Machine described which recognizes the language of all strings of 0s whose length is power of 2. It (recursively) works by crossing off half the 0s and continuing only if there are still an even number of 0s left.
There is a Turing machine which recognizes matching sequences w1 and w2 separated by a # sign. Go look at it.
That machine looks complicated but really has two simple parts. This part is simply for checking the 0 or 1 that we replaced with the blank against the symbol directly after the pound sign:

And this recursive loop does the bulk of the check work after that:

We can also construct a Turing Machine to check multiplication, i x j = k, for a sequence a^i b^j c^k, by crossing off b c's for each a we cross off. And then accepting if everything is crossed off.
So far we have been replacing the first symbol with a blank character _ to mark the leftmost edge. A neater way of detecting the left edge of the tape is to write a special symbol, attempt to move to left, and then check if the symbol under the tape has changed.
Some important things that make Turing machines so powerful:
All nondeterministic turing machines can be simulated by deterministic ("normal") ones.
All multitape turing machines can be simulated by single-tape turing machines.
This builds up to the Church Turing Thesis, which says that all mechanical/ systematic/ finite & describable computations can be done by a Turing machine.