A context-free grammar providers a simple and mathematically percise mechanism for describing the methods by which phrases in some natural language are built from smaller blocks, capturing the "block structure" of sentences in a natural way.
The formalism of context-free grammars was developed in the 1956 by Noam Chomsky.
Context-Free Grammar 擁有足夠強的表達力來表示大多數程式設計語言的語法;實際上,幾乎所有程式設計語言都是通過 Context-Free Grammar 來定義的。
相比 Context-Sensitive,正如因為規則變化時,不會因為上下文的關係而有所不同,所以才會稱呼為 Context-Free。
the non-terminal and the string of to place .
Design a context-free grammar for this language .
我們可以利用該 Grammar 來產生符合該文法的字串:
在產生的過程,為了跟 Grammar 上的箭頭 做區分,記號上會使用 ,並讀作 "yields";而整個字串的產生過程就稱為 "String Derivation"。
Context-Free Grammar 中的 Productions 有兩種常見書寫方式,一種為 Backus-Naur Form;另一種為 Chomsky Normal Forms,各有不同的優缺點:
<variable>
is a non-terminal symbol.<expression>
consists of one or more sequences of terminal and non-terminal symbol and each sequences is separated by the |
symbol.BNF is easily understood by humans, useful for Parser Generators, Uses LL, LR, LALR Parsing algorithms.
is the start variable, are any two non-terminal symbol and is a terminal.
CNF is often difficult to understand, useful for theorem proving, Uses the Cocke-Younger-Kasami Parsing algorithms.
A languages which is generated from a Context-Free Grammar is a Context-Free Languages.
通常最好的 Context-Free Language 的範例就是 Balanced Parenthesis:
或是 Arithmetic Expressions(例如加法與乘法):
All Regular Languages are Context-Free Languages.
Corollary: The class of regular language is a subset of context-free language.
But, some context-free language are not regular, for example:
By following the derivation of a string, we can construct a parsing tree.
Theorem: If , can be generated by using left-most derivation, right-most derivation or both the methods.
形容有一個字串有一個以上的 parsing tree 可以應對(在同一種 most derivation)。
A grammar is ambiguous, if there are some strings that have more than one distinct parsing tree.
Theorem: Having different derivation methods does not necessarily imply generating distinct parsing tress.
Ambiguous grammar cause that we have multiple ways to execute the program segment.
Theorem: There are some context-free languages that can only be generated by ambiguous grammars.
形容某一種語言用文法表示時必是 Ambiguous。
天生就存在 Ambiguous Grammars 的 Context-Free Languages,請避免設計這樣的語言,因為無解。
最好的例子就是四則運算:
同樣產生出來的字串都是 "id + id * id",但到底是先乘還是先加,在 CFG 中是沒有明確的被定義,所以這樣就是一種 Ambiguous。
A context-free grammar is right-linear, if each grammar rule is in either one of the following formats:
A context-free grammar is left-linear if and only if all its grammar rules obey the following formats:
CFL are closed under union, Kleene star, concatenation.
The intersection of a CFL and a regular language is a CFL (not regular).
CFL are not closed under intersection, complement, difference.
DCFL are closed under complement, concatenation, Kleene star.
The intersection of a DCFL and a regular language is a DCFL (not regular).
DCFL are not closed under union. (common CFL)
DCFL are not closed under intersection. (non-CFL)
Definition: A context-free grammar is regular if and only if it is right(left)-linear.
Theorem: A regular grammar can be modified into a left-linear grammar.
Theorem: Regular languages are generated by regular grammars.
So, this context-free grammar is right-linear.
Pushdown Automata were introduced A. G. Oettinger in 1961.
The machines accepting context-free languages.
When the stack is empty, the state is in a final state, and the input tape is empty, accepts the string.
input:
Possiable actions in one step:
An FA is a PDA without the stack, therefore FA forms a proper subset of PDA.
The current configuration of a Pushdown Automata can be described by using: .
_
: The position of the read head.A string is accepted by PDA , if
A pushdown automata is sextuple representation:
transition relation:
The input symbol may be ; The top stack string may be .
A pushdown automata uses a transition relation to direct its compution, it is a relation:
The meaning of transition rule:
PDA 一次就是 push 一層或是 pop 一層,而 push/pop 的內容可以是多個 symbol,例如 "aa";而一般來說 stack 的最底層會是 "$" 或 " "。
The language accepted by a PDA is denoted by .
is the set of strings accepted by . A language is called Context-Free Language if and only if there exists some PDA such that, .
Pushdown Automata: Language recognizer.
Context-Free Grammar: Language generator.
意思是從狀態 到狀態 ,必須要讀取到字符 ,且 stack pop 為 ,然後 push 。
, Design a Pushdown Automata for this language:
, and: 、, , , and the transition relation:
該語言為:
若畫圖為:
A Pushdown Automata is non-determinisitc in nature, because the partially defined transition relations.
Definition: If and only if there exists a sequence of transitions leading into a final configuration.
There is a language , . 這種語言只存在 Non-determinstic 版本的 PDA ,其定義如下:
其中 這一條是 non-derminstic 的關鍵,它表達為:「在不讀任何 symbol 且不參考及 pop 任何 stack 情況下改變狀態」,這個步驟是很不穩定的,有可能你的 字串,假設長度是 20,所以對半切應該要在第 20 個時走這一個規則,但是有可能在讀到第 5 個字符時就跳到 狀態,這樣就會讓原本應該要是屬於該語言的字串被誤判為不存在;不過因為是存在一種機率這個是可以判定正確,所以這樣的設計是合法的。
Try "aabbaa":
Definition: is a Deterministic Push Down Automata, if it satisfies the following condition:
For each configuration of , there is exactly one configuraion can succeed it.
NPDA 在實務上太難達成了(因為猜錯的機率太大),所以 DPDA 比較能符合在實務上的機械實現。
Constraints on the transition relation :
The computation of a DPDA is controlled by a transition relation, not a function:
有存在一種方法讓無窮迴圈去掉,讓字串讀完且停在非 final 的 state 上,讓 PDA 在實務上實現成為可能。
Need to signal the end of the input string and the bottom of the stack.
$
,來做記號代表為 stack 的最底層。Most computer programming languages are DCFL.
DCFL can be pared by LR(k) parser in , In General CFL can be parsed within .
The equivalence problem of two DCFL is solvable: DCFL .
DCFL is closed under complement.
Theorem: If is a DCFL, can be generated by an unambiguous CFG.
Unambiguous CFG 不一定是 Deterministic CFL。
Theorem: Deterministic Context-Free Language form a proper subset of Context-Free Language.
However, Deterministic Context-Free Language is still a proper super-set of regular languages.
Given a context-free language , there exists (called a "pumping length"), such that:
For any string with , can be split into , , , and , such that:
Aka. strong version of classic pumping lemmaf for context-free languages.
Given a context-free language , such that for any string with and every way of "marking" or more of the symbols in . The string can be written as .
With strings , , , and , such that:
https://math.stackexchange.com/questions/217741/how-do-i-show-language-of-prime-number-of-xs-not-context-free
https://cs.stackexchange.com/questions/140991/proof-that-aibjck-mid-i-j-k-in-mathbbn-ikj-is-not-context-free-usi
https://cs.stackexchange.com/questions/64056/prove-if-l-0m1n-mid-m-neq-n-is-regular-or-not/64074#64074
https://cs.stackexchange.com/questions/150051/is-anbmck-n-neq-m-and-m-neq-k-context-free
計算理論