# Constituency Grammars ###### tags: `NLP` `Coref Resolution` From **[Chapter 12: Constituency Grammars](https://web.stanford.edu/~jurafsky/slp3/12.pdf)** for Sequence Processing in **Speech and Language Processing**, Dan Jurafsky and James h. Martin, *3rd ed.* - The word **syntax** refers to the way words are arranged together. - We explore syntactic phenomena that goes beyond simpler approaches, and formal models for capturing them in a computationally useful manner. - This chapter focuses on context-free grammars, the backbone of many formal models of the syntax of natural language. - They are powerful enough to express sophisticated relations among the words in a sentence, yet - Computationally tractable enough that efficient algorithms exist for parsing sentences with them. ## Constituency - Syntactic constituency is the idea that groups of words can behave as single units, or **constituents**. - Consider the **noun phrase**, a sequence of words surrounding at least one noun. - These words form constituents, and one evidence is that they can all appear in similar syntactic environments, for example, *before a verb*. - On the other hand, while the whole noun phrase can occur before a verb, this is not true of each of the individual words that make up a noun phrase. ![](https://i.imgur.com/pOhzlhK.png) - Other kinds of evidence for constituency come from what are called **preposed** or **postposed** constructions. - For example, the prepositional phrase *on September seventeenth* can be placed in a number of different locations in the following examples, including at the beginning (preposed) or at the end (postposed): > *On September seventeenth*, I'd like to fly from Atlanta to Denver > I'd like to fly *on September seventeenth* from Atlanta to Denver > I'd like to fly from Atlanta to Denver *on September seventeenth* - But again, while the entire phrase can be placed differently, the individual words making up the phrase cannot be (an asterisk (*) marks fragments that are not grammatical English sentences): > *++On September++, I'd like to fly ++seventeenth++ from Atlanta to Denver > *++On++ I'd like to fly ++September seventeenth++ from Atlanta to Denver > *I'd like to fly ++on September++ from Atlanta to Denver ++seventeenth++ ## Context-Free Grammars - The most widely used formal system for modeling constituent structure in English and other natural languages is the **Context-Free Grammar**, or **CFG**. - Context-free grammars are also called **Phrase-Structure Grammars**, and the formalism is equivalent to **Backus-Naur From**, or **BNF**. - A context-free grammar consists of a set of **rules** or **productions**, each of which expresses the ways that symbols of the language can be grouped and ordered together, and a **lexicon** of words and symbols. - For example, the following productions express that an **NP** (**noun phrase**) can be composed of either a $ProperNoun$ or a determiner ($Det$) followed by a $Nominal$; a $Nominal$ in turn can consist of one or more $Nouns$. \begin{align} NP \; &\rightarrow \; Det Nominal \\ NP \; &\rightarrow \; ProperNoun \\ Nominal \; &\rightarrow \; Noun \; | \; Nominal Noun \end{align} - Context-free rules can be hierarchically embedded, so we can combine the previous rules with others, like the following, that express facts about the lexicon: \begin{align} Det \; &\rightarrow \; a \\ Det \; &\rightarrow \; the \\ Noun \; &\rightarrow \; flight \\ \end{align} - The symbols that are used in a CFG are divided into two classes. - The symbols that correspond to words in the language ("the", "nightclub") are called **terminal** symbols; the lexicon is the set of rules that introduce these terminal symbols. - The symbols that express abstractions over these terminals are called **non-terminals**. - In each context-free rule, - The item to the right of the arrow ($\rightarrow$) is an ordered list of one or more terminals and non-terminals; - To the left of the arrow is a single non-terminal symbol expressing some cluster or generalization. - The non-terminal associated with each word in the lexicon is its lexical category, or part of speech. - A CFG can be thought of in two ways: as a device for *generating sentences* and as a device for *assigning a structure to a given sentence*. - Viewing a CFG as a generator, we can read the $\rightarrow$ as "rewrite the symbol on the left with the string of symbols in the right". ![](https://i.imgur.com/2md8Faz.png) - We say the string *a flight* can be derived from the non-terminal *NP*. Thus, a CFG can be used to generate a set of strings. - This sequence of rule expansions is called a **derivation** of the string of words. - Is it common to represent a derivation by a **parse tree** (*Fig. 12.1* shows the tree representation of this derivation). ![](https://i.imgur.com/H0bZnO1.png) - In the parse tree shown in Fig. 12.1, we can say that the node *NP* **dominates** all the nodes in the tree (*Det, Nom, Noun, a, flight*). - We can say further that it immediately dominates the nodes *Det* and *Nom*. - The formal language defined by a CFG is the set of strings that are derivable from the designated **start symbol**. - Each grammar must have one designated start symbol, which is often called *S*. - *S* is usually interpreted as the "sentence" node. - The set of strings that are derivable from *S* is the set of sentences in some simplified version of English. - We add a few additional rules to the inventory: - The following rule expresses the fact that a sentence can consist of a noun phrase followed by a **verb phrase**: \begin{align} S \; \rightarrow \; \text{NP} \; \text{VP} \quad \text{I prefer a morning flight} \end{align} - A verb phrase in English consists of a verb followed by assorted other things. - For example, one kind of verb phrase consists of a verb followed by a noun phrase: \begin{align} VP \; \rightarrow \; \text{Verb} \; \text{NP} \quad \text{prefer a morning flight} \end{align} - Or the verb may be followed by a noun phrase and a prepositional phrase: \begin{align} VP \; \rightarrow \; \text{Verb} \; \text{NP} \; \text{PP} \quad \text{leave Boston in the morning} \end{align} - Or the verb phrase may have a verb followed by a prepositional phrase alone: \begin{align} VP \; \rightarrow \; \text{Verb} \; \text{PP} \quad \text{leaving on Thursday} \end{align} - A prepositional phrase generally has a preposition followed by a noun phrase. - For example, a common type of prepositional phrase in the ATIS corpus is used to indicate location or direction: \begin{align} PP \; \rightarrow \; \text{Preposition} \; \text{NP} \quad \text{from Los Angeles} \end{align} - The *NP* inside a *PP* need not be a location; *PPs* are often used with times and dates, and with other nouns as well; they can be arbitrarily complex: ![](https://i.imgur.com/gPGlfI5.png) - *Fig. 12.2* gives a sample lexicon, and *Fig. 12.3* summarizes the grammar rules we've seen so far, which we'll call $\mathscr{L}_0$. - Note that we can use the or-symbol $|$ to indicate that a non-terminal has alternate possible expansions. ![](https://i.imgur.com/4HbeRo9.png) - We can use this grammar to generate sentences of this "ATIS-language". - We start with *S*, expand it to *NP* *VP*, then choose a random expansion of *NP* (i.e, to *I*), and a random expansion of *VP* (i.e. *Verb NP*), and so on until we generate the string *I prefer a morning flight*. - *Fig. 12.4* shows a parse tree that represents a complete derivation of *I prefer a morning flight*. ![](https://i.imgur.com/Cdm32LB.png) - We can also represent a parse tree in a more compact format called bracketed notation. - Here's the bracketed representation of the parse tree of *Fig. 12.4*: ![](https://i.imgur.com/LzSKQCi.png) - A CFG like that of $\mathscr{L}_0$ defines a formal language. A formal language is a set of strings. - Sentences (strings of words) that can be derived by a grammar are in the formal language defined by that grammar, and are called **grammatical** sentences. - Sentences that cannot be derived by a given formal grammar are not in the language defined by that grammar and are referred to as **ungrammatical**. - This hard line between "in" and "out" characterizes all formal languages but is only a very simplified model of how natural languages really work. - This is because determining whether a given sentence is part of a given natural language often depends on the context. - In linguistics, the use of formal languages to model natural languages is called **generative grammar** since the language is defined by the set of possible sentences "generated" by the grammar. ### Formal Definition of Context-Free Grammar - A context-free grammar $G$ is defined by four parameters: $N$, $\Sigma$, $R$, $S$ (technically this is a "4-tuple"). - $N$: a set of **non-terminal symbols** (or **variables**) - $\Sigma$: a set of **terminal symbols** (disjoint from $N$) - $R$: a set of **rules** or productions, each of the form $A \rightarrow \beta$, where - $A$ is a non-terminal, - $\beta$ is a string of symbols from the infinite set of strings $(\Sigma \cup N)^*$ - $S$: a designated **start symbol** and a member of $N$ - For the remainder of the section we adhere to the following conventions when discussing the formal properties of context-free grammars: ![](https://i.imgur.com/xLp8uNy.png) - A language is defined through the concept of derivation. One string derives another one if it can be rewritten as the second one by some series of rule applications. - More formally, if $A \rightarrow \beta$ is a production of $R$ and $\alpha$ and $\gamma$ are any strings in the set $(\Sigma \cup N)^*$, then we say that $\alpha A \gamma$ **directly derives** $\alpha \beta \gamma$, or $\alpha A \gamma \Rightarrow \alpha \beta \gamma$ - Derivation is then a generalization of direct derivation: \begin{gather} \text{Let } \; \alpha_1, \alpha_2,..., \alpha_m \text{ be strings in } (\Sigma \cup N)^* \text{, $m \geq 1$, such that} \\ \alpha_1 \Rightarrow \alpha_2, \alpha_2 \Rightarrow \alpha_3,..., \alpha_{m-1} \Rightarrow \alpha_m \\ \text{We say that } \alpha_1 \textbf{ derives } \alpha_m \text{, or } \alpha_1 \Rightarrow^* \alpha_m \end{gather} - We can the formally define the language $\mathscr{L}_G$ generated by a grammar $G$ as the set of strings composed of terminal symbols that can be derived from the designated start symbol $S$. \begin{align} \mathscr{L}_G = \{w|w \text{ is in } \Sigma * \text{ and } S \Rightarrow^* w\} \end{align} - The problem of mapping from a string of words to its parse tress is called **syntactic parsing**. ## Some Grammar Rules for English Consulting a good reference grammar of English is strongly advised, such as [**The Cambridge Grammar of the English Language**](https://www.goodreads.com/book/show/934984.The_Cambridge_Grammar_of_the_English_Language). ### Sentence-Level Constructions - Among the large number of constructions for English sentences, four are particularly common and important: **declaratives**, **imperatives**, **yes-no questions**, and **wh-questions**. - Sentences with **declarative** structure have a subject noun phrase followed by a verb phrase. - Sentences with this structure have a great number of different uses. - Here are a number of example from the ATIS domain: > I want a flight from Ontario to Chicago > The flight should be eleven a.m. tomorrow > The return flight should leave at around seven p.m. - Sentences with **imperative** structure often begin with a verb phrase and have no subject. - They are called imperative because they are almost always used for commands and suggestions. - In the ATIS domain they are commands to the system: > Show the lowest fare > Give me Sunday's flights arriving in Las Vegas from New York City > List all flights between five and seven p.m. - We can model this sentence structure with another rule for the expansion of $S$: \begin{align} S \rightarrow \text{VP} \end{align} - Sentences with **yes-no question** structure begin with an auxiliary verb, followed by a subject $NP$, followed by a $VP$. - They are often (though not always) used to ask questions. - Here are some example; note that the third example is not at all a question but a request (**pragmatic** functions): > Do any of these flights have stops? > Does American's flight eighteen twenty five serve dinner? > Can you give me the same information for United? - Here's the rule: \begin{align} S \rightarrow \text{Aux } \text{ NP } \text{ VP} \end{align} - The most complex sentence-level structures are the various **wh-** structures. - They are so named because one of their constituents is a **wh-phrase**, that is, one that includes a **wh-word**. - *who, whose, when, where, what, which, how, why* - These may be boardly grouped into two classes of sentence-level structures. - The **wh-subject-question** structure is identical to the declarative structure, except that the first noun phrase contains some wh-word. - Here are some examples: > What airlines fly from Burbank to Denver? > Which flights depart Burbank after noon and arrive in Denver by six p.m? > Whose flights serve breakfast? - Here is a rule: \begin{align} S \rightarrow \text{Wh-NP } \text{ VP} \end{align} - In the **wh-non-subject-question** structure, the wh-phrase is not the subject of the sentence, and so the sentence includes another subject. - In these types of sentences the auxiliary appears before the subject $NP$, just as in the yes-no question structures. - Here is an example followed by a sample rule: > What flights do you have from Burbank to Tacoma Washington? \begin{align} S \; \rightarrow \; \text{Wh-NP } \text{ Aux } \text{ NP } \text{ VP} \end{align} - Cosntructions like the **wh-non-subject-question** contain what are called **long-distance dependencies**. - Because the *Wh-NP what flights* is far away from the predicate that it is semantically related to, the main verb *have* in the *VP*. ### Clauses and Sentences - There is more to $S$ than just standing alone as a unit of discourse. - $S$ rules are intended to account for entire sentences that stand alone as fundamental units of discourse. - $S$ can also occur on the right-hand side of grammar rules and hence can be embedded within larger sentences. - What differentiates sentence constructions (i.e., the $S$ rules) from the rest of the grammar is the notion that they are in some sense *complete*. - In this way they correspond to the notion of a **clause**. - Which traditional grammars often describe as forming a complete thought. - One way of making the ntion of "complete thought" more precise is to say $S$ is a node of the parse tree below which the main verb of the $S$ has all of its **arguments**. ### The Noun Phrase - Our $\mathscr{L}_0$ grammar introduced three of the most frequent types of noun phrases that occur in English: pronouns, proper nouns and the $NP \rightarrow \text{Det } \text{ Nominal}$ construction. - We focus on the $NP \rightarrow \text{Det } \text{ Nominal}$ construction since that is where the bulk of the syntactic complexity resides. - These noun phrases consist of a head, the central noun in the noun phrase, along with various modifiers that can occur before or after the head noun. #### The Determiner - Noun phrases can begin with simple lexical determiners: > a stop, the flights, this flight, those flights, any flights, some flights - The role of the determiner can also be filled by more complex expressions: > United's flight > United's pilot's union > Denver's mayor's mother's canceled flight - In these examples, the role of the determiner is filled by a possessive expression consisting of a noun phrase followed by an *'s* as a possessive marker, as in the following rule: \begin{align} Det \rightarrow \text{NP } \text{'s} \end{align} - The fact that this rule is recursive (since an *NP* can start with a *Det* helps us model the last two examples above, - In which a sequence of possessive expression serves as a determiner. - Under some circumtances determiners are optional in English. For example, determiners may be omitted if the noun they modify is plural. > Show me *flights* from San Fransisco to Denver on weekdays - **Mass nouns** also don't require determination. > Does this flight serve *dinner*? #### The Nominal - The nominal construction follows the determiner and contains any pre- and post- head noun modifiers. - As indicated in grammar $\mathscr{L}_0$, in its simplest form a nominal can consist of a single noun. \begin{align} \text{Nominal } \rightarrow \text{ Noun} \end{align} - This rule also provides the basis for the bottom of various recursive rules used to capture more complex nominal constructions. #### Before the Head Noun - A number of different kinds of word classes can appear before the head noun but after the determiner (the "postdeterminers") in a nominal. - These include **cardinal numbers**, **ordinal numbers**, **quantifiers**, and adjectives. - Example of cardinal numbers: > two friends, one stop - Ordinal numbers include *first*, *second*, *third*, and so on, but also words like *next*, *last*, *past*, *other*, and *another*: > the first one, the next day, the second leg > the last flight, the other American flight - Some quantifiers (*many, (a) few, several*) occur only with plural count nouns: > many fares - Adjectives occur after quantifiers but before nouns. > a *first-class* fare, a *non-stop* flight > the *longest* layover, the *earliest* lunch flight - Adjectives can also be grouped into a phrase called an **adjective phrase** or AP. - APs can have an adverb before the adjective: > the *least expensive* fare #### After the Head Noun - A head noun can be followed by postmodifiers. - Three kinds of nominal postmodifiers are common in English: - Prepositional phrases > all flights *from Cleveland* - Non-finite clauses > any flights *arriving after elevent a.m.* - Relative clauses > a flight *that serves breakfast* - Here are some examples of prepositional phrase postmodifiers, with brackets inserted to show the boundaries of each PP; note that two or more PPs can be strung together within a single NP: > all flights *[from Cleveland] [to Newark]* > arrival *[in San Jose] [before seven p.m.]* > a reservation *[on flight six oh six] [from Tampa] [to Montreal]* - Here's a new nominal rule to account for postnominal *PPs*: \begin{align} \text{Nominal } \rightarrow \text{Nominal } \text{ PP} \end{align} - The three most common kinds of non-finite postmodifiers are the gerundive (*-ing*), *-ed*, and infinitive forms. - **Gerundive** postmodifiers consist of a verb phrase that begins with the gerundive (*-ing*) form of the verbs. > any of those *[leaving on Thursday]* > any flights *[arriving after eleven a.m.]* > flights *[arriving within thirty minutes of each other]* - We can define the $Nominals$ with gerundive modifiers as follows, making use of a new non-terminal $GerundVP$: \begin{align} \text{Nominal } \rightarrow \text{ Nominal } \text{ GerundVP} \end{align} - We can make rules for $GerundVP$ constituents by duplicating all of our VP productions, substituting $GerundV$ for $V$. \begin{align} \text{GerundVP } &\rightarrow \text{ GerundV} \\ &| \text{ GerundV NP } | \text{ GerundV NP PP } | \text{ GerundV PP} \\ \end{align} - $GerundV$ can then be defined as \begin{align} \text{GerundV } \rightarrow \text{ being } | \text{ arriving } | \text{ leaving } | \text{ ...} \end{align} - The phrases in italics below are examples of the two other common kinds of non-finite clauses, infinitives and *-ed* forms: > the last flight *to arrive in Boston* > I need to have dinner *served* > Which is the aircraft *used by this flight*? - A postnominal relative clause (more correctly a **restrictive relative clause**), is a clause that often begins with a **relative pronoun** (*that* and *who* are the most common). - The relative pronoun functions as the subject of the embedded verb in the following examples: > a flight *that serves breakfast* > flights *that leave in the morning* > the one *that leaves at ten thirty five* - We might add rules like the following to deal with these: \begin{align} \text{Nominal } &\rightarrow \text{ Nominal RelClause} \\ \text{RelClause } &\rightarrow (\text{who } | \text{ that}) \text{ VP} \end{align} - The relative pronoun may also function as the object of the embedded verb, as in the following example. > the earliest American Airlines flight that I can get - Various postmonimal modifiers can be combined: > a flight *[from Phoenix to Detroit] [leaving Monday evening]* > evening flights *[from Nashville to Houston] [that serve dinner]* > a friend *[living in Denver] [that would like to visit me in DC]* #### Before the Noun Phrase - Word classes that modify and appear before *NPs* are called **predeterminers**. - Many of these have to do with number or amount; a common predeterminer is *all*: > all the flights, all flights, all non-stop flights - The example noun phrase given in *Fig. 12.5* illustrates some of the complexity that arises when these rules are combined. ![](https://i.imgur.com/FMcalTn.png) ### The Verb Phrase - The verb phrase consists of the verb and a number of other constituents. - In the simple rules so far, these other constituents include *NPs* and *PPs* and combinations of the two: \begin{align} \textit{VP } &\rightarrow \textit{ Verb} \quad \text{disappear} \\ \textit{VP } &\rightarrow \textit{ Verb NP} \quad \text{prefer a morning flight} \\ \textit{VP } &\rightarrow \textit{ Verb NP PP} \quad \text{leave Boston in the morning} \\ \textit{VP } &\rightarrow \textit{ Verb PP} \quad \text{leaving on Thursday} \end{align} - Verb phrases can be significantly more complicated. Many other kinds of constituents, such an an entire embedded sentence, can follow the verb. - These are called **sentential complements**. ![](https://i.imgur.com/Crqjf1S.png) - Here's a rule for these: \begin{align} \textit{VP } \rightarrow \textit{ Verb S} \end{align} - Similarly, another potential constituent of the *VP* is another VP. This is often the case for verbs like *want, would like, try, intent, need*: > I want $[_{VP}$ to fly from Milwaukee to Orlando$]$ > Hi, I want $[_{VP}$ to arrange three flights$]$ - While a verb phrase can have many possible kinds of constituents, not every verb is compatible with every verb phrase. - For example, the verb *want* can be used either with an *NP* complement (*I want a flight...*) or with an infinitive *VP* complement (*I want to fly to...*). - By contrast, a verb like *find* cannot take this sort of *VP* complement (**I found to fly to Dallas*). - Traditional grammar distinguishes between - **Transitive** verbs like *find*, which take a direct object *NP* (*I found a flight*), and - **Intransitive** verbs like *disappear*, which do not (**I disappeared a flight*). - We say traditional grammars **subcategorize** verbs into these two categories. - Modern grammars distinguish as many as 100 subcategories. - For example, a verb like *find* **subcategorizes for** an *NP*, and a verb like *want* subcategorizes for either an *NP* or a non-finite *VP*. - We also call these constituents the **complements** of the verb (hence the use of the term **sentential complement**). - We say that *want* can take a *VP* complement. - These possible sets of complements are called the **subcategorization frame** of the verb. - Subcategorization frames for a set of example verbs are given in *Fig. 12.6*. ![](https://i.imgur.com/NrJrWPZ.png) - We can capture the association between verbs and their complements by making separate subtypes of the class Verb: \begin{align} \textit{Verb-with-NP-complement } &\rightarrow \text{ find | leave | repeat | ...} \\ \textit{Verb-with-S-complement } &\rightarrow \text{ think | believe | say | ...} \\ \textit{Verb-with-Inf-VP-complement } &\rightarrow \text{ want | try | need | ...} \end{align} - Each *VP* rule could then be modified to require the appropriate verb subtype: \begin{align} \textit{VP } &\rightarrow \textit{ Verb-with-no-complement } \text{ disappear} \\ \textit{VP } &\rightarrow \textit{ Verb-with-NP-comp NP } \text{ prefer a morning flight} \\ \textit{VP } &\rightarrow \textit{ Verb-with-S-comp S } \text{ said there were two flights} \end{align} - A problem with this approach is the significant increase in the number of rules and the associated loss of generality. ### Coordination - The major phrase types discussed can be conjoined with conjunctions like and, or, and but to form larger constructions of the same type. - For example, a coordinate noun phrase can consist of two other noun phrases separated by a conjunction: > Please repeat $[_{NP}[_{NP}$ the flights$]$ *and* $[_{NP}$ the costs$]]$ > I need to know $[_{NP}[_{NP}$ the aircraft$]$ *and* $[_{NP}$ the flight number$]]$ - Here's a rule that allows these structures: \begin{align} \textit{NP } \rightarrow \textit{ NP and NP} \end{align} - Note that the ability to form coordinate phrases though conjunctions is often used as a test for constituency. - Consider the following examples, which differ from the ones given above in that they lack the second determiner: > Please repeat $[_{Nom}[_{Nom}$ flights$]$ *and* $[_{Nom}$ costs$]]$ > I need to know $[_{Nom}[_{Nom}$ aircraft$]$ *and* $[_{Nom}$ flight number$]]$ - The fact that these phrases can be conjoined is evidence for the presence of the underlying $\textit{Nominal}$ constituent we have been making use of. Here's a rule for this: \begin{align} \textit{Nominal } \rightarrow \textit{ Nominal and Nominal} \end{align} - The following examples illustrate conjunctions involving $\textit{VPs}$ and $\textit{Ss}$: > What flights do you have $[_{VP}[_{VP}$ leaving Denver$]$ *and* $[_{VP}$ arriving in San Francisco$]]$ > $[_S[_S$ I'm interested in a flight from Dallas to Washington$]$ *and* $[_S$ I'm also interested in going to Baltimore$]]$ - The rules for $\textit{VP}$ and $\textit{S}$ conjunctions mirro the $\textit{NP}$ one given above: \begin{align} \textit{VP } &\rightarrow \textit{ VP and VP} \\ \textit{S } &\rightarrow \textit{ S and S} \end{align} - Since all the major phrase types can be conjoined in this fashion, it is also possible to represent this conjuntion fact more generally. - A number of grammar formalisms such as GPSG do this using **metarules** like: \begin{align} \textit{X } \rightarrow \textit{ X and X} \end{align} - This metarule states that any non-terminal can be conjoined with the same non-terminal to yield a constituent of the same type. - The variable $\textit{X}$ must be designated as a variable that stands for any non-terminal rather than a non-terminal itself.