# Constituency Grammars
###### tags: `NLP` `Coref Resolution`
From **[Chapter 12: Constituency Grammars](https://web.stanford.edu/~jurafsky/slp3/12.pdf)** for Sequence Processing in **Speech and Language Processing**, Dan Jurafsky and James h. Martin, *3rd ed.*
- The word **syntax** refers to the way words are arranged together.
- We explore syntactic phenomena that goes beyond simpler approaches, and formal models for capturing them in a computationally useful manner.
- This chapter focuses on context-free grammars, the backbone of many formal models of the syntax of natural language.
- They are powerful enough to express sophisticated relations among the words in a sentence, yet
- Computationally tractable enough that efficient algorithms exist for parsing sentences with them.
## Constituency
- Syntactic constituency is the idea that groups of words can behave as single units, or **constituents**.
- Consider the **noun phrase**, a sequence of words surrounding at least one noun.
- These words form constituents, and one evidence is that they can all appear in similar syntactic environments, for example, *before a verb*.
- On the other hand, while the whole noun phrase can occur before a verb, this is not true of each of the individual words that make up a noun phrase.

- Other kinds of evidence for constituency come from what are called **preposed** or **postposed** constructions.
- For example, the prepositional phrase *on September seventeenth* can be placed in a number of different locations in the following examples, including at the beginning (preposed) or at the end (postposed):
> *On September seventeenth*, I'd like to fly from Atlanta to Denver
> I'd like to fly *on September seventeenth* from Atlanta to Denver
> I'd like to fly from Atlanta to Denver *on September seventeenth*
- But again, while the entire phrase can be placed differently, the individual words making up the phrase cannot be (an asterisk (*) marks fragments that are not grammatical English sentences):
> *++On September++, I'd like to fly ++seventeenth++ from Atlanta to Denver
> *++On++ I'd like to fly ++September seventeenth++ from Atlanta to Denver
> *I'd like to fly ++on September++ from Atlanta to Denver ++seventeenth++
## Context-Free Grammars
- The most widely used formal system for modeling constituent structure in English and other natural languages is the **Context-Free Grammar**, or **CFG**.
- Context-free grammars are also called **Phrase-Structure Grammars**, and the formalism is equivalent to **Backus-Naur From**, or **BNF**.
- A context-free grammar consists of a set of **rules** or **productions**, each of which expresses the ways that symbols of the language can be grouped and ordered together, and a **lexicon** of words and symbols.
- For example, the following productions express that an **NP** (**noun phrase**) can be composed of either a $ProperNoun$ or a determiner ($Det$) followed by a $Nominal$; a $Nominal$ in turn can consist of one or more $Nouns$.
\begin{align}
NP \; &\rightarrow \; Det Nominal \\
NP \; &\rightarrow \; ProperNoun \\
Nominal \; &\rightarrow \; Noun \; | \; Nominal Noun
\end{align}
- Context-free rules can be hierarchically embedded, so we can combine the previous rules with others, like the following, that express facts about the lexicon:
\begin{align}
Det \; &\rightarrow \; a \\
Det \; &\rightarrow \; the \\
Noun \; &\rightarrow \; flight \\
\end{align}
- The symbols that are used in a CFG are divided into two classes.
- The symbols that correspond to words in the language ("the", "nightclub") are called **terminal** symbols; the lexicon is the set of rules that introduce these terminal symbols.
- The symbols that express abstractions over these terminals are called **non-terminals**.
- In each context-free rule,
- The item to the right of the arrow ($\rightarrow$) is an ordered list of one or more terminals and non-terminals;
- To the left of the arrow is a single non-terminal symbol expressing some cluster or generalization.
- The non-terminal associated with each word in the lexicon is its lexical category, or part of speech.
- A CFG can be thought of in two ways: as a device for *generating sentences* and as a device for *assigning a structure to a given sentence*.
- Viewing a CFG as a generator, we can read the $\rightarrow$ as "rewrite the symbol on the left with the string of symbols in the right".

- We say the string *a flight* can be derived from the non-terminal *NP*. Thus, a CFG can be used to generate a set of strings.
- This sequence of rule expansions is called a **derivation** of the string of words.
- Is it common to represent a derivation by a **parse tree** (*Fig. 12.1* shows the tree representation of this derivation).

- In the parse tree shown in Fig. 12.1, we can say that the node *NP* **dominates** all the nodes in the tree (*Det, Nom, Noun, a, flight*).
- We can say further that it immediately dominates the nodes *Det* and *Nom*.
- The formal language defined by a CFG is the set of strings that are derivable from the designated **start symbol**.
- Each grammar must have one designated start symbol, which is often called *S*.
- *S* is usually interpreted as the "sentence" node.
- The set of strings that are derivable from *S* is the set of sentences in some simplified version of English.
- We add a few additional rules to the inventory:
- The following rule expresses the fact that a sentence can consist of a noun phrase followed by a **verb phrase**:
\begin{align}
S \; \rightarrow \; \text{NP} \; \text{VP} \quad \text{I prefer a morning flight}
\end{align}
- A verb phrase in English consists of a verb followed by assorted other things.
- For example, one kind of verb phrase consists of a verb followed by a noun phrase:
\begin{align}
VP \; \rightarrow \; \text{Verb} \; \text{NP} \quad \text{prefer a morning flight}
\end{align}
- Or the verb may be followed by a noun phrase and a prepositional phrase:
\begin{align}
VP \; \rightarrow \; \text{Verb} \; \text{NP} \; \text{PP} \quad \text{leave Boston in the morning}
\end{align}
- Or the verb phrase may have a verb followed by a prepositional phrase alone:
\begin{align}
VP \; \rightarrow \; \text{Verb} \; \text{PP} \quad \text{leaving on Thursday}
\end{align}
- A prepositional phrase generally has a preposition followed by a noun phrase.
- For example, a common type of prepositional phrase in the ATIS corpus is used to indicate location or direction:
\begin{align}
PP \; \rightarrow \; \text{Preposition} \; \text{NP} \quad \text{from Los Angeles}
\end{align}
- The *NP* inside a *PP* need not be a location; *PPs* are often used with times and dates, and with other nouns as well; they can be arbitrarily complex:

- *Fig. 12.2* gives a sample lexicon, and *Fig. 12.3* summarizes the grammar rules we've seen so far, which we'll call $\mathscr{L}_0$.
- Note that we can use the or-symbol $|$ to indicate that a non-terminal has alternate possible expansions.

- We can use this grammar to generate sentences of this "ATIS-language".
- We start with *S*, expand it to *NP* *VP*, then choose a random expansion of *NP* (i.e, to *I*), and a random expansion of *VP* (i.e. *Verb NP*), and so on until we generate the string *I prefer a morning flight*.
- *Fig. 12.4* shows a parse tree that represents a complete derivation of *I prefer a morning flight*.

- We can also represent a parse tree in a more compact format called bracketed notation.
- Here's the bracketed representation of the parse tree of *Fig. 12.4*:

- A CFG like that of $\mathscr{L}_0$ defines a formal language. A formal language is a set of strings.
- Sentences (strings of words) that can be derived by a grammar are in the formal language defined by that grammar, and are called **grammatical** sentences.
- Sentences that cannot be derived by a given formal grammar are not in the language defined by that grammar and are referred to as **ungrammatical**.
- This hard line between "in" and "out" characterizes all formal languages but is only a very simplified model of how natural languages really work.
- This is because determining whether a given sentence is part of a given natural language often depends on the context.
- In linguistics, the use of formal languages to model natural languages is called **generative grammar** since the language is defined by the set of possible sentences "generated" by the grammar.
### Formal Definition of Context-Free Grammar
- A context-free grammar $G$ is defined by four parameters: $N$, $\Sigma$, $R$, $S$ (technically this is a "4-tuple").
- $N$: a set of **non-terminal symbols** (or **variables**)
- $\Sigma$: a set of **terminal symbols** (disjoint from $N$)
- $R$: a set of **rules** or productions, each of the form $A \rightarrow \beta$, where
- $A$ is a non-terminal,
- $\beta$ is a string of symbols from the infinite set of strings $(\Sigma \cup N)^*$
- $S$: a designated **start symbol** and a member of $N$
- For the remainder of the section we adhere to the following conventions when discussing the formal properties of context-free grammars:

- A language is defined through the concept of derivation. One string derives another one if it can be rewritten as the second one by some series of rule applications.
- More formally, if $A \rightarrow \beta$ is a production of $R$ and $\alpha$ and $\gamma$ are any strings in the set $(\Sigma \cup N)^*$, then we say that $\alpha A \gamma$ **directly derives** $\alpha \beta \gamma$, or $\alpha A \gamma \Rightarrow \alpha \beta \gamma$
- Derivation is then a generalization of direct derivation:
\begin{gather}
\text{Let } \; \alpha_1, \alpha_2,..., \alpha_m \text{ be strings in } (\Sigma \cup N)^* \text{, $m \geq 1$, such that} \\
\alpha_1 \Rightarrow \alpha_2, \alpha_2 \Rightarrow \alpha_3,..., \alpha_{m-1} \Rightarrow \alpha_m \\
\text{We say that } \alpha_1 \textbf{ derives } \alpha_m \text{, or } \alpha_1 \Rightarrow^* \alpha_m
\end{gather}
- We can the formally define the language $\mathscr{L}_G$ generated by a grammar $G$ as the set of strings composed of terminal symbols that can be derived from the designated start symbol $S$.
\begin{align}
\mathscr{L}_G = \{w|w \text{ is in } \Sigma * \text{ and } S \Rightarrow^* w\}
\end{align}
- The problem of mapping from a string of words to its parse tress is called **syntactic parsing**.
## Some Grammar Rules for English
Consulting a good reference grammar of English is strongly advised, such as [**The Cambridge Grammar of the English Language**](https://www.goodreads.com/book/show/934984.The_Cambridge_Grammar_of_the_English_Language).
### Sentence-Level Constructions
- Among the large number of constructions for English sentences, four are particularly common and important: **declaratives**, **imperatives**, **yes-no questions**, and **wh-questions**.
- Sentences with **declarative** structure have a subject noun phrase followed by a verb phrase.
- Sentences with this structure have a great number of different uses.
- Here are a number of example from the ATIS domain:
> I want a flight from Ontario to Chicago
> The flight should be eleven a.m. tomorrow
> The return flight should leave at around seven p.m.
- Sentences with **imperative** structure often begin with a verb phrase and have no subject.
- They are called imperative because they are almost always used for commands and suggestions.
- In the ATIS domain they are commands to the system:
> Show the lowest fare
> Give me Sunday's flights arriving in Las Vegas from New York City
> List all flights between five and seven p.m.
- We can model this sentence structure with another rule for the expansion of $S$:
\begin{align}
S \rightarrow \text{VP}
\end{align}
- Sentences with **yes-no question** structure begin with an auxiliary verb, followed by a subject $NP$, followed by a $VP$.
- They are often (though not always) used to ask questions.
- Here are some example; note that the third example is not at all a question but a request (**pragmatic** functions):
> Do any of these flights have stops?
> Does American's flight eighteen twenty five serve dinner?
> Can you give me the same information for United?
- Here's the rule:
\begin{align}
S \rightarrow \text{Aux } \text{ NP } \text{ VP}
\end{align}
- The most complex sentence-level structures are the various **wh-** structures.
- They are so named because one of their constituents is a **wh-phrase**, that is, one that includes a **wh-word**.
- *who, whose, when, where, what, which, how, why*
- These may be boardly grouped into two classes of sentence-level structures.
- The **wh-subject-question** structure is identical to the declarative structure, except that the first noun phrase contains some wh-word.
- Here are some examples:
> What airlines fly from Burbank to Denver?
> Which flights depart Burbank after noon and arrive in Denver by six p.m?
> Whose flights serve breakfast?
- Here is a rule:
\begin{align}
S \rightarrow \text{Wh-NP } \text{ VP}
\end{align}
- In the **wh-non-subject-question** structure, the wh-phrase is not the subject of the sentence, and so the sentence includes another subject.
- In these types of sentences the auxiliary appears before the subject $NP$, just as in the yes-no question structures.
- Here is an example followed by a sample rule:
> What flights do you have from Burbank to Tacoma Washington?
\begin{align}
S \; \rightarrow \; \text{Wh-NP } \text{ Aux } \text{ NP } \text{ VP}
\end{align}
- Cosntructions like the **wh-non-subject-question** contain what are called **long-distance dependencies**.
- Because the *Wh-NP what flights* is far away from the predicate that it is semantically related to, the main verb *have* in the *VP*.
### Clauses and Sentences
- There is more to $S$ than just standing alone as a unit of discourse.
- $S$ rules are intended to account for entire sentences that stand alone as fundamental units of discourse.
- $S$ can also occur on the right-hand side of grammar rules and hence can be embedded within larger sentences.
- What differentiates sentence constructions (i.e., the $S$ rules) from the rest of the grammar is the notion that they are in some sense *complete*.
- In this way they correspond to the notion of a **clause**.
- Which traditional grammars often describe as forming a complete thought.
- One way of making the ntion of "complete thought" more precise is to say $S$ is a node of the parse tree below which the main verb of the $S$ has all of its **arguments**.
### The Noun Phrase
- Our $\mathscr{L}_0$ grammar introduced three of the most frequent types of noun phrases that occur in English: pronouns, proper nouns and the $NP \rightarrow \text{Det } \text{ Nominal}$ construction.
- We focus on the $NP \rightarrow \text{Det } \text{ Nominal}$ construction since that is where the bulk of the syntactic complexity resides.
- These noun phrases consist of a head, the central noun in the noun phrase, along with various modifiers that can occur before or after the head noun.
#### The Determiner
- Noun phrases can begin with simple lexical determiners:
> a stop, the flights, this flight, those flights, any flights, some flights
- The role of the determiner can also be filled by more complex expressions:
> United's flight
> United's pilot's union
> Denver's mayor's mother's canceled flight
- In these examples, the role of the determiner is filled by a possessive expression consisting of a noun phrase followed by an *'s* as a possessive marker, as in the following rule:
\begin{align}
Det \rightarrow \text{NP } \text{'s}
\end{align}
- The fact that this rule is recursive (since an *NP* can start with a *Det* helps us model the last two examples above,
- In which a sequence of possessive expression serves as a determiner.
- Under some circumtances determiners are optional in English. For example, determiners may be omitted if the noun they modify is plural.
> Show me *flights* from San Fransisco to Denver on weekdays
- **Mass nouns** also don't require determination.
> Does this flight serve *dinner*?
#### The Nominal
- The nominal construction follows the determiner and contains any pre- and post- head noun modifiers.
- As indicated in grammar $\mathscr{L}_0$, in its simplest form a nominal can consist of a single noun.
\begin{align}
\text{Nominal } \rightarrow \text{ Noun}
\end{align}
- This rule also provides the basis for the bottom of various recursive rules used to capture more complex nominal constructions.
#### Before the Head Noun
- A number of different kinds of word classes can appear before the head noun but after the determiner (the "postdeterminers") in a nominal.
- These include **cardinal numbers**, **ordinal numbers**, **quantifiers**, and adjectives.
- Example of cardinal numbers:
> two friends, one stop
- Ordinal numbers include *first*, *second*, *third*, and so on, but also words like *next*, *last*, *past*, *other*, and *another*:
> the first one, the next day, the second leg
> the last flight, the other American flight
- Some quantifiers (*many, (a) few, several*) occur only with plural count nouns:
> many fares
- Adjectives occur after quantifiers but before nouns.
> a *first-class* fare, a *non-stop* flight
> the *longest* layover, the *earliest* lunch flight
- Adjectives can also be grouped into a phrase called an **adjective phrase** or AP.
- APs can have an adverb before the adjective:
> the *least expensive* fare
#### After the Head Noun
- A head noun can be followed by postmodifiers.
- Three kinds of nominal postmodifiers are common in English:
- Prepositional phrases
> all flights *from Cleveland*
- Non-finite clauses
> any flights *arriving after elevent a.m.*
- Relative clauses
> a flight *that serves breakfast*
- Here are some examples of prepositional phrase postmodifiers, with brackets inserted to show the boundaries of each PP; note that two or more PPs can be strung together within a single NP:
> all flights *[from Cleveland] [to Newark]*
> arrival *[in San Jose] [before seven p.m.]*
> a reservation *[on flight six oh six] [from Tampa] [to Montreal]*
- Here's a new nominal rule to account for postnominal *PPs*:
\begin{align}
\text{Nominal } \rightarrow \text{Nominal } \text{ PP}
\end{align}
- The three most common kinds of non-finite postmodifiers are the gerundive (*-ing*), *-ed*, and infinitive forms.
- **Gerundive** postmodifiers consist of a verb phrase that begins with the gerundive (*-ing*) form of the verbs.
> any of those *[leaving on Thursday]*
> any flights *[arriving after eleven a.m.]*
> flights *[arriving within thirty minutes of each other]*
- We can define the $Nominals$ with gerundive modifiers as follows, making use of a new non-terminal $GerundVP$:
\begin{align}
\text{Nominal } \rightarrow \text{ Nominal } \text{ GerundVP}
\end{align}
- We can make rules for $GerundVP$ constituents by duplicating all of our VP productions, substituting $GerundV$ for $V$.
\begin{align}
\text{GerundVP } &\rightarrow \text{ GerundV} \\
&| \text{ GerundV NP } | \text{ GerundV NP PP } | \text{ GerundV PP} \\
\end{align}
- $GerundV$ can then be defined as
\begin{align}
\text{GerundV } \rightarrow \text{ being } | \text{ arriving } | \text{ leaving } | \text{ ...}
\end{align}
- The phrases in italics below are examples of the two other common kinds of non-finite clauses, infinitives and *-ed* forms:
> the last flight *to arrive in Boston*
> I need to have dinner *served*
> Which is the aircraft *used by this flight*?
- A postnominal relative clause (more correctly a **restrictive relative clause**), is a clause that often begins with a **relative pronoun** (*that* and *who* are the most common).
- The relative pronoun functions as the subject of the embedded verb in the following examples:
> a flight *that serves breakfast*
> flights *that leave in the morning*
> the one *that leaves at ten thirty five*
- We might add rules like the following to deal with these:
\begin{align}
\text{Nominal } &\rightarrow \text{ Nominal RelClause} \\
\text{RelClause } &\rightarrow (\text{who } | \text{ that}) \text{ VP}
\end{align}
- The relative pronoun may also function as the object of the embedded verb, as in the following example.
> the earliest American Airlines flight that I can get
- Various postmonimal modifiers can be combined:
> a flight *[from Phoenix to Detroit] [leaving Monday evening]*
> evening flights *[from Nashville to Houston] [that serve dinner]*
> a friend *[living in Denver] [that would like to visit me in DC]*
#### Before the Noun Phrase
- Word classes that modify and appear before *NPs* are called **predeterminers**.
- Many of these have to do with number or amount; a common predeterminer is *all*:
> all the flights, all flights, all non-stop flights
- The example noun phrase given in *Fig. 12.5* illustrates some of the complexity that arises when these rules are combined.

### The Verb Phrase
- The verb phrase consists of the verb and a number of other constituents.
- In the simple rules so far, these other constituents include *NPs* and *PPs* and combinations of the two:
\begin{align}
\textit{VP } &\rightarrow \textit{ Verb} \quad \text{disappear} \\
\textit{VP } &\rightarrow \textit{ Verb NP} \quad \text{prefer a morning flight} \\
\textit{VP } &\rightarrow \textit{ Verb NP PP} \quad \text{leave Boston in the morning} \\
\textit{VP } &\rightarrow \textit{ Verb PP} \quad \text{leaving on Thursday}
\end{align}
- Verb phrases can be significantly more complicated. Many other kinds of constituents, such an an entire embedded sentence, can follow the verb.
- These are called **sentential complements**.

- Here's a rule for these:
\begin{align}
\textit{VP } \rightarrow \textit{ Verb S}
\end{align}
- Similarly, another potential constituent of the *VP* is another VP. This is often the case for verbs like *want, would like, try, intent, need*:
> I want $[_{VP}$ to fly from Milwaukee to Orlando$]$
> Hi, I want $[_{VP}$ to arrange three flights$]$
- While a verb phrase can have many possible kinds of constituents, not every verb is compatible with every verb phrase.
- For example, the verb *want* can be used either with an *NP* complement (*I want a flight...*) or with an infinitive *VP* complement (*I want to fly to...*).
- By contrast, a verb like *find* cannot take this sort of *VP* complement (**I found to fly to Dallas*).
- Traditional grammar distinguishes between
- **Transitive** verbs like *find*, which take a direct object *NP* (*I found a flight*), and
- **Intransitive** verbs like *disappear*, which do not (**I disappeared a flight*).
- We say traditional grammars **subcategorize** verbs into these two categories.
- Modern grammars distinguish as many as 100 subcategories.
- For example, a verb like *find* **subcategorizes for** an *NP*, and a verb like *want* subcategorizes for either an *NP* or a non-finite *VP*.
- We also call these constituents the **complements** of the verb (hence the use of the term **sentential complement**).
- We say that *want* can take a *VP* complement.
- These possible sets of complements are called the **subcategorization frame** of the verb.
- Subcategorization frames for a set of example verbs are given in *Fig. 12.6*.

- We can capture the association between verbs and their complements by making separate subtypes of the class Verb:
\begin{align}
\textit{Verb-with-NP-complement } &\rightarrow \text{ find | leave | repeat | ...} \\
\textit{Verb-with-S-complement } &\rightarrow \text{ think | believe | say | ...} \\
\textit{Verb-with-Inf-VP-complement } &\rightarrow \text{ want | try | need | ...}
\end{align}
- Each *VP* rule could then be modified to require the appropriate verb subtype:
\begin{align}
\textit{VP } &\rightarrow \textit{ Verb-with-no-complement } \text{ disappear} \\
\textit{VP } &\rightarrow \textit{ Verb-with-NP-comp NP } \text{ prefer a morning flight} \\
\textit{VP } &\rightarrow \textit{ Verb-with-S-comp S } \text{ said there were two flights}
\end{align}
- A problem with this approach is the significant increase in the number of rules and the associated loss of generality.
### Coordination
- The major phrase types discussed can be conjoined with conjunctions like and, or, and but to form larger constructions of the same type.
- For example, a coordinate noun phrase can consist of two other noun phrases separated by a conjunction:
> Please repeat $[_{NP}[_{NP}$ the flights$]$ *and* $[_{NP}$ the costs$]]$
> I need to know $[_{NP}[_{NP}$ the aircraft$]$ *and* $[_{NP}$ the flight number$]]$
- Here's a rule that allows these structures:
\begin{align}
\textit{NP } \rightarrow \textit{ NP and NP}
\end{align}
- Note that the ability to form coordinate phrases though conjunctions is often used as a test for constituency.
- Consider the following examples, which differ from the ones given above in that they lack the second determiner:
> Please repeat $[_{Nom}[_{Nom}$ flights$]$ *and* $[_{Nom}$ costs$]]$
> I need to know $[_{Nom}[_{Nom}$ aircraft$]$ *and* $[_{Nom}$ flight number$]]$
- The fact that these phrases can be conjoined is evidence for the presence of the underlying $\textit{Nominal}$ constituent we have been making use of. Here's a rule for this:
\begin{align}
\textit{Nominal } \rightarrow \textit{ Nominal and Nominal}
\end{align}
- The following examples illustrate conjunctions involving $\textit{VPs}$ and $\textit{Ss}$:
> What flights do you have $[_{VP}[_{VP}$ leaving Denver$]$ *and* $[_{VP}$ arriving in San Francisco$]]$
> $[_S[_S$ I'm interested in a flight from Dallas to Washington$]$ *and* $[_S$ I'm also interested in going to Baltimore$]]$
- The rules for $\textit{VP}$ and $\textit{S}$ conjunctions mirro the $\textit{NP}$ one given above:
\begin{align}
\textit{VP } &\rightarrow \textit{ VP and VP} \\
\textit{S } &\rightarrow \textit{ S and S}
\end{align}
- Since all the major phrase types can be conjoined in this fashion, it is also possible to represent this conjuntion fact more generally.
- A number of grammar formalisms such as GPSG do this using **metarules** like:
\begin{align}
\textit{X } \rightarrow \textit{ X and X}
\end{align}
- This metarule states that any non-terminal can be conjoined with the same non-terminal to yield a constituent of the same type.
- The variable $\textit{X}$ must be designated as a variable that stands for any non-terminal rather than a non-terminal itself.