Last class, we decided that the ATM will not parse the sentence, "The teacher wanted to read a book." It is because there is no prepositional phrase that has a noun in it. So it will die on "to read."
The subject of "wanted" is "the teacher" and the subject of "to read" is "the teacher" also. But we would not say "the teacher wanted the teacher to read a book" because it is repetitive. There are many sentences in English that follow this pattern.
The teacher wanted the students to read a book.
The direct object of "wanted" is the entire phrase "the students to read a book." The direct object is what is wanted.
Additions to the ATM
What would you have to do to the ATM to handle a sentence of this nature? You would have to complicate the ATM! Dr. Gomez gave out the last handout with the updated ATM.
There are three new arcs on this new sheet, but we are only interested in 17 and 18 (13 will not be included on the final exam). Arcs 17 and 18 use z-slash notation. It means to give a recursive call to z (you will call the entire ATM recursively). There are no conditions on 17 (taken non-deterministically). There is, however, an initialization. You will initialize some registers when you do the recursive call. The action is taken when we come back to the point after we recurse.
Trace (The teacher wanted to read a book)
SUBJECT(the teacher)
MAIN-VERB(wanted)
3, 4, 5, and 8 all fail.
Recurse-17 {
SUBJECT1(teacher)
MAIN-VERB1(read)
DO1(a book)
JUMP
SEND
}
DO(SUBJECT1, MAIN-VERB1, DO1)
Trace (The teacher wanted the students to read a book)
SUBJECT(the teacher)
MAIN-VERB(wanted)
DO(the students)
Recurse-18 {
SUBJECT1(the students)
MAIN-VERB1(read)
DO1(a book)
}
IO(the students)
DO(SUBJECT1, MAIN-VERB1, DO1)
Note: This works only using 18 but not 17.
A relative clause is a clause that describes a noun. For example, "the person who went to the store" is a relative clause. Differentiating between "that" and "who" is complicated.
The computer's life is made easier when you have stricter rules, but many people make the mistake of using "that" instead of "who" when qualifying a human. Humans can use knowledge to decipher whether the "that" qualifies a person or a thing, but it's much harder for the computer to do that.
The horse raced past the barn fell is the same as The horse which was raced past the barn fell. The human parse breaks down around the dangling participle. This is an example of why you cannot build a deterministic parser for English; you need to be able to back out. The human parser is not parallel, i.e. "raced" can be used two ways but you cannot parse them both at the same time.
Relative clauses occur in every language (Chomsky referred to this as universal linguistics.
The teacher the students like left.
The direct object of "like" is "the teacher."
The teacher the students the children admire like left.
A computer will handle this sentence easily, but humans can't easily parse this sentence. According to the formalism the grammar is perfect. According to human users, the grammar is weird. The reason is because we can keep very few items in our "stack." Our stack "overflows" very easily. We see that when our stack has three items it overflows! This only happens when you have self-embedded sentences (recursion).
We can parse a sentence like "The cat that ate the mouse that ate the cheese that came from Spain left." It's because each embedded sentence has its Direct Object attached.
You can use commas to separate main clauses from subordinate clauses (While I was reading, I got sick), delimit a list (apples, oranges, and monkeys), and "opposition" (not sure if this is the word he meant) phrases like "Giraffes, African mammals, like green valleys."
Comparative phrases add complexity. There is a big difference between "My children eat more apples than yours" and "My children eat more apples than oranges." We are using semantic knowledge to eliminate a tremendous combinatorial expression. When you try to parse these using only syntax, you wind up with ambiguity — more than one syntactically correct parse."