LING 581: Advanced Computational Linguistics Lecture Notes January 30th
Mar 31, 2015
LING 581: Advanced Computational Linguistics
Lecture NotesJanuary 30th
Relative clause constructions
• Terminology– gap (__):
• indicates where the head of the construction is interpreted
– Subject RC: the man (that|who) __ saw me– Object RC: the man (that|who) I saw __– Subject and object RCs can appear in subject and object
positions freely:• The man that saw me left the room• The man that I saw left the room• I saw the man that saw me• I again saw the man that I sawNote: the relative pronoun is the that/who/which
Relative clause constructions
• Terminology contd.:– Infinitival/untensed vs. tensed• John saw Mary (tensed)• John sees Mary (tensed)• John to see Mary (untensed)
– In RC constructions:• the man to see Mary• a person to see• a time to go see Mary
Note: subject is always missing…But it’s not always the RC gap
Relative clause constructions
• Terminology contd.:– Zero refers to a missing relative pronoun– Zero RCs:
• the man I saw (tensed)• the man to see (untensed)
– *Zero:• *the man saw me / the man who saw me• *the man was seen by me / the man who was seen by me• The horse raced past the barn fell
– must be zero:• *a person that to see• *the man that to see Mary
Homework Exercise
Subject Non-Subject
Tensed relatives
Untensed relatives
Frequency counts
that which/who/what/when/where
zero
Tensed relatives
Homework Exercise Review• Use tregex to search for relative clauses as defined in Parsing
Guidelines section 4.2.2:2. zero relative clauses
Homework Exercise Review• Use tregex to search for relative clauses as defined in Parsing
Guidelines section 4.2.2:2. zero relative clauses
Homework Exercise Review• Use tregex to search for relative clauses as defined in Parsing
Guidelines section 4.2.2:3. infinitival relative clauses
Homework Exercise Review• Use tregex to search for relative clauses as defined in Parsing
Guidelines section 4.2.2:3. infinitival relative clauses
Homework Exercise Review• Use tregex to search for relative clauses as defined in Parsing
Guidelines section 4.2.2:3. infinitival relative clauses
Homework Exercise Review
• From page 17:
Homework Exercise Review
• Use tregex to search for relative clauses as defined in Bracketing Guidelines (prsguid1.pdf) section 4.2.2:1. wh- and that- relative clauses Two subtypes:
WHNP NP-traceWHADVP ADVP-trace
Note: the format in the guide doesn’t always match exactly with WSJ trees … -NONE-
Homework Exercise Review
• Use tregex to search for relative clauses as defined in Bracketing Guidelines (prsguid1.pdf) section 4.2.2:1. wh- and that- relative clauses
Matches Pattern11598 @NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i) << (@NP < (/^-NONE-$/ < /^\*T\*-([0-9]+)$/#1%i)))
1. 2.
3.
Homework Exercise Review
• Browsing through the matches and refining the search is always a good idea …
to see what we have inadvertently picked up or have not thought of
Homework Exercise Review
• Note: 2nd matching tree has an intervening PP:
Homework Exercise Review
• Note: 5th matching tree has an intervening PP:
Note: intervening punctuation is also commonThe plant, which is owned by Hollingsworth & Vose Co., was under contract …
Homework Exercise Review
11598@NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)
Note: the SBAR from NP-SBJ was extraposed to the VP
Note: *ICH* non-subject relative clause
Homework Exercise Review
11598@NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)
This is NOT a relative clauseconstruction!
Homework Exercise Review
11598@NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)
The relative clause gap here is ADVP
Infinitival/non-tensed clause
Homework Exercise Review
11598@NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)
*ICH* subject relative clause
Note: the SBAR from the NP objectwas right extraposed to the VP
Homework Exercise Review
11598@NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)
CoordinationSBAR SBAR CC SBAR
Homework Exercise Review
• 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)• 10290 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)
$/#2%i)
• Excludes *ICH* cases• Excludes coordination …
Homework Exercise Review
• 10290 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)$/#2%i)• 10326 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)$/#2%i
<< (/^(NP|ADVP)/ < (/^-NONE-$/ < /^\*T\*-([0-9]+)$/#1%i)))
Homework Exercise Review• 8575 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)$/#2%i << (NP-SBJ
< /^-NONE-$/))• 5975 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)$/#2%i << (NP-SBJ <
(/^-NONE-$/ < /^\*T\*-([0-9]+)$/#1%i)))
Homework Exercise Review
Let’s look at the *ICH* subcases:
Homework Exercise Review
159 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*/))
Homework Exercise Review
159 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*/))
This is NOT a relative clauseconstruction!
Homework Exercise Review159 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*/))155 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*-([0-9]+)/#1%i)) : /^SBAR-([0-9]+)$/#1%i
Only 1 out of the 4 is NOT a relative clauseconstruction!
Homework Exercise Review159 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*/))155 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*-([0-9]+)/#1%i)) : /^SBAR-([0-9]+)$/#1%i
Search string is too restrictive:SBAR-PRPSBAR-NOM
Homework Exercise Review• 116 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*-([0-9]+)/#1%i)) : (/^SBAR.*-
([0-9]+)$/#1%i < /^WH(NP|ADVP)-([0-9]+)$/)• 115 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*-([0-9]+)/#1%i)) : (/^SBAR.*-
([0-9]+)$/#1%i < /^WH(NP|ADVP)-([0-9]+)$/#2%j << /\*T\*-([0-9]+)/#1%j)
Not a trace?BUG?
Relevance of Treebanks
• Statistical parsers typically construct syntactic phrase structure– they’re trained on Treebank corpora like the Penn
Treebank• Note: some use dependency graphs, not trees
Parsers trained on the Treebank
• Don’t recover fully-annotated trees– not trained using nodes with indices or empty (-NONE-) nodes– not trained using functional tags, e.g. –SBJ
• Therefore they don’t fully parse• Example: no SBAR node in … a movie to see
Stanford parser
Parsers trained on the Treebank
• SBAR can be forced by the presence of an overt relative pronoun, but note there is no subject gap:
Parsers trained on the Treebank
• Probabilities are estimated from frequency information of each node given surrounding context (e.g. parent node, or the word that heads the node)
• Still these systems have enormous problems with prepositional phrase (PP) attachment
• Example:(borrowed from Igor Malioutov)
– A boy with a telescope kissed Mary on the lips– Mary was kissed by a boy with a telescope on the lips
• PP with a telescope should adjoin to the noun phrase (NP) a boy• PP on the lips should adjoin to the verb phrase (VP) headed by
kiss
Active/passive sentences
• Examples using the Stanford Parser:
Both active and passivesentences are parsed incorrectly
Active/passive sentences
• Examples:
X on the lips modifies MaryX on the lips modifies telescope
Homework Exercise• Use tregex to find out how many passive sentences there are in
the Treebank WSJ section?• The passive construction (according to the Bracketing Guidelines)
– Note: by-phrase containing logical subject (LGS) is optional