Linguistics 187 Week 4Linguistics 187 Week 4
Ambiguity and RobustnessAmbiguity and Robustness
Discourse
Language has pervasive ambiguityLanguage has pervasive ambiguity
walk untieable knot bank? Noun or Verb (untie)able or un(tieable)? river or financial?
walk untieable knot bank? Noun or Verb (untie)able or un(tieable)? river or financial?
Every man loves a woman. The same woman or each their own? John told Tom he had to go.
Who had to go?
Every man loves a woman. The same woman or each their own? John told Tom he had to go.
Who had to go?
I like Jan. |Jan|.| or |Jan.|.| (sentence end or abbreviation) I like Jan. |Jan|.| or |Jan.|.| (sentence end or abbreviation)
EntailmentSemanticsSyntaxMorphologyTokenization
John didn’t wait to go. now or never?
John didn’t wait to go. now or never?
Bill fell. John kicked him.because or after?
Bill fell. John kicked him.because or after?
The duck is ready to eat. Cooked or hungry? The duck is ready to eat. Cooked or hungry?
AmbiguityAmbiguity
Syntactically legitimate ambiguity (vs. spurious ambiguity: “boys and girls” & pushup)
Sources: – Alternative c-structure rules– Disjunctions in f-structure description– Lexical categories
XLE’s display/computation of ambiguity Dealing with ambiguity
– Recognize legitimate ambiguity– OT marks for preferences (later in the course)– Stochastic disambiguation
Syntactic AmbiguitySyntactic Ambiguity
Lexical– part of speech– subcategorization frames
Syntactic– attachments– coordination
Implemented system highlights interactions
Lexical Ambiguity: POSLexical Ambiguity: POS
verb-nounI saw her duck. I saw [NP her duck]. I saw [NP her] [VP duck].
noun-adjectivethe [N/A mean] rule that child is [A mean]. he calculated the [N mean].
Morphology and POS ambiguityMorphology and POS ambiguity
English has impoverished morphology and hence extreme POS ambiguity– leaves: leave +Verb +Pres +3sg
leaf +Noun +Pl
leave +Noun +Pl– will: +Noun +Sg; +Aux; +Verb +base
Even languages with extensive morphology have ambiguities
Lexical ambiguity: Subcat framesLexical ambiguity: Subcat frames
Words often have more than one subcategorization frame– transitive/intransitive
I broke it./It broke.– intransitive/oblique
He went./He went to London.– transitive/transitive with infinitive
I want it./I want it to leave.
Subcat-Rule interactionsSubcat-Rule interactions
OBL vs. ADJUNCT with intransitive/oblique– He went to London.
[ PRED ‘go<(^ SUBJ)(^ OBL)>’
SUBJ [PRED ‘he’]
OBL [PRED ‘to<(^ OBJ)>’
OBJ [ PRED ‘London’]]]
[ PRED ‘go<(^ SUBJ)>’
SUBJ [PRED ‘he’]
ADJUNCT { [PRED ‘to<(^ OBJ)>’
OBJ [ PRED ‘London’]]}]
OBL-ADJUNCT cont.OBL-ADJUNCT cont.
Passive by phrase– It was eaten by the boys. [ PRED ‘eat<(^ OBL-AG)(^ SUBJ)>’ SUBJ [PRED ‘it’] OBL-AG [PRED ‘by<(^ OBJ)>’ OBJ [PRED ‘boy’]]]– It was eaten by the window. [ PRED ‘eat<NULL(^ SUBJ)>’ SUBJ [PRED ‘it’] ADJUNCT { [PRED ‘by<(^ OBJ)>’ OBJ [PRED ‘boy’]]}]
XCOMP-ADJUNCTXCOMP-ADJUNCT
to infinitives can be arguments or adjuncts (purpose clauses)– I want her to leave.
[ PRED ‘want<(^ SUBJ)(^ XCOMP)>(^ OBJ)’
SUBJ [ PRED ‘I’ ]
OBJ [ PRED ‘her’ ]1
XCOMP [ PRED ‘leave<(^ SUBJ)>’
SUBJ [ 1 ] ] ]
XCOMP-ADJUNCT cont.XCOMP-ADJUNCT cont.
– I want money to buy that.
[ PRED ‘want<(^ SUBJ)(^ OBJ)>’
SUBJ [ PRED ‘I’ ]
OBJ [ PRED ‘money’ ]
ADJUNCT { [ PRED ‘buy<(^ SUBJ)(^ OBJ)>’
SUBJ [ PRED ‘pro’ ]
OBJ [ PRED ‘that’ ] ] } ]
But both sentences get both analyses– The syntax does not have world knowledge
OBJ-TH and Noun-Noun compoundsOBJ-TH and Noun-Noun compounds
Many OBJ-TH verbs are also transitive– I took the cake. I took Mary the cake.
The grammar needs a rule for noun-noun compounds– the tractor trailer, a grammar rule
These can interact– I took the grammar rules– I took [NP the grammar rules]– I took [NP the grammar] [NP rules]
Syntactic AmbiguitiesSyntactic Ambiguities
Even without lexical ambiguity, there is legitimate syntactic ambiguity– PP attachment– Coordination
Want to:– constrain these to legitimate cases– make sure they are processed efficiently
PP AttachmentPP Attachment
PP adjuncts can attach to VPs and NPs Strings of PPs in the VP are ambiguous
– I see the girl with the telescope.
I see [the girl with the telescope].
I see [the girl] [with the telescope].
This ambiguity is reflected in:– the c-structure (constituency) – the f-structure (ADJUNCT attachment)
PP attachment cont.PP attachment cont.
This ambiguity multiplies with more PPs– I saw the girl with the telescope– I saw the girl with the telescope in the garden– I saw the girl with the telescope in the garden on
the lawn
The syntax has no way to determine the attachment, even if humans can.
Ambiguity in coordinationAmbiguity in coordination
Vacuous ambiguity of non-branching trees– this can be avoided (pushup)
Legitimate ambiguity– old men and women
old [N men and women]
[NP old men ] and [NP women ]– I turned and pushed the cart
I [V turned and pushed ] the cart
I [VP turned ] and [VP pushed the cart ]
Grammar Engineering and ambiguityGrammar Engineering and ambiguity
Large-scale grammars will have lexical and syntactic ambiguities
With real data they will interact, resulting in many parses– these parses are (syntactically) legitimate– they are not intuitive to humans
(but more plausible words can make them better)
XLE provides tools to manage ambiguity– grammar writer interfaces– computation
XLE displayXLE display
Four windows– c-structure (top left)– f-structure (bottom left)– packed f-structure (top right)– choice space (bottom right)
C-structure and f-structure “next” buttons Other two windows are packed
representations of all the parses– clicking on a choice will display that choice in the
left windows
ExampleExample
I see the girl in the garden PP attachment ambiguity
– both ADJUNCTS– difference in ADJUNCT-TYPE
Packed F-structure and Choice spacePacked F-structure and Choice space
Sorting through the analysesSorting through the analyses
“Next” button on c-structure and then f-structure windows– impractical with many choices – independent vs. interacting ambiguities– hard to detect spurious ambiguity
The packed representations show all the analyses at once– (in)dependence more visible– click on choice to view– spurious ambiguities appear as blank choices
» but legitimate ambiguities may also do so
Ambiguity DemoAmbiguity Demo– eng-week4-demo.lfg– eng-week4-demo-test.lfg
Attachment– the girl ate the banana with the monkey
Subcategorization– the girl thought about the banana
Feature– the sheep laughed
All three (2 c-structures; 8 analyses)– the girl thought about the banana with the monkey
XLE Ambiguity ManagementXLE Ambiguity Management
The sheep liked the fish.How many sheep?
How many fish?
The sheep-sg liked the fish-sg.The sheep-pl liked the fish-sg.The sheep-sg liked the fish-pl.The sheep-pl liked the fish-pl.
Options multiplied out
The sheep liked the fish sgpl
sgpl
Options packed
Packed representation is a “free choice” system– Encodes all dependencies without loss of information– Common items represented, computed once– Key to practical efficiency
… but it’s wrongIt doesn’t encode all dependencies, choices are not free.
Dependent choicesDependent choices
Das Mädchen-nom sah die Katze-nomDas Mädchen-nom sah die Katze-accDas Mädchen-acc sah die Katze-nomDas Mädchen-acc sah die Katze-acc
Das Mädchen sah die Katzenomacc
nomacc
The girl saw the cat
Again, packing avoids duplication
badThe girl saw the catThe cat saw the girl bad
Who do you want to succeed? I want to succeed John want intrans, succeed trans I want John to succeed want trans, succeed intrans
Solution: Label dependent choicesSolution: Label dependent choices
Das Mädchen-nom sah die Katze-nomDas Mädchen-nom sah die Katze-accDas Mädchen-acc sah die Katze-nomDas Mädchen-acc sah die Katze-acc
badThe girl saw the catThe cat saw the girl bad
• Label each choice with distinct Boolean variables p, q, etc.• Record acceptable combinations as a Boolean expression • Each analysis corresponds to a satisfying truth-value assignment
(free choice from the true lines of ’s truth table)
Das Mädchen sah die Katze p:nom
p:acc
q:nom
q:acc
(pq)
(pq) =
Ambiguity and RobustnessAmbiguity and Robustness
Large-scale grammars are massively ambiguous
Grammars parsing real text need to be robust– "loosening" rules to allow robustness increases
ambiguity even more
Need a way to control the ambiguity– version of Optimality Theory (OT)
Theoretical OTTheoretical OT Grammar has a set of violable constraints Constraints are ranked by each language
– This gives cross-linguistic variation
Candidates (analyses) compete– John waited for Mary. vs. John waited for 3 hours.
Constraint ranking determines winning candidate Issues for XLE
– Candidates can be very ungrammatical» we have a grammar to produce grammatical analyses
» even with robust, ungrammatical analyses, these are controlled
– Generation, not parsing direction» we know what the string is already
» for generation we have a very specified analysis
XLE OTXLE OT
Incorporate idea of ranking and (dis)preference Filter syntactic and lexical ambiguity Reconcile robustness and accuracy Allow parsing grammar to be used for generation
XLE OT ImplementationXLE OT Implementation
OT marks in– grammar rules– templates– lexical entries
CONFIG states– preference vs. dispreference– ranking– parsing vs. generation orders
The o:: projectionThe o:: projection OT marks are not f-structure features OT marks are in their own projection
c-structure
f-structure
o-structure(set of OT marks)
The o:: projectionThe o:: projection
The o-structure is just a set of marks { PPadj GuessedN }
Instead of ^ and !, have o::* (NB: !f::*) PP: (^ ADJUNCT)=!
PPadj $ o::* ;– the f-structure is exactly the same– there is now an additional o-structure
Ranking analysesRanking analyses
Specify relative importance of OT marks in the CONFIGOPTIMALITYORDER Mark3 Mark2 +Mark1.
Comparing analyses– Find most important mark where the analyses differ– Prefer the analysis with the
» Least number of dispreference marks (no +)
» Most number of preference marks (+)
Importance
Ranking analyses (continued)Ranking analyses (continued)
an analysis with Mark2 is preferred over an analysis with Mark3
an analysis with no mark is preferred over an analysis with Mark2 or Mark3
an analysis with one Mark2 is preferred over one with two Mark2
an analysis with Mark1 is preferred over an analysis with no mark
an analysis with two Mark1 is preferred over an analysis with one Mark1
ImportanceOPTIMALITYORDER Mark3 Mark2 +Mark1.
Difference with Theoretical OTDifference with Theoretical OT
Theoretical OT: only dispreference marks XLE OT:
– dispreference marks: Mark1– preference marks: +Mark1– NOTE: + is only indicated in the CONFIG
only the name (Mark1) appears in the
grammar
Deciding which to use can be difficult
Example: PP ambiguitiesExample: PP ambiguities
John waited for Mary. John waited for 3 hours. Rule with OT marks Using template OT(_mark)=_mark $ o::*.
VP --> V
(NP: (^ OBJ)=!)
PP*: { (^ OBL)=!
@(OT PPobl)
|! $ (^ ADJUNCT)
@(OT PPadj)}.
Basic StructuresBasic Structures
John waited for Maryf-str:[ PRED 'wait<SUBJ OBL>' SUBJ [ PRED 'John'] OBL [ PRED 'for<OBJ>' OBJ [ PRED 'Mary' ]]]o-str: { PPobl }
John waited for Maryf-str:[ PRED 'wait<SUBJ>' SUBJ [ PRED 'John'] ADJ {[ PRED 'for<OBJ>' OBJ [ PRED 'Mary' ]]}]o-str: { PPadj }
Ranking for ExampleRanking for Example
Disprefer ADJUNCTs– OPTIMALITYORDER PPadj.– Problem: will disprefer adjuncts even when no
OBL analysis is possible
Prefer OBLs– OPTIMALITYORDER +PPobl.– Problem: will prefer OBL even when the other
analysis was not an ADJUNCT– Still probably better than dispreferring ADJUNCTs– Solution: local OT marks (not discussed here)
Special OT marks in XLESpecial OT marks in XLE
Separate other marks into fields Marks preceding
– NOGOOD: remove parts of the grammar for debugging or specializing
– STOPPOINT: apply on a second pass for extending grammar on failure
– CSTRUCTURE: filter when the c-structure is built for speed
There is lots of discussion in the XLE documentation; the reading on the web is a bit out of date for these marks
The NOGOOD MarkThe NOGOOD Mark
OT marks can be used to remove parts of the grammar– rules or rule parts– templates or template parts– lexical items or parts of them
Use for– grammar adaptation/sharing– grammar development
Example– OPTIMALITYORDER FrontMatter NOGOOD.
NOGOOD ExampleNOGOOD Example
ROOT rule allows for front matter for special corpus ROOT --> (FR-MAT: (^ ID)=!
@(OT FrontMatter))
S.
FR-MAT --> NUMBER
(PERIOD).
1. The light flashes.
FR-MATFR-MAT
Grammars for corpora with front matter will not rank the OT mark FrontMatter (unranked marks are neutral)
Grammars for corpora without front matter will make the OT mark a NOGOOD OPTIMALITYORDER FrontMatter NOGOOD.
Effective ROOT rule: ROOT --> S.
Allows rule sharing across grammars Can also be used for debugging
RobustnessRobustness
What to do if the grammar doesn't provide an analysis?
Graceful failure– FRAGMENTs– Specific relaxations
Ungrammatical analysis only if no grammatical one
Avoid ungrammatical analyses in generation
Robustness: STOPPOINTRobustness: STOPPOINT
On first pass, STOPPOINT is treated as NOGOOD Small, fast grammar for standard constructions
If first pass fails, ignore STOPPOINT and extend grammar– Relaxation possibilities precede STOPPOINT– OPTIMALITYORDER BadDetNAgr STOPPOINT.
STOPPOINT Mark exampleSTOPPOINT Mark example
Example: NP: this boy NP: this boys Template call with OT mark:
DEMON(_P _N) = (^ SPEC PRED)='_P' { (^ NUM)=c _N |(^ NUM)~= _N @(OT BadDetNAgr)}.
Lexical entry: this DET XLE @(DEMON %stem sg).
RankingOPTIMALITYORDER BadDetNAgr STOPPOINT.
Structures for STOPOINT exampleStructures for STOPOINT exampleNP: this boyf-str [ PRED 'boy' NUM sg SPEC [ PRED 'this' ]]o-str
NP: this boysf-str [ PRED 'boy' NUM pl SPEC [ PRED 'this' ]]o-str { BadDetNAgr }
Parsing this boys will be slow: the grammar has to parse a second time But the ungrammatical input gets a parse Only put OT marks behind the STOPPOINT if they will be rarely triggered
Preference marks and STOPPOINTPreference marks and STOPPOINT
Preference marks behind the STOPPOINT are tried first (counter to intuitition)– OPTIMALITYORDER +MWE STOPPOINT.
Use MWE readings if at all possible If fail, do a second pass with the analytic
(non-MWE) structure (inefficient if fail) Example:
print` quality N * @(NOUN %STEM) @(OT MWE).
The [N print quality] is excellent.I want to [V print] [NP quality documents].
CSTRUCTURE MarksCSTRUCTURE Marks
Apply marks before f-structure constraints are processed– OPTIMALITYORDER NoCloseQuote Guessed
CSTRUCTURE.
Improve performance by filtering early May loose some analyses
– coverage/efficiency tradeoff
CSTRUCTURE example: GuessedCSTRUCTURE example: Guessed
Only use guessed form if another form is not found in the morphology/lexicon– OPTIMALITYORDER Guessed CSTRUCTURE.
Trade-off: lose some parses, but much fasterThe foobar is good.
no entry for foobar ==> parse with guessed N
The audio is good.
audio: only A in morphology ==> no parse
CSTRUCTURE example: QuoteCSTRUCTURE example: Quote
Only allow unbalanced quote marks if there is no other quote markThen I left." vs. He said, "they appeared."
METARULEMACRO: … _CAT QT: @(OT NoCloseQt); … XLE only tries balanced version, not double
unbalanced version– failure when really needed two unbalanced quotes
Combining the OT marksCombining the OT marks
All the types of OT marks can be used in one grammar– ordering of NOGOOD, CSTRUCTURE,
STOPPOINT are important
ExampleOPTIMALITYORDER
Verbmobil NOGOOD
Guessed CSTRUCTURE
+MWE Fragment STOPPOINT
RareForm StrandedP +Obl.
Other FeaturesOther Features
Grouping: have marks treated as being of equal importance– OPTIMALITYORDER (Paren Appositive) Adjunct.
Ungrammatical markup: have XLE report analyses with this mark with a *– these are treated like any dispreference mark for
determining the optimal analyses– OPTIMALITYORDER *NoDetAgr STOPPOINT.
GenerationGeneration
XLE uses the same basic grammar to parse and generate
Do not always want to generate all the possibilities that can be parsed
Put in special OT marks for generation to block or prefer certain strings– fix up bad subject-verb agreement– only allow certain adverb placements– control punctuation options
GENOPTIMALITYORDER
OT Marks: Main pointsOT Marks: Main points
Ambiguity: broad coverage results in ambiguity – OT marks allow preferences
Robustness: want fall back parses only when regular parses fail – OT marks allow multipass grammar
XLE provides for complex orderings of OT marks– NOGOOD, CSTRUCTURE, STOPPOINT– preference, dispreference, ungrammatical– see the XLE documentation for details
FRAGMENT grammarFRAGMENT grammar
What to do when the grammar does not get a parse– always want some type of output– want the output to be maximally useful
Why might it fail:– construction not covered yet– "bad" input– took too long (XLE parsing parameters)
Grammar engineering approachGrammar engineering approach
First try to get a complete parse If fail, build up chunks that get complete
parses (c-str and f-str) Have a fall back for things without even chunk
parses Link these chunks and fall backs together in a
single f-structure
Basic ideaBasic idea
XLE has a REPARSECAT which it tries if there is no complete parse
Grammar writer specifies what category the possible chunks are
OT marks are used to – build the fewest chunks possible– disprefer using the fall back over the chunks
Sample outputSample output
the the dog appears. Split into:
– "token" the– sentence "the dog appears"– ignore the period
C-structureC-structure
F-structureF-structure
How to get thisHow to get this
FRAGMENTS -->
{ NP: (^ FIRST)=! @(OT-MARK Fragment) |S: (^ FIRST)=! @(OT-MARK Fragment) |TOKEN: (^ FIRST)=! @(OT-MARK Fragment) }
(FRAGMENTS: (^ REST)=! ).
Lexicon: -token TOKEN * (^ TOKEN)=%stem @(OT-MARK Token).
Why First-Rest?Why First-Rest? FIRST-REST
[ FIRST [ PRED …]
REST [ FIRST [ PRED … ]
REST … ] ]– Efficient– Encodes order
Possible alternative: set{ [ PRED … ]
[ PRED … ] }– Not as efficient (copying)– Even less efficient if mark scope facts
Accuracy?Accuracy?
Evaluation against gold standard “PARC 700” f-structure bank for Wall Street Journal
Measure: F-score on dependency triples– F-score: average of precision and recall – Dependency triples: separate f-structure features Subj(run, dog) Tense(run, past)
Results for best-matching f-structure:– Full parses: F=88.5– Fragment parses: F=76.7
(Riezler et al, 2002)
Fragments summaryFragments summary XLE has a chunking strategy for when the
grammar does not provide a full analysis Each chunk gets full c-str and f-str The grammar writer defines the chunks based
on what will be best for that grammar and application
Quality– Fragments have reasonable but degraded f-scores– Usefulness in applications is being tested