A Multi-Strategy Approach to A Multi-Strategy Approach to Parsing of Grammatical Relations Parsing of Grammatical Relations in Child Language Transcripts in Child Language Transcripts Kenji Sagae Language Technologies Institute Carnegie Mellon University Thesis Committee: Alon Lavie, co-chair Brian MacWhinney, co-chair Lori Levin Jaime Carbonell John Carroll, University of Sussex
101
Embed
A Multi-Strategy Approach to Parsing of Grammatical Relations in Child Language Transcripts Kenji Sagae Language Technologies Institute Carnegie Mellon.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
A Multi-Strategy Approach to A Multi-Strategy Approach to Parsing of Grammatical Relations Parsing of Grammatical Relations
in Child Language Transcriptsin Child Language Transcripts
Kenji SagaeLanguage Technologies Institute
Carnegie Mellon University
Thesis Committee:
Alon Lavie, co-chair
Brian MacWhinney, co-chair
Lori Levin
Jaime Carbonell
John Carroll, University of Sussex
2
Natural Language Parsing:Natural Language Parsing:Sentence Sentence → Syntactic Structure→ Syntactic Structure
• One of the core problems in NLP
Input: The boy ate the cheese sandwich
Output:
(S (NP (Det The) (N boy))
(VP (V ate) (NP (Det the) (N cheese) (N sandwich))))
((1 2 The DET) (2 3 boy SUBJ) (3 0 ate ROOT) (4 6 the DET) (5 6 cheese MOD) (6 3 sandwich OBJ))
Grammatical Relations (GRs)• Subject, object, adjunct, etc.
3
Using Natural Language ProcessingUsing Natural Language Processingin Child Language Researchin Child Language Research
• CHILDES Database (MacWhinney, 2000)– 200 megabytes of child-parent dialog transcripts– Part-of-speech and morphology analysis
• Tools available• Not enough for many research questions
– No syntactic analysis
• Can we use NLP to analyze CHILDES transcripts?– Parsing– Many decisions: representation, approach, etc.
4
Parsing CHILDES: Parsing CHILDES: Specific and General MotivationSpecific and General Motivation
• Specific task: automatic analysis of syntax in CHILDES corpora– Theoretical importance (study of child language
development)– practical importance (measurement of syntactic
competence)
• In general: Develop techniques for syntactic analysis, advance parsing technologies– Can we develop new techniques that perform better
than current approaches?• Rule-based• Data-driven
5
Research ObjectivesResearch Objectives
• Identify a suitable syntactic representation for CHILDES transcripts– Must address the needs of child language research
• Develop a high accuracy approach for syntactic analysis of spoken language transcripts– parents and children at different stages of language
acquisition
• The plan: a multi-strategy approach– ML: ensemble methods– Parsing: several approaches possible, but
combination is an underdeveloped area
6
Research ObjectivesResearch Objectives
• Develop methods for combining analyses from different parsers and obtain improved accuracy– Combining rule-based and data-driven approaches
• Evaluate the accuracy of developed systems
• Validate the utility of the resulting systems to the child language community– Task-based evaluation: Automatic measurement of
grammatical complexity in child language
7
Overview of the Multi-Strategy ApproachOverview of the Multi-Strategy Approachfor Syntactic Analysisfor Syntactic Analysis
Transcripts
Parser A
Parser B
Parser C
Parser D
Parser E
ParserCombination
SYNTACTICSTRUCTURES
8
Thesis StatementThesis Statement
• The development of a novel multi-strategy approach for syntactic parsing allows for identification of Grammatical Relations in transcripts of parent-child dialogs at a higher level of accuracy than previously possible
• Through the combination of different NLP techniques (rule-based or data-driven), the multi-strategy approach can outperform each strategy in isolation, and produce significantly improved accuracy
• The resulting syntactic analysis are at a level of accuracy that makes them useful to child language research
9
OutlineOutline
• The CHILDES GR scheme
• GR Parsing of CHILDES transcripts
• Combining different strategies
• Automated measurement of syntactic development in child language
• Related work
• Conclusion
10
CHILDES GR SchemeCHILDES GR Scheme(Sagae, MacWhinney and Lavie, 2004)(Sagae, MacWhinney and Lavie, 2004)
• Features: derived from parser configuration– Crucially: two topmost items in S, first item in W– Additionally: other features that describe the current
configuration (look-ahead, etc)
48
Parsing CHILDESParsing CHILDESwith a Classifier-Based Parserwith a Classifier-Based Parser
• Parser uses SVM• Trained on Eve training set (5,000 words)• Tested on Eve test set (2,000 words)
• Labeled dependency accuracy: 87.3%– Uses only domain-specific data– Same level of accuracy as GR system based on
Charniak parser
49
Precision and Recall of Specific GRsPrecision and Recall of Specific GRs
GR Precision Recall F-score
SUBJ 0.97 0.98 0.98
OBJ 0.89 0.94 0.92
COORD 0.71 0.76 0.74
JCT 0.78 0.88 0.83
MOD 0.94 0.87 0.91
PRED 0.80 0.83 0.82
ROOT 0.95 0.94 0.94
COMP 0.70 0.78 0.74
XCOMP 0.93 0.82 0.87
50
Precision and Recall of Specific GRsPrecision and Recall of Specific GRs
GR Precision Recall F-score
SUBJ 0.97 0.98 0.98 0.93
OBJ 0.89 0.94 0.92 0.87
COORD 0.71 0.76 0.74 0.75
JCT 0.78 0.88 0.83 0.86
MOD 0.94 0.87 0.91 0.85
PRED 0.80 0.83 0.82 0.81
ROOT 0.95 0.94 0.94 0.91
COMP 0.70 0.78 0.74 0.54
XCOMP 0.93 0.82 0.87 0.61
51
OutlineOutline
• The CHILDES GR scheme
• GR Parsing of CHILDES transcripts
• Combining different strategies
• Automated measurement of syntactic development in child language
• Related Work
• Conclusion
• Weighted voting• Combination as parsing• Handling young child utterances
52
Combine Different Parsers Combine Different Parsers to Get More Accurate Resultsto Get More Accurate Results
• Rule-based
• Statistical parsing + dependency labeling
• Classifier-based parsing– Obtain even more variety
Precision and Recall of Specific GRsPrecision and Recall of Specific GRs
GR Precision Recall F-score
SUBJ 0.98 0.98 0.98
OBJ 0.94 0.94 0.94
COORD 0.94 0.91 0.92
JCT 0.87 0.90 0.88
MOD 0.97 0.91 0.94
PRED 0.86 0.89 0.87
ROOT 0.97 0.96 0.96
COMP 0.75 0.67 0.71
XCOMP 0.90 0.88 0.89
63
OutlineOutline
• The CHILDES GR scheme
• GR Parsing of CHILDES transcripts
• Combining different strategies
• Automated measurement of syntactic development in child language
• Weighted voting• Combination as parsing• Handling young child utterances
64
Voting May Not Produce Voting May Not Produce a Well-Formed Dependency Treea Well-Formed Dependency Tree
• Voting on a word-by-word basis
• No guarantee of well-formedness
• Resulting set of dependencies may form a graph with cycles, or may not even be fully connected– Technically not fully compliant with CHILDES
GR annotation scheme
65
Parser Combination as ReparsingParser Combination as Reparsing
• Once several parsers have analyzed a sentence, use their output to guide the process of reparsing the sentence
• Two reparsing approaches– Maximum spanning tree– CYK (dynamic programming)
66
Dependency Parsing as Search for Dependency Parsing as Search for Maximum Spanning TreeMaximum Spanning Tree
• First, build a graph– Each word in input sentence is a node– Each dependency proposed by any of the parsers is
an weighted edge– If multiple parsers propose the same dependency,
add weights into a single edge
• Then, simply find the MST– Maximizes the votes– Structure guaranteed to be a dependency tree– May have crossing branches
67
Parser Combination with the CYK AlgorithmParser Combination with the CYK Algorithm
• The CYK algorithm uses dynamic programming to find all parses for a sentence given a CFG– Probabilistic version finds most probable parse
• Build a graph, as with MST• Parse the sentence using CYK
– Instead of a grammar, consult the graph to determine how to fill new cells in the CYK table
– Instead of probabilities, we use the weights from the graph
68
Precision and Recall of Specific GRsPrecision and Recall of Specific GRs
GR Precision Recall F-score
SUBJ 0.98 0.98 0.98
OBJ 0.94 0.94 0.94
COORD 0.94 0.91 0.92
JCT 0.87 0.90 0.88
MOD 0.97 0.91 0.94
PRED 0.86 0.89 0.87
ROOT 0.97 0.97 0.97
COMP 0.73 0.89 0.80
XCOMP 0.88 0.88 0.88
69
OutlineOutline
• The CHILDES GR scheme
• GR Parsing of CHILDES transcripts
• Combining different strategies
• Automated measurement of syntactic development in child language
• Weighted voting• Combination as parsing• Handling young child utterances
70
Handling Young Child Utterances withHandling Young Child Utterances withRule-Based and Data-Driven ParsingRule-Based and Data-Driven Parsing
• Eve-child test set:I need tapioca in the bowl.
That’s a hat.
In a minute.
* Warm puppy happiness a blanket.
* There briefcase.
? I drinking milk.
? I want Fraser hat.
71
Three Types of Sentences in One CorpusThree Types of Sentences in One Corpus
• No problem– High accuracy
• No GRs– But data-driven systems will output GRs
• Missing words, agreement errors, etc.– GRs are fine, but a challenge for data-driven
systems trained on fully grammatical utterances
72
To Analyze or Not To Analyze:To Analyze or Not To Analyze:Ask the Rule-Based ParserAsk the Rule-Based Parser
• Utterances with no GRs are annotated in test corpus as such
• Rule-based parser set to high precision– Same grammar as before
• If sentence cannot be parsed with the rule-based system, output No GR.– 88% Precision, 89% Recall– Sentences are fairly simple
73
The Rule-Based Parser also The Rule-Based Parser also Identifies Missing WordsIdentifies Missing Words
• If the sentence can be analyzed with the rule-based system, check if any insertions were necessary– If inserted be or possessive marker ’s, insert
the appropriate lexical item in the sentence
• Parse the sentence with data-driven systems, run combination
74
High Accuracy Analysis of High Accuracy Analysis of Challenging UtterancesChallenging Utterances
• Eve-child test– No rule-based first pass: 62.9% accuracy
• Many errors caused by GR analysis of words with no GRs
– With rule-based pass: 88.0% accuracy
• 700 words from Naomi corpus– No rule-based: 67.4%– Rule-based, then combo: 86.8%
75
OutlineOutline
• The CHILDES GR scheme
• GR Parsing of CHILDES transcripts
• Combining different strategies
• Automated measurement of syntactic development in child language
• Related work
• Conclusion
76
Index of Productive Syntax (IPSyn)Index of Productive Syntax (IPSyn)(Scarborough, 1990)(Scarborough, 1990)
• A measure of child language development
• Assigns a numerical score for grammatical complexity
(from 0 to 112 points)
• Used in hundreds of studies
77
IPSyn Measures Syntactic DevelopmentIPSyn Measures Syntactic Development
• IPSyn: Designed for investigating differences in language acquisition– Differences in groups (for example: bilingual children)– Individual differences (for example: delayed language
development)– Focus on syntax
• Addresses weaknesses of Mean Length of Utterance (MLU)– MLU surprisingly useful until age 3, then reaches ceiling (or
Automating IPSyn with Automating IPSyn with Grammatical Relation AnalysesGrammatical Relation Analyses
• Search for language structures using patterns that involve POS tags and GRs (labeled dependencies)
• Examples
– Wh-embedded clauses: search for wh-words whose head (or transitive head) is a dependent in a GR of types [XC]SUBJ, [XC]PRED, [XC]JCT, [XC]MOD, COMP or XCOMP
– Relative clauses: search for a CMOD where the dependent is to the right of the head
82
Evaluation DataEvaluation Data
• Two sets of transcripts with IPSyn scoring from two different child language research groups
• Set A– Scored fully manually– 20 transcripts– Ages: about 3 yrs.
• Set B– Scored with CP first, then manually corrected– 25 transcripts– Ages: about 8 yrs.
(Two transcripts in each set were held out for development and debugging)
83
Evaluation Metrics: Evaluation Metrics: Point DifferencePoint Difference
• Point difference
– The absolute point difference between the scores provided by our system, and the scores computed manually
– Simple, and shows how close the automatic scores are to the manual scores
• Automatic measurement of grammatical complexity– Long, Fey & Channell,
2004
90
OutlineOutline
• The CHILDES GR scheme
• GR Parsing of CHILDES transcripts
• Combining different strategies
• Automated measurement of syntactic development in child language
• Related work
• Conclusion
91
Major ContributionsMajor Contributions
• An annotation scheme based on GRs for syntactic structure in CHILDES transcripts
• A linear-time classifier-based parser for constituent structures
• The development of rule-based and data-driven approaches to GR analysis– Precision/recall trade-off using insertions and skipping– Data-driven GR analysis using existing resources
• Charniak parser, Penn Treebank
– Parser variety in classifier-based dependency parsing
92
Major Contributions (2)Major Contributions (2)
• The use of different voting schemes for combining dependency analyses– Surpasses state-of-the-art in WSJ dependency