Top Banner
Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin, Yuguang Duan, Yuanyuan Zhao, Weiwei Sun, Xiaojun Wan Peking University {zi.lin, ariaduan, zhao yy, ws, wanxiaojun}@pku.edu.cn October 25, 2018
47

Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Oct 28, 2019

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Semantic Role Labeling for Learner Chinese:the Importance of Syntactic Analysis and L2-L1 Parallel Data

Zi Lin, Yuguang Duan, Yuanyuan Zhao, Weiwei Sun, Xiaojun Wan

Peking University

{zi.lin, ariaduan, zhao yy, ws, wanxiaojun}@pku.edu.cn

October 25, 2018

Page 2: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Overview

Background

Data Set

Robustness of L1-annotation-trained SRL Systems

Analysis

Improving SRL Systems with L2-L1 Parallel Data

Page 3: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Outline

Background

Data Set

Robustness of L1-annotation-trained SRL Systems

Analysis

Improving SRL Systems with L2-L1 Parallel Data

Page 4: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

What is interlanguage?

A second language (or L2) which preserves some features of their first language (or L1).

你好~

Mandarin

Chinese

Page 5: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

What is interlanguage?

A second language (or L2) which preserves some features of their first language (or L1).

你好~

Mandarin English

Chinese

Page 6: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

What is interlanguage?

A second language (or L2) which preserves some features of their first language (or L1).

你好~

Mandarin English

Chinese

influence

Page 7: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Interlanguage is everywhere...

Page 8: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Interlanguage is everywhere...

Social Network

Page 9: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Interlanguage is everywhere...

And perhaps your paper...

Page 10: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Outline

Background

Data Set

Robustness of L1-annotation-trained SRL Systems

Analysis

Improving SRL Systems with L2-L1 Parallel Data

Page 11: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

L2-L1 Parallel Data

Collect a large dataset of L2-L1 parallel texts of Mandarin by exploring “languageexchange” social networking services – lang-81.

1http://lang-8.com/

Page 12: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Data for SRL annotation

Initial collection1,108,907 pairs

717,241 pairsclean up

600 pairsmanual

selectionSRL annotation

segmentation

4 typologically different mother tongues

Language FamilyChinese Sino-TibetanRussian SlavicArabic SemiticJapanese UnknownEnglish Germanic

Page 13: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Data for SRL annotation

Initial collection1,108,907 pairs

717,241 pairsclean up

600 pairsmanual

selectionSRL annotation

segmentation

4 typologically different mother tongues

Language FamilyChinese Sino-TibetanRussian SlavicArabic SemiticJapanese UnknownEnglish Germanic

Page 14: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Data for SRL annotation

Initial collection1,108,907 pairs

717,241 pairsclean up

600 pairsmanual

selectionSRL annotation

segmentation

4 typologically different mother tongues

Language FamilyChinese Sino-TibetanRussian SlavicArabic SemiticJapanese UnknownEnglish Germanic

Page 15: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Data for SRL annotation

Initial collection1,108,907 pairs

717,241 pairsclean up

600 pairsmanual

selectionSRL annotation

segmentation

4 typologically different mother tongues

Language FamilyChinese Sino-TibetanRussian SlavicArabic SemiticJapanese UnknownEnglish Germanic

Page 16: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Data for SRL annotation

Initial collection1,108,907 pairs

717,241 pairsclean up

600 pairsmanual

selectionSRL annotation

segmentation

4 typologically different mother tongues

Language FamilyChinese Sino-TibetanRussian SlavicArabic SemiticJapanese UnknownEnglish Germanic

Page 17: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Two Questions

1. Can human understand interlanguage robustly?

2. Can automatic system produce high-quality semantic structures?

Page 18: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Can human understand interlanguage robustly?

/ It is difficult to define the syntactic formulism of learner language.

, But sometimes we can understand what they mean...

Why not Semantics?

Page 19: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Can human understand interlanguage robustly?

/ It is difficult to define the syntactic formulism of learner language.

, But sometimes we can understand what they mean...

Why not Semantics?

Page 20: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Can human understand interlanguage robustly?

/ It is difficult to define the syntactic formulism of learner language.

, But sometimes we can understand what they mean...

Why not Semantics?

Page 21: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Semantic Role Labeling

Argument (AN): Who did what to whom ?

Adjunct (AM): When , where , why and how ?

I ate breakfast quickly in the car this morning because I was in a hurry.

A0 A1

AM-MNR

AM-LOC

AM-TMP

AM-PRP

Page 22: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Inter-annotator agreement

I Annotator: two Linguisticstudents

I The first 50-sentence trial set:adapting and refining CPBsecification

I The rest 100-sentence set:reporting the inter-annotatoragreement

Page 23: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Inter-annotator agreement

I Annotator: two Linguisticstudents

I The first 50-sentence trial set:adapting and refining CPBsecification

I The rest 100-sentence set:reporting the inter-annotatoragreement

Page 24: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Two Questions

1. Can human understand interlanguage robustly?

2. Can automatic system produce high-quality semantic structures?

Page 25: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Outline

Background

Data Set

Robustness of L1-annotation-trained SRL Systems

Analysis

Improving SRL Systems with L2-L1 Parallel Data

Page 26: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Three SRL systems

I The Necessity of Parsing for Predicate Argument Recognition. (2002). Gildea and Palmer.

I Semantic Role Labeling Using Different Syntactic Views (2005). Pradhan et al.

I Syntax for Semantic Role Labeling, To Be, Or Not To Be. (2018). He et al.

I Linguistically-Informed Self-Attention for Semantic Role Labeling. (2018). Strubell et al.EMNLP 2018 Best Paper

Parsers

Systems PCFGLA-parser-basedSRL system

Neural-parser-basedSRL system

Neural syntax-agnosticSRL system

Minimal span-based parser

Berkeley parser

Performance<

Trained on Chinese TreeBank that has SRL in CPB

Trained on Chinese PropBank (CPB)

Page 27: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Results

Page 28: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Results

Page 29: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Findings

The syntax-based systems are more robust when handling learner texts.

Page 30: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Findings

The better the parsing results we get, the better the performance on L2 weachieve.

Page 31: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Outline

Background

Data Set

Robustness of L1-annotation-trained SRL Systems

Analysis

Improving SRL Systems with L2-L1 Parallel Data

Page 32: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Why syntactic analysis is important?

用 汉语 也 说话 快 对我来说 很 难 啊。Using Chinese also speaking quickly for me very hard.

Gold

A0 rel

A0

Syntax-based system

Neural end-to-end system

AM AM

A0 AM AM AM rel

AM rel

Using Chinese and also speaking quickly is very hard for me

Page 33: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Why syntactic analysis is important?

CP

IP

IP

用汉语using

Chinese

VP

ADVP

也also

VP

VP

说话快speakingquickly

VP

PP

对我来说for me

ADVP

很very

VP

难hard

SP

啊MOD

PU

I Though the whole structure is ill-formed

Page 34: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Why syntactic analysis is important?

CP

IP

IP

用汉语using

Chinese

VP

ADVP

也also

VP

VP

说话快speakingquickly

VP

PP

对我来说for me

ADVP

很very

VP

难hard

SP

啊MOD

PU

I Partial of the sentence can be well-formed.

Page 35: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

A new Questions

1. Can human understand interlanguage robustly?

2. Can automatic system produce high-quality semantic structures?

↓3. Can we improve the SRL performance on interlanguage?

Page 36: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Outline

Background

Data Set

Robustness of L1-annotation-trained SRL Systems

Analysis

Improving SRL Systems with L2-L1 Parallel Data

Page 37: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Leveraging L2-L1 Parallel Data

, 我 喜欢 做 中国菜I like cooking Chinese food

, 我 喜欢 做饭I like cooking meal

/ 我 喜欢 做饭 中国菜I like cook-meal Chinese food

Page 38: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Leveraging L2-L1 Parallel Data

, 我 喜欢 做 中国菜I like cooking Chinese food

, 我 喜欢 做饭I like cooking meal

/ 我 喜欢 做饭 中国菜I like cook-meal Chinese food

Page 39: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Leveraging L2-L1 Parallel Data

, 我 喜欢 做 中国菜I like cooking Chinese food

, 我 喜欢 做饭I like cooking meal

/ 我 喜欢 做饭 中国菜I like cook-meal Chinese food

Page 40: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Leveraging L2-L1 Parallel Data

〈predicate, argument, role〉 tuples

L1:

我 喜欢 做中国菜I like cooking Chinese food

ARG0 ARG1

我 喜欢 做 中国菜I like cooking Chinese food

ARG0

ARG1

L2:

我 喜欢 做饭中国菜I like cook-meal Chinese food

ARG0 ARG1

我 喜欢 做饭 中国菜I like cook-meal Chinese food

ARG0

ARG1

# of shared tuples = 1

Page 41: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Leveraging L2-L1 Parallel Data

Metric for comparing SRL results

I L2-recall:(# of shared tuples) / (# of tuples of the result in L2)

I L1-recall:(# of shared tuples) / (# of tuples of the result in L1)

Well-formed sentence pair if both are greater than λ

Page 42: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Retraining two essential modules

Parsers

Systems PCFGLA-parser-basedSRL system

Neural-parser-basedSRL system

Neural syntax-agnosticSRL system

Minimal span-based parser

Berkeley parser

Performance<

1. Retrain the parser: Using the automatically generated syntactic trees of the well-formed sentence pairs

Page 43: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Retraining two essential modules

Page 44: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Results

Page 45: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Results

Page 46: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Results

Page 47: Semantic Role Labeling for Learner Chinese: the Importance ... · Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Analysis and L2-L1 Parallel Data Zi Lin,

Thanks for your attention!

Zi Lin is planning to apply for PhD program in CS orlinguistics this fall. Email me at [email protected].

cn if you are interested!