A Novel Rule Refinement Method for SMT through Simulated Post-Editing Sitong Yang 1,2 , Heng Yu ?1 , and Qun Liu 1,3 1. Key Laboratory of Intelligent Information Processing. Institute of Computing Technology, Chinese Academy of Sciences 2. University of Chinese Academy of Sciences 3. CNGL, School of Computing, Dublin City University 2014/12/23
68
Embed
A Novel Rule Refinement Method for SMT through Simulated ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt126.pdfA Novel Rule Refinement Method for SMT through Simulated Post-Editing
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
A Novel Rule Refinement Method for
SMT through Simulated Post-Editing
Sitong Yang1,2, Heng Yu?1, and Qun Liu1,3
1. Key Laboratory of Intelligent Information Processing. Institute of Computing Technology, Chinese Academy of Sciences
2. University of Chinese Academy of Sciences
3. CNGL, School of Computing, Dublin City University
2014/12/23
2014/12/23
Post-Editing
Pros & Cons
Our method
Data set & Experiment
Conclusion & Furture Work
Post Editing(PE)
2014/12/23 Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Post Editing(PE)
2014/12/23 Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Automatic post editing
Post Editing(PE)
2014/12/23
MT
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Post Editing(PE)
2014/12/23
MT Post Editing (SMT)
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Post Editing(PE)
2014/12/23
MT Post Editing (SMT)
Result
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Post Editing(PE)
2014/12/23
MT Post Editing (SMT)
Result
Multiple stream
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Post Editing(PE)
2014/12/23
MT
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Post Editing(PE)
2014/12/23
Post Editing MT
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Post Editing(PE)
2014/12/23
Better MT Post Editing
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Post Editing(PE)
2014/12/23
Single stream
Better MT Post Editing
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
2014/12/23
Post-Editing
Pros & Cons
Our method
Data set & Experiment
Conclusion & Furture Work
Pros & Cons
2014/12/23
Pros:
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Pros & Cons
2014/12/23
Pros:
• Better adaptation
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Pros & Cons
2014/12/23
Pros:
• Better adaptation
• No additional burden for SMT
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Pros & Cons
2014/12/23
Pros:
• Better adaptation
• No additional burden for SMT
Cons:
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Pros & Cons
2014/12/23
Pros:
• Better adaptation
• No additional burden for SMT
Cons:
• Expensive
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Pros & Cons
2014/12/23
Pros:
• Better adaptation
• No additional burden for SMT
Cons:
• Expensive
• Hard to learn
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
2014/12/23
Post-Editing
Pros & Cons
Our method
Data set & Experiment
Conclusion & Furture Work
Our method
2014/12/23
We Learn from PE results to enhance the original SMT Model.
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Our method
2014/12/23
We Learn from PE results to enhance the original SMT Model.
• Simulated Post Editing
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Our method
2014/12/23
We Learn from PE results to enhance the original SMT Model.
• Simulated Post Editing
• Error-Driven Frame work
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Our method
2014/12/23
We Learn from PE results to enhance the original SMT Model.
• Simulated Post Editing
• Error-Driven Frame work
Error Detection Rule Extraction Rule Filteration
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Simulated PE
2014/12/23
Daniel [2010] formulated the task of simulated post-editing, wherein pregenerated reference translations are used as a stand-in for actual post-editing.
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Simulated PE
2014/12/23
Machine Translation
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Simulated PE
2014/12/23
Machine Translation
Human Post Editing
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Simulated PE
2014/12/23
Machine Translation
Human Post Editing
PE
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Simulated PE
2014/12/23
Machine Translation
Human Post Editing
PE
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Expensive
Simulated PE
2014/12/23
Machine Translation
Human Post Editing
Reference
PE
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Simulated PE
2014/12/23
Machine Translation
Human Post Editing
Reference
PE
SiPE
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Simulated PE
2014/12/23
Machine Translation
Human Post Editing
Reference
PE
SiPE
Cheap
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Error-Driven Rule Refinement
2014/12/23 Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Error-Driven Rule Refinement
2014/12/23
This man lived a dog ’s life
这个 人 生活 潦倒
Src:
Tgt:
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Error-Driven Rule Refinement
2014/12/23
This man lived a dog ’s life
这个 人 生活 潦倒
Src:
Tgt:
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Error-Driven Rule Refinement
2014/12/23
This man lived a dog ’s life
这个 人 生活 潦倒
Src:
Tgt:
Alignment Error!
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Error-Driven Rule Refinement
2014/12/23
This man lived a dog ’s life
这个 人 生活 一只 狗 的 生活
这个 人 生活 潦倒
Src:
Tgt:
MT:zhege ren shenghuo yizhi gou de shenghuo
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Error-Driven Rule Refinement
2014/12/23
This man lived a dog ’s life
这个 人 生活 一只 狗 的 生活
这个 人 生活 潦倒
Src:
Tgt:
MT:
这个 人 生活 潦倒Ref:zhege ren shenghuo liaodao
zhege ren shenghuo yizhi gou de shenghuo
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Error-Driven Rule Refinement
2014/12/23
This man lived a dog ’s life
这个 人 生活 一只 狗 的 生活
这个 人 生活 潦倒
Src:
Tgt:
MT:
这个 人 生活 潦倒Ref:zhege ren shenghuo liaodao
zhege ren shenghuo yizhi gou de shenghuo
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Error Detection
2014/12/23 Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Error Detection
2014/12/23 Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Editing distance
Error Detection
2014/12/23 Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
TERplus
Synonym match Stem match
Phrase substitution Shift
Deletion Word substitution
Insertion
Editing distance→
Error Detection
2014/12/23 Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Error Detection
2014/12/23 Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
Y T Ps S D Ws I Y T Ps S D Ws I 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Y: Synonym match
T: Stem match
Ps: Phrase subst i t ut i on
S: Shif t
D: Delet i on
Ws: Word subst i t ut i on
I: Insert i on
SiPE Distribution SiPE Precision
Error Detection
2014/12/23 Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
Y T Ps S D Ws I Y T Ps S D Ws I
small sample hard to learn
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Y: Synonym match
T: Stem match
Ps: Phrase subst i t ut i on
S: Shif t
D: Delet i on
Ws: Word subst i t ut i on
I: Insert i on
SiPE Distribution SiPE Precision
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Y: Synonym match
T: Stem match
Ps: Phrase subst i t ut i on
S: Shif t
D: Delet i on
Ws: Word subst i t ut i on
I: Insert i on
Error Detection
2014/12/23 Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
Y T Ps S D Ws I Y T Ps S D Ws I
SiPE Distribution SiPE Precision
Rule extration and Filteration
2014/12/23
Filteration
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Rule extration and Filteration
2014/12/23
Filteration • C (words of Context )
• P (words of Source side Substitution part)
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Rule extration and Filteration
2014/12/23 Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
这个 人 生活 一只 狗 的 生活MT:
这个 人 生活 潦倒Ref:zhege ren shenghuo liaodao
zhege ren shenghuo yizhi gou de shenghuo
C=1 P=4
Extration Monolingual rule:
生活 一只狗的生活 ||| 生活 潦倒
Rule extration and Filteration
2014/12/23 Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
这个 人 生活 一只 狗 的 生活MT:
这个 人 生活 潦倒Ref:zhege ren shenghuo liaodao
zhege ren shenghuo yizhi gou de shenghuo
Extration Monolingual rule:
生活 一只狗的生活 ||| 生活 潦倒
Rule extration and Filteration
2014/12/23
Extration Monolingual rule:
生活 一只狗的生活 ||| 生活 潦倒
This man lived a dog ’s life
这个 人 生活 一只 狗 的 生活
Src:
MT:
这个 人 生活 潦倒Ref:zhege ren shenghuo liaodao
zhege ren shenghuo yizhi gou de shenghuo
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Rule extration and Filteration
2014/12/23
Extration Monolingual rule:
生活 一只狗的生活 ||| 生活 潦倒
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Rule extration and Filteration
2014/12/23
Extration Monolingual rule:
生活 一只狗的生活 ||| 生活 潦倒
Original Bilingual rule :
lived a dog ‘s life ||| 生活 一只 狗 的 生活 ||| 0.5 0.0149508 0.4 7.97148e-06 2.718
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Rule extration and Filteration
2014/12/23
Extration Monolingual rule:
生活 一只狗的生活 ||| 生活 潦倒
Original Bilingual rule :
lived a dog ‘s life ||| 生活 一只 狗 的 生活 ||| 0.5 0.0149508 0.4 7.97148e-06 2.718
New rule:
lived a dog ‘s life |||生活 潦倒 ||| 0.5 0.0149508 0.4 7.97148e-06 2.718
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Filtering Criterion
2014/12/23 Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
Filtering Criterion
2014/12/23 Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
calming the emotions of
calming the feelings of
C=2 P=2
the beginning of the new
the opening
C=1 P=3
C=2 P=1
between the faculty members and
between teachers and
C=2 P=3
to open up and
opening up and
Filtering Criterion
2014/12/23 Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
calming the emotions of
calming the feelings of
C=2 P=2
the beginning of the new
the opening
C=1 P=3
C=2 P=1
between the faculty members and
between teachers and
C=2 P=3
to open up and
opening up and
Filtering Criterion
2014/12/23 Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
calming the emotions of
calming the feelings of
C=2 P=2
the beginning of the new
the opening
C=1 P=3
C=2 P=1
between the faculty members and
between teachers and
C=2 P=3
to open up and
opening up and
Filtering Criterion
2014/12/23
• Should Contain More Context ( c>=2 )
• More Accurated Substitution ( 2<=p<=5 )
Sitong Yang, Heng Yu, and Qun Liu. A Novel Rule Refinement Method for SMTthrough Simulated Post-Editing
>> Post Editing >> Pros & Cons >> Our method >> Data set & Experiment >> Conclusion & Furture Work
2014/12/23
Post-Editing
Pros & Cons
Our method
Data set & Experiment
Conclusion & Furture Work
Experiments-Setup
2014/12/23
• Baseline system: Moses:a state-of-art phrase-based SMT system