Top Banner
PKUTM Experiments in NTCIR8 MOAT Task Author: Chenfeng Wang * , Tengfei Ma * , Liqiang Guo, Xiaojun Wan d Ji Y and Jianwu Y ang Affiliation: Insitute of Computer Science & Technology of Peking University Speaker: Tengfei Ma
22

PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Jun 20, 2018

Download

Documents

trinhkhue
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

PKUTM Experiments in NTCIR‐8 MOAT Task

Author:  Chenfeng Wang*, Tengfei Ma*, Liqiang Guo, Xiaojun Wan d Ji Yand Jianwu Yang

Affiliation: Insitute of Computer Science & Technology of Peking University

Speaker:  Tengfei Ma

Page 2: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Background of Opinion AnalysisBackground of Opinion Analysis

• Aspects of Opinion Analysis– Is it opinionated?– Is it opinionated?– Is the opinion positive or negative?–What is the opinion?–Who gives the opinion and who does the opinion point to?

– How to summarize all the opinions?o to su a e a t e op o s– … 

Page 3: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Background of Opinion AnalysisBackground of Opinion Analysis

• Aspects of Opinion Analysis– Is it opinionated?– Is it opinionated?– Is the opinion positive or negative?–What is the opinion?–Who gives the opinion and who does 

NTCIR‐8 MOAT

the opinion point to?– How to summarize all the opinions? o to su a e a t e op o s– …

Page 4: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Background of Opinion AnalysisBackground of Opinion Analysis

• The trend of opinion analysis– Coarse‐grain to fine‐grain– Coarse‐grain to fine‐grain• Holder/target extractionG l t d i ifi d– General to domain‐specific and domain‐transfer• Opinion analysis in news, product reviews, movie reviewsC Li l t f l i• Cross‐Lingual, transfer learning

– Publisher‐predominate to interactive

Page 5: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Background of Opinion AnalysisBackground of Opinion Analysis

• The trend of opinion analysis– Coarse‐grain to fine‐grain– Coarse‐grain to fine‐grain• Holder/target extractionG l t d i ifi d– General to domain‐specific and domain‐transfer

NTCIR‐8 MOAT

• Opinion analysis in news, product reviews, movie reviewsC Li l t f l i• Cross‐Lingual, transfer learning

– Publisher‐predominate to interactive

Page 6: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Our tasks in NTCIR 8 MoatOur tasks in NTCIR‐8 Moat

• Opinionated subtask. • Opinion holder extraction• Opinion holder extraction.• Opinion target extraction.

Page 7: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

(Opinionated task)

I. DETECTION OF SUBJECTIVE (Opinionated task)

SENTENCES

Page 8: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Detection of subjective sentencesDetection of subjective sentences

• Equivalent to a classification problemproblem Data 

preprocessing

Feature selection

Classification using a l ifi

• Our method:

selection classifier

– Some combined datasets– Some special opinion features– A general classifier and an improvement

Page 9: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Detection of subjective sentencesDetection of subjective sentences

• Data Preprocessing– Choosing the training Datasets– Choosing the training Datasets• NTCIR6/NTCIR7 corpora and NTCIR8’s samplessamples• Containing both simplified and traditional Chinesetraditional Chinese

– Translate traditional Chinese to Simplified ChineseSimplified Chinese

– POS, NERld– Building Lexicons

Page 10: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

– Source:• expanded Hownet by using theexpanded Hownet by using  the Synonymy Thesaurus + MPQA(English‐‐‐‐>Chinese) + NTU + our in‐house labeled 

Building corpora

– Types:

Building Lexicons

• Opinion Operators e.g.声称(claim)

• Opinion Indicators  e.g.但是(but)p g

• Degree Adverbs e.g.非常(very), 缺乏(lack of)

• Opinion Words (28421 opinion words)p ( p )• Strong Opinion Words  (6471 words)

Page 11: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Detection of subjective sentencesTable 1. Features used in the opinionated subtask

Detection of subjective sentences

• Feature SelectionPunctuations FeaturesPresence of quotation marks like “,「,’, 」and ”q , , ,Presence of colon followed by quotation marks Percentage of punctuations in sentences

Words and Entities FeaturesThe percentage of numeral wordsThe presence of pronounThe presence of a named entityThe presence of a named entityThe presence of a word which indicates a sequence

Lexical Subjective CluesThe presence of opinion operatorThe presence of opinion indicatorThe logarithm of percentage of opinion wordsThe logarithm of percentage of strong opinion wordsThe presence of degree verb

Collocation FeaturesThe presence of collocations between named entities and opinion operators

The presence of collocations between pronouns or nouns and opinion operatorsThe presence of collocations between pronouns or nouns and opinion operators

The presence of collocations between opinion operators and opinion words

The presence of collocations between pronouns and opinion words

The presence of collocations between nouns or pronouns and opinion words

The presence of collocations between degree adverbs and opinion operators

The presence of collocations between degree adverbs and opinion words

The presence of collocations between nouns or named entities and opinion words

Page 12: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Detection of subjective sentencesDetection of subjective sentences

• Classifier– Basic classifiers– Basic classifiers• such as SVM, Naive Bayes, Max Entropy and Decision Treeand Decision Tree• The  comprison is shown in the following sectionfollowing section

– Improved classifier• Iterative classifier using former results• Iterative classifier using former results of detecting subjective sentences

Page 13: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Detection of subjective sentencesTable 2. The result for identifying opinionated sentences

Detection of subjective sentences

• Results in NTCIR8Precision Recall F-measure

Run1 All datasets +iterativeclassifier 0.3721 0.8370 0.5152

Run2NTCIR7 + NTCIR8simplified Chinese +basic classifier

0.4134 0.8335 0.5527

Run2 + NTCIR7

• Additional Tests (Comparison of different classifiers)

Run3 Run2 + NTCIR7traditional dataset 0.3405 0.9062 0.4950

different classifiers)

lenient strict

Page 14: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Detection of subjective sentencesDetection of subjective sentences

• Discussion of the results– Training data– Training data• More ≠ Be er• When and how to leverage translated• When and how to leverage translated datasets

Classifier– Classifier• Iterative  risk

bl– Problem• Ambiguous definition• Ambiguous words

Page 15: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Holder/target task

EXTRACTION OF OPINION HOLDERS Holder/target task

AND TARGETS

Page 16: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Extracting opinion holders/targetsExtracting opinion holders/targets

• Common methods– Parsing and direct training (Bethard)– Parsing and direct training (Bethard)–Maximum Entropy ranking (Kim and H )Hovy)

– Labeling• Our method– Chunking and heuristic rulesChunking and heuristic rules

Page 17: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Extracting opinion holders/targetsExtracting opinion holders/targets

• Advantage of Chunking– Better than parsing in Chinese– Better than parsing in Chinese– Easier to control and modify than 

h ll ishallow parsing

• Process:– Training data: proposition bank–Modifying training dataModifying training data– Training and labeling by CRF

Page 18: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Extracting opinion holders/targetsExtracting opinion holders/targets

• Heuristic rules for opinion holder extractionextraction– before an opinion operator (include a colon) or following a quotescolon) or following a quotes. 

– not governed by a preposition– in other sentences sometimes– using nouns or pronouns as g pcandidates to complement the upper missing casesg

– author

Page 19: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Extracting opinion holders/targetsExtracting opinion holders/targets

• Heuristic rules for opinion target extractionextraction– Similar to opinion holder extraction

l h l–Mainly existing in the opinion clause or as the object of an opinion operator

– Coherent with neighbor sentences

Page 20: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Extracting opinion holders/targets

Table 3. Evaluations Results for Opinion HoldersTable 4. Evaluations Results for Opinion Targets

Extracting opinion holders/targets

Precision Recall F-measure

Only foropinionatedsentences

Run1 0.550 0.434 0.485

Run2 0.554 0.431 0.485

Holder Extraction

sentencesRun3 0.548 0.473 0.508

For allsentences

Run1 0.204 0.434 0.277

Run2 0.232 0.431 0.301

Run3 0.186 0.473 0.267

Precision Recall F-measure

Only for Run1 0.892 0.736 0.806

Target Extraction

opinionatedsentences Run2 0.896 0.732 0.805

Run3 0.877 0.792 0.832

For allsentences

Run1 0.339 0.736 0.464

Run2 0.385 0.732 0.504

Run3 0.307 0.792 0.442

Page 21: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang

Extracting opinion holders/targetsExtracting opinion holders/targets

• Discussion– Limited by the parsing technique– Limited by the parsing technique– Features are complex for machine l ilearning

– Future research (See (Ma, Coling10))• Adding semantic information• Adding syntactic rules to leverage relevant information (e.g. reviews‐‐news)

Page 22: PKUTM Experiments in NTCIR 8 MOAT Taskresearch.nii.ac.jp/.../NTCIR/03-NTCIR8-MOAT-WangC_slides.pdfPKUTM Experiments in NTCIR‐8 MOAT Task Author: *Chenfeng Wang*, Tengfei Ma , Liqiang