-
An Overview of Opinionated Tasks and Corpus PreparationHsin-Hsi
ChenDepartment of Computer Science and Information
EngineeringNational Taiwan UniversityTaipei, Taiwan
http://research.nii.ac.jp/ntcir/ntcir-ws6/opinion/ntcir5-opinionws-en.html
-
What is an opinion?Opinion is a subjective informationOpinion
usually contains an opinion holder an attitude, and a target, but
not obligatoryA sentential clause or a meaningful unit (in Chinese)
is the smallest unit of an opinion.
-
Why opinion processing is important?There is explosive
information on the Internet, and its hard to extract opinions by
humans.Opinions of the public is an important index of companies
and the government.Opinions change over time, so to keep track of
opinions automatically is an important issue.
-
Fact-based vs. Opinion-basedExamples:Circular vs. HappyHe is an
engineer. vs. He thinks that his boss is a kind person.Why the sky
is blue? vs. Do people support the government?
-
Previous Work (1)English:Sentiment words (Wiebe et al., Kim and
Hovy, Takamura et al.)Opinion sentence extraction (Riloff and
Wiebe, Kim and Hovy)Opinion document extraction (Wiebe et al., Pang
et al.)Opinion summarization: reviews and products (Hu and Liu,
Dave et al.)
-
Previous Work (2)JapaneseOpinion extraction (Kobayasi et al.:
reviews, at word/sentence level)Opinion summarization (Morinaga et
al.: product reputations, Seki, Eguchi, and Kando)ChineseOpinion
extraction (Ku, Wu, Li and Chen)Opinion summarization (Ku, Li, Wu
and Chen)News and Blog Corpora (Ku, Liang and Chen)Korean?
-
Corpus Preparation (1)QuantityHow much materials should we
collect?Words/Sentences/DocumentsSourceWhat source should we pick?
Mining opinions from general documents or the obvious opinionated
documents? (ex. Discussion group)News, Reviews, Blogs,
-
Corpus Preparation (2)Different granularityWord levelSentence
levelClause levelDocument levelMulti-documents
(summarization)Different sourcesDifferent languages
-
Previous Work (Corpus Preparation 1/5)Example: NRRC Summer
Workshop on Multiple-Perspective QAPeople involved: 1 researcher, 3
graduate students, 6 professorsCollect 270,000 documents, over
11-month periods, retrieve documents relevant to 8 topics, more
than 200 documents of each topicWorkshop: MPQA: Multi-Perspective
Question Answering RRC Host: Northeast Regional Research Center
(NRRC) 2002 Leader: Prof. Janyce Wiebe Participants: Eric Breck,
Chris Buckley, Claire Cardie, Paul Davis, Bruce Fraser, Diane
Litman, David Pierce, Ellen Riloff, Theresa Wilson
-
Previous Work (Corpus Preparation 2/5) Source: news documents
(World News Connection - WNC)
In another work on word level: 2,615 words
-
Previous Work (Corpus Preparation 3/5) Example: Using NTCIR
Corpus (Chinese)ReusableNTCIR2, news documentsRetrieve documents
relevant to 6 topicsOn average, 34 documents for each topicAt Word
level: 838 wordsExperiments using NTCIR3 are ongoing
-
Previous Work(Corpus Preparation 4/5)
-
Previous Work(Corpus Preparation 5/5)Example: Using reviews from
Web (Japanese)Specific domains: cars and games15,000 reviews
(230,000 sentences) for cars, 9,700 reviews (90,000 sentences) for
gamesUsing topic words (ex. Companies of cars and
games)Semi-automatic methods for collecting opinion terms (with
patterns)
-
Corpus Annotation Annotation types
(1)Support/Non-supportSentiment/Non-sentimentPositive/Neutral/NegativeStrong/Medium/Weak
Annotation types (2)Opinion holder/Attitude/TargetNested
opinions
-
Previous Work (Corpus Annotation 1/4)Example: NRRC Summer
Workshop on Multiple-Perspective QA (English)Total 114 documents
annotated57 with deep annotations, 57 with shallow annotations7
annotators
-
Previous Work (Corpus Annotation 2/4)TagsOpinion:
on=implicit/formally declaredFact: onlyfactive=yes/noSubjectivity:
strength=high/medium/loAttitude: neg-attitude/pos-attitudeWriter:
opinion holder information
-
Previous Work (Corpus Annotation 3/4)Example: Using NTCIR Corpus
(Chinese)Total 204 documents are annotated3 annotatorsUsing
XML-style tagsDefine types, but no strength (considering the
agreement issue)
-
Previous Work (Corpus Annotation 4/4)
-
Corpus Evaluation (1)How to choose materials? Filter out
candidates whose annotations are too diverse among annotators?
(Agreements?)How many annotators are needed for one candidate?
(More annotators, lower agreements)How to build the gold
standard?VotingUse instances with consistent annotations
-
Corpus Evaluation (2)How to evaluate a corpus for a subjective
task?Agreement (Is it enough?)Kappa value (To what agreement level
?)Almost perfect agreementSubstantial agreementModerate
agreementFair agreementSlight agreementLess than change
agreement
-
Kappa coefficient (wiki)Cohen's kappa coefficient is a
statistical measure of inter-rater agreement.It is generally
thought to be a more robust measure than simple percent agreement
calculation since takes into account the agreement occurring by
chance.Cohen's kappa measures the agreement between two raters who
each classify N items into C mutually exclusive categories.The
first evidence of Cohen's Kappa in print can be attributed to
Galton (1892).
-
Kappa coefficient (wiki)The equation for is:
Pr(a) is the relative observed agreement among ratersPr(e) is
the hypothetical probability of chance agreement If the raters are
in complete agreement then = 1If there is no agreement among the
raters (other than what would be expected by chance) then 0.
-
Kappa coefficientTwo raters are asked to classify objects into
categories 1 and 2. The table below contains cell probabilities for
a 2 by 2 table.
P0=P11+P22, observed level of agreementThis value needs to be
compared to the value that you would expect if the two raters were
totally
independentPe=P1P1+P2P2http://www.childrensmercy.org/stats/definitions/kappa.htm
-
ExampleHypothetical Example: 29 patients are examined by two
independent doctors (see Table). 'Yes' denotes the patient is
diagnosed with disease X by a doctor. 'No' denotes the patient is
classified as no disease X by a doctor.
P0=P11+P22=(10 + 12)/29 = 0.76Pe=P1P1+P2P2 =0.586 * 0.345 +
0.655 * 0.414 = 0.474Kappa = (0.76 - 0.474)/(1 - 0.474) =
0.54http://www.dmi.columbia.edu/homepages/chuangj/kappa/
-
Online Kappa Calculatorhttp://justus.randolph.name/kappa
-
Previous WorkCorpus EvaluationDifferent languages/annotations
may have different agreements.Kappa: 0.32-0.65 (only factivity,
English)Kappa: 0.40-0.68 (word level, Chinese)Different annotators
with different background may have different agreements.
-
What are needed for this work?What kind of documents? News?
Others?All relevant documents?Provide only the type of documents,
or fully annotated documents for training?Provide some sentiment
words as clues?To what granularity? Word, clause, sentence,
document, or multi-document?In which language? Mono-lingual,
multi-lingual or cross-lingual?
-
Natural Language Processing Lecture 15
Opinionated ApplicationsHsin-Hsi ChenDepartment of Computer
Science and Information EngineeringNational Taiwan
UniversityTaipei, Taiwan
-
Opinionated ApplicationsOpinion extraction Sentiment word
miningOpinionated sentence extractionOpinionated document
extractionOpinion summarizationOpinion tracking
Opinionated question answeringMulti-lingual/Cross-lingual
opinionated issues
-
Opinion MiningOpinion extraction identifies opinion holders,
extracts the relevant opinion sentences and decides their
polarity.Opinion summarization recognizes the major events embedded
in documents and summarizes the supportive and the non-supportive
evidence.Opinion tracking captures subjective information from
various genres and monitors the developments of opinions from
spatial and temporal dimensions.
-
Opinion extraction Extracting opinion evidence from words,
sentences, and documents, and then to tell their polarities.The
composition of semantics and that of opinions are very much alike
in documents:Word -> Sentence -> DocumentThe algorithm is
designed based on the composition of different granularities.
-
SeedsSentiment words in General Inquirer (GI) and Chinese
Network Sentiment Dictionary (CNSD) are collected as seeds.GI is in
English, while CNSD is in Chinese. GI is translated in Chinese.A
total of 10,542 qualified seeds are collected in NTUSD.
-
Statistics of Seeds
-
Thesaurus ExpansionThe seed vocabulary is enlarged by (The
Academia Sinica Bilingual Ontological WordNet)Words in the same
clusters may not always have the same opinion tendency.(forgive)
vs. (appease)How to distinguish words with different polarities
within the same cluster/synsetOpinion tendency of a word and its
strength
-
Sentiment Tendency of a Character (raw score)
-
Sentiment Tendency of a Character (normalization)?
-
Sentiment Tendency of a Word
A sentiment degree of a Chinese word w is the average of the
sentiment scores of the composing characters c1, c2, ..., cpA
positive score denotes a positive word.A negative score denotes a
negative word.Score zero denotes non-sentiment or neutral.
-
Opinion Extraction at Sentence Level at Sentence Level?
-
Opinion Extraction at Document Level
-
Evaluation Corpus PreparationSource: TREC (English;News) / NTCIR
(Chinese;News) / Blog (Chinese:Casual Writing)Corpus is prepared
for multi-genre and multi- lingual issues.Corpus is prepared to
evaluate opinion extraction, summarization, and tracking.
-
Opinion SummarizationFind important topics of a document
set.Find relative sentences of important topicsFind opinions
embedded in sentences.Summarize opinions of important topics.
-
Opinion TrackingOpinion tracking is a kind of graph-based
opinion summarization.We are concerned of how opinions change over
time. An opinion tracking system tells how people change their
opinions as time goes by.To track opinions, opinion extraction and
summarization are necessary. Opinion extraction tells the changes
of opinion polarities, while opinion summarization tells the
correlated events.
NTCIR (NII Test Collection for IR Systems) Project Workshop:
MPQA: Multi-Perspective Question Answering RRC Host: Northeast
Regional Research Center (NRRC) 2002 Leader: Prof. Janyce Wiebe
Participants: Eric Breck, Chris Buckley, Claire Cardie, Paul Davis,
Bruce Fraser, Diane Litman, David Pierce, Ellen Riloff, Theresa
Wilson NRRC 2002
https://rrc.mitre.org/pubs.shtml
SentimentInquirer ; Genre