Movie Review Mining and Summarization Li Zhuang, Feng Jing, and Xiao-Yan Zhu ACM CIKM 2006 Speaker: Yu-Jiun Li u Date : 2007/01/10
Jan 01, 2016
Movie Review Mining and Summarization
Li Zhuang, Feng Jing, and Xiao-Yan Zhu
ACM CIKM 2006
Speaker: Yu-Jiun Liu
Date : 2007/01/10
Introduction
Review is useful for both information promulgators and readers.
However, many reviews are lengthy with only few sentences expressing the author’s opinions.
Automatically generate the summary of reviews.
Product Review v.s. Movie Review
The characteristic of movie review mining
The promulgators probably comment more other movie-related elements.
The reader probably wants more. Movie review must generate richer
summary than product review. A multi-knowledge based approach.
Definition 1 Movie Feature
A movie feature is a movie element or a movie-related people that has been commented on.
According to IMDB, feature classes are divided into two groups: ELEMENT and PEOPLE. ELEMENT: OA, ST (screenplay), SE (special effects)…etc. PEOPLE: PPR, PDR, PAC…etc. Example: “story”, “script”, and “screenplay”
belong to ST class; “actor”, “actress”, and “supporting cast” belong to PAC class.
Definition 2
Relevant Opinion of A Feature The relevant opinion of a feature is a set
of words or phrases that expresses a positive (PRO) or negative (CON) opinion on the feature. The polarity of a same opinion word may
vary in different domain. Example: “predictable” is neutral in product
review; sounds negative in movie review.
Definition 3 Feature-Opinion Pair
A feature-opinion pair consists of a feature and a relevant opinion.
An explicit F-O pair : both the feature and the opinion appear in sentence. Example: “The movie is excellent.”
An implicit F-O pair : the feature or the opinion does not appear in sentence. Example: “When I watched this film, I
hoped it ended as soon as possible.” (no opinion word)
Keyword list generation
Build a keyword list to capture main feature/opinion words in movie reviews.
Divide the list into two classes: features and opinions.
Feature Keywords
The words converge. Special parts: People Name (multi-forma
t)(ex: Liu Yu Jiun ; Liu Y.J. ; L. Y. Jiun … etc)
Opinion Keywords Not only use the statistical results.
The first 100 positive/negative words are selected as seed.
For each substantive in WordNet, search it in WordNet for the synsets of its first two meanings. If one of the seed words is in the synsets, the substantive is added to the opinion word list.
Remained opinion words with high frequency are added as domain specific words.
Mining Explicit F-O Pairs In a sentence, use keyword list to find all feature/opinion
words. Use dependency grammar graph to detect the path betwe
en each feature word and each opinion word. Stanford Parser (http://www-nlp.stanford.edu/software/lex-parser.shtml)
Mining Explicit F-O Pairs II Example: “This movie is a masterpiece.” Path: “movie (NN) – nsubj – is (VBZ) – dobj – masterpiece (NN)
”
Mining Implicit F-O Pairs This problem is difficult, so only deal with two
simple cases with opinion words appearing. Very short sentences that appear at the
beginning or ending of a review and contain obvious opinion words. Ex: “Great!” “movie-great” or “film-great”
Specific mapping from opinion word to feature word.
Summary Generation
1. Collect all the sentences that express opinions on a feature class.
2. The semantic orientation of the relevant opinion in each sentence is identified.
3. List the organized sentence as the summary.
Data Select 11 movies from the top 250 list of I
MDB. For each movie, the first 100 reviews are
downloaded. Totally more than 16,000 sentences and
more than 260,000 words. Four movie fans were asked to label f-o p
airs, and give the classes of feature word and opinion word respectively.