Great Food, Lousy Service
Topic Modeling for Sentiment Analysis in Sparse Reviews
Robin [email protected]
OpenTable.com
Short
Characters Words
Sparse
“An unexpected combination of Left-Bank Paris and Lower Manhattan in Omaha.
Divine. Inspirational and a great value.”
• Food?• Ambiance?• Service?• Noise?
Skewed
Correlations
SVM + Features, Features, Features! tokenize punctuation "white list" (only use sentiment words) id, neutralize proper nouns remove stop words strip numbers POS tagging, ADJ only contraction splitting POS tagging, add ADV lower casing Brill tagger unigram (Bag of Words) sentiment "white list" (Harvard lexicon) bigram count of sentiment words (pos/neg) trigram balanced training set mixed n-grams binary accuracy ignore stop words sub-topic classifiers, hand list stemming WordNet topic list expansion negation processing topic-filtered n-grams expanded negation processing topic-word proximity filtering large training set size strict entropy modeling varying dictionary size frequency-weighted entropy modeling SVM scaling
• 30+ preprocessing and SVM classification features,• ~50 configurations
Key Features• Stemming
• Porter 1980 via NLTK• <fast>, <faster>, <fastest> <fast>
• Negation processing • (enhanced approach from Pang et al. 2002)• “Not a great experience.” NOT_great• “They never disappoint!” NOT_disappoint
• Net sentiment count• pos/neg lexicon (Harvard General Inquirer)• running +/- count• “Incredible(+) food, but our server was rude(-).” (0)
Results (so far)• Trained on 10,000 reviews• Tested on ~80,000 reviews
• Accuracy• Baseline: 50.0%• Intermediate model: 56.6% (1.13x)• abs( average scoring delta ): 0.56
Topic ModelingHand-seeded topic-word list expanded via WordNet
SynSets
1. sub-topic classifiers2. topic-filtered n-grams• <soupFOOD was fantasticADJ>• <fantasticADJ soupFOOD was>
3. topic-word proximity filtering• both above <fantasticADJ/FOOD>.
Results:Food Ambiance Service Noise
1. 39.15% 47.26% 53.70% 48.43%3. 40.05% 47.88% 54.92% 50.35%
1.02x 1.01x 1.02x 1.03x
Word-Rating Distributions
“worst” “mediocre” “decent”
“solid” “exceeded”
Frequency-Weighted Entropy Model
• Accuracy• Baseline: 50.0%• Intermediate model: 56.6%• Best (entropy) model: 58.6% (1.17x)• abs( average scoring delta ): 0.56 0.52