Wine Informatics Dr. Bernard Chen Ph.D. University of Central Arkansas
Jan 11, 2016
Wine Informatics
Dr. Bernard Chen Ph.D.University of Central Arkansas
Data science Data science is the study that incorporates varying
techniques and theories from distinct fields, such as Data Mining, Scientific Methods, Math and Statistics, Visualization, natural language processing, and the Domain Knowledge, to discover useful information from domain-related
data.
Domain Knowledge in Wine The quality of the wine is usually
assured by the wine certification, which is generally assessed by Physicochemical, and sensory tests
The existing data mining researches focus on the physicochemical laboratory tests much more than sensory tests.
Domain Knowledge in Wine it is very interesting to mine useful information
from those sensory testing notes for answering the questions such as
“What makes wine become a 90+ one?”, “What is the common characteristics shared by 90+
Napa Cabernet sauvignon?”, “What are the group of the wine share similarities?”, “What are the characteristics differ the wine from
France and Italy?”
Domain Knowledge in Wine The key to the success of the wine sensory
related data science research relays on the consistent reviews from prestigious experts.
Several popular wine magazines provide widely accepted sensory reviews toward wines produced every year, such as Wine Spectator [13], Wine Advocate [14], Decanter [15]
Wine Spectator Review Example Kosta Browne Pinot Noir Sonoma Coast 2009 Ripe and deeply flavored, concentrated and
well-structured, this full-bodied red offers a complex mix of black cherry, wild berry and raspberry fruit that's pure and persistent, ending with a pebbly note and firm tannins. Drink now through 2018. 5,818 cases made.
Wine Spectator
Our first dataset is compiled from the list of “Top 100 Wines of 2011” [16] by Wine Spectator, a lifestyle magazine that focuses on wine and wine culture.
Their reviews are straight and to the point.
Review Example Kosta Browne Pinot Noir Sonoma Coast 2009 Ripe and deeply flavored, concentrated and
well-structured, this full-bodied red offers a complex mix of black cherry, wild berry and raspberry fruit that's pure and persistent, ending with a pebbly note and firm tannins. Drink now through 2018. 5,818 cases made.
Ann C. Noble’s Wine Aroma Wheel
Our own wine wheel Based on “Top 100 wines in 2011”, we
analyzing all one hundred wine reviews and adding all necessary categories and subcategories, we came out with a total of 547 distinct attributes.
When looking at our finished list, we noticed many cases where groups of attributes were really just permeations of the same thing.
An example would be the following three attributes: FRESHLY-CUT APPLE, RIPE APPLE, and APPLE.
Hierarchical Clustering
DendrogramVenn Diagram of Clustered Data
From http://www.stat.unc.edu/postscript/papers/marron/Stat321FDA/RimaIzempresentation.ppt
Distance Measure
Distance Measure Example
WINE CHERRY CHEWY TANNINS
BEAUTY
WINE1 1 1 1
WINE2 0 0 1
Clustering Results
Clustering Results1 2 3 4 5 6 7 8 9 10 11
Clustering Results
Ref#
Vintage
Type
Varietal
1 2008 RED MERLOT (.53) - CABERNET FRANC (.29) - CABERNET SAUVIGNON (.13) - MALBEC (.04) - PETIT VERDOT (.01)
2 2008 RED CABERNET SAUVIGNON
3 2009 RED PINOT NOIR
4 2007 RED CABERNET SAUVIGNON
5 2007 RED SANGIOVESE (.90) - CANAIOLO/COLORINO (.10)
6 2004 RED TEMPRANILLO
Ref#
World
Country Region Alcohol
Price
Drink Begin
Drink End
1 NEW United States
Washington $35 NOW 2020
2 NEW United States
Washington 14.5% $37 NOW 2018
3 NEW United States
California 13.9% $45 NOW 2019
4 NEW United States
Washington 14.6% $32 NOW 2019
5 OLD Italy Tuscany 14% $22 NOW 2022
6 OLD Spain Castilla y Leon
$15 NOW 2015
Clustering Results
CLUSTER #3 – 6 Instances – Attribute InformationAttribute Number of Wines Attribute
WeightBLACKBERRY 6 3LONG FINISH 5 2
SPICE 4 3FRUIT 3 1
BLACK CHERRY 3 3RED 3 2
FOCUSED 3 1EXCELLENT
FINISH3 2
RIPE 3 1TANNINS_LOW 3 2TANNINS_HIGH 3 2Suggestions
This cluster represents the fruity aspect of new-world wines, focusing on powerful notes of blackberry and black cherry, as well as a commanding finish.
Conclusion In this paper, we discuss Wine Reviews
and how their attributes can play an integral role in grouping different wines together.
We show that when using only the attributes of a wine review, we can aggregate wines together that have similar world region, monetary value, vintage, type, and varietal.
Thanks
Questions?