Article Uniting the Tribes: Using Text for Marketing Insight Jonah Berger, Ashlee Humphreys, Stephan Ludwig, Wendy W. Moe, Oded Netzer, and David A. Schweidel Abstract Words are part of almost every marketplace interaction. Online reviews, customer service calls, press releases, marketing com- munications, and other interactions create a wealth of textual data. But how can marketers best use such data? This article provides an overview of automated textual analysis and details how it can be used to generate marketing insights. The authors discuss how text reflects qualities of the text producer (and the context in which the text was produced) and impacts the audience or text recipient. Next, they discuss how text can be a powerful tool both for prediction and for understanding (i.e., insights). Then, the authors overview methodologies and metrics used in text analysis, providing a set of guidelines and procedures. Finally, they further highlight some common metrics and challenges and discuss how researchers can address issues of internal and external validity. They conclude with a discussion of potential areas for future work. Along the way, the authors note how textual analysis can unite the tribes of marketing. While most marketing problems are interdisciplinary, the field is often fragmented. By involving skills and ideas from each of the subareas of marketing, text analysis has the potential to help unite the field with a common set of tools and approaches. Keywords computational linguistics, machine learning, marketing insight, interdisciplinary, natural language processing, text analysis, text mining Online supplement: https://doi.org/10.1177/0022242919873106 The digitization of information has made a wealth of textual data readily available. Consumers write online reviews, answer open-ended survey questions, and call customer service repre- sentatives (the content of which can be transcribed). Firms write ads, email frequently, publish annual reports, and issue press releases. Newspapers contain articles, movies have scripts, and songs have lyrics. By some estimates, 80%–95% of all business data is unstructured, and most of that unstruc- tured data is text (Gandomi and Haider 2015). Such data has the potential to shed light on consumer, firm, and market behavior, as well as society more generally. But, by itself, all this data is just that—data. For data to be useful, researchers must be able to extract underlying insight—to mea- sure, track, understand, and interpret the causes and conse- quences of marketplace behavior. This is where the value of automated textual analysis comes in. Automated textual analysis 1 is a computer-assisted methodology that allows researchers to rid themselves of mea- surement straitjackets, such as scales and scripted questions, and to quantify the information contained in textual data as it naturally occurs. Given these benefits, the question is no longer whether to use automated text analysis but how these tools can best be used to answer a range of interesting questions. This article provides an overview of the use of automated text analysis for marketing insight. Methodologically, text analysis approaches can describe “what” is being said and “how” it is said, using both qualitative and quantitative inqui- ries with various degrees of human involvement. These Jonah Berger is Associate Professor of Marketing, Wharton School, University of Pennsylvania, USA (email: [email protected]). Ashlee Humphreys is Associate Professor, Medill School of Journalism, Media, and Integrated Marketing Communications, Northwestern University, USA (email: [email protected]). Stephan Ludwig is Associate Professor of Marketing, University of Melbourne, Australia (email: stephan.ludwig@ unimelb.edu.au). Wendy W. Moe is Associate Dean of Master’s Programs, Dean’s Professor of Marketing, and Co-Director of the Smith Analytics Consortium, University of Maryland, USA (email: [email protected]). Oded Netzer is Professor of Business, Columbia Business School, Columbia University, USA (email: [email protected]). David A. Schweidel is Professor of Marketing, Goizueta Business School, Emory University, USA (email: [email protected]). 1 Computer-aided approaches to text analysis in marketing research are generally interchangeably referred to as computer-aided text analysis (Pollach 2012), text mining (Netzer et al. 2012), automated text analysis (Humphreys and Wang 2017), or computer-aided content analysis (Dowling and Kabanoff 1996). Journal of Marketing 1-25 ª American Marketing Association 2019 Article reuse guidelines: sagepub.com/journals-permissions DOI: 10.1177/0022242919873106 journals.sagepub.com/home/jmx
25
Embed
Uniting the Tribes: Using Text for Marketing Insight · in. Automated textual analysis1 is a computer-assisted methodology that allows researchers to rid themselves of mea-surement
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Article
Uniting the Tribes: Using Textfor Marketing Insight
Jonah Berger, Ashlee Humphreys, Stephan Ludwig, Wendy W. Moe,Oded Netzer, and David A. Schweidel
AbstractWords are part of almost every marketplace interaction. Online reviews, customer service calls, press releases, marketing com-munications, and other interactions create a wealth of textual data. But how can marketers best use such data? This article provides anoverview of automated textual analysis and details how it can be used to generate marketing insights. The authors discuss how textreflects qualities of the text producer (and the context in which the text was produced) and impacts the audience or text recipient.Next, theydiscuss how text can be a powerful tool both for prediction and for understanding (i.e., insights).Then, the authorsoverviewmethodologies and metrics used in text analysis, providing a set of guidelines and procedures. Finally, they further highlight somecommon metrics and challenges and discuss how researchers can address issues of internal and external validity. They conclude with adiscussion of potential areas for future work. Along the way, the authors note how textual analysis can unite the tribes of marketing.While most marketing problems are interdisciplinary, the field is often fragmented. By involving skills and ideas from each of thesubareas of marketing, text analysis has the potential to help unite the field with a common set of tools and approaches.
Keywordscomputational linguistics,machine learning, marketing insight, interdisciplinary, natural language processing, text analysis, text mining
tics, economics, organizational behavior) and use different tools,
making it increasingly difficult to have a common conversation.
However, text analysis can unite the tribes. Not only does it
involve skills and ideas from each of these areas, doing it well
requires such integration because it borrows ideas, concepts,
approaches, and methods from each tribe and incorporates them
to achieve insight. In so doing, the approach also adds value to
each of the tribes in ways that might not otherwise be possible.
We start by discussing two distinctions that are useful when
thinking about how text can be used: (1) whether text reflects or
impacts (i.e., says something about the producer or has a down-
stream impact on something else) and (2) whether text is used
for prediction or understanding (i.e., predicting something or
understanding what caused something). Next, we explain how
text may be used to unite the tribes of marketing. Then we
provide an overview of text analysis tools and methodology
and discuss key questions and measures of validity. Finally,
we close with a future research agenda.
The Universe of Text
Communication is an integral part of marketing. Not only do
firms communicate with customers, but customers communi-
cate with firms and one another. Moreover, firms communicate
with investors and society communicates ideas and values to
the public (through newspapers and movies). These communi-
cations generate text or can be transcribed into text.
A simple way to organize the world of textual data is to
think about producers and receivers—the person or organi-
zation that creates the text and the person or organization
who consumes the text (Table 1). While there are certainly
other parties that could be listed, some of the main
producers and receivers are consumers, firms, investors, and
society at large. Consumers write online reviews that are
read by other consumers, firms create annual reports that
are read by investors, and cultural producers represent soci-
etal meanings through the creation of books, movies, and
other digital or physical artifacts that are consumed by indi-
viduals or organizations.
Consistent with this distinction between text producer and
text receiver, researchers may choose to study how text reflects
or impacts. Specifically, text reflects information about, and
thus can be used to gain insight into, the text producer or one
can study how text impacts the text receiver.
Text as a Reflection of the Producer
Text reflects and indicates something about the text producer
(i.e., the person, organization, or context that created it). Cus-
tomers, firms, and organizations use language to express them-
selves or achieve desired goals, and as a result, text signals
information about the actors, organization, or society that cre-
ated it and the contexts in which it was created. Like an anthro-
pologist piecing together pottery shards to learn about a distant
civilization, text provides a window into its producers.
Take, for example, a social media post in which someone
talks about what they did that weekend. The text that person
produces provides insight into several facets. First, it provides
insight into the individual themselves. Are they introverted or
extraverted? Neurotic or conscientious? It sheds light on who
they are in general (i.e., stable traits or customer segments;
Moon and Kamakura 2017) as well as how they may be feeling
or what they may be thinking at the moment (i.e., states). In a
sense, language can be viewed as a fingerprint or signature
(Pennebaker 2011). Just like brush strokes or painting style can
be used to determine who painted a particular painting,
researchers use words and linguistic style to infer whether a
play was written by Shakespeare, or if a person is depressed
(Rude, Gortner, and Pennebaker 2004) or being deceitful
2 Journal of Marketing XX(X)
(Ludwig et al. 2016). The same is true for groups, organiza-
tions, or institutions. Language reflects something about who
they are and thus provides insight into what they might do in
the future.
Second, text can provide insight into a person’s attitudes
toward or relationships with other attitude objects—whether
that person liked a movie or hated a hotel stay, for example,
or whether they are friends or enemies with someone. Lan-
guage used in loan applications provides insight into whether
people will default (Netzer, Lemaire, and Herzenstein 2019),
language used in reviews can provide insight into whether they
are fake (Anderson and Simester 2014; Hancock et al. 2007;
Table 1. Text Producers and Receivers.
TextProducers
Text Receivers
Consumers Firms Investors Institutions/Society
Consumers � Online reviews (Anderson andSimester 2014; Chen and Lurie 2013;Fazio and Rockledge 2015a; Kronrodand Danziger 2013a; Lee and Bradlow2011; Liu, Lee, and Srinivasan 2019a;Melumad, Inman, and Pham 2019;Moon and Kamakura 2017; Puranam,Narayan, and Kadiyali 2017)� Social media (Hamilton, Schlosser, and
Chen 2017a; Netzer et al. 2012;Villarroel Ordenes et al. 2017)� Offline word of mouth (Berger and
Schwartz 2011a; Mehl and Pennebaker2003a)
� Forms and applications(Netzer, Lemaire, andHerzenstein 2019)� Idea-generation contexts
(Bayus 2013a; Toubia andNetzer 2017)� Social media/brand
Data acquisition. Data acquisition can be well defined if the
researcher is provided with a set of documents (e.g., emails,
quarterly reports, a data set of product reviews) or more open-
ended if the researcher is using a web scraper (e.g., Beautiful
Soup) that searches the web for instances of a particular topic
or a specific product. When scraping text from public sources,
researchers should abide by the legal guidelines for using the
data for academic or commercial purposes.
Tokenization. Tokenization is the process of breaking the text into
units (often words and sentences). When tokenizing, the
researcher needs to determine the delimiters that define a token
(space, period, semicolon, etc.). If, for example, a space or a period
is used to determine a word, it may produce some nonsensical
tokens. For example, “the U.S.” may be broken to the tokens “the,”
“U,” and “S.” Most text-mining software has smart tokenization
procedures to alleviate such common problems, but the
researcher should pay close attention to instances that are spe-
cific to the textual corpora. For cases that include paragraphs or
threads, depending on the research objective, the researcher
may wish to tokenize these larger units of text as well.
Cleaning. HTML tags and nontextual information, such as
images, are cleaned or removed from the data set. The cleaning
needs may depend on the format in which the data was provided/
extracted. Data extracted from the web often requires heavier
cleaning due to the presence of HTML tags. Depending on the
purpose of the analysis, images and other nontextual information
may be retained. Contractions such as “isn’t” and “can’t” need to
be expanded at this step. In this step, researchers should also be
mindful of and remove phrases automatically generated by com-
puters that may occur within the text (e.g., “html”).
Removing stop words. Stop words are common words such as “a”
and “the” that appear in most documents but often provide no
significant meaning. Common text-mining tools (e.g., the tm,
quanteda, tidytext, and tokenizers package in R; the Natural
Language Toolkit package in Python; exclusion words in
WordStat) have a predefined list of such stop words that can
be amended by the researcher. It is advisable to add common
words that are specific to the domain (e.g., “Amazon” in a
corpora of Amazon reviews) to this list. Depending on the
research objective, stop words can sometimes be very mean-
ingful, and researchers may wish to retain them for their anal-
ysis. For example, if the researcher is interested in extracting
not only the content of the text but also writing style (e.g.,
Packard, Moore, and McFerran 2018), stop words can be very
informative (Pennebaker 2011).
Spelling. Most text-mining packages have prepackaged spellers
that can help correct spelling mistakes (e.g., the Enchant spel-
ler). In using these spellers, the researcher should be aware of
language that is specific to the domain and may not appear in
the speller—or even worse, that the speller may incorrectly
“fix.” Moreover, for some analyses the researcher may want
to record the number of spelling mistakes as an additional
Table 2. The Text Analysis Workflow.
Data Preprocessing Common Tools Measurement Validity
� Data acquisition:Obtain or download(often in an HTMLformat) text.� Tokenization: Break
text into units (oftenwords and sentences)using delimiters (e.g.,periods).� Cleaning: Remove
nonmeaningful text (e.g.,HTML tags) andnontextual information.� Removing stop words:
Eliminate commonwords such as “a” or“the” that appear inmost documents.� Spelling: Correct
spelling mistakes usingcommon spellers.� Stemming and
lemmatization: Reducewords into theircommon stem or lemma.
� Entity extraction: Tools used toextract the meaning of one wordat a time or simple cooccurrenceof words. These tools includedictionaries; part-of-speechclassifiers; many sentiment analysistools; and, for complex entities,machine learning tools.� Topic modeling: Topic modeling
can identify the general topics(described as a combination ofwords) that are discussed in abody of text. Common toolsinclude LDA and PF.� Relation extraction: Going
beyond entity extraction, theresearcher may be interested inidentifying textual relationshipsamong extracted entities. Relationextraction often requires the useof supervised machine learningapproaches.
� Count measures: The set ofmeasures used to represent thetext as count measures. The tf-idfmeasure allows the researcher tocontrol for the popularity of theword and the length of thedocument.
� Similarity measures: Cosinesimilarity and the Jaccard indexare often used to measure thesimilarity of the text betweendocuments.
� Accuracy measures: Often usedrelative to human-coded orexternally validated documents.The measures of recall, precision,F1, and the area under the curveof the receiver operatingcharacteristic curve are oftenused.
� Readability measures: Measuressuch as the simple measure ofgobbledygook (SMOG) are usedto assess the readability level ofthe text.
� Internal Validity– Construct: Dictionary
validation and sampling-and-saturation procedures ensurethat constructs are correctlyoperationalized in text.
– Concurrent: Compareoperationalizations with priorliterature.
– Convergent: Multipleoperationalizations of keyconstructs.
– Causal: Control for factorsrelated to alternativehypotheses.
� External Validity– Predictive: Use conclusions to
tions). In more formal text, capitalization can be used to help
Table 3. Data Preprocessing Steps.
Data Processing Step Issues to Consider Illustration
Data acquisition � Is the data readily available in textual format or does theresearch needs to use a web scraper to find the data?� What are the legal guidelines for using the data
(particularly relevant for web-scraped data)?
Tweets mentioning different brands from the samecategory during a particular time frame aredownloaded from Twitter.
Tokenization � What is the unit of analysis (word, sentence, thread,paragraph)?� Use smart tokenization for delimiters and adjust to
specific unique delimiters found in the corpora.
The unit of analysis is the individual tweet. The wordsin the tweet are the tokens of the document.
Cleaning � Web-scraped data often requires cleaning of HTMLtags and other symbols.� Depending on the research objective, certain textual
features (e.g., advertising on the page) may or may notbe cleaned.� Expansion of contractions such as “isn’t” to “is not.”
URLs are removed and emojis/emoticons areconverted to words.
Removing stop word � Use a stop word list available by the text-miningsoftware, but adapt it to a specific application by adding/removing relevant stop words.� If the goal of the analysis is to extract writing style, it is
advisable to keep all/some of the stop words.
Common words are removed. The remaining textcontains brand names, nouns, verbs, adjectives,and adverbs.
Spelling � Can use commonly used spellers in text-miningpackages (e.g., the Enchant speller).� Language that is specific to the domain may be
erroneously coded as a spelling mistake.� May wish to record the number of spelling mistakes as
an additional textual measure.
Spelling mistakes are removed, enabling analysis intoconsumer perceptions (manifest through wordchoice) of different brands.
Stemming and lemmatization � Can use commonly used stemmers in text-miningpackages (e.g., Porter stemmer).� If the goal of the analysis is to extract writing style,
stemming can mask the tense used.
Verbs and nouns are “standardized” by reducing totheir stem or lemma.
Berger et al. 9
Tab
le4.
Tax
onom
yofT
ext
Anal
ysis
Tools
.
Appro
ach
Com
mon
Tools
Res
earc
hQ
ues
tions
Ben
efits
Lim
itat
ions
and
Com
ple
xitie
sM
arke
ting
Exam
ple
s
Entity
(word
)ex
trac
tion:
Extr
acting
and
iden
tify
ing
asi
ngl
ew
ord
/n-g
ram
�N
amed
entity
extr
action
(NER
)to
ols
(e.g
.,St
anfo
rdN
ER
)�
Dic
tionar
ies
and
lexic
ons
(e.g
.,LI
WC
,EL
2.0
,Se
ntiSt
rengt
h,
VA
DER
)�
Rule
-bas
edcl
assi
ficat
ion
�Li
ngu
istic-
bas
edN
LPto
ols
�M
achin
ele
arnin
gcl
assi
ficat
ion
tools
(conditio
nal
random
field
s,hid
den
Mar
kov
model
s,dee
ple
arnin
g)
�Bra
nd
buzz
monitori
ng
�Pre
dic
tive
model
sw
her
ete
xt
isan
input
�Extr
acting
psy
cholo
gica
lst
ates
and
trai
ts�
Sentim
ent
anal
ysis
�C
onsu
mer
and
mar
ket
tren
ds
�Pro
duct
reco
mm
endat
ions
�C
anex
trac
ta
larg
enum
ber
of
entities
�C
anunco
ver
know
nen
tities
(peo
ple
,bra
nds,
loca
tions)
�C
anbe
com
bin
edw
ith
dic
tionar
ies
toex
trac
tse
ntim
ent
or
lingu
istic
styl
es�
Rel
ativ
ely
sim
ple
touse
�C
anbe
unw
ield
ydue
toth
ela
rge
num
ber
ofen
tities
extr
acte
d�
Som
een
tities
hav
em
ultip
lem
eanin
gsth
atar
ediff
icult
toex
trac
t(e
.g.,
the
laundry
det
erge
nt
bra
nd
“All”
)�
Slan
gan
dab
bre
viat
ions
mak
een
tity
extr
action
more
diff
icult
inso
cial
med
ia�
Mac
hin
ele
arnin
gto
ols
may
requir
ela
rge
hum
an-c
oded
trai
nin
gdat
a�
Can
be
limited
for
sentim
ent
anal
ysis
�Le
ean
dBra
dlo
w(2
011)
�Ber
ger
and
Milk
man
(201
2)�
Ghose
etal
.(2
012)a
�T
irunill
aian
dT
ellis
(2012)
�H
um
phre
ysan
dT
hom
pso
n(2
014)a
�Ber
ger,
Moe,
and
Schw
eidel
(2019)
�Pac
kard
,M
oore
,an
dM
cFer
ran
(2018)
Topic
extr
action:
Extr
acting
the
topic
dis
cuss
edin
the
text
�LS
A�
LDA
�PF
�LD
A2ve
cw
ord
embed
din
g
�Su
mm
ariz
ing
the
dis
cuss
ion
�Id
entify
ing
consu
mer
and
mar
ket
tren
ds
�Id
entify
ing
cust
om
ernee
ds
�T
opic
soft
enpro
vide
use
ful
sum
mar
izat
ion
ofth
edat
a�
Dat
are
duct
ion
per
mits
the
use
oftr
aditio
nal
stat
istica
lm
ethods
insu
bse
quen
tan
alys
is�
Eas
y-to
-ass
ess
dyn
amic
s
�T
he
inte
rpre
tation
ofth
eto
pic
sca
nbe
chal
lengi
ng
�N
ocl
ear
guid
ance
on
the
sele
ctio
nofth
enum
ber
ofto
pic
s�
Can
be
diff
icult
with
short
text
(e.g
.,tw
eets
)
�T
irunill
aian
dT
ellis
(2014)
�Busc
hke
nan
dA
llenby
(2016)
�Pura
nam
,N
aray
an,an
dK
adiy
ali(2
017)
�Ber
ger
and
Pac
kard
(2018)
�Li
uan
dT
oubia
(2018)
�T
oubia
etal
.(2
019)
�Z
hong
and
Schw
eidel
(2019)
�A
nsa
ri,Li
,an
dY
ang
(2018)a
�T
imosh
enko
and
Hau
ser
(2019)
�Li
u,Si
ngh
,an
dSr
iniv
asan
(2016)a
�Li
u,Le
e,an
dSr
iniv
asan
(2019)a
Rel
atio
nex
trac
tion:
Extr
acting
and
iden
tify
ing
rela
tionsh
ips
among
word
s
�C
o-o
ccurr
ence
ofen
tities
�H
andw
ritt
enru
le�
Super
vise
dm
achin
ele
arnin
g�
Dee
ple
arnin
g�
Word
2ve
cw
ord
embed
din
g�
Stan
ford
Sente
nce
and
Gra
mm
atic
alD
epen
den
cyPar
ser
�M
arke
tm
appin
g�
Iden
tify
ing
pro
ble
ms
men
tioned
with
spec
ific
pro
duct
feat
ure
s�
Iden
tify
ing
sentim
ent
for
afo
cal
entity
�Id
entify
ing
whic
hpro
duct
attr
ibute
sar
em
entioned
posi
tive
ly/n
egat
ivel
y�
Iden
tify
ing
even
tsan
dco
nse
quen
ces
(e.g
.,cr
isis
)fr
om
consu
mer
-or
firm
-gen
erat
edte
xt
�M
anag
ing
serv
ice
rela
tionsh
ips
�R
elax
esth
ebag
-of-
word
sas
sum
ption
ofm
ost
text-
min
ing
met
hods
�R
elat
esth
ete
xt
toa
par
ticu
lar
foca
len
tity
�A
dva
nce
sin
text-
min
ing
met
hods
will
offer
new
opport
unitie
sin
mar
keting
�A
ccura
cyofcu
rren
tap
pro
aches
islim
ited
�C
om
ple
xre
lationsh
ips
may
be
diff
icult
toex
trac
t�
Itis
advi
sed
todev
elop
dom
ain-s
pec
ific
sentim
ent
tools
asse
ntim
ent
sign
als
can
vary
from
one
dom
ain
toan
oth
er
�N
etze
ret
al.(2
012)
�T
oubia
and
Net
zer
(2017)
�Bogh
rati
and
Ber
ger
(2019)
aR
efer
ence
appea
rsin
the
Web
Appen
dix
.
10
extract known entities such as brands. However, in more casual
text, such as social media, such signals are less useful. Com-
mon dictionaries include LIWC (Pennebaker et al. 2015), EL
2.0 (Rocklage, Rucker, and Nordgren 2018), Diction 5.0, or
General Inquirer for psychological states and traits (for exam-
ple applications, see Berger and Milkman [2012]; Ludwig et al.
[2013]; Netzer, Lemaire, and Herzenstein [2019]).
Sentiment dictionaries such as Hedonometer (Dodds et al.
2011), VADER (Hutto and Gilbert 2014), and LIWC can be used
to extract the sentiment of the text. One of the major limitations of
the lexical approaches for sentiment analysis commonly used in
marketing is that they apply a “bag of words” approach—meaning
that word order does not matter—and rely solely on the cooccur-
rence of a word of interest (e.g., “brand”) with positive or negative
words (e.g., “great,” “bad”) in the same textual unit (e.g., a
review). While dictionary approaches may be an easy way to
measure constructs and comparability across data sets, machine
learning approaches trained by human-coded data (e.g., Borah and
Tellis 2016; Hartmann et al. 2018; Hennig-Thurau, Wiertz, and
Feldhaus 2015) tend to be the most accurate way of measuring
suchconstructs (Hartmann et al. 2019), particularly if theconstruct
is complex or the domain is uncommon. For this reason, research-
ers should carefully weigh the trade-off between empirical fit and
theoretical commensurability, taking care to validate any diction-
aries used in the analysis (discussed in the next section).
A specific type of entity extraction includes linguistic-type
entities such as part-of-speech tagging, which assigns a linguis-
tic tag (e.g., verb, noun, adjective) to each entity. Most text
analysis tools (e.g., the tm package in R, the Natural Language
Toolkit package in Python) have a built-in part-of-speech tag-
ging tool. If no predefined dictionary exists, or the dictionary is
not sufficient for the extraction needed, one could add hand-
crafted rules to help define entities. However, the list of rules
can become long, and the task of identifying and writing the
rules can be tedious. If the entity extraction by dictionaries or
rules is difficult or if the entities are less defined, machine
word “laptop” is likely to appear in almost every review in
corpora that is composed of laptop reviews.
Accuracy measures. When evaluating the accuracy of text mea-
sures relative to human-coded or externally validated docu-
ments, measures of recall and precision are often used.
Recall is the proportion of entities in the original text that the
text-mining algorithm was able to successfully identify (it is
defined by the ratio of true positives to the sum of true positives
and false negatives). Precision is the proportion of correctly
identified entities from all entities identified (it is defined by
the ratio of true positives to the sum of true positives and false
positives). On their own, recall and precision measures are
difficult to assess because an improvement in one often comes
at the expense of the other. For example, if one defines that
every entity in the corpora is a brand, recall for brands will be
perfect (you will never miss a brand if it exists in the text), but
precision will be very low (there will be many false positive
identifications of a brand entity).
To create the balance between recall and precision, one can
use the F1 measure—a harmonic mean of the levels of recall
and precision. If the researcher is more concerned with false
positives than false negatives (e.g., it is more important to
identify positives than negatives), recall and precision can be
weighted differently. Alternatively, for unbalanced data with
high proportions of true or false in the populations, a receiver
operating characteristics curve can be used to reflect the rela-
tionship between true positives and false positives, and the area
under the curve is often used as a measure of accuracy.
Similarity measures. In some cases, the researcher is interested in
measuring the similarity between documents (e.g., Ludwig
et al. 2013). How similar is the language used in two adver-
tisements? How different is a song from its genre? In such
cases, measures such as linguistic style matching, similarity
in topic use (Berger and Packard 2018), cosine similarity, and
the Jaccard index (e.g., Toubia and Netzer 2017) can be used to
assess the similarity between the text of two documents.
Readability measures. In some cases, the researcher is interested
in evaluating the readability of the text. Readability can reflect
the sophistication of the writer and/or the ability of the reader to
comprehend the text (e.g., Ghose and Ipeirotis 2011). Common
readability measures include the Flesch–Kincaid reading ease
and the simple measure of gobbledygook (SMOG) measures.
These measures often use metrics such as average number of
syllables and average number of words per sentence to evaluate
the readability of the text. Readability measures often grade the
text on a 1–12 scale reflecting the U.S. school grade-level
needed to comprehend the text. Common text-mining packages
have built-in readability tools.
The Validity of Text-Based Constructs
While the availability of text has opened up a range of research
questions, for textual data to provide value, one must be able to
establish its validity. Both internal validity (i.e., does text
accurately measure the constructs and the relationship between
them?) and external validity (i.e., do the test-based findings
apply to phenomena outside the study?) can be established in
various ways (Humphreys and Wang 2017). Table 5 describes
how the text analysis can be evaluated to improve different
types of validity (Cook and Campbell 1979).
Internal Validity
Internal validity is often a major threat in the context of text
analysis because the mapping between words and the underly-
ing dimension the research aims to measure (e.g., psychologi-
cal state and traits) is rarely straightforward and can vary across
contexts and textual outlets (e.g., formal news vs. social
media). In addition, given the relatively young field of auto-
mated text analysis, validation of many of the methods and
constructs is still ongoing.
Accordingly, it is important to confirm the internal validity
of the approach used. A range of methods can be adopted to
ensure construct, concurrent, convergent, discriminant, and
causal validity. In general, the approach for ensuring internal
validity is to ensure that the text studied accurately reflects the
theoretical concept or topic being studied, does so in a way that
is congruent with prior literature, is discriminant from other
related constructs, and provides ample and careful evidence for
the claims of the research.
Construct validity. Construct validity (i.e., does the text represent
the theoretical concept?) is perhaps the most important to
address when studying text. Threats to construct validity occur
when the text provides improper or misleading evidence of the
construct. For instance, researchers often rely on existing stan-
dardized dictionaries to extract constructs to ensure that their
work is comparable with other work. However, these diction-
aries may not always fit the particular context. For example,
extracting sentiment from financial reports using sentiment
tools developed for day-to-day language may not be appropri-
ate. Particularly when attempting to extract complex constructs
(e.g., psychological states and traits, relationships between con-
sumers and products, and even sentiment), researchers should
attempt to validate the constructs on the specific application to
ensure that what is being extracted from the text is indeed what
they intended to extract. Construct validity can also be chal-
lenged when homonyms or other words do not accurately
reflect what researchers think they do.
Strategies for addressing threats to construct validity require
that researchers examine how the instances counted in the data
connect to the theoretical concept(s) (Humphreys and Wang
2017). Dictionaries can also be validated using a saturation
approach, pulling a subsample of coded entries and verifying
with a hit rate of approximately 80% (Weber 2005). Another
method is to use input from human coders, as is done to support
machine learning applications (as previously discussed). For
example, one can use Amazon Mechanical Turk workers to
label phrases on a scale from “very negative” to “very positive”
for sentiment analysis and then use these words to create a
Berger et al. 13
weighted dictionary. In many cases, multiple methods for dic-
tionary validation are advisable to ensure that one is achieving
both theoretical and empirical fit. For topic modeling, research-
ers infer topics from a list of cooccurring words. However,
these are theoretical inferences made by researchers. As such,
construct validity is equally important and can be ascertained
using some of the same methods of validation, through satura-
tion and calculating a hit rate through manual analysis of a
subset of the data. When using a classification approach, con-
fusion matrices can be produced to provide details on accuracy,
false positives, and false negatives (Das and Chen 2007).
Concurrent validity. Concurrent validity concerns the way that
the researcher’s operationalization of the construct relates to
prior operationalizations. Threats to concurrent validity often
come when researchers create text-based measures inductively
from the text. For instance, if one develops a topic model from
the text, it will be based on the data set and may not therefore
produce topics that are comparable with previous research. To
address these threats, one should compare the operationaliza-
tion with other research and other data sources. For example,
Schweidel and Moe (2014) propose a measure of brand senti-
ment based on social media text data and validate it by
Table 5. Text Analysis Validation Techniques.
Type of Validity Validation Technique Description of Method for Validation References
Internal ValidityConstruct validity Dictionary validation After draft dictionary is created, pull 10% of the sample
and calculate the hit rate. Measures such as hit rates,precision, and recall can be used to measure accuracy.
Weber (2005)
Have survey participants rate words included in thedictionary. Based on this data, the dictionary can alsobe weighted to reflect the survey data.
Brysbaert, Warriner, andKuperman (2014)a
Have three coders evaluate the dictionary categories. Iftwo of the three coders agree that the word is part ofthe category, include; if not, exclude. Calculate overallagreement.
Saturation Pull 10% of instances coded from the data and calculatethe hit rate. Adjust word list until saturation reaches80% hit rate.
Weber (2005)
Concurrent validity Multiple dictionaries Calculate and compare multiple textual measures of thesame construct (e.g., multiple sentiment measures)
Hartmann et al. (2018)
Comparison of topics Compare with other topic models of similar data sets inother research (e.g., hotel reviews)
Mankad et al. (2016)a
Convergent validity Triangulation Look within text data for converging patterns (e.g.,positive/e emotion correlates with known-positiveattributes); apply Principle Components Analysis toshow convergent groupings of words
Humphreys (2010); Kern et al.(2016)
Multiple operationalizations Operationalize constructs with textual and nontextualdata (e.g., sentiment, star rating)
Ghose et al. (2012)a; Mudambi,Schuff, and Zhang (2014)a
Causal validity Control variables Include variables in the model that address rivalhypotheses to control for these effects
Ludwig et al. (2013)
Laboratory study Replicate focal relationship between the independentvariable and dependent variable in a laboratory setting
Spiller and Belogolova (2016)a;Van Laer et al. (2018)
External ValidityGeneralizability Replication with different
data setsCompare the results from the text analysis with the
results obtained other (possibly non-text-related) datasets
Netzer et al. (2012)
Predict key performancemeasure
Include results from text analysis in regression or othermodel to predict a key outcome (e.g., sales,engagement)
Fossen and Schweidel (2019)
Predictive validity Holdout sample Train model on approximately 80%–90% of the data andvalidate the model with the remaining data. Validationcan be done using k-fold validation, which trains themode on k-1 subsets of the data and predicts for theremaining subset of testing.
Jurafsky et al. (2014)
Robustness Different statisticalmeasures, unitizations
Use different, but comparable, statistical measures oralgorithms (e.g., lift, cosine similarity, Jaccardsimilarity), aggregate at different levels (e.g., day,month)
Netzer et al. (2012)
aReference appears in the Web Appendix.
14 Journal of Marketing XX(X)
comparing it with brand measures obtained through a tradi-
tional marketing research survey. Similarly, Netzer et al.
(2012) compare the market structure maps derived from textual
information with those derived from product switching and
surveys, and Tirunillai and Tellis (2014) compare the topics
they identify with those found in Consumer Reports. When
studying linguistic style (Pennebaker and King 1999), for
example, it is beneficial to use robust measures from prior
literature where factor analysis and other methods have already
been employed to create the construct.
Convergent validity. Convergent validity ensures that multiple
measurements of the construct (i.e., words) all converge to the
same concept. Convergent validity can be threatened when
the measures of the construct do not align or have different
effects. Convergent validity can be enhanced by using several
substantively different measures (e.g., dictionaries) of the
same construct to look for converging patterns. For example,
when studying posts about the stock market, Das and Chen
(2007) compare five classifiers for measuring sentiment,
comparing them in a confusion matrix to examine false posi-
tives. Convergent evidence can also come from creating a
correlation or similarity matrix of words or concepts and
checking for patterns that have face validity. For instance,
Humphreys (2010) looks for patterns between the concept
of crime and negative sentiment to provide convergent evi-
dence that crime is negatively valenced in the data.
Discriminant validity. Discriminant validity, the degree to which
the construct measures are sufficiently different from measures
of other constructs, can be threatened when the measurement of
the construct is very similar to that of another construct. For
instance, measurements of sentiment and emotion in many
cases may not seem different because they are measured using
similar word lists or, when using classification, return the same
group of words as predictors. Strategies for ensuring discrimi-
nant validity entail looking for discriminant rather than con-
vergent patterns and boundary conditions (i.e., when and how
is sentiment different from emotion?). Furthermore, theoretical
refinements can be helpful in drawing finer distinctions. For
example, anxiety, anger, and sadness are different kinds of
emotion (and can be measured via psychometrically different
scales), whereas sentiment is usually measured as positive,
negative, or neutral (Pennebaker et al. 2015).
Causal validity. Causal validity is the degree to which the con-
struct, as operationalized in the data set, is actually the cause of
another construct or outcome, and it is best ascertained through
random assignment in controlled lab conditions. Any number
of external factors can threaten causal validity. However, steps
can be taken to enhance causal validity in naturally occurring
textual data. In particular, rival hypotheses and other explana-
tory factors for the proposed causal relationship can be statis-
tically controlled for in the model. For example, Ludwig et al.
(2013) include price discount in the model when studying the
relationship between product reviews and conversion rate to
control for this factor.
External Validity
To achieve external validity, researchers should attempt to
ensure that the effects found in the text apply outside of the
research framework. Because text analysis often uses natu-
rally occurring data that is often of large magnitude, it tends
have a relatively high degree of external validity relative to,
for example, lab experiments. However, establishing external
validity is still necessary due to threats to validity from sam-
pling bias, overfitting, and single-method bias. For example,
online reviews may be biased due to self-selection among
those who elected to review a product (Schoenmuller, Netzer,
and Stahl 2019).
Predictive validity. Predictive validity is threatened when the
construct, though perhaps properly measured, does not have
the expected effects on a meaningful second variable. For
example, if consumer sentiment falls but customer satisfac-
tion remains high, predictive validity could be called into
question. To ensure predictive validity, text-based constructs
can be linked to key performance measures such as sales (e.g.,
Fossen and Schweidel 2019) or consumer engagement (Ash-
ley and Tuten 2015). If a particular construct has been theo-
retically linked to a performance metric, then any text-based
measure of that construct should also be linked to that perfor-
mance metric. Tirunillai and Tellis (2012) show that the vol-
ume of Twitter activity affects stock price, but they find
mixed results for the predictive validity of sentiment, with
negative sentiment being predictive but positive sentiment
having no effect.
Generalizability can be threatened when researchers base
results on a single data set because it is unknown whether the
findings, model, or algorithm would apply in the same way to
other texts or outside of textual measurements. Generalizability
of the results can be established by viewing the results of text
analysis along with other measures of attitude and behavioral
outcomes. For example, Netzer et al. (2012) test their substan-
tive conclusions and methodology on message boards of both
automobile discussions and drug discussions from WebMD.
Evaluating the external validity and generalizability of the
findings is key, because the analysis of text drawn from a
particular source may not reflect consumers more broadly
(e.g., Schweidel and Moe 2014).
Robustness. Robustness can be limited when there is only one
metric or method used in the model. Researchers can ensure
robustness by using different measures for relationships (e.g.,
Pearson correlation, cosine similarity, lift) and probing results
by relaxing different assumptions. The use of holdout samples
and k-fold cross-validation methods can prevent researchers
from overfitting their models and ensure that relationships
found in the data set will hold with other data as well (Jurafsky
et al. 2014; see also Humphreys and Wang 2017). Probing on
Berger et al. 15
different “cuts” of the data can also help. Berger and Packard
(2018), for example, compare lyrics from different genres, and
Ludwig et al. (2013) include reviews of both fiction and non-
fiction books.
Finally, researchers should bear in mind the limitations of
text itself. There are thoughts and feelings that consumers,
managers, or other stakeholders may not express in text. The
form of communication (e.g., tweets, annual reports) may also
shape the message; some constructs may not be explicit enough
to be measured with automated text analysis. Furthermore,
while textual information can often involve large samples,
these samples may not be representative. Twitter users, for
example, tend to be younger and more educated (Smith and
Anderson 2018). Those who contribute textual information,
particularly in social media, may represent polarized points
of view. When evaluating cultural products or social media,
one should consider the system in which they are generated.
Often viewpoints are themselves filtered through a cultural
system (Hirsch 1986; McCracken 1988) or elevated by an algo-
rithm, and the products make it through this process may share
certain characteristics. For this reason, researchers and firms
should use caution when making attributions on the basis of a
cultural text. It is not necessarily a reflection of reality (Jame-
son 2005) but rather may represent ideals, extremes, or insti-
tutionalized perceptions, depending on the context.
Future Research Agenda
We hope this article encourages more researchers and practi-
tioners to think about how they can incorporate textual data into
their research. Communication and linguistics are at the core of
studying text in marketing. Automated text analysis opens the
black box of interactions, allowing researchers to directly
access what is being said and how it is said in marketplace
communication. The notion of text as indicative of meaning-
making processes creates fascinating and truly novel research
questions and challenges. There are many methods and
approaches available, and there is no space to do all of them
justice. While we have discussed several research streams,
given the novelty of text analysis, there are still ample oppor-
tunities for future research, which we discuss next.
Using Text to Reach Across the Marketing Discipline
Returning to how text analysis can unite the tribes of market-
ing, it is worth highlighting a few areas that have mostly been
examined by one research tradition in marketing where fruitful
cross-pollination between tribes is possible through text anal-
ysis. Brand communities were first identified and studied by
researchers coming from a sociology perspective (Muniz and
O’Guinn 2001). Later, qualitative and quantitative researchers
further refined the concepts, identifying a distinct set of roles
and status in the community (e.g., Mathwick, Wiertz, and De
Ruyter 2007). Automated text analysis allows researchers to
study how consumers in these communities interact at scale
and in a more quantifiable manner—for instance, examining
how people with different degrees of power use language and
predict group outcomes based on quantifiably different
dynamics (e.g., Manchanda, Packard, and Pattabhitamaiah
2015). Researchers can track influence, for example, by inves-
tigating which types of users initiate certain words or phrases
and which others pick up on them. Research could examine
whether people begin to enculturate to the language of the
community over time and predict which individuals may be
more likely to stay or leave on the basis of how well they adapt
to the group’s language (Danescu-Niculescu-Mizil et al. 2013;
Srivastava and Goldberg 2017). Quantitative or machine learn-
ing researchers might capture the most commonly discussed
topics and how these dynamically change over the evolution
of the community. Interpretive researchers might examine how
these terms link conceptually, to find underlying community
norms that lead members to stay. Marketing strategy research-
ers might then use or develop dictionaries to connect these
communities to firm performance and to offer directions for
firms regarding how to keep members participating across dif-
ferent brand communities (or contexts).
The progression can flow the other way as well. Outside of a
few early investigations (e.g., Dichter 1966), word of mouth
was originally studied by quantitative researchers interested in
whether interpersonal communication actually drove individ-
ual and market behavior (e.g., Chevalier and Mayzlin 2006;
Iyengar, Van den Bulte, and Valente 2011). More recently,
however, behavioral researchers have begun to study the under-
lying drivers of word of mouth, looking at why people talk
about and share some stories, news, and information rather than
others (Berger and Milkman 2012; De Angelis et al. 2012; for a
review, see Berger [2014]). Marketing strategy researchers
might track the text of word-of-mouth interactions to predict
the emergence of brand crises or social media firestorms (e.g.,
Zhong and Schweidel 2019) as well as when, if, and how to
respond (Herhausen et al. 2019).
Consumer–firm interaction is also a rich area to examine.
Behavioral researchers could use the data from call centers to
better understand interpersonal communication between con-
sumers and firms and record what drives customer satisfaction
(e.g., Packard and Berger 2019; Packard, Moore, and McFerran
2018). The back-and-forth between customers and agents could
be used to understand conversational dynamics. More quanti-
tative researchers should use the textual features of call centers
to predict outcomes such as churn and even go beyond text to
examine vocal features such as tone, volume, and speed of
speech. Marketing strategy researchers could use calls to
understand how customer-centric a company is or assess the
quality, style, and impact of its sales personnel.
Finally, it is worth noting that different tribes not only have
different skill sets but also often study substantively different
types of textual communication. Consumer-to-consumer com-
munication is often studied by researchers in consumer beha-
vior, whereas marketing strategy researchers more often tend to
study firm-to-consumer and firm-to-firm communication. Col-
laboration among researchers from the different subfields may
allow them to combine these different sources of textual data.
16 Journal of Marketing XX(X)
There is ample opportunity to apply theory developed in one
domain to enhance another. Marketing strategy researchers, for
example, often use transaction economics to study business-to-
business relationships through agency theory, but these
approaches may be equally beneficial when studying
consumer-to-consumer communications.
Broadening the Scope of Text Research
As noted in Table 1, certain text flows have been studied more
than others. A large portion of existing work has focused on
consumers communicating to one another through social
media and online reviews. The relative availability of such
data has made it a rich area of study and an opportunity to
apply text analysis to marketing problems.3 Furthermore, for
this area to grow, researchers need to branch out. This
includes expanding (1) data sources, (2) actors examined, and
(3) research topics.
Expand data sources used. Offline word of mouth, for example,
can be examined to study what people talk about and conversa-
tional dynamics. Doctor–patient interactions can be studied to
understand what drives medical adherence. Text items such as
yearbook entries, notes passed between students, or the text of
speed dating conversations can be used to examine relationship
formation, maintenance, and dissolution. Using offline data
requires carefully transcribing content, which increases the
amount of effort required but opens up a range of interesting
avenues of study. For example, we know very little about the
differences between online recommendations and face-to-face
recommendations, where the latter also include the interplay
between verbal and nonverbal information. Moreover, in the
new era of “perpetual contact” our understanding of cross-
message and cross-channel implications is limited. Research
by Batra and Keller (2016) and Villarroel Ordenes et al.
(2018) suggests that appropriate sequencing of messages mat-
ters; it might similarly matter across channels and modality.
Given the rise of technology-enabled realities (e.g., augmented
reality, virtual reality, mixed reality), assistive robotics, and
smart speakers, understanding the roles and potential differ-
ences between language and nonverbal cues could be achieved
using these novel data sources.
Expand dyads between text producers and text receivers. There are
numerous dyads relevant to marketing in which text plays a
crucial role. We discuss just a few of the areas that deserve
additional research.
Considering consumer–firm interactions, we expect to see
more research leveraging the rich information exchanged
between consumers and firms through call centers and chats
(e.g., Packard and Berger 2019; Packard, Moore, and McFerran
2018). These interactions often reflect inbound communication
between customers and the firm, which can have important
implications for the relationship between parties. In addition,
how might the language used on packaging or in brand mission
statements reflect the nature of organizations and their relation-
ship to their consumers? How might the language that is most
impactful in sales interactions differ from the language that is
most useful in customer service interactions? Research could
also probe how the impact of such language varies across
contexts. The characteristics of language used by consumer
packaged goods brands and pharmaceuticals brands in direct-
to-consumer advertising likely differ. Similarly, the way in
which consumers process the language used in disclosures in
advertisements for pharmaceuticals (e.g., Narayanan, Desiraju,
and Chintagunta 2004) and political candidates (e.g., Wang,
Lewis, and Schweidel 2018) may vary.
Turning to firm-to-firm interactions, most conceptual
frameworks on business-to-business (B2B) exchange relations
emphasize the critical role of communication (e.g., Palmatier,
Dant, and Grewal 2007). Communicational aspects have been
linked to important B2B relational measures such as commit-
ment, trust, dependence, relationship satisfaction, and relation-
ship quality. Yet research on actual, word-level B2B
communication is very limited. For example, very little
research has examined the types of information exchanged
between salespeople and customers in offline settings. The
ability to gather and transcribe data at scale points to important
opportunities to do so. As for within-firm communication,
researchers could study informal communications such as
marketing-related emails, memos, and agendas generated by
firms and consumed by their employees.
Similarly, while a great deal of work in accounting and
finance has begun to use annual reports as a data source (for
a review, see Loughran and McDonald [2016]), marketing
researchers have paid less attention to this area to study com-
munication with investors. Most research has used this data to
predict outcomes such as stock performance and other mea-
sures of firm valuation. Given recent interest in linking
marketing-related activities to firm valuation (e.g., McCarthy
and Fader 2018), this may be an area to pursue further. All firm
communication, including required documents such as annual
reports or discretionary forms of communication such as adver-
tising and sales interactions, can be used to measure variables
such as market orientation, marketing capabilities, marketing
leadership styles, and even a firm’s brand personality.
There are also ample research opportunities in the interac-
tions between consumers, firms, and society. Data about the
broader cultural and normative environment of firms, such as
news media and government reports, may be useful to shed
light on the forces that shape markets. To understand how a
company such as Uber navigates resistance to market change,
for example, one might study transcripts of town hall meetings
and other government documents in which citizen input is
heard and answered. Exogenous shocks in the forms of social
movements such as #metoo and #blacklivesmatter have
affected marketing communication and brand image. One
potential avenue for future research is to take a cultural
3 While readily available data facilitates research, there are downsides to be
recognized, including the representatives of such data and the terms of service
that govern the use of this data.
Berger et al. 17
branding approach (Holt 2016) to study how different publics
define, shape, and advocate for certain meanings in the market-
place. Firms and their brands do not exist in a vacuum, inde-
pendent of the society in which they operate. Yet limited
research in marketing has considered how text can be used to
derive firms’ intentions and actions at the societal level. For
example, scholars have shown how groups of consumers such
as locavores (i.e., people who eat locally grown food; Thomp-
son and Coskuner-Balli 2007), fashionistas (Scaraboto and
Fischer 2012), and bloggers (McQuarrie, Miller, and Phillips
2012) shape markets. Through text analysis, the effect of the
intentions of these social groups on the market can then be
measured and better understood.
Another opportunity for future research is the use of textual
data to study culture and cultural success. Topics such as cul-
tural propagation, artistic change, and the diffusion of innova-
tions have been examined across disciplines with the goal of
understanding why certain products succeed while others fail
(Bass 1969; Boyd and Richerson 1986; Cavalli-Sforza and
Feldman 1981; Rogers 1995; Salganik, Dodds, and Watts
2006; Simonton 1980). While success may be random (Bielby
and Bielby 1994; Hirsch 1972), another possibility is that cul-
tural items succeed or fail on the basis of their fit with con-
sumers (Berger and Heath 2005). By quantifying aspects of
books, movies, or other cultural items quickly and at scale,
researchers can measure whether concrete narratives are more
engaging, whether more emotionally volatile movies are more
successful, whether songs that use certain linguistic features are
more likely to top the Billboard charts, and whether books that
evoke particular emotions sell more copies. While not as
widely available as social media data, more and more data on
cultural items has recently become available. Data sets such as
the Google Books corpus (Akpinar and Berger 2015), song
lyric websites, or movie script databases provide a wealth of
information. Such data could enable analyses of narrative
structure to identify “basic plots” (e.g., Reagan et al. 2016; Van
Laer et al. 2019).
Key Marketing Constructs (That Could Be) Measuredwith Text
Beginning with previously developed ways of representing
marketing constructs can help some researchers address valid-
ity concerns. This section details a few of these constructs to
aid researchers who are beginning to use text analysis in their
work (see the Web Appendix). Using prior operationalization
of a construct can ensure concurrent validity—helping build
the literature in a particular domain—but researchers should
take steps to ensure that the prior operationalization has con-
struct validity with their data set.
At the individual level, sentiment and satisfaction are per-
haps some of the most common measurements (e.g., Buschken
and Allenby, 2016; Homburg, Ehm, and Artz 2015; Herhausen
et al. 2019; Ma, Baohung, and Kekre 2015; Schweidel and Moe
2014) and have been validated in numerous contexts. Other
aspects that may be extracted from text include the authenticity
and emotionality of language, which have also been explored
through robust surveys and scales or by combining multiple
existing measurements (e.g., Mogilner, Kamvar, and Aaker
2011; Van Laer et al. 2019). There are also psychological con-
structs, such as personality type and construal level (Kern et al.
2016; Snefjella and Kuperman 2015), that are potentially use-
ful for marketing researchers and could also be inferred from
the language used by consumers.
Future work in marketing studying individuals might con-
sider measurements of social identification and engagement.
That is, researchers currently have an idea of positive or neg-
ative consumer sentiment, but they are only beginning to
explore emphasis (e.g., Rocklage and Fazio 2015), trust, com-
mitment, and other modal properties. To this end, harnessing
linguistic theory of pragmatics and examining phatics over
semantics could be useful (see, e.g., Villarroel et al. 2017).
Once such work is developed, we recommend that researchers
carefully validate approaches proposed to measure such con-
structs along the lines described previously.
At the firm level, constructs have been identified in firm-
produced text such as annual reports and press releases. Mar-
ket orientation, advertising goals, future orientation, deceitful
intentions, firm focus, and innovation orientation have all
been measured and validated using this material (see Web
Appendix Table 1). Work in organizational studies has a his-
tory of using text analysis in this area and might provide some
inspiration and validation in the study of the existence of
managerial frames for sensemaking and the effect of activists
on firm activities.
Future work in marketing at the firm level could further
refine and diversify measurements of strategic orientation
(e.g., innovation orientation, market-driving vs. market-
driven orientations). Difficult-to-measure factors deep in the
organizational culture, structure, or capabilities may be
revealed in the words the firm, its employees, and external
stakeholders use to describe it (see Molner, Prabhu, and Yadav
[2019]). Likewise, the mindsets and management style of mar-
keting leaders may be discerned from the text they use (see
Yadav, Prabhu, and Chandy [2007]). Firm attributes that are
important outcomes of firm action (e.g., brand value) could
also be explored using text (e.g., Herhausen et al. 2019). In
this case, there is an opportunity to use new kinds of data. For
instance, internal, employee-based brand value could be mea-
sured with text on LinkedIn or Glassdoor. Finally, more subtle
attributes of firm language, including conflict, ambiguity, or
openness, might provide some insight into the effects of man-
agerial language on firm success. For this, it may be useful to
examine less formal textual data of interactions such as
employee emails, salesperson calls, or customer service cen-
ter calls.
Less work in marketing has measured constructs on the
social or cultural level, but work in this vein tends to focus
on how firms fit into the cultural fabric of existing meanings
and norms. For instance, institutional logics and legitimacy
have been measured by analyzing media text, as has the rise
18 Journal of Marketing XX(X)
of brand publics that increase discussion of brands within a
culture (Arvidsson and Caliandro 2016).
At the cultural level, marketing research is likely to maintain
a focus on how firms fit into the cultural environment, but it
may also look to how the cultural environment affects consu-
mers. For instance, measurement of cultural uncertainty, risk,
hostility, and change could benefit researchers interested in the
effects of culture on both consumer and firm effects as well as
the effects of culture and society on government and investor
relationships. Measuring openness and diversity through text
are also timely topics to explore and might inspire innovations
in measurement, focusing on, for example, language diversity
rather than the specific content of language. Important cultural
discourses such as language around debt and credit could also
be better understood through text analysis. Measurement of
gender- and race- related language could be useful in exploring
diversity and inclusion in the way firms and consumers react to
text from a diverse set of writers.
Opportunities and Challenges Provided byMethodological Advances
Opportunities. As the development of text analysis tools
advances, we expect to see new and improved use of these
tools in marketing, which can enable scholars to answer ques-
tions we could not previously address or have addressed only in
a limited manner. Here are a few specific method-driven direc-
tions that seem promising.
First, the vast majority of the approaches used for text anal-
ysis in marketing (and elsewhere) rely on bag-of-words
approaches, and thus, the ability to capture true linguistic rela-
tionships among words beyond their cooccurrence was limited.
However, in marketing we are often interested in capturing the
relationship among entities. For example, what problems or
benefits did the customer mention about a particular feature
of a particular product? Such approaches require capturing a
deeper textual relationship among entities than is commonly
used in marketing. We expect to see future development in
these areas as deep learning and NLP-based approaches enable
researchers to better capture semantic relationships.
Second, in marketing we are often interested in the latent
intention or latent states of writers when creating text, such as
their emotions, personality, and motivations. Most of the
research in this area has relied on a limited set of dictionaries
(primarily the LIWC dictionary) developed and validated to
capture such constructs. However, these dictionaries are often
limited in capturing nuanced latent states or latent states that
may manifest differently across contexts. Similar to advances
made in areas such as image recognition, with the availability
of a large number of human-coded training data (often in the
millions) combined with deep learning tools, we hope to see
similar approaches being taken in marketing to capture more
complex behavioral states from text. This would require an
effort to human-code a large and diverse set of textual corpora
for a wide range of behavioral states. Transfer learning meth-
ods commonly used in deep learning tools such as conventional
neural nets can then be used to apply the learning from the
more general training data to any specific application.
Third, there is also the possibility of using text analysis to
personalize customer–firm interactions. Using machine learn-
ing, text analysis can also help personalize the customer inter-
action by detecting consumer traits (e.g., personality) and states
(e.g., urgency, irritation) and perhaps eventually predicting
traits associated with value to the firm (e.g., customer lifetime
value). After analysis, firms can then tailor customer commu-
nication to match linguistic style and perhaps funnel consumers
to the appropriate firm representative. The stakes of making
such predictions may be high, mistakes costly, and there are
clearly contexts in which using artificial intelligence impedes