Computational Analysis of Agenda Setting Theory Yeooul Kim and Alice Oh [email protected] June 18, 2013
May 12, 2015
Computational Analysis of Agenda Setting Theory
Yeooul Kim and Alice [email protected] 18, 2013
Our Research Overview
• Topic Modeling (Machine Learning)
• CIKM 2011: Distance-dependent Chinese restaurant franchise (ddCRF)
• ICML 2012: Dirichlet process with random mixed measures (DP-MRM)
• CIKM 2012: Recursive chinese restaurant process for modeling topic hierarchies (rCRP)
• NIPS Big Learning Workshop 2012: Distributed Online Learning for Latent Dirichlet Allocation (DoLDA)
• IJCAI 2013: Context Dependent Conceptualization (at MSRA)
• Computational Social Science
• WSDM 2011: Aspect sentiment unification model for online review analysis
• ICWSM 2012: Social aspects of emotions in Twitter conversations
• ACL 2012: Self-disclosure and relationship strength in Twitter conversations
• AAAI 2013: Hierarchical Aspect Sentiment Model for Online Reviews (at MSRA)
2
Agenda Setting Theory How does media affect the thoughts of the audience?
Agenda Setting Theory (McCombs & Shaw, 1972)
• Media affects audiences by having an influence on
• What to think about
• How to think about it
• Examples of traditional media studies
• Media affects the outcome of presidential elections (Perloff and Krauss, 1985)
• Media coverage influences the control of infectious diseases (Cui et al., 2008)
• Tone of news articles affects the number of visitors to museums (Zyglidopoulos et al., 2012)
1.Use of traditional off-line newspapers and TV as target media
• Analysis is limited to a small volume over a short duration
• Issues are arbitrarily chosen
2.Use of off-line MIP (Most Important Problems) surveys
• Self-reports are not reliable
• Only a small subset of the population can be surveyed
3.Use of manual coding for content analysis
• You need experts
• It is difficult to replicate and generalize to other domains
Limitation of Traditional Media Studies
Computational Analysis of Agenda Setting Theory
1.Use of traditional off-line newspapers and TV as target media
• Crawl online news to get several years’ data
• Use machine learning to automatically discover the important issues
2.Use of off-line MIP (Most Important Problems) surveys
• Look at counts of social media shares
• Look at counts of user comments
3.Use of manual coding for content analysis
• Use unsupervised machine learning to analyze content for tone (polarity) of articles and comments
• Try it for different issues to see whether ML approach can generalize over many domains
7
Gay marriage COMMENT
SHARE
AUDIENCE’S BEHAVIOR
7
Gay marriage COMMENT
SHARE
AUDIENCE’S BEHAVIOR
8
Section #Articles #Comments #Commenters #Shares
Politics 1,863 174,680 14,106 2,080,889
Business 2,043 130,921 17,791 3,657,544
Opinion 4,820 149,618 30,556 6,620,489
Sports 814 17,282 5,484 712,507
Technology 456 13,571 4,993 570,732
Science 945 50,113 11,114 4,709,041
World 3,673 134,572 14,882 3,534,637
Health 3,060 92,964 18,185 6,001,082
Total 17,674 763,721 117,111 27,886,921
From http://www.npr.org/
2011.01 – 2013.04
DATA STATISTICS
9
Section Issue (Labeled by using Mturk) #Articles
Politics presidential electioninfringement of human rightsrace for Washingtongovernment economics presidential campaigns and money candidate-marriage & immigration political viewpoints
575195167274163261157
Business economic decline under Obamaemployment and paid slavery agriculturebanks and loan stock market and business housing markettax and businessenergy and finance new business and running
514218131198166170180222138
Health health care reform laws vaccinationHIV and treatment medication healthcare and costs food and obesitysleep study and children food and safety health tech and new treatment mental health in families
349189496197224245210223125117
Issue Detection using HDP
Detected Issue list and the number of articles of each issue for three sections out of eight sections.
Correlation between the volume of articles (per week) and audience’s interests (following comments). Correlation value for (a) is 0.786 and it shows strong agenda setting effects., also correlation value for (b) is 0.418 and it shows weak agenda setting effects.
10
▶ Effects from media exposure CORRELATION IN ISSUE
Jan 01 2011
11
Section Keywords Issue (Labeled by using Mturk) commentcomment shareshareSection Keywords Issue (Labeled by using Mturk)
corr. effect size corr. effect size
Politics romney gingrich republican santorum president obama house people political state republican election party walker president obama tax house congress romney campaign obama money million court law state ice supreme marriage romney obama president republican voters
presidential electioninfringement of human rightsrace for Washingtongovernment economics presidential campaigns and money candidate-marriage & immigration political viewpoints
0.873**** 0.845**** 0.855**** 0.903**** 0.836**** 0.895**** 0.878****
LargeLargeLargeLargeLargeLargeLarge
0.161 0.562**** 0.511**** 0.347*** 0.367** 0.417** 0.372*
noneLargeLargeMediumMediumMediumMedium
Business percent economy year jobs debt rate tax people job can work time jobs year yearsfood farmers year beer corn prices new bank banks financial money new company news new company people can stock nowpeople city new can home like now housing tax can people state new like year get new oil gas company american car industry like can people new company get year
economic decline under Obamaemployment and paid slavery agriculturebanks and loan stock market and business housing markettax and businessenergy and finance new business and running
0.870**** 0.732**** 0.634**** 0.786**** 0.736**** 0.670**** 0.767**** 0.702**** 0.750****
LargeLargeLargeLargeLargeLargeLargeLargeLarge
0.304*** 0.346*** 0.230* 0.268** 0.441** 0.360* 0.278** 0.423** 0.278**
MediumMediumSmallSmallMediumMediumSmallMediumSmall
Health health law care insurance people federalhealth people vaccine virus new flu cancer women people percent risk hiv drug drugs fda people can new patients care health patients hospital hospitals food people can health new like weight can people study sleep kids children food can people health like may new get can patients people cancer brain new people can like life health get know
health care reform laws vaccinationHIV and treatment medication healthcare and costs food and obesitysleep study and children food and safety health tech and new treatment mental health in families
0.564**** 0.640**** 0.399* 0.447** 0.706**** 0.702**** 0.541**** 0.428** 0.544**** 0.418**
LargeLargeMediumMediumLargeLargeLargeMediumLargeMediu,m
0.241** 0.341*** 0.279** 0.149 0.615**** 0.162* 0.456** 0.330*** 0.172 0.360*
SmallMediumSmallnoneLargeSmallMediumMediumnoneMedium
CORRELATION & EFFECT SIZE
Q: Why do we see larger agenda setting effects for some issues?H: Previous studies argue for a relationship between agenda setting and relevance and/or uncertainty.
Previous research in media studies argue that larger agenda setting affects can be seen for issues with
• Low relevance and High uncertainty (Schonbach & Weaver, 1985)
• High relevance and High uncertainty (McCombs, 2004)
12
RELEVANCE & UNCERTAINTY
Relevance: The relevance of an issue to the audienceUncertainty: The degree of uncertainty by the audience about the issue
GOALMeasure the average relevance and uncertainty of each issue to analyze the correlation to the level of agenda setting effect
EXPERIMENT SETTINGS2 sets of randomly distributed issues(each set contains 27 issues)
PARTICIPANTS- 26 American MTurkers, 13 for each set - Various ages from 20s to 50s
13
Measuring Relevance & Uncertainty with MTurk
14
DEFINITION
ABOUT SECTION
PERSONAL INFO.
MTurk Questionnaires -1
15
ABOUT ISSUE
Mturk Questionnaires -2
15
ABOUT ISSUE
Mturk Questionnaires -2
16
Issues
Relevance
Uncertainty
17
Results of MTurk
1
2
3
4
5
1 2 3 4 5
Unc
erta
inty
Relevance
20s
Internet and privacy 1
2
3
4
5
1 2 3 4 5
30s-50s Issue
world and football
Internet and privacy
1
2
3
4
5
1 2 3 4 5
60s
world and football world and football
Internet and privacy
18
Correlations with Agenda Setting Effect
share_corr = 0.480* Relevance
1
2
3
4
5
1 2 3 4 5
Unc
erta
inty
Relevance
Set A Issue
19
Correlations with Agenda Setting Effect
share_corr = 0.524* Relevance
1
2
3
4
5
1 2 3 4 5
30s-50s Issue
INFLUENTIAL FACTOR Tone (Polarity) of article
GOALIdentify the effects of article tone, positive and negative, on the commenting and sharing behaviors of the audience
20
Content Polarity & Audience Behavior
21
Positive and Negative Articles
Proportion of sharing behavior to commenting behavior. Audience tends to leave more comments on negative article set, on the other hand, audience shares more articles in positive article set.
22
DETECTED POS./NEG. WORDS
The sets of positive and negative words obtained from model analysis for news articles. Words depending on sections differentiate positive and negative traits of each section.
BUSINESS HEALTH OPINION POLITICS Positive joined viral smoothly better balance respect forward empower fair moderate
Negative cutthroat axed lawsuit beating lose opposite battle unjust fuming sequester
Positive care respect admit clarify essential healthy repair benign hope repaired
Negative tough severe emergency affected risk dying war spitting tricks abnormal
Positive spectacular useful created prize confirm love sublime win confident mellow
Negative weird fog distressing slam doubted fail wrong fears slippery peril
Positive expert forward proud consent carol rights great worth integrity truth
Negative ironic heinous arguing dick undo grinding outlaw meaningless theft lost
SCIENCE SPORTS TECHNOLOGY WORLD Positive fortunate cleanup essential credit safety comforting milestone learn gang dim
Negative spill crude busted upset concern problems dark smash prize creating
Positive victory won grace fun champion passion ace belief luck balance
Negative chase shock busted beating defeat thwart lost alleged assault cockeyed
Positive best fancy easy help intelligence strong improve fit trust fame
Negative blocks shabby shy wicked rash shaky mortal grave pity unfinished
Positive free respected support moderate consistent prompt afford gratitude joined affluent
Negative tension protest heavy raging slam war crime oppress poverty poor
• Presented preliminary research in using computational methods for media studies
• Crawled a corpus of articles including user comments and social sharing counts from NPR over a period of three years
• Showed that sharing patterns and commenting patterns are quite different
• Showed the effects of agenda setting for 57 issues over 8 sections of NPR
• Looked at relevance and uncertainty as two dimensions to explain the various degree of agenda setting for different issues
• Looked at the tone of article (pos, neg) to see whether people react differently
• Identified lots of loose ends
• Please contact me if you are interested in collaborating
Contributions & Future Work
Set ASet A Set BSet B
#Person Kappa #Person Kappa
Entire set 13 0.0436** 13 0.0192
20s 2 -0.225. 1 One person
30s 3 0.0228 6 0.00883
40s 2 0.0189 2 0.074
50s 4 -0.0382 3 -0.0588
60s 2 0.211. 1 One person
30s-50s 9 0.0448* 11 0.0162
Male 4 0.0158 7 0.0239
Female 9 0.0616** 6 -0.0048
24
DATA: AGREEMENT & SIGNIFICANCE
Calculate Fleiss’ Kappa value for each data set. .p<0.1, *p<0.05,**p<0.01, ***p<0.001,****p<0.0001
25