Top Banner
Mapping information exposure on social media to explain differences in HPV vaccine coverage in the United States Adam G. Dunn a,, Didi Surian a , Julie Leask b , Aditi Dey c , Kenneth D. Mandl d,e , Enrico Coiera a a Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, NSW 2109, Australia b School of Public Health and Sydney Nursing School, The University of Sydney, Sydney, NSW 2006, Australia c National Centre for Immunisation Research and Surveillance, The Children’s Hospital at Westmead, Sydney 2145, NSW, Australia d Computational Health Informatics Program, Boston Children’s Hospital, Boston, MA 02115, United States e Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States article info Article history: Received 8 March 2017 Received in revised form 19 April 2017 Accepted 20 April 2017 Available online 29 April 2017 Keywords: Human papillomavirus vaccine Social media Content analysis Acceptability Vaccine refusal Immunization coverage abstract Background: Together with access, acceptance of vaccines affects human papillomavirus (HPV) vaccine coverage, yet little is known about media’s role. Our aim was to determine whether measures of informa- tion exposure derived from Twitter could be used to explain differences in coverage in the United States. Methods: We conducted an analysis of exposure to information about HPV vaccines on Twitter, derived from 273.8 million exposures to 258,418 tweets posted between 1 October 2013 and 30 October 2015. Tweets were classified by topic using machine learning methods. Proportional exposure to each topic was used to construct multivariable models for predicting state-level HPV vaccine coverage, and compared to multivariable models constructed using socioeconomic factors: poverty, education, and insurance. Outcome measures included correlations between coverage and the individual topics and socioeconomic factors; and differences in the predictive performance of the multivariable models. Results: Topics corresponding to media controversies were most closely correlated with coverage (both positively and negatively); education and insurance were highest among socioeconomic indicators. Measures of information exposure explained 68% of the variance in one dose 2015 HPV vaccine coverage in females (males: 63%). In comparison, models based on socioeconomic factors explained 42% of the variance in females (males: 40%). Conclusions: Measures of information exposure derived from Twitter explained differences in coverage that were not explained by socioeconomic factors. Vaccine coverage was lower in states where safety concerns, misinformation, and conspiracies made up higher proportions of exposures, suggesting that negative representations of vaccines in the media may reflect or influence vaccine acceptance. Ó 2017 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). 1. Introduction There is a growing recognition that news and social media chan- nels can be exploited to shape individual views through evidence and misinformation. While the value of news and social media as a population intervention to positively influence health behaviors has been examined [1,2], the impact that socially-shaped misinfor- mation has on decision-making and health outcomes is an emerging concern. We explored a healthcare decision that appears susceptible to influence of news and social media—the choice of administering the human papillomavirus (HPV) vaccine to an adolescent. For vaccines generally, rates of coverage are influenced by the policies and practices that make it easier or harder to access vaccines, as well as the representation of their safety and efficacy in public discourse that may affect acceptance. Socioeconomic indicators related to education, poverty, and insurance are associ- ated with levels of immunization across countries [3,4], and these are likely to reflect differences in both access and acceptance. In countries where access to healthcare is a less important barrier, socioeconomic factors tend to display weaker correlations with immunization levels, while attitudes and beliefs play an increas- ingly important role [5]. This phenomenon is described as an emerging inverse relationship [6], and suggests the need to address vaccine hesitancy in public health practice [7,8]. Until now, our understanding of the relationship between the representation of vaccines in news media and acceptance has come from hypotheses generated by surveys and manual forms of media tracking [9,10]. http://dx.doi.org/10.1016/j.vaccine.2017.04.060 0264-410X/Ó 2017 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Abbreviations: HPV, human papillomavirus. Corresponding author. E-mail address: [email protected] (A.G. Dunn). Vaccine 35 (2017) 3033–3040 Contents lists available at ScienceDirect Vaccine journal homepage: www.elsevier.com/locate/vaccine
8

Mapping information exposure on social media to explain ... · Mapping information exposure on social media to explain differences in HPV vaccine coverage in the United States Adam

Jun 12, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Mapping information exposure on social media to explain ... · Mapping information exposure on social media to explain differences in HPV vaccine coverage in the United States Adam

Vaccine 35 (2017) 3033–3040

Contents lists available at ScienceDirect

Vaccine

journal homepage: www.elsevier .com/locate /vacc ine

Mapping information exposure on social media to explain differencesin HPV vaccine coverage in the United States

http://dx.doi.org/10.1016/j.vaccine.2017.04.0600264-410X/� 2017 The Author(s). Published by Elsevier Ltd.This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Abbreviations: HPV, human papillomavirus.⇑ Corresponding author.

E-mail address: [email protected] (A.G. Dunn).

Adam G. Dunn a,⇑, Didi Surian a, Julie Leask b, Aditi Dey c, Kenneth D. Mandl d,e, Enrico Coiera a

aCentre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, NSW 2109, Australiab School of Public Health and Sydney Nursing School, The University of Sydney, Sydney, NSW 2006, AustraliacNational Centre for Immunisation Research and Surveillance, The Children’s Hospital at Westmead, Sydney 2145, NSW, AustraliadComputational Health Informatics Program, Boston Children’s Hospital, Boston, MA 02115, United StateseDepartment of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States

a r t i c l e i n f o a b s t r a c t

Article history:Received 8 March 2017Received in revised form 19 April 2017Accepted 20 April 2017Available online 29 April 2017

Keywords:Human papillomavirus vaccineSocial mediaContent analysisAcceptabilityVaccine refusalImmunization coverage

Background: Together with access, acceptance of vaccines affects human papillomavirus (HPV) vaccinecoverage, yet little is known about media’s role. Our aim was to determine whether measures of informa-tion exposure derived from Twitter could be used to explain differences in coverage in the United States.Methods: We conducted an analysis of exposure to information about HPV vaccines on Twitter, derivedfrom 273.8 million exposures to 258,418 tweets posted between 1 October 2013 and 30 October 2015.Tweets were classified by topic using machine learning methods. Proportional exposure to each topicwas used to construct multivariable models for predicting state-level HPV vaccine coverage, andcompared to multivariable models constructed using socioeconomic factors: poverty, education, andinsurance. Outcome measures included correlations between coverage and the individual topicsand socioeconomic factors; and differences in the predictive performance of the multivariable models.Results: Topics corresponding to media controversies were most closely correlated with coverage (bothpositively and negatively); education and insurance were highest among socioeconomic indicators.Measures of information exposure explained 68% of the variance in one dose 2015 HPV vaccine coveragein females (males: 63%). In comparison, models based on socioeconomic factors explained 42% of thevariance in females (males: 40%).Conclusions: Measures of information exposure derived from Twitter explained differences in coveragethat were not explained by socioeconomic factors. Vaccine coverage was lower in states where safetyconcerns, misinformation, and conspiracies made up higher proportions of exposures, suggesting thatnegative representations of vaccines in the media may reflect or influence vaccine acceptance.� 2017 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND

license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

1. Introduction For vaccines generally, rates of coverage are influenced by the

There is a growing recognition that news and social media chan-nels can be exploited to shape individual views through evidenceand misinformation. While the value of news and social media as apopulation intervention to positively influence health behaviorshas been examined [1,2], the impact that socially-shaped misinfor-mation has on decision-making and health outcomes is an emergingconcern. We explored a healthcare decision that appears susceptibleto influence of news and social media—the choice of administeringthe human papillomavirus (HPV) vaccine to an adolescent.

policies and practices that make it easier or harder to accessvaccines, as well as the representation of their safety and efficacyin public discourse that may affect acceptance. Socioeconomicindicators related to education, poverty, and insurance are associ-ated with levels of immunization across countries [3,4], and theseare likely to reflect differences in both access and acceptance. Incountries where access to healthcare is a less important barrier,socioeconomic factors tend to display weaker correlations withimmunization levels, while attitudes and beliefs play an increas-ingly important role [5]. This phenomenon is described as anemerging inverse relationship [6], and suggests the need to addressvaccine hesitancy in public health practice [7,8]. Until now, ourunderstanding of the relationship between the representation ofvaccines in news media and acceptance has come from hypothesesgenerated by surveys and manual forms of media tracking [9,10].

Page 2: Mapping information exposure on social media to explain ... · Mapping information exposure on social media to explain differences in HPV vaccine coverage in the United States Adam

3034 A.G. Dunn et al. / Vaccine 35 (2017) 3033–3040

HPV vaccines were first available in 2006, and are used to pre-vent the infections that can cause cervical cancers, as well as can-cers of the anus, vagina, penis, and throat [11–13]. Surveys andregistries have been used to estimate HPV vaccine coverage bylocation [14,15], but are resource intensive [16]. Around one thirdof females in developed countries aged 10–20 have received a fullcourse of the vaccine [17], but there is substantial variationbetween and within countries, especially in the United States(US) [18,19]. Variation in both the intention to vaccinate andreceipt of HPV vaccines has been linked to socioeconomic factorsincluding income, education, race and ethnicity, and interactionswith the health system [20–23].

The relationships between the representation of HPV vaccinesin public discourse, vaccine acceptance, and observed vaccine cov-erage are complicated. Describing the reasons why HPV vaccina-tion has fallen short of targets, the 2012 US President’s CancerPanel report recognized that acceptance of the HPV vaccine isaffected by the knowledge, attitudes, and beliefs of parents andhealthcare providers [24]. Differences in exposure to informationthat is critical or noncritical of HPV vaccines may influence the atti-tudes and beliefs of the target populations, their parents, andhealth professionals [25]. The representation of HPV vaccine infor-mation in news and social media varies by source [26–29], andthere is limited evidence to suggest that acceptance of HPV vacci-nes may be influenced by critical news media [30,31].

To better understand how the representation of HPV vaccinesin media may influence or reflect vaccine behaviors, we askedwhether populations that were disproportionately exposed to evi-dence, advocacy, misinformation, safety concerns, or conspiracieshad higher or lower rates of HPV vaccine coverage. We chose touse Twitter to measure population-level differences in informa-tion exposure because it is the largest accessible source ofinformation about the news and social media people read orsee, and has been previously used to explain differences in healthoutcomes. Our aim was to determine whether state level differ-ences in exposure to information on Twitter about HPV vaccines

Fig. 1. The distribution of the 219.7 million potential exposures to HPV vaccine tweets branging from 0 to 19.7 million (New York County, NY). (For interpretation of the referenarticle.)

were associated with state level differences in HPV vaccine cover-age in the US.

2. Material and methods

2.1. Study data

State level coverages of HPV vaccines were extracted from theNational Immunization Surveys undertaken by the Centers for Dis-ease Control and Prevention, which included 10,084 females and10,743 males aged 13–17 in 2014 [18], and 10,508 females and11,367 males aged 13–17 in 2015 [19]. Measures of poverty status,racial and ethnic composition, insurance coverage, and educationlevels were used as proxies for state level differences in healthand socioeconomic status. These data were extracted from 2014census produced by the US Census Bureau [32], for 50 states andthe District of Columbia.

Twitter is a social media platform in which users post shortmessages (tweets) that may include images, videos, links to othertweets, or links to other webpages. Tweets are visible to userswho follow them. We collected tweets in the period between 1October 2013 and 30 October 2015, using the Twitter applicationprogramming interface (API) to repeatedly search for tweets thatincluded keywords related to HPV vaccines (‘‘Gardasil”, ‘‘Cervarix”,‘‘hpv + vaccin⁄”, ‘‘cervical + vaccin⁄”). Shortly after the first time anew user posted a tweet meeting the search criteria, lists of follow-ers were collected, as well as the profile information for each ofthose followers. This information included self-described locationor geographical coordinates. Further details are provided in theSupplementary Material (Section 1.1).

2.2. Location inference

The locations of Twitter users were inferred user profile infor-mation. In Twitter, accurate location information can be found inonly a small proportion of tweets that have coordinates stored in

y county. Colors are given by each county’s percentile among total exposure counts,ces to colour in this figure legend, the reader is referred to the web version of this

Page 3: Mapping information exposure on social media to explain ... · Mapping information exposure on social media to explain differences in HPV vaccine coverage in the United States Adam

Fig. 2. The word distributions and correlations for the 5 most negatively correlated topics. Common words in each topic and the probability that they appear in a tweet in thetopic (left), the strength of the correlation with HPV vaccine coverage (right), and an example tweet from each topic.

A.G. Dunn et al. / Vaccine 35 (2017) 3033–3040 3035

the metadata of the tweet—known as geo-tags—approximately onepercent of tweets [33]. As an alternative, around half of usersdescribe a home location in a free text section of their user profiledesigned for that purpose. Nominatim, a gazetteer used to resolvelocations from text, was used to extract the likely location of usersfrom their user profiles (Supplementary Material, Section 1.3).

2.3. Topic modeling

Topic modeling is a common method for grouping documentsthat are semantically similar based on word level differences[34], and the approach has been used for applications in socialmedia and public health [35]. The approach is a clustering methodthat takes the number of topics as a parameter, and produces amodel that can be used to assign each tweet to a single topic. Moredetails of the method and previous implementations of topic

modeling using social media data for public health applicationsare provided in the Supplementary Material (Section 1.2).

Each tweet was assigned to one of 31 topics. Topic exposures foreach state were estimated by counting the proportion of potentialexposures in that state that came from the tweets from that topic,producing a value between 0 and 1 for each of the 51 by 31 state-topic combinations.

2.4. Multivariable model construction and evaluation

Following a similar approach to that of Eichstaedt et al. [36]multivariable models of vaccine coverage were constructed usingcombinations of predictors from the set of topic exposures andthe census-derived socioeconomic factors. The socioeconomic fac-tors from the US Census Bureau data included the proportions ofpeople in each state who were African American; Hispanic of any

Page 4: Mapping information exposure on social media to explain ... · Mapping information exposure on social media to explain differences in HPV vaccine coverage in the United States Adam

Fig. 3. The state level proportional topic exposures for Topic 9 (the Katie Couric controversy). Topic 9 exposures made up between 3.79% (Rhode Island) and 11.1% (Arkansas)of exposures in each state and exhibited a strong negative correlation with HPV vaccine coverage.

3036 A.G. Dunn et al. / Vaccine 35 (2017) 3033–3040

race; living below poverty levels; had high school or bachelor-degree education; or had health insurance.

Elastic net regression was used to select and weight variables inthe models in an attempt to produce robust and parsimoniousmodels [37]. Fivefold cross-validation was used to measure perfor-mance in out-of-sample data to avoid bias due to over-fitting andthe final models were constructed by averaging the coefficientsacross the five rounds (Supplementary Material Section 1.4). Theoutcome measures for the performance of the models were the r-squared values and the correlation coefficients (Pearson’s r) pro-duced by comparing the logit-transformed values for HPV vaccinecoverage from the National Immunization Surveys with the valuespredicted by the models. In total there were 20 models constructedacross 4 series (males and females; in 2014 and 2015) of 5combinations of predictors (insurance/poverty predictors, educa-tion predictors, all socioeconomic and health predictors, onlyTwitter-based predictors, all predictors), each using 50 or 51 unitsof analysis (one observation was unavailable in one survey). Per-formance was reported by the r-squared values, the correlationcoefficients and their 95% confidence intervals, and the numbersof predictors used in the models.

3. Results

3.1. Distribution of topic exposures within the US

A total of 258,418 tweets about HPV vaccines were identified byrepeated searches using the Twitter API. The total number of publicTwitter users (globally) that may have been exposed to one ormore of these tweets was 109.2 million. From this cohort, 31.4%(34.3 million) were identified within the US and 30.0% (32.8 mil-lion) could be localized at the state level (Supplementary Material,Section 2.1).

From a global total of 1.33 billion exposures to HPVvaccine tweets, 291.8 million potential exposures were associatedwith the 34.3 million users located in the US, of which 93.8%

(273.8 million) could be localized to the state level, and 75.3%(219.7 million) were associated with users localized to the countylevel (Fig. 1). Total exposure counts ranged from 389,982 inWyoming to 41.3 million in California. The ratio of exposures topopulation (also from the US Census Bureau) was highest for theDistrict of Columbia, New York, Rhode Island, Massachusetts,Nevada, and Colorado (Supplementary Material Section 2.2).

The topics varied between generally positive topics describingnew evidence or advocating for the use of HPV vaccines; mixedtopics debating mainstream news media stories; and negativetopics describing safety concerns, conspiracies, and politics(Fig. 2). A mainstream news media story related to a television pro-gram had the strongest overall negative correlations with coveragein females and males, and its highest proportional exposures werefound in states with lower coverage (Fig. 3). Generally positivetopics (describing evidence or advocating for the use of the vac-cine) tended to reach much larger audiences than topics relatedto safety concerns or conspiracies (Fig. 4). A newspaper story thatwas retracted and replaced with evidence-based responses had thestrongest overall positive correlations with coverage in males, andits highest proportional exposures were found in states with highercoverage (Fig. 5).

3.2. Associations with HPV vaccine coverage

The percentage of the population without a Bachelor leveleducation, high school education, and health insurance were mostclosely correlated with the proportions of females and males aged13–17 in each state who had received at least one dose of HPV vac-cine. No other socioeconomic factors were found to be significantlycorrelated with coverage. Among the topic exposure factors, thetopics that exhibited consistently high correlations with HPV vac-cine coverage across both males and females were related to main-stream news media events and issues, safety concerns, andconspiracies. Positive topics (reporting evidence and advocacy)had weaker correlations with coverage.

Page 5: Mapping information exposure on social media to explain ... · Mapping information exposure on social media to explain differences in HPV vaccine coverage in the United States Adam

Fig. 4. The word distributions and correlations for the 5 most positively correlated topics. Common words in each topic and the probability that they appear in a tweet in thetopic (left), the strength of the correlation with HPV vaccine coverage (right), and an example tweet from each topic.

A.G. Dunn et al. / Vaccine 35 (2017) 3033–3040 3037

Models constructed using only socioeconomic factors explained40% and 42% of the variance in initiation (at least one dose) forfemales and males, respectively, while models constructed usingonly topic exposure factors explained 63% and 68% of the variancein initiation (Table 1).

The differences between models that used socioeconomic fac-tors and models that used topic exposure factors was more pro-nounced for initiation compared to completion (Fig. 6). Whenmodeling initiation, the topic exposure models were able toexplain an additional 25% of the variance in coverage comparedto the models based on socioeconomic factors, for both femalesand males (Table 1). The increases for models of completion waslower; 17% and 12% in females and males, respectively.

When the socioeconomic factors and the topic exposure factorswere combined, they marginally improved models in comparisonto those that used topic exposure alone (Fig. 6). Models that com-bined socioeconomic factors and topic exposure factors explained±1% of the variance in coverage compared to the models that onlyused topic exposure factors (Table 1).

4. Discussion

The results showed a strong correlation between HPV vaccinecoverage and exposure to certain topics on Twitter, often negativetopics that expressed concerns about safety or related to conspira-cies. While socioeconomic and health factors are expected to cap-ture differences in access to healthcare, proxies for informationdiets derived from social media data account for more of the vari-ance in coverage, and thus appear to capture acceptance in a waythat socioeconomic factors do not. The results also showed thatHPV vaccine coverage is most closely correlated with topics thatwere covered in mainstream news media, suggesting a relationshipbetween the quality of information in broader public discourse andvaccine acceptance.

Other studies that havemeasured differences in the rates of cov-erage across populations have found that coverage varies by raceand ethnicity, income, education, and interactions with the healthsystem [20–22]. Our results were consistent with these studies inrelation to insurance, education, and poverty–correlations were

Page 6: Mapping information exposure on social media to explain ... · Mapping information exposure on social media to explain differences in HPV vaccine coverage in the United States Adam

Fig. 5. The state level proportional topic exposure for Topic 19 (the Toronto Star controversy). Topic 19 exposures made up between 1.74% (Rhode Island) and 5.12% (Districtof Columbia) of exposures in each state and exhibited a strong positive correlation with HPV vaccine coverage.

Table 1Correlations between model predictions and HPV vaccine coverage (2015 National Immunization Survey).

Models Number of predictors Predictors used Pearson’s R [95% CI] p-value R2

Female aged 13–17, one doseInsurance and poverty 2 1 0.48 [0.24–0.67] 3.75 � 10�4 0.23Education predictors only 2 2 0.51 [0.27–0.69] 1.43 � 10�4 0.26All socioeconomic/health 6 4 0.63 [0.43–0.77] 7.40 � 10�7 0.40Twitter only 31 22 0.79 [0.66–0.88] 4.07 � 10�12 0.63All predictors 37 22 0.82 [0.71–0.90] 1.07 � 10�13 0.68

Male aged 13–17, one doseInsurance and poverty 2 2 0.53 [0.30–0.70] 7.01 � 10�5 0.28Education predictors only 2 2 0.58 [0.37–0.74] 6.53 � 10�6 0.34All socioeconomic/health 6 5 0.64 [0.45–0.78] 3.44 � 10�7 0.42Twitter only 31 10 0.82 [0.71–0.90] 1.33 � 10�13 0.68All predictors 37 18 0.82 [0.70–0.89] 2.07 � 10�13 0.67

Female aged 13–17, three dosesInsurance and poverty 2 2 0.62 [0.42–0.77] 1.15 � 10�6 0.39Education predictors only 2 2 0.63 [0.43–0.77] 6.07 � 10�7 0.40All socioeconomic/health 6 4 0.75 [0.59–0.85] 2.99 � 10�10 0.56Twitter only 31 11 0.78 [0.64–0.87] 1.45 � 10�11 0.61All predictors 37 20 0.82 [0.70–0.89] 2.49 � 10�13 0.67

Male aged 13–17, three dosesInsurance and poverty 2 2 0.44 [0.18–0.64] 1.30 � 10�3 0.19Education predictors only 2 2 0.47 [0.22–0.66] 5.51 � 10�4 0.22All socioeconomic/health 6 5 0.53 [0.30–0.70] 6.88 � 10�5 0.28Twitter only 31 10 0.71 [0.54–0.82] 5.07 � 10�9 0.51All predictors 37 14 0.70 [0.53–0.82] 8.01 � 10�9 0.50

3038 A.G. Dunn et al. / Vaccine 35 (2017) 3033–3040

identified between HPV vaccine coverage reported in the NationalImmunization Survey in 2014 and 2015, and education, income,and insurance data from US Census in 2014. The results were notalignedwith these studies in relation to race or ethnicity—we foundno significant correlations between HPV vaccine coverage reportedin the National Immunization Survey in 2014 and 2015 and the pro-portions of Black and African American people or the proportion ofHispanic people of any race from the US Census. However, becauseour models were state level aggregates and previous studies were

conducted within individual states and considered patient-levelreceipts of the HPV vaccine, the results are not directly comparable.

A previous study used Twitter data to predict county-level heartdiseasemortality by sampling and counting tweets without search-ing for terms related to the application domain [36]. Rather thancounting the number of tweets in a topic, we examined exposureby collecting data about Twitter followers, and only used tweetsthat were directly relevant to HPV vaccines. While the results arenot directly comparable, both studies identified correlations that

Page 7: Mapping information exposure on social media to explain ... · Mapping information exposure on social media to explain differences in HPV vaccine coverage in the United States Adam

Fig. 6. Correlations between the HPV vaccine coverages for initiation (at least one dose) and completion (three doses). Coverage data are from the 2015 NationalImmunization Survey, and coverages estimated by each of five models including combinations of socioeconomic factors and topic exposures; with 95% confidence intervals.

A.G. Dunn et al. / Vaccine 35 (2017) 3033–3040 3039

matched expectations and prior evidence—topics related to nega-tive emotions like anger and stress were positively associated withheart disease mortality; and here topics related to harms, sideeffects, and conspiracies were negatively associated with HPV vac-cine coverage.

Our methods may be used to generate further hypotheses aboutwhy people who choose not to vaccinate tend to be geographicallyclustered [10,38–40]. While opinions are known to cluster withincommunities through homophily and contagion, there has been lit-tle research previously to examine how the effect of external influ-ences such as news and social media may be localized bycommunity and location. The results also help reveal the limitsand biases implicit in sampling information exposure via socialmedia and suggest that even without accounting for samplingbiases, proxies for information diets derived from Twitter datamay be used as a signal of health behaviors that are otherwise dif-ficult or costly to measure.

Routine systems that monitor exposure to relevant topics couldbe constructed to identify locations where misinformation or low-quality evidence is over-represented in the news media, and guidepublic health interventions to amplify high-quality evidence andguide social media interventions [2]. More precise identificationof the safety concerns, misinformation, and conspiracies that areimportant to a segment of the public could provide a basis fromwhich to construct targeted and cost-effective news and socialmedia based interventions.

4.1. Limitations

Twitter users may not be representative of the general popula-tion and some Twitter users represent entities other than individ-uals, including organizations and non-human users. However, therelationship is measured between the representation of evidenceand misinformation in the public domain and vaccination cover-age, rather than an individual-level prediction of vaccinedecision-making by population. Second, the method also reliedon self-reported locations from Twitter users, which capturesaround half of the users that may have been exposed to evidenceand misinformation. This is a standard approach but it could beimproved through location and demographic inference methods(see Supplementary Material). Third, the study did not considertemporal patterns in exposure to topics corresponding withchanges in coverage between 2014 and 2015, which limits the abil-ity to conclude about whether changes in information exposurewere lagging or leading indicators of decision-making in the popu-lation. Fourth, recommendations for two dose models were intro-duced in 2016 and we considered survey data for one dose andthree doses in 2014 and 2015, relating those to initiation and com-pletion. Future epidemiological analyses would need to considerthese changes. Finally, socioeconomic factors were selected basedon a literature review of the evidence about observed differencesin recipients of vaccines but there may be other measurable socioe-conomic and health factors correlated with HPV vaccine coveragethat were not included in the analysis.

Page 8: Mapping information exposure on social media to explain ... · Mapping information exposure on social media to explain differences in HPV vaccine coverage in the United States Adam

3040 A.G. Dunn et al. / Vaccine 35 (2017) 3033–3040

5. Conclusions

Measures of exposure to HPV related tweets explained more ofthe variance in state level HPV vaccine coverage than wasexplained by socioeconomic factors. Our study suggests that instates where negative opinions about HPV vaccines are popular-ized by mainstream media, the coverage is often lower than wouldbe expected by socioeconomic differences alone. These differencessuggest that news media may reflect, amplify, or influence vaccineacceptance, and that measures of information exposure derivedfrom Twitter may be a surrogate indicators for localized differencesin acceptance.

Conflicts of interest

No conflicts of interest.

Funding

This work was supported by the National Health and MedicalResearch Council (Grant No. APP1128968). The NHMRC playedno role in the study design, data collection, analysis and interpre-tation, the writing of the report, or the decision to submit the arti-cle for publication.

Contributions

AGD, DS, JL, AD, KM, EC designed the study; AGD collected thedata; AGD, DS undertook the analysis; AGD, JL, KM, AD interpretedthe results; AGD drafted the article; AGD, DS, JL, AD, KM, EC criti-cally revised the article for important intellectual content. Allauthors have approved the final article.

Appendix A. Supplementary material

Supplementary data associated with this article can be found, inthe online version, at http://dx.doi.org/10.1016/j.vaccine.2017.04.060. These data include Google maps of the most important areasdescribed in this article.

References

[1] Wakefield MA, Loken B, Hornik RC. Use of mass media campaigns to changehealth behaviour. The Lancet 2010;376(9748):1261–71.

[2] Coiera E. Social networks, social media, and social diseases. BMJ 2013;346:f3007.

[3] de Figueiredo A, Johnston IG, Smith DMD, et al. Forecasted trends invaccination coverage and correlations with socioeconomic factors: a globaltime-series analysis over 30 years. The Lancet Global Health 2016;4(10):e726–35.

[4] Tauil MdC, Sato APS, Waldman EA. Factors associated with incomplete ordelayed vaccination across countries: a systematic review. Vaccine 2016;34(24):2635–43.

[5] Leask J. Target the fence-sitters. Nature 2011;473(7348):443–5.[6] Larson HJ, de Figueiredo A, Xiahong Z, et al. The state of vaccine confidence

2016: global insights through a 67-country survey. EBioMedicine2016;12:295–301.

[7] van Boven M, van Lier A. Global trends in vaccination coverage. The LancetGlobal Health 2016;4(10):e670–1.

[8] Leask J, Willaby HW, Kaufman J. The big picture in addressing vaccinehesitancy. Human Vaccines & Immunotherap. 2014;10(9):2600–2.

[9] Larson H, Schulz W, Tucker J, et al. Measuring vaccine confidence: introducinga global vaccine confidence index. PLoS Currents 2015;1. currents.outbreaks.ce0f6177bc97332602a8e3fe7d7f7cc4.

[10] Larson HJ, Jarrett C, Eckersberger E, et al. Understanding vaccine hesitancyaround vaccines and vaccination from a global perspective: a systematicreview of published literature, 2007–2012. Vaccine 2014;32(19):2150–9.

[11] Forman D, de Martel C, Lacey CJ, et al. Global burden of human papillomavirusand related diseases. Vaccine 2012;30(Supplement 5):F12–23.

[12] Gertig D, Brotherton J, Budd A, et al. Impact of a population-based HPVvaccination program on cervical abnormalities: a data linkage study. BMCMed2013;11(1):227.

[13] Chow EPF, Danielewski JA, Fehler G, et al. Human papillomavirus in youngwomen with Chlamydia trachomatis infection 7 years after the Australianhuman papillomavirus vaccination programme: a cross-sectional study.Lancet Infect Dis 2015;15(11):1314–23.

[14] Laemmle-Ruff I, Barbaro B, Brotherton J. Human papillomavirus vaccinenational catch-up program Insights into under-notification. Aust FamPhysician 2013;42:880–4.

[15] Brotherton JML, Liu B, Donovan B, et al. Human papillomavirus (HPV)vaccination coverage in young Australian women is higher than previouslyestimated: Independent estimates from a nationally representative mobilephone survey. Vaccine 2014;32(5):592–7.

[16] Cutts FT, Claquin P, Danovaro-Holliday MC, et al. Monitoring vaccinationcoverage: defining the role of surveys. Vaccine 2016;34(35):4103–9.

[17] Bruni L, Diaz M, Barrionuevo-Rosas L, et al. Global estimates of humanpapillomavirus vaccination coverage by region and income level: a pooledanalysis. The Lancet Global Health 2016;4(7):e453–63.

[18] Reagan-Steiner S, Yankey D, Jeyarajah J, et al. National, regional, state, andselected local area vaccination coverage among adolescents aged 13–17years—United States, 2014. Morb Mortal Wkly Rep 2015;64(29):784–92.

[19] Reagan-Steiner S, Yankey D, Jeyarajah J, et al. National, regional, state, andselected local area vaccination coverage among adolescents aged 13–17 years— United States, 2015. Morb Mortal Wkly Rep 2016;65:850–8.

[20] Moss JL, Reiter PL, Brewer NT. Correlates of human papillomavirus (HPV)vaccine coverage: a state-level analysis. Sex Transm Dis 2015;42(2):71–5.

[21] Tsui J, Rodriguez HP, Gee GC, et al. Are HPV vaccination services accessible tohigh-risk communities? A spatial analysis of HPV-associated cancer andChlamydia rates and safety-net clinics. Cancer Causes Control 2013;24(12):2089–98.

[22] Fisher H, Trotter CL, Audrey S, et al. Inequalities in the uptake of humanpapillomavirus vaccination: a systematic review and meta-analysis. Int JEpidemiol 2013;42(3):896–908.

[23] Polonijo AN, Carpiano RM, Reiter PL, et al. Socioeconomic and racial-ethnicdisparities in prosocial health attitudes: the case of human papillomavirus(HPV) vaccination for adolescent males. J Health Soc Behav 2016;57(3):390–406.

[24] National Cancer Institute. Accelerating HPV vaccine uptake: urgency for actionto prevent cancer. A report to the President of the United States from thePresident’s Cancer Panel. In. Bethesda, MD; 2014.

[25] Leask J. Media researchers must understand the audience too. Eur J PublicHealth 2016;26(4):534–5.

[26] Hilton S, Hunt K, Langan M, et al. Newsprint media representations of theintroduction of the HPV vaccination programme for cervical cancer preventionin the UK (2005–2008). Soc Sci Med 2010;70(6):942–50.

[27] Madden K, Nan X, Briones R, et al. Sorting through search results: a contentanalysis of HPV vaccine information online. Vaccine 2012;30(25):3741–6.

[28] Krieger JL, Katz ML, Eisenberg D, et al. Media coverage of cervical cancer andthe HPV vaccine: implications for geographic health inequities. Health Expect2013;16(3):e1–e12.

[29] Fu LY, Zook K, Spoehr-Labutta Z, et al. Search engine ranking, quality, andcontent of web pages that are critical versus noncritical of humanpapillomavirus vaccine. J Adolesc Health 2016;58(1):33–9.

[30] Sotiriadis A, Dagklis T, Siamanta V, et al. Increasing fear of adverse effectsdrops intention to vaccinate after the introduction of prophylactic HPVvaccine. Arch Gynecol Obstet 2012;285(6):1719–24.

[31] Larson H. The world must accept that the HPV vaccine is safe. Nature 2015;528(7580):9.

[32] US Census Bureau. American Community Survey 1-Year Estimates.<http://factfinder2.census.gov>.

[33] Sloan L, Morgan J. Who tweets with their location? Understanding therelationship between demographic characteristics and the use of geoservicesand geotagging on Twitter. PLoS ONE 2015;10(11):e0142209.

[34] Blei DM. Probabilistic topic models. Commun ACM 2012;55(4):77–84.[35] Surian D, Nguyen DQ, Kennedy G, et al. Characterizing Twitter discussions

about HPV vaccines using topic modeling and community detection. J MedInter Res 2016;18(8):e232.

[36] Eichstaedt JC, Schwartz HA, Kern ML, et al. Psychological language on Twitterpredicts county-level heart disease mortality. Psychol Sci 2015;26:159–69.

[37] Zou H, Hastie T. Regularization and variable selection via the elastic net. J RoyStatis Soc: Ser B (Stat Method) 2005;67(2):301–20.

[38] Atwell JE, Van Otterloo J, Zipprich J, et al. Nonmedical vaccine exemptions andpertussis in California, 2010. Pediatrics 2013;132(4):624–30.

[39] Lieu TA, Ray GT, Klein NP, et al. Geographic clusters in underimmunization andvaccine refusal. Pediatrics 2015;135(2):280–9.

[40] Omer SB, Richards JL, Ward M, et al. Vaccination policies and rates ofexemption from immunization, 2005–2011. N Engl J Med 2012;367(12):1170–1.