YOU ARE DOWNLOADING DOCUMENT

Please tick the box to continue:

Transcript
Page 1: Research Using Twitter Social Media Datasouthampton.ac.uk/~lac/WebDataResearchAssistant/HowTo.pdf · 2018-08-03 · use with Twitter. 1) Install the Web Data RA browser extension

1

ResearchUsingTwitterSocialMediaDataTwitterisanextensivesourceofsocialmediadata,andunusualinthatithasbothanextensivedatasharinginfrastructureandanexplicitlypermissiveTermsofService.ThistutorialdocumentshowshowtosocialscientistscanuseTwitterasasourceofresearchdata,withoutanytechnicalbackgroundorprogrammingability.Itshowshowtousecommonlyavailabledatacaptureandanalysistools,andthekindsofresearchmethodsandinvestigationsthattheycansupport.NoteonSoftwareRequiredforthisTutorialTofollowthistutorial,youwillneedtousetheChromebrowsertoaccessTwitter’swebsite.Thedatacapturetool(WebDataRA)isaChromebrowserextensionthatinterpretstheTwitterWebpagesandcapturesthemasspreadsheetdatathatyoucanpasteintoMicrosoftExcel(WindowsorMac).YouwillalsouseGephi,anopensourcenetworkanalysisandvisualisationapplicationwhichyoucandownloadandinstallfromgephi.org.ThisexpectsarecentversionoftheJavaRuntimeEnvironment(JRE)tobeinstalledonyourmachine.

InstallingandUsingWebDataRATheWebDataRAwillcaptureTwitter,FacebookandGoogledatafromabrowserandallowyoutopasteatableofinformationdirectlyintoaspreadsheet.ThisdocumentfocusesonitsusewithTwitter.1) InstalltheWebDataRAbrowserextensionintoChromebyvisiting

bit.ly/WebDataRAinChrome,andclickingontheblue“+AddtoChrome”button.Thesmallgreenicon willappearinthetoprightofthebrowserwindow,nexttotheURLbar.

2) GotoTwitter.comandcreateaTwittersearchordisplayatimeline3) ClickontheWebDataRAicon tostartcollectingtweets.Everyfive

secondsthebrowserwillautomaticallyscrolltothebottomofthepagetomakeTwitterloadthenextbatchofresultsandcopythedataautomaticallytotheclipboard.

4) Whenyouhavecollectedenoughresults,pastethedataintoanExcelspreadsheet.5) UseExceltoanalysedata,orexporttootherprogramssuchasGephiorVoyantforother

kindsofanalysis.

Page 2: Research Using Twitter Social Media Datasouthampton.ac.uk/~lac/WebDataResearchAssistant/HowTo.pdf · 2018-08-03 · use with Twitter. 1) Install the Web Data RA browser extension

2

OverviewWhenyoupastedatafromWebDataRAyouwillcreatefourtablesinyourspreadsheet,appearingbeloweachother.

Thetweetdata,withauthor,mentions,hashtags,textandcountsofretweets,repliesandlikesbrokenoutinseparatecolumns.

Accountoccurrencesummary,acountofthenumberoftimesthateachTwitteraccountappearsinthedatasetasauthororamention(includingthenumberofretweets).

Countsoftheappearancesofeachhashtag.

Atableofedgesoftheconversationalnetwork,i.e.thenumberoftimeseachpairofaccountscommunicatewitheachother.

Youcanusethisdatainvariousways:(a) Thetweetdata(gray)containsthebasicdataabouteachtweet:whatwassaid,

when,bywhoandtowhom.Youcanusethisdatatoformageneraloverviewofthecommunicationovertimeandidentifythemostsignificanttweets.YoucanalsoexaminespecifictweetsandtheircontextbyreferringbacktotheTwittersiteusingeachtweet’sURL.

(b) Theaccounttable(green)showsyouthemostactivetweeters,themostfrequentrepliers,andthemostretweetedusers.Thiswillhelpyouseethekeyactorsinaconversation,andthemainrolesthattheytake.Youcanfollowupbyclickingontheaccountnamestolookattheaccountbiosandtherelevanttimelinesoftheseactorstounderstandwhethertheyarecorporateaccounts,privateindividuals,botsortrolls.

(c) Thehashtagtable(blue)showsyouthemostfrequentlyusedhashtags.Thiscanhelpyouextendyourdatagatheringtolookformoretweetsrelevanttoyourresearchquestion.

(d) Theedgetablewillhelpyoutoseetheinteractionsbetweenactors,andhelpyoutounderstandgroupingsofactors,andthepatternoftheirinteraction.Isakeyaccountdominatingaconversationandtalkingtomanyothers?Aretheyrespondingorjustbeingpassiverecipientsofmarketingmessages?Isthereagroupofequalshavingabalancedconversationwithequalparticipation?

Page 3: Research Using Twitter Social Media Datasouthampton.ac.uk/~lac/WebDataResearchAssistant/HowTo.pdf · 2018-08-03 · use with Twitter. 1) Install the Web Data RA browser extension

3

UsingExcelforSimpleQuantitativeAnalysesThefollowingsetoftweets(graytable)comesfromaTwittersearchforthephrase“DigitalDetox”.

TheeasiestwaytoseeanoverviewofaTwittertimelineistocreateaPivotTable.Clickonanygraycell,andchoose“PivotTable”fromtheInsertribbon.InthePivotTablebuilder,drag“Author”fromtheFieldNamepanelintothe“Rows”panel,drag“Timestamp”intothe“Columns”panel,anddrag“Author”(again)intothe“Values”panel(itwillautomaticallyturninto“CountofAuthor”).

Thescreenshotaboveshowsthedatesincludedinthetwittersampleasgreencolumnheadings,andtheaccountsthatauthoredtweetsasrowlabelsalongthelefthandside,orderedbymostprolifictweeters.Thevaluesineachcellarethenumberoftweetsauthoredbyaspecificaccountonaspecificday.Youcanadjusttheformattingforconvenience(Inarrowedthecolumnsandslantedthecolumnheadingsandchangedtheangleofthetextto60degreestofit),usethe“Row

Page 4: Research Using Twitter Social Media Datasouthampton.ac.uk/~lac/WebDataResearchAssistant/HowTo.pdf · 2018-08-03 · use with Twitter. 1) Install the Web Data RA browser extension

4

Labels”controltosortbytheauthorcount(i.e.thenumberoftweetsanauthorcreated)andshowonlytherowswherethetotalauthorcountisgreaterthanachosenthreshold.Youcanalsouseconditionalformattingtocolourthecellstohighlightthemostextremevalues.AllkindsofsummariesandanalysesarepossibleusingExcelonthisdata,including:

• Showingthedistributionofthetweetsamplethroughtime• Identifyingthemostprolificand/orpopularactors,andshowingtheiractivity

throughtime• Showingtheuseofindividualhashtags(thismightbeusefulinabigconversation,or

onethatevolvesoveralongerperiod)• Comparingtherelativeproportionofcontributionsfromdifferentactors/hashtags

AlloftheseanalyseswillleadontootherquestionsthatcanbeaskedbygoingbacktoTwitter.Theaccountnamesintheaccount“authorandmentions”(green)tableareclickable,andopenthepageoftheaccountprofileinyourdefaultwebbrower.Alternatively,tomakeasetofaccountnamesintoclickablehyperlinksgivingbrowseraccesstotheuser’sTwittertimelineandbio,usethefollowingfunction:=HYPERLINK(CONCATENATE("http://twitter.com/",A9),CONCATENATE("@",A9)) whereA9isanexampleofacelladdressthatcontainsaTwitteraccountnamee.g.lescarr.Followingtheaccounthyperlinksforthemostprolificauthorsinthegreentable,weseethattheyareallcommercialactorstooneextentoranother.Account # BioItsTimeToLogOff 30 TimeToLogOffisthehomeofdigitaldetox.We’respearheadingthe

movementtodisconnectregularlyfromdigitaldevicesandreconnectwiththeworldoffline.Wedothisthroughcollectingfactsontheneedfordigitaldetox,runningcampaignstogeteveryoneofftheirscreensandhostingretreats,eventsandworkshops.

DinnerTableMBA 9 Acommercialorganisationworkingtogethertohelpfamiliesbecomemoreconfident,successful,andself-empowered

SpareFoot 8 Astoragecompany.Wemakeiteasytomoveandstoreyourstuff.Reservestorageforfreeandgetyourmindoutoftheclutter.

CultureEffect 5 AuthorofDigitox:HowtoFindaHealthyBalanceforyourFamily’sDigitalDiet

UsingGephiforNetworkAnalysesToseehowthevariousaccountsinteractwitheachotherasanetwork,copyandpastetheyellowtableintoaseparatespreadsheetandsaveitasaCSVfile(callitedgetable.csvor

3.OneclicktoopenTwitteraccountinbrowser

2.Functionturnsaccountnameintolink

1.Functionusesaccountname

Page 5: Research Using Twitter Social Media Datasouthampton.ac.uk/~lac/WebDataResearchAssistant/HowTo.pdf · 2018-08-03 · use with Twitter. 1) Install the Web Data RA browser extension

5

similar).Loadupthenetworkvisualisationprogram“Gephi”,andstartanewproject.Inthe“DataLaboratory”,choose“ImportSpreadsheet”andloaduptheCSVdataasanedgetable.Youcanthenapplyavarietyofnetworklayoutalgorithmsinthe“Overview”pane.

Thenetworkvisualisationshowsmanyisolatednodes(accounts)inanouterringandacentralcoremadeupofdifferentgroupsofaccounts.Manyofthesearelooselyconnected“chains”of2-6accountswhereoneaccounthasmentionedanother,whichhasmentionedanotherandsoon.Therearemorecomplicatedsubnetworkcomponentsthatdemonstratemoreactivity,asseenbelow.

Thegreencomponentisdominatedbyasinglecorporateaccount(themostprolificaccountinthissample)whoseroleistopromotetheideaofadigitaldetoxandthat“tweetsat”manyotheraccounts,initiatingcommunicationwiththem.Bycontrast,therednetworkconsistsofalargergroupofteachersandeducationprofessionalswhoalreadyparticipateinalargerprofessionalnetworkwithinTwitter,andwhoarediscussingthetopicofdigitaldetoxwithinthatcontext.ManysummariesandanalysesarepossibleusingGephi’snetworkvisualisationtool:

• Showingtheinteractionofthenetworkactors• Identifyingthecommunitiesandactiveparticipantsubgroupswithinthelarger

sample• Identifyingtherolesofdifferentactorsinthecommunicationsnetwork

Page 6: Research Using Twitter Social Media Datasouthampton.ac.uk/~lac/WebDataResearchAssistant/HowTo.pdf · 2018-08-03 · use with Twitter. 1) Install the Web Data RA browser extension

6

Pleasenote,thatmanynetworkanalysesintheliteratureoftenfocusontheretweetnetwork,butthatisnotpossiblewithWebDataRA,becausetheidentityoftheretweetingaccountsisnotavailabletotheWebbrowser(theWebUserinterfaceorWebapp).Youhavedataonthenumberofretweets(i.e.thepopularityofthetweet),butnottheaccountsthatretweetedtheoriginalmessage.Formoreinformationonasupplementaryservicebeingdevelopedtofillthisgap,seetheendofthisdocument.

UsingVoyantforTextualAnalysesInthegraytable,copythe“SanitisedText”column.Thiscontainsthetextofallthetexts,butwithalltheTwitterfeatures(@names,#hashtags,URLs)removedtoleaveonlytheEnglishtext.GototheVoyant-Tools.orgwebsite,pastethetextintothetextboxandpressthe“Reveal”button.Youwillseeascreenwithseveralpanelsthathelpyouexplorethetextofthetweetsindifferentways.VoyantToolsisatextualcorpusanalyser.Itconsidersthedatathatyouhaveenteredasasingledocumentwheretheindividualtweetsarelikeindividualsentencesorparagraph.TheWordcloud,Reader,TrendandConcordanceallanalysethetextfromthecollectionoftweets.ClickonawordintheWordcloud,andallitsoccurrenceswillbehighlightedintheReaderpanel,it’sfrequencythroughoutthewholedocument(setoftweets)willbedisplayedintheTrendgraph,anditscontextwillbedisplayedintheConcordance.Thishelpsyoutoinvestigatetheuseoflanguageinthecollectionoftweets,andquicklyunderstandwhatisbeingtalkedaboutandhow.Italsohelpsyoutoseehowthelanguagechangesovertime.Thefirstthingthatthisdisplayshowsisthatthemostcommontermsaredigital,detox,anddigitaldetoxbecausetheywerethesearchterms!Toignorethem,addthemtothestopwordslistbyclickingonthe“Options”iconintheWordcloudpanel’sgreyiconbar.

Whenthemouseentersthebar,youwillseethefollowingiconsappear: .The“Options”iconistheslidingbutton,nexttothe“Help”questionmark.

ToedittheStopwordslist,clickonthe“EditList”buttonandaddthethreeextratermstobeignored,oneperline.

Page 7: Research Using Twitter Social Media Datasouthampton.ac.uk/~lac/WebDataResearchAssistant/HowTo.pdf · 2018-08-03 · use with Twitter. 1) Install the Web Data RA browser extension

7

Youwillseethatthepanelshavere-analysedthetext,ignoringthetermsthatyouhaveaddedtothestopwordlist.

Othermorecomplexvisualisationsandanalysesareavailabletobeused,includingadvancedMachineLearningalgorithmstoclusterkeywordsandsimplegraphvisualisationstoshowkeywordco-occurrence.

StreamGraph DimensionalReductionsofKeywordAppearance

NetworkofKeywordCo-occurrences

Page 8: Research Using Twitter Social Media Datasouthampton.ac.uk/~lac/WebDataResearchAssistant/HowTo.pdf · 2018-08-03 · use with Twitter. 1) Install the Web Data RA browser extension

8

AnalysisofTwitterLanguageThewordcloudshowsinvisualformthemostcommonlyusedtermsintheTwittersample.Thisisgreatforanimpressionofthetopics,butamoreusefulsummaryisthe“Terms”panel.Using15ofthetop20wordsinthissample,wemighthypothesisethatadigitaldetoxisaboutspendinglesstimeusingaphoneandtheneedtounplugyourselffromsmartphonesocialmediatechnologyapps–it’sahealthchallengeyoutryforaday,aweekorayear.

Aneasywaytoinvestigatetheuseofeachofthesewordsistoselecttheminthetermpanel,andscrollthroughthein-contextuseinthe“Context”panel.Themostcommonlyusedwordis“time”butit’smainuseisnotinthesenseof“spendingtoomuchtime”or“savingtime”butinthesenseofanopportunity(it’stimeto…orit’stimefor…)andfrequentlyasarhetoricalquestion(Isn’tittimeforadigitaldetox?)FindingtweetsthatspeakfrompersonalexperienceManyofthetweetsseemtobepromotionallifestyletweetsinheadlineform(e.g.“WhyYouShouldDoaDigitalDetox”or“DigitalDetoxBenefits:Areyouaddictedtotechnology?”)Toidentifypersonaltweetsfrompeoplewhohavetriedorarethinkingoftryingadigitaldetox,searchfortweetscontainingthepersonalpronoun“I”.Unfortunately,“I”isoneoftheVoyantToolsstoplistwordsandisignored(astoplistconsistsofthecommonbutlow-informationwordsinatextincludingpronouns,prepositionsandconjunctions).AlthoughitispossibletoeditthestoplistinVoyant,IwillsearchforthestringI<space>orI<apostrophe>inthetextofthetweet(tomatchI,Iam,Ihave,Iwill,I’m,I’ve,I’ll).Usethefollowingformula =OR(NOT(ISERROR(FIND("I ",A1))),NOT(ISERROR(FIND("I'",A1)))) todefineanextracolumninthegraytable,andsortthesetbytheresults(TRUEorFALSE).Only17%ofthetweetsinthiscollectionwereidentifiedaspersonalbythissimplerule.(Note,noteverytweetwiththewordIistalkingfromafirstpersonperspective,andmanytweeterswouldmiss“Ihave”or“Iam”fromthebeginningofatweetortext.)Ofthose17%,justoverhalfwereactivelyplanningadigitaldetox,weredoingadigitaldetoxorhaddoneone.(Theironyoftweetingaboutdoingadigitaldetoxisnotlostonsomecommentators!)

Page 9: Research Using Twitter Social Media Datasouthampton.ac.uk/~lac/WebDataResearchAssistant/HowTo.pdf · 2018-08-03 · use with Twitter. 1) Install the Web Data RA browser extension

9

Itisthenpossibletofollowupwitheachoftheaccountsthathastweetedaboutanactualperiodofdigitaldetoxthattheyhaveundertaken,toexaminetheirTwitteractivitybefore,duringandafterthedetoxandtoidentifyanyquantitativedifferenceintheiruseoftheplatform.Asanexample,Ichoseasingleaccountthathadannouncedthattheywerestartinga“DigitalDetox”todeprivethemselvesofsocialmediaforaday.Iaccessedallthetweetsofthatuserovertheperiod(usingWebDataRA),andcreateda1-authorpivottable(asabove)toaggregatethetweetstoacountperday,andthenplottedthatdataasascattergraphwith7-daymovingaverage.Thegraphshowsthatalthoughtheaccountdidnottweetonthechosendate(markedwitharedcross)thereisnonoticeablechangeintweetingbehaviouraftertheannounceddetoxday.

AdditionalMethods:SentimentAnalysisSentimentanalysiscanhelpyouidentifypositiveornegativecommentsinyoursample.Thisisapopularmethodinindustry,especiallywithbrandmanagementcompanies.Howeveritisacademicallycontested,anddoesnothaveahighdegreeoftransparencyinthelexicalprocessing. Nevertheless,totryitout,pastethe“SantitisedText”columnintoanonlineservicesuchassentigem.com.Considertowhatextenttheresultsseemaccuratetoyou–howwelldoesitidentifypositiveandnegative‘sentiment’inasentence?Whatkindsofinaccuraciescanyousee?And,mostimportantly,despiteanyshortcomings,doesithelpyoutoidentifyanypointsofinterestinyourdataformorethoroughinvestigation?

7% 11%

22%

23%

25%

12%

PersonalTweetsaboutDigitalDetox

WantToDo

PlanToDo

Doing

Did

Suggestions

Opinions

Page 10: Research Using Twitter Social Media Datasouthampton.ac.uk/~lac/WebDataResearchAssistant/HowTo.pdf · 2018-08-03 · use with Twitter. 1) Install the Web Data RA browser extension

10

AdditionalMethods:RetweetNetworkAsupportingserviceisbeingdevelopedforWebDataRAtoallowyoutorecovertheretweetnetworkaroundthetweetsthatyouhavecollected.Aspreviouslyexplained,theTwitterwebappdoesnotshowtheretweetsofeachtweet,onlythecountofretweets.However,itispossibletorequestthatdatafromtheTwitterAPI,althoughitislimitedto100retweetsofeachtweet.(Ifatweethasbeenretweetedmorethan100times,thoseextratweetswillnotbeaccessible,assuchthisshouldbeconsideredtobeanapproximationoftherealretweetnetwork.)Toobtaintheretweetdataforyourtweets,selectthetwitteridsforwhichyouwishtofindtheretweets,andpastetheseintotheformathttp://pretend.webdataRA/retweets/.PleasenotethattheTwitterAPIlimitsrequestssuchthateachtwitterIDthatyouprovidewilltake12secondstoprocess,andthefinishedresultmaytakeanhourormore.YoucanpastetheresultingdataintoanExcelspreadsheet,saveitasCSVandimportitintoGephiaspreviouslyexplained.IntheGephinetworkvisualisationbelow,nodesarelargerandcolouredmoreintenselyaccordingtohowmuchothernodesretweetthem(theirroleasauthoritiesinthenetwork),butthenodelabelsaresizedaccordingtowhethertheyretweetorareretweeted.Soyoucanseethatsomeaccountsareveryactivebutnotasoriginatorsofinformation,whereasotheraccountsmaybelessactivebutaremoreinfluential.

Page 11: Research Using Twitter Social Media Datasouthampton.ac.uk/~lac/WebDataResearchAssistant/HowTo.pdf · 2018-08-03 · use with Twitter. 1) Install the Web Data RA browser extension

11


Related Documents