SNAP: Social Media and Network Analy5cs for Public Health
• HenryKautz• GoergenIns1tuteforDataScience,UniversityofRochester
• Idea• Socialmediausers=distributedsensornetwork
• Challenge• Discoveringsignalinnoise
• Method• Machinelearning
• Applica1ons• Trackinginfluenza• Measuringalcoholuse• Improvingfoodsafety
!me-lapseheatmapoftweetsfromNYC
Analyzing Tweets • Goal:findraretweetsaboutspecificdiseasesymptoms(1/50,000)
• Previousapproach:keywords• Problems:“sickofhomework”,“undertheweather”
• Ourapproach:machinelearning• Use“MechanicalTurk”workerstocreatetrainingdata• 98%accuracy
SickTweetsMachine
LearningSystem
TrainingData
Contains“sneeze”?“sick”?“1red”?
Twi@erhealth Usinga“flusymptom”classifier,wecan:• Measureflulevelsaccuratelyandquickly• Predictriskofpar1cularuserscatchingtheflu• Discovercorrela1onsbetweenfluandfactorssuchasairpollu1on
nEmesis
• Listentotweetstofindpossiblecasesoffoodpoisoning• Useresultstopriori1zerestaurantinspec1ons• 3monthtrialisLasVegas:provedeffec1veindoubleblindtrial• CDCfunding5yearexpansion
GeoDrink
• UnderstandingpaZernsofalcoholuseincommuni1es• Inferloca1onsofusers’homesandtheexact1meandplaceofdrinking