TEMPLATE DESIGN © 2008 www.PosterPresentations.com Introduction Footnotes Acknowledgements Sreejata Chatterjee ([email protected]) Faculty of Computer Science, Dalhousie University, Halifax, Canada [1] Mashable Social Media: http://mashable.com/2011/09/08/twitter-has-100-million-active [2] Social Media Lab: http://socialmedialab.ca/?p=1952 [3] Wired.com: http://www.wired.com/wiredscience/2010/10/twitter-crystal-ball [4] Radian6: Social Media Monitoring and Engagement, Social CRM There are huge amounts of real-time social media data being created every moment. For example, ~230 million tweets are posted daily by Twitter’s 200 million users [1]. If harnessed, it can provide a great wealth of insight into what people are thinking about and what they like or dislike. For instance, Twitter data has already proven to be useful in a number of different contexts: monitoring elections [2] to predicting stock market trends [3] to conducting brand monitoring and PR campaigns [4]. However, social media data tend to be noisy and ephemeral. Furthermore, social media companies often limit the amount of data one can access automatically at any point of time, making this rich source of transient data difficult to collect. This work focuses on designing and developing automated methods and a web-based infrastructure that can help other researchers and developers to collect and process raw social media data by: (1) Creating a Data Collector and Repository Tool for collecting and storing public Twitter data for a specified group of online users in an effective and efficient manner, (2) Connecting open APIs via Web Services which process Twitter to add value and richness to the Twitter data in our database, such as geo-coding or assigning “influence” scores to Tweeters, (3) Creating an NLP (Natural Language Processing) Module that can conduct sentiment analysis on social media data, (4) Providing a robust API that other developers can use to create and test innovative web applications with the data collected. I would like to thank Dr. Anatoliy Gruzd, Director of the Social Media Lab, for supervising this research. Additionally, I would like to thank Philip Mai, Research Manager at the Social Media Lab for his valuable feedback. System Architecture for Handling Social Media Data getAllTweet - Return all the tweets by all the users getUserTweets - Returns tweets posted by a specified user getTimedUserTweets - Returns tweets within a time interval getUserProfilePicUrl - Returns user’s profile picture getUserDetails - Returns detailed user information getUserTimeLineInfo - Returns basic user information API calls are made via HTTP requests (see below). The output is formatted in JSON (JavaScript Object Notation). 1) Gets all tweets that have been posted between Feb 14 - April 14, 2012, by all of the users who follow “asist2011” and “asist_org”: http://URL_BASE/tweetApiCalls.php?call=getAllTweets& seedUserList=asist2011,asist_org&startTime=2012-02- 14&endTime=2012-04-14 2) Returns details about dalprof’s profile such as profile info, followers, friends, Klout score (influence score), geocoded location – for easy and universal location identification http://URL_BASE/tweetApiCalls.php?call=getUserDetails &user=dalprof GRAND Projects: • DINS - Digital Infrastructures: Access and Use in the Network Society • NAVEL - Network Assessment and Validation for Effective Leadership Netlytic – a system for automated discovery, analysis and visualization of information about online communities, being developed by Dr. Gruzd at the Dalhousie University Social Media Lab. Example 2: Tag Cloud of Top 30 Topics derived from Positive (left) and Negative (right) Tweets about #OccupyWallStreet Example 1: A Visual Representation of the Sentiment Analysis made possible by the new NLP Module now available in Netlytic As a proof of concept, the new NLP Module, based on the Natural Language ToolKit (NLTK), has been added to an existing web tool called Netlytic, giving it the ability to provide sentiment analysis. Sentiment Analysis of >70K Tweets about #OccupyWallStreet Conclusion: Overall, tweets about the Occupy Wall Street movement were more positive than negative. Case Studies #2: Netlytic.org Sample API Calls Research Objectives Case Studies #1: AcademiaMap.com AcademiaMap-Dashboard App AcademiaMap-GeoVisualizer App AcademiaMap helps scholars to filter the “noise” from their Twitter streams using various "influence" metrics and provides them with an easy way to identify trending topics and interesting voices to follow on Twitter. (Lead developer: Melissa Anez) A Geo-based Visualization system that displays communication connections between scholarly users of Twitter from across the globe. (Lead developer: Jamiur Rahman) AcademiaMap - Twitter App The API developed as part of this project is currently being used in a few different applications for a system called AcademiaMap, an Online Influence Assessment App designed for scholars. A Twitter app that automatically posts tweets about trending topics and re- posts tweets that are popular within a group of scholarly Twitter users. (Lead developer: Sreejata Chatterjee)