Top Banner
Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo
15

Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo.

Dec 30, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo.

Named Entity Recognition in Tweets:TwitterNLP

Ludymila Lobo

Twitter NLP

Ludymila Lobo

Page 2: Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo.

Reading material

Named Entity Recognition in Tweets, RITTER, Alan, CLARK, Sam, Mausam and ETZIONI, Oren. Obtained on Association for Computational Linguistics website, at https://aclweb.org/anthology/D/D11/D11-1141.pdf

http://www.academia.edu/1128304/Shallow_parsing_as_part-of-speech_tagging

Twitter NLP Tool

https://github.com/aritter/twitter_nlp

Aplication with Twitter NLP

statuscalendar.com

Collecting Tweets

https://dev.twitter.com

http://www.webdevdoor.com/jquery/twitter-feed-authentication-search

https://github.com/abraham/twitteroauth

http://sourceforge.net/projects/xampp/

Resources

http://www.webdevdoor.com/jquery/twitter-feed-authentication-search/

Page 3: Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo.

Big amount of data (even more than Library of Congress -Washington D.C.)*, with 151 millions of itens

Real time information, some times more up-to-date than articles.

Why Twitter?

http://pt.wikipedia.org/wiki/Library_of_Congress

*Hachman (2011)

Page 4: Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo.

Noisy and informal nature

Diversity of entities (companies, products, bands, teams, movies, etc), that are not relatively frequent, which makes a sample of Tweets with a few examples

Lack of context

Challenges

http://twitter.com

Page 5: Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo.

Tool

• https://github.com/aritter/twitter_nlp• Unzip file, on Linux terminal type:– sh build.sh

Page 6: Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo.

Tool

• statuscalendar.com

Page 7: Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo.

How it works

POS (Part of Speech) ->NLP, clustering Chunking (shallow parsing)

@paulwalk oIt b-np's b-vpthe b-npview i-npfrom b-ppwhere b-advpI b-np'm b-vpliving i-vpfor b-pptwo b-npweeks i-np

best ADJ ADV NP V better ADJ ADV V DET close ADV ADJ V N cut V N VN VD even ADV DET ADJ V grant NP N V hit V VD VN N DET

Page 8: Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo.

How it works

Capitalization classifier:Predicts whether or not a tweet is informatively capitalized (using SVM learning)

NER (Named Entity Recognition)

POS (Part of Speech) ->NLP, clustering

Chunking (shallow parsing)

Tom Hanks was awesome in Forrest Gump

actor movie

Page 9: Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo.

Tool

@cityofcalgary: Free swimming and golf tomorrow for @cbc Sports Day in Canada #yyc #sportsday http://ow.ly/2G4sf

@cityofcalgary/O :/O Free/O swimming/O and/O golf/O tomorrow/O for/O @cbc/O Sports/B-other Day/I-other in/O Canada/B-geo-loc #yyc/O #sportsday/O http://ow.ly/2G4sf/O

Adam Beyer: Swedish Techno Pioneer: When it comes to his own DJing and sound, he's slightly more diverse and likes...

Adam/B-person Beyer/I-person :/O Swedish/O Techno/O Pioneer/O :/O When/O it/O comes/O to/O his/O own/O DJing/O and/O sound/O ,/O he/O 's/O slightly/O more/O diverse/O and/O likes/O

Page 10: Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo.

How to retrieve data from Twitter?

https://dev.twitter.com

Page 11: Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo.

<?phpsession_start();require_once("twitteroauth/twitteroauth/twitteroauth.php"); //Path to twitteroauth library $search = "wpi OR #WPI";$notweets = 50;$consumerkey = “123456";$consumersecret = “123456";$accesstoken = "123456";$accesstokensecret = “123456"; function getConnectionWithAccessToken($cons_key, $cons_secret, $oauth_token, $oauth_token_secret) { $connection = new TwitterOAuth($cons_key, $cons_secret, $oauth_token, $oauth_token_secret); return $connection;} $connection = getConnectionWithAccessToken($consumerkey, $consumersecret, $accesstoken, $accesstokensecret); $search = str_replace("#", "%23", $search); $tweets = $connection->get("https://api.twitter.com/1.1/search/tweets.json?q=".$search."&count=".$notweets);

echo json_encode($tweets);?>

http://www.webdevdoor.com/jquery/twitter-feed-authentication-search/

Page 12: Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo.

• Authentication libraryhttps://github.com/abraham/twitteroauth

Download and include in the same folder as the code

How to retrieve data from Twitter?

Page 13: Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo.

How to retrieve data from Twitter?

http://sourceforge.net/projects/xampp/

Page 14: Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo.

How to retrieve data from Twitter?

Copy the project folder to C:\xampp\htdocs

Page 15: Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo.

How to retrieve data from Twitter?

http://localhost/TwitterStreams/tweet.php on a browser