REAL-TIME ROAD TRAFFIC EVENTS DETECTION AND GEO-PARSING A Thesis Submitted to the Faculty of Purdue University by Saurabh Kumar In Partial Fulfillment of the Requirements for the Degree of Master of Science in Electrical and Computer Engineering August 2018 Purdue University Indianapolis, Indiana
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
REAL-TIME ROAD TRAFFIC EVENTS DETECTION AND GEO-PARSING
A Thesis
Submitted to the Faculty
of
Purdue University
by
Saurabh Kumar
In Partial Fulfillment of the
Requirements for the Degree
of
Master of Science in Electrical and Computer Engineering
August 2018
Purdue University
Indianapolis, Indiana
ii
THE PURDUE UNIVERSITY GRADUATE SCHOOL
STATEMENT OF COMMITTEE APPROVAL
Dr. Sarah Koskie, Chair
Department of Electrical and Computer Engineering
Dr. Brian King
Department of Electrical and Computer Engineering
Dr. Xiao Luo
Department of Computer Information Technology
Approved by:
Dr. Brian King
Head of the Graduate Program
iii
I dedicate this thesis to my dear mother Mrs. Manju Gupta who always gave
priority to education other than anything else.
iv
ACKNOWLEDGMENTS
I would like to thank Dr. Sarah Koskie for her guidance and Dr. Brian King for
his valuable suggestions. Also, I would like to thank my brother Rahul for helping
me to improve upon the presentation of this thesis.
I Internal State Vector for RNN−→I [t] Internal State Vector for Forward RNN Layer at Layer T←−I [t] Internal State Vector for Backward RNN Layer at Layer T
J(w) Cost Function
Mji Word Vector Mi j
th Component
Mi Vector Representation of Word Wi
S State Vector for NER
Wi Word Representation
w Weight Vector
wi[j] Weight Vector ith Component at jth Layer
w[j] Weight Vector jth Layer
−→w Weight Vector for Forward RNN Layer
α Learning Rate
β Decay Factor
Θ(x) Sigmoid Activation Function
x
ABBREVIATIONS
API Application Programming Interface
FFNN Feed Forward Neural Network
JSON JavaScript Object Notation
LIDAR Light Detection and Ranging
LSTM Long Short-Term memory
MIMO Multiple Input Multiple Output
MISO Multiple Input Single Output
NER Named Entity Recognition
PASM Publish and Subscriber Model
POS Part of Speech
RDBMS Relational Database Management System
RNN Recurrent Neural Network
SVM Support Vector Machine
xi
GLOSSARY
Corpus Collection of documents
Google Maps Online Mapping Service Developed by Google
Tweet Twitter user’s post
Twitter Online News and Social Networking Platform
Waze Navigation Application for Smartphone
xii
ABSTRACT
Kumar, Saurabh. M.S.E.C.E., Purdue University, August 2018. Real-Time RoadTraffic Events Detection and Geo-Parsing. Major Professor: Sarah Koskie.
In the 21st century, there is an increasing number of vehicles on the road as
well as a limited road infrastructure. These aspects culminate in daily challenges
for the average commuter due to congestion and slow moving traffic. In the United
States alone, it costs an average US driver $1200 every year in the form of fuel and
time [1]. Some positive steps, including (a) introduction of the push notification
system and (b) deploying more law enforcement troops, have been taken for better
traffic management. However, these methods have limitations and require extensive
planning [2]. Another method to deal with traffic problems is to track the congested
area in a city using social media. Next, law enforcement resources can be re-routed
to these areas on a real-time basis.
Given the ever-increasing number of smartphone devices, social media can be used
as a source of information to track the traffic-related incidents.
Social media sites allow users to share their opinions and information. Platforms
like Twitter, Facebook, and Instagram are very popular among users. These platforms
enable users to share whatever they want in the form of text and images. Facebook
users generate millions of posts in a minute. On these platforms, abundant data,
including news, trends, events, opinions, product reviews, etc. are generated on a
daily basis.
Worldwide, organizations are using social media for marketing purposes. This
data can also be used to analyze the traffic-related events like congestion, construction
work, slow-moving traffic etc. Thus the motivation behind this research is to use social
media posts to extract information relevant to traffic, with effective and proactive
xiii
traffic administration as the primary focus. I propose an intuitive two-step process
to utilize Twitter users’ posts to obtain for retrieving traffic-related information on
a real-time basis. It uses a text classifier to filter out the data that contains only
traffic information. This is followed by a Part-Of-Speech (POS) tagger to find the
geolocation information. A prototype of the proposed system is implemented using
distributed microservices architecture.
1
1. INTRODUCTION
Traffic congestion is one of the biggest problems in our modern cities. Delays, road
rage, environmental effects, and increased fuel consumption are some of its by prod-
ucts. To avoid these problems, governments and local law enforcement use inductive
loops, cameras, and radar to monitor traffic [3]. These tools are effective but have
drawbacks in term of installation and maintenance, along with high operational costs.
Large capital investments and a large workforce are required to build such infras-
tructure from the ground up, so leveraging the existing infrastructure for gathering
traffic-related information would be more viable and cost-effective. Social media plat-
forms can be used to serve that purpose. Every day millions of users on these plat-
forms communicate with each other and share their opinions. With proper content
filtering techniques, traffic-related incidents can be filtered out of all other events.
Twitter is a popular social media platform with millions of active users. It provides
a channel between friends and co-workers to communicate using desktop or mobile
applications. It offers a platform for market researchers, activists, and decision makers
to access information on a real-time basis. Organizations are using it to learn about
customer satisfaction levels [4]. Some researchers have even used it for the tracking
seismic activity [5].
In the same way, mining this open source information can be utilized to track
traffic incidents on a real-time basis. Analyzing the tweets can give us the location
information without the use of hardware like LIDAR, cameras, etc.
2
1.1 Goal
The primary objective of this thesis is to develop an ecosystem to track traffic
incidents in real time using a non-traditional source of information like social media
data. It is believed that local law enforcement agencies can utilize this information
for better traffic management and emergency response.
Currently, getting the real-time traffic information requires an array of sensors [6],
but with the rise of social media, a massive amount of real-time traffic data is flowing
through Twitter1, Facebook, and other social media platforms that can be utilized
as a substitute. These platforms are acting as a new medium where every user is a
source of information.
There are applications like Google Maps2 and Waze 3 that provide real-time traf-
fic updates by leveraging crowd-source data, but social media channels are left out.
For example, Waze provides an interface to report and geo-mark traffic-related in-
formation and Google collects data through Android phones, where every Android
user acts as a data source. Google’s proprietary algorithms predict the traffic con-
gestion [7] by analyzing the number of the Android users and their speed. Although
these platforms perform well, Google Maps and Waze are not utilizing other channels
like social media. Therefore the primary objective of this thesis is to utilize social
media platforms as a data source to monitor traffic incidents.
This thesis is divided into four major parts: data collection, text classification,
location detection, and system architecture. For data collection, the Twitter platform
is used as a data source. Twitter provides multiple ways to access the data using the
rest API [8]. Text classification is used to filter out Tweets related to the traffic
incidents and this is done by using a RNN model. Location detection means to
determining the location from the text. For example, consider “Stefan is going to
1Twitter is an online news and social networking platform2Online mapping service developed by Google3Navigation application for smartphone
3
West Pacific Street”. Here “West Pacific Street” is the location. In the last chapter,
all the components are tied together to build a scalable system for real-time data
processing.
4
2. DATA COLLECTION
At the beginning of this research, some public Twitter Id’s were manually collected
using Twitter’s search interface. Keywords like ‘traffic’, ‘rain’, etc. (shown in Table
A.2) were manually entered to the search interface to get the twitter accounts that
posted tweets having these keywords. This activity was repeated multiple times for
many cities. The main idea behind this exercise was to gather information about:
• Twitter accounts that frequently post traffic-related tweets.
• Number of Tweet being geo-tagged.
• Frequently occuring words in tweets.
The results of this task are:
• Identification of Twitter accounts in different cities that tweet traffic-related
information.
• A vocabulary of frequently used words in the Tweets.
• Less than one percent of all the tweets contains geo-tagged data.
According to the output, a list of user accounts and keywords is compiled for the
data collection task.
Twitter provides multiple ways to get the tweeted data [8] via their rest API1. Out
of these methods, only the geo-location and the user-time-line methods are used to
retrieve the data. For this thesis, only tweets that are available in the public domain
have been used.
1Method to communicate between different components
5
2.1 Geo Location
In this API, the radius, longitude, and latitude of the target city are the input
parameters [9]. The API returns all the Tweets within a given radius, where the
center is the location specified by the geo-coordinates input to the API, and all
Tweets having input keywords, within the radius.
2.2 User Time Line
In the user time-line method, the user Twitter ID is passed to the Twitter API
which returns a collection of the most recent tweets and re-tweets posted by the user
and the user’s followers [10].
2.3 System Design
An application is created on the Linux server to download the publicly available
Tweets. It requires authorization and token keys which, during an early phase of the
application creation, are automatically assigned by Twitter [11]. There are multiple
frameworks to access the Twitter API. One such example is Tweepy [12], which can
extract the data in JSON and converts it into a Python dictionary.
To get the tweets, the system, shown in the Figure 2.1, implements both user-
time-line and geolocation methods using the Tweepy framework. The system has the
following two components: the data extraction server and the MySQL server.
2.4 Data Extraction Server
The data extraction server acts as middleware between Twitter and the MySQL
server. It runs a Python program periodically to collect the data from Twitter. First,
the Tweets’ URLs are removed from the collected data. Next, Tweet Id, text and
date are stored on the MySQL server.
6
Twitter Rest API’sUser Time
LineGeo
Location
Data ExtractionServer
MySQL Server
Twitter Server
Fig. 2.1. Tweet Extraction Architecture
Approximately one hundred thousand tweets were collected in three months be-
tween October, 2016 and December, 2016.
2.5 MySQL Server
MySQL is an open source RDBMS system owned by Oracle [13]. It is used to
store Tweet Id, date, and Tweet text, which are later used to train machine learning
models.
7
3. TEXT CLASSIFICATION
3.1 Introduction
Classification is a supervised machine-learning methodology that involves assign-
ing a label to set of input features. In machine-learning, a feature is an individual
measurable property of an observed phenomena [14]. Generally, a feature is a numeri-
cal value such as the age of a person, a temperature etc. In the case of text, a sequence
of letters and symbols can’t be fed directly to the machine-learning algorithms [15].
Text feature-extraction algorithms convert a string into a vector.
Classification can be of a binary or a multi-class type.
• Binary classification is often used to determine whether an item is or is not in the
class; but it can also be used if the data consists of two classes. Some common
examples are spam detection, credit-card fraudulent-transaction detection, and
gender identification.
• Examples of multi-class classification include country-of-origin detection and
language detection.
3.2 Text Classification
A text classification algorithm, according to its content, assigns one of a given set
of classes to an input document from a given set of classes. Document-type, song
genre, book type, etc. For example, consider a text classification problem to find out
whether an email is a spam or not. In this case, the classifier is a binary text classifier
and the output is either “spam” or “not spam”.
Text classification can be applied to solve a variety of problems such as:
• Understanding and identifying an opinion in a piece of text.
8
• Determining movie-review class from good, bad, or worse category.
• Spam identification.
A supervised text-classification task assigns one of a predefined set of classes.
It starts by building a hypothesis function to do the classification. A hypothesis
function is a mapping of input features to output classes and the classifier is trained
using ground-truth data. For the learning process the whole dataset is divided into
two parts; (a) test data which is used as ground-truth and (b) training data, which
is used to validate the accuracy of the classifier.
3.2.1 Feature Extraction
The first step in feature extraction is to convert the text into a vector. Some
popular text feature extraction methods are:
• Bag of words [16].
• Word2vec [17].
Bag of Words
The bag of words model is a vector space model of a text document. It is a
frequency based vector representation where each word is represented by its number
of appearances. It first builds the vocabulary using a document or set of documents
and then converts a text document into a vector using word frequencies.
“Bag of words”” refers to the fact that this model ignores grammar and word order.
To convert text into a vector involves the following steps (shown in Figure 3.1):
• Step One: Collect all documents.
• Step Two: Build the vocabulary - a collection of all the unique words in all
documents.
9
Mouse in the Hat cat In The Hat Green egg and ham
mouse in the hat cat green egg and ham
Cat and small cat in the house
mouse in the hat cat Green Egg and ham
0 1 1 1 2 0 0 0 0
Step 1
Step 2
Comparison
Input Document
Vector Representation
Step 3
Fig. 3.1. Bag of Words Representation
• Step Three: Create the document vector by comparing the words with the
vocabulary. Each word is assigned its frequency count in the vocabulary and
zero otherwise.
Word2Vec
Word2vec [17] is a word-embedding model created by a team of researchers led
by Tomas Mikolov at Google. Word embedding converts text into a vector. All the
vectors that have the same context are placed nearby.
This model takes a large corpus as input and produces higher dimensional vectors.
All the vectors have the same size. It has many advantages compared to earlier
algorithms [17]. For example the word order does not influence the resultant word
vectors generated by the model and the algorithm is computationally more efficient.
10
The Word2vec model generates word vector M based on its probability of occur-
rence based on the surrounding words, rather than its frequency. There are multiple
ways by which Word2vec can generate the word vector. The skip-gram model de-
scribed in next section is used because of its computational efficiency [17].
Skip Gram Model
In this model, the objective is to predict the target word based on the surrounding
words within a given window size. The output vector relates the likelihood of the
vocabulary words to the target words.
Consider a document having words W1, W2, ... Wt−1, Wt, Wt+1, ..., Wn. If the
window size is one, the target word Wt will predict the surrounding words Wt−1 and
Wt+1, as shown in the Figure 3.2.
𝑤𝑡
Mapping Function
𝑤𝑡−1
𝑤𝑡+1
Fig. 3.2. Skip-Gram model
Word2vec Training
In the Word2vec model, the word vectors are generated randomly and stored in
the embedding matrix, then the skip-gram model tries to learn the probability vector.
The training goal is to find the target words given a set of surrounding words. At first,
11
Embedding Matrix (generated randomly)
mouse in the hat cat green egg and ham
Softmax
Vocabulary Size
Word Vector Size Log Cost Function
Training
𝑣1
𝑣2
𝑣3
𝑣4
𝑣5
𝑣6
𝑣7
𝑣8
𝑣9
Word Vector
Mouse in the Hat
cat In The Hat
Green egg and ham
Documents
mouse in the hat cat green egg and ham
Vocabulary
Fig. 3.3. Word2vec Training
the vocabulary is built from a set of input documents. Then the embedding matrix
is generated randomly. In the training step, the embedding weights are learned using
the softmax function in (3.2). The process is described in Figure 3.3.
Consider a document having words W1,W2, . . .Wn. The skip-gram algorithm uses
a gradient descent algorithm [18] to maximize the probability of the target word
p(Wt+i|Wt) given the surrounding words [17]:
1
n
n∑t=1
∑−k≤i≤k,k 6=0
log(p(Wt+i|Wt)) (3.1)
where n is the vocabulary size and k is the window size. In the skip-gram model the
conditional probability p(Wn+i|Wn) is defined as:
p(Wn+i|Wn) =exp(MWn+i
·MWn)∑ ni=1 exp(MWn+i
·MWn)(3.2)
where n is the vocabulary size, MW is the output vector representation of the target
word and MW is the input representation of the target word.
12
3.2.2 Classification of Short Text
With the advancement of smartphones, the information content of online commu-
nication like Tweets and chat messages has increased over the time. This information
can be mined from the aforementioned conservations. However, in terms of infor-
mation extraction and data processing, microblogging websites like Twitter pose the
hardest challenges [19]. This is due to its limited tweet length and its unstructured
[2] Federal Highway Administration, Vehicle Detection and Surveil-lance, 2018 (accessed July 10, 2018). [Online]. Available:https://www.fhwa.dot.gov/policyinformation/pubs/vdstits2007/03.cfm
[3] A. Roy, N. Gale, and L. Hong, “Automated traffic surveillance using fusion ofdoppler radar and video information,” Mathematical and Computer Modelling,vol. 54, no. 1-2, pp. 531–543, 2011.
[4] B. J. Jansen, M. Zhang, K. Sobel, and A. Chowdury, “Twitter power: Tweets aselectronic word of mouth,” Journal of the Association for Information Scienceand Technology, vol. 60, no. 11, pp. 2169–2188, 2009.
[5] N. Ambraseys, “The seismic activity of the Marmara sea region over the last2000 years,” Bulletin of the Seismological Society of America, vol. 92, no. 1, pp.1–18, 2002.
[6] S. Coleri, S. Y. Cheung, and P. Varaiya, “Sensor networks for monitoring traffic,”in Allerton conference on communication, control and computing, 2004, pp. 32–40.
[7] D. Barth, The bright side of sitting in traffic: Crowdsourcing roadcongestion data, 2018 (accessed May 20, 2018). [Online]. Available:https://googleblog.blogspot.ca/2009/08/bright-side-of-sitting-in-traffic.html
[8] Twitter, Get Tweets, 2018 (accessed June 18, 2018). [Online]. Available:https://developer.twitter.com/en/docs/api-reference-index.html
[9] Tweepy, Filtering Tweets by location, 2018 (accessed June 19, 2018). [Online].Available: https://developer.twitter.com/en/docs/tutorials/filtering-tweets-by-location.html
[10] Twitter, Get Tweet timelines, 2018 (accessed May 20, 2018). [Online]. Available:https://googleblog.blogspot.ca/2009/08/bright-side-of-sitting-in-traffic
[11] D. Twitter, Authentication, 2018 (accessed June 18, 2018). [Online]. Available:https://developer.twitter.com/en/docs/basics/authentication/guides/access-tokens.htm
[12] Tweepy, Twitter API access via python, 2018 (accessed May 21, 2018). [Online].Available: http://tweepy.readthedocs.io/en/v3.5.0/index.html
51
[13] Oracle, Java, 2018 (accessed June 7, 2018). [Online]. Available:https://java.com/en/
[14] John Paul Mueller, Luca Massaron, Machine Learning: Creat-ing Your Own Features In Data, 2018 (accessed June 20, 2018).[Online]. Available: https://www.dummies.com/programming/big-data/data-science/machine-learning-creating-features-data/
[15] I. Guyon and A. Elisseeff, “An introduction to feature extraction,” in Featureextraction. Springer, 2006, pp. 1–25.
[16] Y. Zhang, R. Jin, and Z.-H. Zhou, “Understanding bag-of-words model: a statis-tical framework,” International Journal of Machine Learning and Cybernetics,vol. 1, no. 1-4, pp. 43–52, 2010.
[17] Y. Goldberg and O. Levy, “word2vec explained: Deriving Mikolov et al.’snegative-sampling word-embedding method,” arXiv preprint arXiv:1402.3722,2014.
[18] M. Vorontsov, G. Carhart, and J. Ricklin, “Adaptive phase-distortion correctionbased on parallel gradient-descent optimization,” Optics letters, vol. 22, no. 12,pp. 907–909, 1997.
[19] K. Bontcheva, L. Derczynski, A. Funk, M. Greenwood, D. Maynard, andN. Aswani, “Twitie: An open-source information extraction pipeline for mi-croblog text,” in Proceedings of the International Conference Recent Advancesin Natural Language Processing RANLP 2013, 2013, pp. 83–90.
[20] N. Wanichayapong, W. Pruthipunyaskul, W. Pattara-Atikom, and P. Chaovalit,“Social-based traffic information extraction and classification,” in ITS Telecom-munications (ITST), 2011 11th International Conference on. IEEE, 2011, pp.107–112.
[21] B. Sriram, D. Fuhry, E. Demir, H. Ferhatosmanoglu, and M. Demirbas, “Shorttext classification in Twitter to improve information filtering,” in Proceedings ofthe 33rd international ACM SIGIR conference on Research and Development ininformation retrieval. ACM, 2010, pp. 841–842.
[22] T. Sakaki, M. Okazaki, and Y. Matsuo, “Earthquake shakes Twitter users: real-time event detection by social sensors,” in Proceedings of the 19th internationalconference on World wide web. ACM, 2010, pp. 851–860.
[23] C. N. Divij Gupta, Detecting Real-Time Messages of Public Inter-est in Tweets, 2018 (accessed May 21, 2018). [Online]. Available:snap.stanford.edu/class/cs224w-readings/mathioudakis10twitter.pdf
[24] W. Wolny, “Sentiment analysis of Twitter data using emoticons and emojiideograms,” Studia Ekonomiczne, vol. 296, pp. 163–171, 2016.
[25] D. R. Cox, “The regression analysis of binary sequences,” Journal of the RoyalStatistical Society. Series B (Methodological), pp. 215–242, 1958.
[26] I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun, “Support vector ma-chine learning for interdependent and structured output spaces,” in Proceedingsof the 21st international conference on Machine learning. ACM, 2004, p. 104.
52
[27] E. T. Rolls and A. Treves, Neural networks and brain function. Oxford Univer-sity Press Oxford, 1998, vol. 572.
[28] R. Hecht-Nielsen, “Theory of the backpropagation neural network,” in Neuralnetworks for perception. Elsevier, 1992, pp. 65–93.
[29] F. A. Gers, J. Schmidhuber, and F. Cummins, “Learning to forget: Continualprediction with lstm,” 1999.
[30] R. Jozefowicz, W. Zaremba, and I. Sutskever, “An empirical exploration of recur-rent network architectures,” in International Conference on Machine Learning,2015, pp. 2342–2350.
[31] M. Schuster and K. K. Paliwal, “Bidirectional recurrent neural networks,” IEEETransactions on Signal Processing, vol. 45, no. 11, pp. 2673–2681, 1997.
[32] W. Zaremba, I. Sutskever, and O. Vinyals, “Recurrent neural network regular-ization,” arXiv preprint arXiv:1409.2329, 2014.
[33] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov,“Dropout: A simple way to prevent neural networks from overfitting,” The Jour-nal of Machine Learning Research, vol. 15, no. 1, pp. 1929–1958, 2014.
[34] S. Merity, N. S. Keskar, and R. Socher, “Regularizing and optimizing lstm lan-guage models,” arXiv preprint arXiv:1708.02182, 2017.
[35] L. Wan, M. Zeiler, S. Zhang, Y. Le Cun, and R. Fergus, “Regularization of neuralnetworks using dropconnect,” in International Conference on Machine Learning,2013, pp. 1058–1066.
[36] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXivpreprint arXiv:1412.6980, 2014.
[37] T. Joachims, “Optimizing search engines using clickthrough data,” in Proceedingsof the eighth ACM SIGKDD international conference on Knowledge discoveryand data mining. ACM, 2002, pp. 133–142.
[38] R. Burbidge, M. Trotter, B. Buxton, and S. Holden, “Drug design by machinelearning: support vector machines for pharmaceutical data analysis,” Computers& chemistry, vol. 26, no. 1, pp. 5–14, 2001.
[39] W. Jin, H. H. Ho, and R. K. Srihari, “Opinionminer: a novel machine learn-ing system for web opinion mining and extraction,” in Proceedings of the 15thACM SIGKDD international conference on Knowledge discovery and data min-ing. ACM, 2009, pp. 1195–1204.
[40] T. Eftimov, B. K. Seljak, and P. Korosec, “A rule-based named-entity recognitionmethod for knowledge extraction of evidence-based dietary recommendations,”PloS one, vol. 12, no. 6, p. 12(6), 2017.
[41] C. Bornet and F. Kaplan, “A simple set of rules for characters and place recog-nition in French novels,” Frontiers in Digital Humanities, vol. 4, p. 6, 2017.
53
[42] J. R. Finkel, T. Grenager, and C. Manning, “Incorporating non-local informationinto information extraction systems by Gibbs sampling,” in Proceedings of the43rd annual meeting on association for computational linguistics. Associationfor Computational Linguistics, 2005, pp. 363–370.
[43] F. A. Elsafoury, Monitoring urban traffic status using Twitter messages. Thesissubmitted to the Faculty of Geo-Information Science and Earth Observation ofthe University of Twente, the Netherlands, February, 2013.
[44] Z. Cheng, J. Caverlee, and K. Lee, “You are where you tweet: a content-basedapproach to geo-locating twitter users,” in Proceedings of the 19th ACM inter-national conference on Information and knowledge management. ACM, 2010,pp. 759–768.
[45] A. Ritter, S. Clark, O. Etzioni et al., “Named entity recognition in tweets: anexperimental study,” in Proceedings of the conference on empirical methods innatural language processing. Association for Computational Linguistics, 2011,pp. 1524–1534.
[46] Twitter, Get Tweet timelines, 2018 (accessed May 20, 2018). [Online]. Available:https://cs.nyu.edu/grishman/jet/guide/PennPOS.html
[47] J. Gelernter and S. Balaji, “An algorithm for local geoparsing of microtext,”GeoInformatica, vol. 17, no. 4, pp. 635–667, 2013.
[48] M. Collins, Log-Linear Models, MEMMs, and CRFs, 2018 (accessed June 7,2018). [Online]. Available: http://www.cs.columbia.edu/ mcollins/crf.pdf
[49] Z. Huang, W. Xu, and K. Yu, “Bidirectional lstm-crf models for sequence tag-ging,” arXiv preprint arXiv:1508.01991, 2015.
[50] Alan Ritter, W-NUT data, 2018 (accessed February 17, 2018). [Online].Available: https://bit.ly/2LeuxK9
[51] J. L. Martin Fowler, Microservices: a definition of this new ar-chitectural term, 2018 (accessed June 7, 2018). [Online]. Available:https://www.martinfowler.com/articles/microservices.html
[52] P. Community, Python, 2018 (accessed June 7, 2018). [Online]. Available:https://www.python.org
[53] C. Community, C, 2018 (accessed June 7, 2018). [Online]. Available:http://www.cplusplus.com/reference/