This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Aim: Collect historical timeseries data for analysis– Continuously collect data from up to 3000 Madrid council traffic sensors via web service
- Data includes traffic speeds and intensities, updated every 5 mins– Push the messages to Kafka– Use Secor to aggregate multiple messages into a single Swift object
- According to policy, e.g., every 60 mins- Possibly partition the data, e.g. according to date- Convert to Parquet format- Annotate with metadata, e.g., min/max speed, start/end time
– Index Swift objects according to their metadata using ElasticSearch
Secor
Swift
IoT Architecture – Madrid Traffic – Ingestion Flow
COSMOS Funding: EU FP7 at level of 2PY x 3 years Started: Sept 2013 Coordinator: ATOS Technical partners: IBM, NTUA, Univ Surrey, Siemens, ATOS Use Case Partners: Hildebrand/Camden, EMT Madrid Bus Transport/Madrid
Council, III Taiwan – Smart Cities use cases Project Vision: Enable ‘things’ to interact with each other based on shared
What is it?– Apache Kafka is a high throughput distributed publish/subscribe messaging system. – Secor is an open source tool developed by Pinterest, which aggregates Kafka messages
and saves as an S3 object. What extensions were needed?
– Support for OpenStack Swift as a Secor target. We also added support for Parquet format and annotating objects with metadata search to support indexing.
What is the value of integration with Swift?– Enables bringing new data and applications to Swift which is an open source solution.
Parquet and metadata search enable improved performance for batch analytics. Status
– We contributed OpenStack Swift support to the Secor community and it is now part of Secor.
– A distributed, scalable, real-time search and analytics engine, built on Apache Lucene. What integration is needed?
– Index object metadata allowing search for objects by attributes. What is the value of integration with Swift
– Use search to select objects for further processing, e.g., relevant objects for analytics. - Note that S3 does not yet have native search according to metadata.
Status– The IBM SoftLayer object service includes a basic implementation of metadata search;
At IBM Research, we added extensions such as data type support and range searches.