Real Time Analysis of Sensor Data for the Internet of Things by means of Clustering and Event Processing Work of: NUI Galway and University of Zagreb Presented by: Rakan Alseghayer
Real Time Analysis of Sensor Data for the Internet of Things
by means of Clustering and Event Processing
Work of:NUI Galway and University of Zagreb
Presented by:Rakan Alseghayer
What is IoT?§ a global infrastructure for the information society,
enabling advanced services by interconnecting (physical and virtual) things based on existing and evolving interoperable information and communication technologies. [www.itu.int].
§ Scale?– IDC expects the installed base of the IoT to grow to 212 billion
"things" globally by the end of 2020.”
§ Things produce Data!!
RakanAlseghayer,UniversityofPittsburgh 2
IoT’s Data§ Sensor technology is the driver of IoT
– Focus on more sensors deployed and bigger sensor NWs.§ Data produced is huge!!
– Big Data again L
RakanAlseghayer,UniversityofPittsburgh 3
IoT’s Position
RakanAlseghayer,UniversityofPittsburgh 4
Source: Gartner 2014
Problem?
Source:Gartner2014
IoT Challenges1. Current focus on sensor deployment and producing
more data.
2. Separation between ecosystems of services and IoTdata.
RakanAlseghayer,UniversityofPittsburgh 5
Goal and Approach§ Goal:
– Providing analytics and information from IoT big data
– Focus more on “unknown unknowns”
§ Approach: – The integration and usage of existing analytics
solutions along with sensor produced big data.
RakanAlseghayer,UniversityofPittsburgh 6
Outline
§ Motivation§ Contributions
§ RT interpretation based on OpenIoT§ OpenIoT use case of Zagreb city
§ Experimental Evaluation§ Conclusions
RakanAlseghayer,UniversityofPittsburgh 7
OpenIoT§ Open source platform to the open source community for
connecting physical and virtual sensors to the Cloud.
§ OpenIoT-VDK that is a ready-to-use version for academic and training purposes.
RakanAlseghayer,UniversityofPittsburgh 8
OpenIot Architicture
RakanAlseghayer,UniversityofPittsburgh 9Source:open-platforms.eu
OpenIoT components§ Extended Global Sensor Network (X-GSN):
– Middleware for deployment and programming of sensor NWs– Allow for virtual sensors (abstraction from device specifics).
§ Linked Sensor Middleware(LSM-Light):– Transforms data from virtual sensor format to Linked Data stored
in RDF (Resource description frame work) format.
§ Intelligent Server Component:– Glue the X-GSN and LSM-Light in a service based cloud
environment, and allows users to query the data.
RakanAlseghayer,UniversityofPittsburgh 10
X-GSN§ Wrappers allow the data encapsulations from device
format to X-GSN format.§ Feeds data to LSM-Light in a push based or pull based
mode
RakanAlseghayer,UniversityofPittsburgh 11
LSM-Light§ The semantic level§ Allows data querying through:
– SPARQL if data is stored already (i.e., static data)– CQELS if data is streamed through query registration ahead of
time.
RakanAlseghayer,UniversityofPittsburgh 12
Outline
§ Motivation§ Contributions
§ RT interpretation based on OpenIoT§ OpenIoT use case of Zagreb city
§ Experimental Evaluation§ Conclusions
RakanAlseghayer,UniversityofPittsburgh 13
Use Case Description§ What?
– Aims at understanding air pollution dynamics in Smart Cities by integrating data from multiple sensors
§ How?– Correlating various variables about air quality in different times
and locations.
§ Uses OpenIoT platform.
RakanAlseghayer,UniversityofPittsburgh 14
Sensors§ Custom designed, off-the-shelf components that
communicate via bluetooth.§ Electrochemical sensors to measure CO and (either
NO2 or SO2).§ Can be mounted on bikes, backpacks, or clothes.§ Sensors sends data to mobile app on smartphone that
transmit data to OpenIoT on the cloud.
RakanAlseghayer,UniversityofPittsburgh 15
Dataset§ Real sensor data measured in the city of Zagreb part of
the campaign SenseZGAir .§ 20 volunteers (sending nodes to the OpenIoT)§ 144 Km2 area covered, total distance of 758.6 km.§ 16,835 data points of 7 dimensions (NO2 or SO2).
RakanAlseghayer,UniversityofPittsburgh 16
Data Partitioning§ Time:
– Data is divided into windows of 2 hours.– non-overlapping windows.– Total of 28 windows worth of data.
§ Location:– Divided into 4 zones.– Data is spatially grouped using K-means.– Centroids are points of interests.– Voronoi-like partitions.
RakanAlseghayer,UniversityofPittsburgh 17
Location Partitioning
RakanAlseghayer,UniversityofPittsburgh 18
Z2->307points
Z1->4431points
Z3->2148pointsZ4->7019points
Correlation Measures§ Pearson Correlation Coefficient:
§ Understanding r:– r == 1 perfectly positivly correlated (green in graph)– r == -1 perfectly negatively correlated (red in graph)– r == 0 no linear correlation (grey in graph)
RakanAlseghayer,UniversityofPittsburgh 19
Correlation Analysis§ Correlation undirected graph G=(V,E) where V is
represents n variables, E represent correlation between them.
§ Visually represented, where A to F are variables§ Edges:
– Green à positive corr.– Red à negative corr.– None
§ Total of 94 graphs.§ Size of node shows reading level§ Thickness of the edge in graphs indicates the value |r|
RakanAlseghayer,UniversityofPittsburgh 20
Hot Day Results
§ Strong (-) correlation b/w Humidity and Temperature à known trend.
§ Strong (+) correlation b/w CO and SO2 à both gases produced by same source, and weakens at the last window à(traffic?).
§ CO and Humidity correlation weakens à possible effect of humidity on contaminant?
RakanAlseghayer,UniversityofPittsburgh 21
Night Results
§ Varying effects of Temp., CO, and pressure and all switch signs during this reading.
§ This is true for humidity and pressure à in low population areas, pollutant gasses are easily affected by environment conditions.
§ Note: Zone 4 was split into 2 sub-zones that showed new patterns à mobile crowd-sensed data is geographically valuable.
RakanAlseghayer,UniversityofPittsburgh 22
Window Size§ Experimented with different sizes (15 min to 12 hours).
§ Trade-off:– Too small: fast analysis, no trustful correlations.– Too large: high latency, very good correlations.
RakanAlseghayer,UniversityofPittsburgh 23
Window Size Effect
RakanAlseghayer,UniversityofPittsburgh 24
nocapturingofSO2+poorcorr.
Finegrainedcorr.,butverylonglatency
Conclusions§ Introduced an OpenIoT approach for real time sensor
data processing in the cloud.§ OpenIoT was presented in a way that showed IoT
adaptation towards satisfying analytics work on collected IoT data.
§ SenseZGAir collected data that introduced the air quality through analytics by correlating different parameters.
§ Graphical representation enables scientists and researchers to easily draw conclusions about the air quality.
25RakanAlseghayer,UniversityofPittsburgh
Questions?Thank you J