Opportunities and Challenges of Large-scale IoT Data Analytics 1 Payam Barnaghi Institute for Communication Systems (ICS)/ 5G Innovation Centre University of Surrey Guildford, United Kingdom ASEAN IoT Innovation Forum, Kuala Lumpur, Malaysia, August 2015
45
Embed
Opportunities and Challenges of Large-scale IoT Data Analytics
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Opportunities and Challenges of Large-scale IoT Data Analytics
1
Payam BarnaghiInstitute for Communication Systems (ICS)/5G Innovation Centre University of SurreyGuildford, United KingdomASEAN IoT Innovation Forum, Kuala Lumpur, Malaysia, August 2015
Cyber-Physical-Social Data
2P. Barnaghi et al., "Digital Technology Adoption in the Smart Built Environment", IET Sector Technical Briefing, The Institution of Engineering and Technology (IET), I. Borthwick (editor), March 2015.
Internet of Things: The story so far
RFID based solutions Wireless Sensor and
Actuator networks, solutions for
communication technologies,
energy efficiency, routing, …
Smart Devices/Web-enabled
Apps/Services, initial products,
vertical applications, early concepts and
demos, …
Motion sensor
Motion sensor
ECG sensor
Physical-Cyber-Social Systems, Linked-data,
semantics,More products, more
heterogeneity, solutions for control and
monitoring, …
Future: Cloud, Big (IoT) Data Analytics, Interoperability, Enhanced Cellular/Wireless
Com. for IoT, Real-world operational use-cases and
Industry and B2B services/applications,
more Standards… P. Barnaghi, A. Sheth, "Internet of Things: the story so far", IEEE IoT Newsletter, September
2014.
3
4
“Each single data item is important.”
“Relying merely on data from sources that are unevenly distributed, without considering background information or social context, can lead to imbalanced interpretations and decisions.”?
Data- Challenges
− Multi-modal and heterogeneous− Noisy and incomplete− Time and location dependent − Dynamic and varies in quality − Crowed sourced data can be unreliable − Requires (near-) real-time analysis− Privacy and security are important issues− Data can be biased- we need to know our data!
5
Data Lifecycle
6Source: The IET Technical Report, Digital Technology Adoption in the Smart Built Environment: Challenges and opportunities of data driven systems for building, community and city-scale applications, http://www.theiet.org/sectors/built-environment/resources/digital-technology.cfm
7
“The ultimate goal is transforming the raw data to insights and actionable knowledge and/or creating effective representation forms for machines and also human users and creating automation.”
This usually requires data from multiple sources, (near-) real time analytics and visualisation and/or semantic representations.
8
“Data will come from various source and from different platforms and various systems.”
This requires an ecosystem of IoT systems with several backend support components (e.g. pub/sub, storage, discovery, and access services). Semantic interoperability is also a key requirement.
Device/Data interoperability
9The slide adapted from the IoT talk given by Jan Holler of Ericsson at IoT Week 2015 in Lisbon.
Search on the Internet/Web in the early days
1010
Accessing IoT data
11
“ The internet/web norm (for now) is often to use an interface to search for the data; the search engines are usually information locators – return the link to the information; IoT data access is more opportunistic and context aware”.
The IoT requires context-aware and opportunistic push mechanism, dynamic device/resource associations and (software-defined) data routing networks.
IoT environments are usually dynamic and (near-) real-time
12
Off-line Data analytics
Data analytics in dynamic environments
Image sources: ABC Australia and 2dolphins.com
What type of problems we expect to solve using the IoT and data analytics solutions?
14Source LAT Times, http://documents.latimes.com/la-2013/
A smart City exampleFuture cities: A view from 1998
− Analysis of thousands of traffic, pollution, weather, congestion, public transport, waste and event sensory data to provide better transport and city management.
− Converting smart meter readings to information that can help prediction and balance of power consumption in a city.
− Monitoring elderly homes, personal and public healthcare applications.
− Event and incident analysis and prediction using (near) real-time data collected by citizen and device sensors.
− Turning social media data (e.g. Tweets) related to city issues into event and sentiment analysis.
− Any many more…
18
EU FP7 CityPulse Project
19
20
CityPulse Consortium
Industrial SIE (Austria,
Romania),ERIC
SME AI,
HigherEducation
UNIS, NUIG,UASO, WSU
City BR, AA
Partners:
Duration: 36 months (2014-2017)
21
AnalyticsToolbox
Context-awareDecision Support,
Visualisation
Knowledge-based
Stream Processing
Real-TimeMonitoring &
Testing
Accuracy & Trust
Modelling
SemanticIntegration
On Demand Data
Federation
OpenReferenceData Sets
Real-TimeIoT InformationExtraction
IoT StreamProcessing
Federation ofHeterogenousData Streams
Design-Time Run-Time Testing
Exposure APIs
Designing for real world problems
101 Smart City scenarios
23http://www.ict-citypulse.eu/scenarios/
Dr Mirko PresserAlexandra Institute Denmark
24
Data Visualisation
25
Event Visualisation
CityPulse demo
26
Data abstraction
27F. Ganz, P. Barnaghi, F. Carrez, "Information Abstraction for Heterogeneous Real World Internet Data", IEEE Sensors Journal, 2013.
Adaptable and dynamic learning methods
http://kat.ee.surrey.ac.uk/
Correlation analysis
29
Analysing social streams
30With
City event extraction from social streams
31
Tweets from a city POS Tagging
Hybrid NER+ Event term extraction
Geohashing
Temporal Estimation
Impact Assessment
Event Aggregatio
nOSM
LocationsSCRIBE
ontology
511.org hierarchy
City Event ExtractionCity Event Annotation
P. Anantharam, P. Barnaghi, K. Thirunarayan, A.P. Sheth, "Extracting City Traffic Events from Social Streams", ACM Trans. on Intelligent Systems and Technology, 2015.
Collaboration with Kno.e.sis, Wright State University
Geohashing
32
0.6 miles
Max-lat
Min-lat
Min-long
Max-long
0.38 miles
37.7545166015625, -122.40966796875
37.7490234375, -122.40966796875
37.7545166015625, -122.420654296875
37.7490234375, -122.420654296875
437.74933, -122.4106711
Hierarchical spatial structure of geohash for representing locations with variable precision.
Here the location string is 5H34
0 1 2 3 4 5 67 8 9 B C D EF G H I J K L
0 172 3 4
5 6 8 9
0 1 2 3 4
5 6 7
0 1 23 4 5
6 7 8
Social media analysis
33
City Infrastructure
Tweets from a city
P. Anantharam, P. Barnaghi, K. Thirunarayan, A. Sheth, "Extracting city events from social streams,“, ACM Transactions on TICS, 2014.
Social media analysis (deep learning – under construction)
34
http://iot.ee.surrey.ac.uk/citypulse-social/
Accumulated and connected knowledge?
35Image courtesy: IEEE Spectrum
Reference Datasets
36http://iot.ee.surrey.ac.uk:8080/datasets.html
Importance of Complementary Data
37
Users in control or losing control?
38
Image source: Julian Walker, Flicker
Data Analytics solutions for IoT data
− Great opportunities and many applications;− Enhanced and (near-) real-time insights;− Supporting more automated decision making and
in-depth analysis of events and occurrences by combining various sources of data;
− Providing more and better information to citizens;− …
39
However…
− We need to know our data and its context (density, quality, reliability, …)
− Open Data (there needs to be more real-time data)
− Complementary data − Citizens in control − Transparency and data management issues
(privacy, security, trust, …)− Reliability and dependability of the systems
40
In conclusion
− IoT data analytics is different from common big data analytics.
− Data collection in the IoT comes at the cost of bandwidth, network, energy and other resources.
− Data collection, delivery and processing is also depended on multiple layers of the network.
− We need more resource-aware data analytics methods and cross-layer optimisations.
− The solutions should work across different systems and multiple platforms (Ecosystem of systems).
− Data sources are more than physical (sensory) observation.− The IoT requires integration and processing of physical-cyber-
social data.− The extracted insights and information should be converted
to a feedback and/or actionable information. 41
IET sector briefing report
42
Available at: http://www.theiet.org/sectors/built-environment/resources/digital-technology.cfm