Top Banner
BIG DATA EUROPE H2020 CSA (2015-17) BDE PILOT INSTANTIATION Ronald Siebes VU Amsterdam Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges 02.03.2016
43

BDE Technical Webinar 1 : Pilot Instantiation

Apr 15, 2017

Download

Science

BigData_Europe
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: BDE Technical Webinar 1 :  Pilot Instantiation

BIG DATA EUROPE H2020 CSA (2015-17) BDEPILOTINSTANTIATION Ronald Siebes VU Amsterdam

Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges

02.03.2016

Page 2: BDE Technical Webinar 1 :  Pilot Instantiation

BigDataEurope

8-Mar-16 www.big-data-europe.eu

The 7 Societal Challenges and their

first pilots

Page 3: BDE Technical Webinar 1 :  Pilot Instantiation

SC1: Life Sciences & Health

8-Mar-16 www.big-data-europe.eu

SC1: Life Sciences & Health

Page 4: BDE Technical Webinar 1 :  Pilot Instantiation

SC1: Life Sciences & Health

8-Mar-16 www.big-data-europe.eu

Partners:

A not-for-profit membership organization, which supports and continues the development of the information infrastructure created during the Open PHACTS project of the Innovative Medicines Initiative (IMI).

The VU Amsterdam was a key participant in the Open PHACTS project responsible for developing the Linked-Data infrastructure.

Big Data Focus area: Large-scale heterogeneous pharma-research data linking & integration Selected Key Data assets: ACD Labs / ChemSpider, ChEBI, ChEMBL, ConceptWiki, DrugBank, ENZYME, Gene Ontology, GO Annotation, SwissProt, WikiPathways

Page 5: BDE Technical Webinar 1 :  Pilot Instantiation

SC1: Life Sciences & Health

8-Mar-16 www.big-data-europe.eu

Page 6: BDE Technical Webinar 1 :  Pilot Instantiation

SC1: Life Sciences & Health

8-Mar-16 www.big-data-europe.eu

Page 7: BDE Technical Webinar 1 :  Pilot Instantiation

SC1: Life Sciences & Health

8-Mar-16 www.big-data-europe.eu

Page 8: BDE Technical Webinar 1 :  Pilot Instantiation

SC1: Life Sciences & Health

8-Mar-16 www.big-data-europe.eu

Pilot 1: Duplicate Open PHACTS functionality on the BDE infrastructure using Open Source solutions Reasons: •  Deployment possible in-house •  Vary domains (e.g. Agriculture) •  Using extra BDE functionalities (e.g. logging,

analysis)

Page 9: BDE Technical Webinar 1 :  Pilot Instantiation

SC1: Life Sciences & Health

8-Mar-16 www.big-data-europe.eu

BDE infrastructure -  Large scale RDF reasoning over 3 billion+ triples -  RESTful API -  Various front ends

Page 10: BDE Technical Webinar 1 :  Pilot Instantiation

SC2: Food & Agriculture

8-Mar-16 www.big-data-europe.eu

SC2: Food & Agriculture

Page 11: BDE Technical Webinar 1 :  Pilot Instantiation

SC2: Food & Agriculture

8-Mar-16 www.big-data-europe.eu

Partners: FAO, the largest autonomous agency within the United Nations system and one of the main players in the agricultural information community.

Big Data Focus area: Large-scale distributed agricultural data integration Selected Key Data assets: INFOODS, AQUASTAT Green Learning Network (GLN), Agricultural Bibliography Network (ABN), AgroVoc, AquaMaps, Fishbase

Semantic Web Company (SWC) is a technology provider headquartered in Vienna (Austria). SWC supports organizations from all industrial sectors worldwide to improve their information management. Their core product is to extract meaning from big data by making use of linked data technologies.

Agroknow is a company that captures, organizes and adds value to the rich information available in agricultural and food sciences, in order to make it universally accessible, useful and meaningful.

Page 12: BDE Technical Webinar 1 :  Pilot Instantiation

SC2: Food & Agriculture

8-Mar-16 www.big-data-europe.eu

AGINFRA

Page 13: BDE Technical Webinar 1 :  Pilot Instantiation

SC2: Food & Agriculture

8-Mar-16 www.big-data-europe.eu

Pilot focus area: Viticulture

(from the Latin word for vine) is the science, production,

and study of grapes. It deals with the series of

events that occur in the vineyard.

Page 14: BDE Technical Webinar 1 :  Pilot Instantiation

SC2: Food & Agriculture

8-Mar-16 www.big-data-europe.eu

Pilot 2: Support advanced crop data discovery, processing, combining and visualization from distributed and heterogeneous data repositories

¥ Vine and Wine sector: emerging market in EU ¥ Sustainability and biodiversity challenges: local varieties are being lost ¥ Exploitation of new grapevine varieties and clones in terms of climate change adaptation ¥ Quality and health status of viticultural products ¥ Contribution to human health (antioxidants, prevention of heart diseases etc.) ¥ Wide variety of heterogeneous (and big) data from various information sources

Reasons:

Page 15: BDE Technical Webinar 1 :  Pilot Instantiation

SC2: Food & Agriculture

8-Mar-16 www.big-data-europe.eu

BDE infrastructure tasks -  Large scale data extraction and integration processing from external data sources (tables,

figures texts) -  Analysis batch jobs for generating statistical data -  Rich query support combining various parameters (e.g. location, geno/fenotypes,

publications, soil data) -  Various front ends similar to PubMed

Page 16: BDE Technical Webinar 1 :  Pilot Instantiation

SC3: Energy

8-Mar-16 www.big-data-europe.eu

SC3: Energy

Page 17: BDE Technical Webinar 1 :  Pilot Instantiation

SC3: Energy

8-Mar-16 www.big-data-europe.eu

Partners: A public entity supervised by the Ministry of Environment, Energy and Climate Change in Greece, founded in September 1987, active in the fields of Renewable Energy Sources (RES), Rational Use of Energy (RUE) and Energy Saving (ES).

Big Data Focus area: Real-time turbine monitoring stream processing and analytics Selected Key Data assets: European Energy Exchange Data, smart meter sensor data, gas/fuels market/price data, consumption statistics, stratigraphic model data (geology, geophysics)

NCSR "Demokritos", the largest multidisciplinary research centre of Greece hosts significant scientific research, technological development and educational activities, coordinated by eight Institutes.

Page 18: BDE Technical Webinar 1 :  Pilot Instantiation

SC3: Energy

8-Mar-16 www.big-data-europe.eu

Pilot focus area: System monitoring

in energy production units.

Page 19: BDE Technical Webinar 1 :  Pilot Instantiation

SC3: Energy

8-Mar-16 www.big-data-europe.eu

Pilot 3: Operation, maintenance and production forecasting for wind turbines on real-time sensor data.

¥ Current technology is not able to deal with full amount of available valuable data ¥ Economic benefit of predicting output and prevention of damage (if one can predict one part about to fail it can be prevented that other parts get damaged) ¥ Large continuous stream of sensor data, perfect to test our platform

Reasons:

Page 20: BDE Technical Webinar 1 :  Pilot Instantiation

SC3: Energy

8-Mar-16 www.big-data-europe.eu

Data: -  Raw sensor and SCADA data from a given

wind farm -  Third-party raw or synthetic data -  Analysis results from built-in analysis modules Processing:

•  Near-real time execution of parameterized models to return operational statistics, including correlation analysis of data across units

•  Weekly execution of operational statistics •  Weekly execution of model parametrization

Page 21: BDE Technical Webinar 1 :  Pilot Instantiation

SC4: Transport

8-Mar-16 www.big-data-europe.eu

SC4: Transport

Page 22: BDE Technical Webinar 1 :  Pilot Instantiation

SC4: Transport

8-Mar-16 www.big-data-europe.eu

Partners: The Fraunhofer Society is a German research organization with 67 institutes spread throughout Germany, each focusing on different fields of applied science.

Big Data Focus area: Streaming sensor network & geo-spatial data integration Selected Key Data assets: GTFS data, OSM/LinkedGeoData, MobilityMaps, Transport sensor data, ROSATTE Road safety attributes, European Road Data Infrastructure - EuroRoadS

The Centre for Research and Technology-Hellas (CERTH) founded in 2000 is one of the leading research centres in Greece. CERTH includes the Hellenic Institute of Transport (HIT): Land, Sea and Air Transportation as well as Sustainable Mobility services

ERTICO - ITS Europe is a partnership of around 100 companies and institutions involved in the production of Intelligent Transport Systems (ITS).

Page 23: BDE Technical Webinar 1 :  Pilot Instantiation

SC4: Transport

8-Mar-16 www.big-data-europe.eu

Pilot focus area: Info mobility and traffic planning

Page 24: BDE Technical Webinar 1 :  Pilot Instantiation

SC4: Transport

8-Mar-16 www.big-data-europe.eu

Pilot 4: Multisource data collection for the provision of accurate info-mobility and advanced transport planning service in Thessaloniki, Greece

¥ Congestion is a major problem in Europe, especially in urban areas. ¥ utilizing real-time probe data for the provision of accurate info-mobility services and advanced transport planning, leads to better decisions ¥ The use of mobility data coming from multiple sources presents significant challenges, especially due to the different nature of the datasets both in content and spatio-temporal terms as well as due to the fact that the data should be collected and processed in real time.

Reasons:

Page 25: BDE Technical Webinar 1 :  Pilot Instantiation

SC4: Transport

8-Mar-16 www.big-data-europe.eu

Data: •  Traffic counts and speed (330 locations, a

data set every 1.5 – 5 minutes, 300k records, 15 MB)

•  Travel times from Bluetooth detectors (43 locations, a data set every 15 minutes, 250k-300k records, 50 MB)

•  Floating Car Data position and speed (1200 vehicles, a data set every 2 minutes, 2M records, 200MB)

•  Check-in events from social networks

Page 26: BDE Technical Webinar 1 :  Pilot Instantiation

SC5: Climate

8-Mar-16 www.big-data-europe.eu

SC5: Climate

Page 27: BDE Technical Webinar 1 :  Pilot Instantiation

SC5: Climate

8-Mar-16 www.big-data-europe.eu

Partners: A public entity supervised by the Ministry of Environment, Energy and Climate Change in Greece, founded in September 1987, active in the fields of Renewable Energy Sources (RES), Rational Use of Energy (RUE) and Energy Saving (ES).

Big Data Focus area: Enormous simulation time. Extremely complicated computing model. Selected Key Data assets: European Grid Infrastructure (EGI). Access to several data centres hosted at CNRS-Lyon, NCSR-D Athens, INFN-Milan, NIKhEF-Amsterdam.

NCSR "Demokritos", the largest multidisciplinary research centre of Greece hosts significant scientific research, technological development and educational activities, coordinated by eight Institutes.

Page 28: BDE Technical Webinar 1 :  Pilot Instantiation

SC5: Climate

8-Mar-16 www.big-data-europe.eu

Pilot focus area: Supporting data-intensive

climate research

Page 29: BDE Technical Webinar 1 :  Pilot Instantiation

SC5: Climate

8-Mar-16 www.big-data-europe.eu

Pilot 5: Downscaling, and retrieval process on (raw) climate data via User-defined parameters (e.g. geographical areas, time period, physical variables, computational grids, time steps)

¥ The provision of Climate model data satisfies an important objective, that of assessing the potential impacts of climate change on well being for adaptation, prevention and mitigation measures and supporting other policy making decisions. ¥ The awareness led to the availability of huge datasets ¥ Downscaling is a computational intensive process

Reasons:

Page 30: BDE Technical Webinar 1 :  Pilot Instantiation

SC5: Climate

8-Mar-16 www.big-data-europe.eu

Data: •  Earth System Grid Federation (ESGF) data:

•  CMIP5 data (global climate model simulations) •  CORDEX data (regional climate model simulations) •  NetCDF data

•  European Centre for Medium range Weather Forecasting (ECMWF) data

Page 31: BDE Technical Webinar 1 :  Pilot Instantiation

SC5: Climate

8-Mar-16 www.big-data-europe.eu

Page 32: BDE Technical Webinar 1 :  Pilot Instantiation

SC6: Social Sciences

8-Mar-16 www.big-data-europe.eu

SC6: Social Sciences

Page 33: BDE Technical Webinar 1 :  Pilot Instantiation

SC6: Social Sciences

8-Mar-16 www.big-data-europe.eu

Partners: CESSDA provides large scale, integrated and sustainable data services to the social sciences. CESSDA is organised as a limited company under Norwegian law owned and financed by the individual EU member states’ ministry of research or a delegated institution.

Big Data Focus area: Statistical and research data linking & integration Selected Key Data assets: Federated social sciences data catalogs, statistical data from public data portals and statistical offices (e.g. EuroStats, UNESCO, WorldBank)

NCSR "Demokritos", the largest multidisciplinary research centre of Greece hosts significant scientific research, technological development and educational activities, coordinated by eight Institutes.

Page 34: BDE Technical Webinar 1 :  Pilot Instantiation

SC6: Social Sciences

8-Mar-16 www.big-data-europe.eu

Pilot focus area: Citizens budget spending on

municipal level

Page 35: BDE Technical Webinar 1 :  Pilot Instantiation

SC6: Social Sciences

8-Mar-16 www.big-data-europe.eu

Pilot 6: Citizens budget in municipal level

¥ Budget: the most important document of public policy ¥ Budget execution affects everyday lives ¥ Citizens are more involved in city level ¥ Having a platform that integrates heterogeneous budget data (many municipality have their own data formats) and calculates infographics would benefit the citizens, the research community and policy makers

Reasons:

Page 36: BDE Technical Webinar 1 :  Pilot Instantiation

SC6: Social Sciences

8-Mar-16 www.big-data-europe.eu

Data: •  Datastream from Greek municipalities, with codes that are

unique identifiers based on national accounting system for municipalities

•  Data from 3 cities in Greece (Highest detail) •  Updated several times within the day (Streams with no

memory) ->Convert in daily observations •  Available through API or CSV/XLS

Page 37: BDE Technical Webinar 1 :  Pilot Instantiation

SC7: Security

8-Mar-16 www.big-data-europe.eu

SC7: Security

Page 38: BDE Technical Webinar 1 :  Pilot Instantiation

SC7: Security

8-Mar-16 www.big-data-europe.eu

Partners: The Centre supports the decision making of the European Union in the field of the Common Foreign and Security Policy (CFSP), by providing products and services resulting from the exploitation of relevant space assets and collateral data, including satellite imagery and aerial imagery, and related services.

NCSR "Demokritos", the largest multidisciplinary research centre of Greece hosts significant scientific research, technological development and educational activities, coordinated by eight Institutes.

Page 39: BDE Technical Webinar 1 :  Pilot Instantiation

SC7: Security

8-Mar-16 www.big-data-europe.eu

Big Data Focus area: Image data analysis Selected Key Data assets: Earth Observation data (e.g. Very High Resolution Satellite Imagery acquired from commercial providers and governmental systems) and collateral data for supporting CFSP/CSDP missions and operations

Page 40: BDE Technical Webinar 1 :  Pilot Instantiation

SC7: Security

8-Mar-16 www.big-data-europe.eu

Pilot focus area: Getting insight in man-made surface

changes triggered by automatic detection, news, or social media information

Page 41: BDE Technical Webinar 1 :  Pilot Instantiation

SC7: Security

8-Mar-16 www.big-data-europe.eu

Pilot 7: Ingestion of remote sensing images and social

sensing data to detect and verify man-made changes on the Earth surface for security applications

¥ Evacuation route planning ¥ Monitoring of critical infrastructures ¥ Border security ¥ Satellite image data is HUGE and computational intensive to compare ¥ Smart ‘focus’ algorithms are needed to prioritize the analysis jobs

Reasons:

Page 42: BDE Technical Webinar 1 :  Pilot Instantiation

SC7: Security

8-Mar-16 www.big-data-europe.eu

Data: •  All data products are distributed in the SENTINEL Standard

Archive Format for Europe (SAFE) format •  The SENTINEL-SAFE format wraps a folder containing image

data in a binary data format and product metadata in XML •  Social Media, which are demonstrated via consuming Twitter

streams •  News agencies, which are demonstrated via consuming  

Reuters RSS feeds

Page 43: BDE Technical Webinar 1 :  Pilot Instantiation

SC7: Security

8-Mar-16 www.big-data-europe.eu