From Big Linked Data to Linked Big Data: DBpedia as a framework for data integration Giuseppe Futia 1 , Antonio Vetrò 1 , Giuseppe Rizzo 2 1- Nexa Center for Internet and Society, DAUIN, Politecnico di Torino 2- Istituto Superiore Mario Boella (ISMB) 7th DBpedia Community Meeting in Leipzig 15 September 2016
16
Embed
From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
From Big Linked Data to Linked Big Data: DBpedia as a framework fordata integration
Giuseppe Futia1, Antonio Vetrò1, Giuseppe Rizzo2
1- Nexa Center for Internet and Society, DAUIN, Politecnico di Torino 2- Istituto Superiore Mario Boella (ISMB)
7th DBpedia Community Meeting in Leipzig15 September 2016
Antonio Vetro'
cosa vuoi dire in questa slide ? cosa rappresentano le bolle ? non è chiaro al momento
Giuseppe Futia
Concordo, sostituisco con 2 slide
Giuseppe Futia
Questa può essere una slide in cui inserisco uno schema di ciò che vorrei creare
Giuseppe Futia
Conviene rileggersi il papers di Fujitsu e cercare di estrarre concetti utili per lo scopo
Giuseppe Futia
DONE
Giuseppe Futia
Forse val la pensa riprendere i paper di DBpedia Spotlight e magari un altro su DBpedia NLP, così da riuscire a non dire minchiate durante la presentazione. Ed in queste fitterebbe anche con quelle che risultano essere blah blah.
Giuseppe Futia
Si può cominciare ad accennare qualcosa di relativo al Deep Learning per i testi? Forse si può accennare al papero di Facebook su questa cosa o anche agli studi che avevo cominciato a trovare. Magari utilizzando tensorflow in qualche maniera? Figatona
Giuseppe Futia
Devo leggermi il paper di RML
Giuseppe Futia
Conviene leggere anche i capitoli successivi di Map/Reduce, così da poter ipotizzare un primo esperimento che poi potrà essere realizzato.
Giuseppe Futia
Non ho capito il terzo punto che ho scritto
Giuseppe Futia
Bisogna leggere un paper a riguardo, un po' per capire che cosa si può dire a proposito dell'ontologia. Un po' per capire se ci sono use cases interessanti che avvallano le mie idee.
PhD candidate on semantics atNexa Center for Internet & Society,DAUIN, Politecnico di Torino
Experiences with LOD and DBpedia
• TellMeFirst, a tool for classifying and enriching textual documents built on DBpedia Spotlight (http://tellmefirst.polito.it)
• Contratti Pubblici, a tool for processing, exploring, and visualizing Italian Public Procurements (http://public-contracts.nexacenter.org/)
• Big Linked Data–Already implemented as shown by the exponential growth of Linked Data in the last years
• Linked Big Data–RDF data model for Big Data Variety–Meta information to enable powerful analytics–Simplify Big Data access, integration, and interlinking
From Big Linked Data to Linked Big Data
Big Data notion of Variety• Variety of data and representation formats
• Variety of conceptualizations and data models
• Variety related to temporal and spatial dependencies
• Variety as a “generalization of the semantic heterogeneity as studied in the field of Linked Data”
(Pascal Hitzler & Krzysztof Janowicz)
PhD research questions (i)
• RQ1: How can the technological foundations of Linked Data and Big Data can be further improved and combined to create an open software architecture for a multi-thematic, multi-perspective, and multi-medial knowledge graph from heterogeneous sources?
PhD research questions (ii)
• RQ2: Which are the features of a research method to meet and evaluate security, scalability, performance, openness, interoperability of the software architecture mentioned earlier? And how we can measure the quality of the knowledge graph produced with this software architecture?
Key ideas for my PhD• Get concepts and ontologies from the DBpedia
knowledge base to support semantic alignment during the integration stage
• Use frameworks for data integration of structured information with Big Data technologies:RDF Mapping Language (RML) + Hadoop or Spark
• Exploit Machine Learning techniques to increment datasets with unstructured data (i.e., Deep Learning)
DBpedia as knowledge base for:
• Entity linking and annotations in documents
• Assertion of additional categories for data
• Improvement of multilingual information
• Estimation of data quality of integrated information according to different features (i.e., provenance)
Challenges• Greater accuracy (integrating different datasets)
• Immediacy (near-real time data, from new data sources)
• Flexibility (not constrained by database structure)
• Better analytics (the ability to change the rules)
• Data quality (reliability and effectiveness of data)