Center for Open Middleware Center for Open Middleware RDF Validation in a Linked Data world A vision beyond structural and value range validation Miguel Esteban Gutiérrez, Raúl García Castro, Nandana Mihindukulasooriya RDF Validation Workshop September 10 th -11 th , 2013
18
Embed
RDF Validation in a Linked Data World - A vision beyond structural and value range validation
Data validation is a vital step for ensuring the quality of data and the expressive languages for doing so and their related tools are essential for a data model to be adopted by the industry. Many data representation and storage technologies, like relational databases or XML, use expressive schema languages for defining the structure and the constraints on data and allow ensuring that the quality and the consistency of data is kept intact. In the context of semantic and Linked Data technologies, which are built upon the Open World Assumption and Non-unique Name Assumption, data validation becomes a challenge as the languages currently used to describe these constraints (i.e., RDF Schema and OWL) are more suited for inferring data than for data validation. There is a clear need for more expressive languages to define rules for validating RDF data.
However, having a wider view on the different use cases where RDF data is being used and considering the applications that consume RDF data as Linked Data, we can discover that there are requirements and concerns that go beyond the structural validation and data value range validation. In this paper, we identify the different requirements and factors that need to be taken into account and discussed in the context of data validation in applications that publish and consume Linked Data. These factors are grouped into three main categories: data source factors, procedure factors, and context factors. We believe that having this broader view will help to identify the concrete requirements for data validation especially in the context of Linked Data.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Center for Open Middleware
Center for Open Middleware
RDF Validation in a Linked Data world A vision beyond structural and value range validation
Miguel Esteban Gutiérrez,
Raúl García Castro,
Nandana Mihindukulasooriya
RDF Validation WorkshopSeptember 10th-11th, 2013
Center for Open Middleware
2
Linked Data & the ALM iStack Project
• Objective:
To foster the adoption of Linked Data technologies as the means for facilitating application integration in enterprise-grade environments in the ALM domain
• Challenge:
Provide the means for ensuring that the data exchanged between the applications of the enterprise portfolio is consistent and valid whilst keeping the integrity of the data in each of these applications
• Linked Data Enabled Application• Expose all or part of its data following the Linked Data principles • The data exposed is “sound and complete” from the application
perspective
• Linked Data Capable Application• Consumes data published following the Linked Data principles
• Linked Data Aware Application• Linked Data Enabled and Linked Data Capable application• Capable of integrating its own data with other Linked Data
Require RDF validation process
Center for Open Middleware
6
Designing the RDF validation process
• Data source factors• Behavioral aspects• Structural aspects
• Procedure factors• Data aspects• Temporal aspects
versions:ver1244 a ai:Version; oslc_asset:version "1.0"^^xsd:string; ai:isVersionOf products:prod3231.
DB
Center for Open Middleware
15
Procedure factors (IV)
• Temporal aspects• Estimated duration
• Short-lived validation process• Validation process is simple enough to be carried out in a short period of time
• Long-lived validation process• Validation process requires complex and/or lengthy operations which span a
wide period of time (i.e., estatistical calculations for reports)
• Immediateness• On-the-fly / up-front
• Validation happens as soon as the data is available (i.e., user input validation)
• Just-in-time / deferred• Validation happens when the data is to be consumed (i.e., batch and/or async
operations)
Center for Open Middleware
16
Context factors
• Validation as part of a write operation• Data provenance• Application managed vs user provided properties• Write once-read many vs read-write properties
• RDF validation in a Linked Data scenario has other concerns beyond traditional structural and data range validation issues
• Procedures for validating Linked Data need to be customized to accommodate the particularities of the scenario in terms of the • the data sources to be consumed,• the processes to be carried out, and • the context in which they are to be applied
Center for Open Middleware
Center for Open Middleware
RDF Validation in a Linked Data world A vision beyond structural and value range validation