-
Project No: FP7-318338
Project Acronym: Optique
Project Title: Scalable End-user Access to Big Data
Instrument: Integrated Project
Scheme: Information & Communication Technologies
Deliverable D2.3First Prototype of the Core Platform
Due date of deliverable: (T0+12)
Actual submission date: October 31, 2013
Start date of the project: 1st November 2012 Duration: 48
months
Lead contractor for this deliverable: FOP
Dissemination level: PU – Public
Final version
-
Executive Summary:First Prototype of the Core Platform
This document summarises deliverable D2.3 of project FP7-318338
(Optique), an Integrated Project sup-ported by the 7th Framework
Programme of the EC. Full information on this project, including
the contentsof this deliverable, is available online at
http://www.optique-project.eu/.
The deliverable presents a first software prototype of the
Optique system, which integrates an initial setof software
components from the technical workpackages into a central platform.
The deliverable providesdetailed documentation of shared platform
interfaces, which have been designed and implemented duringthe
first project phase in order to enable a tight integration of the
Optique components. Particularly, theintegration of modules for
ontology and mapping management, query transformation and visual
query for-mulation into the initial prototype is described.
Finally, the document provides instructions for setting upthe
platform as well as explains basic functionalities for the process
of visual query building to both andEnd-User and IT-Experts.
List of AuthorsJohannes Trame (FOP)Peter Haase (FOP)Ernesto
Jiménez-Ruiz (UOXF)Evgeny Kharlamov (UOXF)Christoph Pinkel
(FOP)Martín Rezk (FUB)Ahmet Soylu (UiO)
ContributorsTimea Bagosi (FUB)Marco Console (UNIROMA1)Martin
Giese (UiO)Mariano Rodríguez-Muro (FUB)Marco Ruzzi
(UNIROMA1)Domenico Fabio Savo (UNIROMA1)Michael Schmidt (FOP)Martin
G. Skjæveland (UiO)Marius Strandhaug (UiO)Evgenij Thorstensen
(UOXF)
2
http://www.optique-project.eu/
-
Contents
1 Introduction 4
2 Optique Platform Architecture & Implementation 62.1
Architectural Overview . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . 62.2 The Information Workbench
as a Platform for Integration . . . . . . . . . . . . . . . . . . .
. 72.3 Shared Platform Interfaces . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . 7
2.3.1 Ontology Management API . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 82.3.2 Relational Schema API . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
92.3.3 R2RML Mapping Management API . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 112.3.4 RDF Data Management API . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Integration of Components . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . 172.4.1 Ontology &
Mapping Management . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . 172.4.2 Query Transformation . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . 182.4.3 Query
Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . 19
3 Documentation and Prototype 213.1 Installation Instructions .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 21
3.1.1 Obtaining the platform bundle . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . 213.1.2 Installation Requirements
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
213.1.3 Installation . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 213.1.4 Opening the Optique
platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . 233.1.5 Shutting down the Optique platform . . . . . . . . . .
. . . . . . . . . . . . . . . . . . 23
3.2 Administration Guide . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 233.2.1 Setup and
Configuration Steps . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 243.2.2 Full Administrator Guide . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 End User Documentation . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 253.3.1 General Platform
Features for Exploration, Search, Authoring and Visualisation . . .
. 253.3.2 Visual Query Formulation . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . 26
4 Conclusion and Outlook 29
Bibliography 29
Glossary 31
3
-
Chapter 1
Introduction
A typical problem that end-users face when dealing with Big Data
is that of data access. Due to theincreasing volume of data, the
velocity of data growth as well as due to the variety of different
data formatsaccessing the relevant information becomes a difficult
task. In situations where an end-user needs data thatpredefined
queries do not provide, the help of IT-experts (e.g., database
managers) is required to translatethe information need of end-users
to specialised queries and optimise them for efficient execution.
Since thisprocess may require several iterations and may take
several days, IT-experts spend 30–70% of their timegathering and
assessing the quality of data [2]. Specially in domains such as the
oil and gas industry thisprocess is the main bottle neck for data
access. The Optique project aims at solutions that reduce the
costin time and money dramatically.
streaming data
end-user IT-expert
Ontology Mappings
...heterogeneous data sources
query
results
Query Formulation
Ontology & Mapping Management
...
Application(Analytics)
Query TransformationDistributed Query Optimisation and
Processing
Figure 1.1: The Optique OBDA system
The key idea of the semantic approach known as “Ontology-Based
Data Access” (OBDA) [4, 1] is to usean ontology, which presents to
users a semantically rich conceptual model of the problem domain.
The usersformulate their information requirements (that is,
queries) in terms of the ontology, and then receive theanswers in
the same intelligible form. These requests should be executed over
the data automatically, withoutan IT-expert’s intervention. To this
end, a set of mappings is maintained which describes the
relationshipbetween the terms in the ontology and the corresponding
terminology in the data source specifications, e.g.,table and
column names in relational database schemas.
Beyond scalability issues and the ability to handle (temporal)
data streams, state-of-the-art OBDAsystems put in particular
constrains on the usability for end-user (i.g., syntactic and
conceptual mismatchduring query formulation) as well as for
IT-Experts (i.g., suitable ontologies and mappings are expensive
toobtain). The Optique projects aims at developing the next
generation OBDA system (cf. Figure 1.1) that
4
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
overcomes these limitations. A bigger goal of the project is to
provide a platform with a generic architecturethat can be adapted
to any domain that requires scalable data access and efficient
query execution.In this deliverable we present an initial prototype
of the Optique platform describing the individual systemcomponents
and their integration and interaction by using shared interfaces,
system-wide conventions andstandards that have been implemented
during the first project phase. The first prototype is strongly
alignedwith the initial Optique architecture and requirements as
specified by all technical and scientific partners inthe Optique
Deliverable 2.1.The remainder of the deliverable is organized as
follows:In Chapter 2 we recall the agreed architecture as specified
in Deliverable 2.1, before we describe in detail thedesign and
implementation of shared platform interface and their role in
integrating the different components.Chapter 3 provides a
documentation for installation and configuration of the platform as
well as end-userinstructions. Chapter 4 summarizes the main
achievements and platform features as well as provides anoverview
of ongoing development and engineering activities for the second
year.
5
-
Chapter 2
Optique Platform Architecture &Implementation
2.1 Architectural Overview
The Optique architecture is designed as a tiered architecture
with three layers (cf. Figure 2.1). Whilethe presentation layer
addresses specifically end-users needs such as query formulation
and query answervisualisation, it provides also functionalities for
IT-experts in order to set-up and (re)configure the platform.
The core components such as for query transformation, query
answering and ontology and mappingmanagement are integrated on the
application layer and interact through specialized application
programminginterfaces. The data and resource layer is responsible
for managing the access to different kind of (re)sources,for
example, relational databases, data streams or even access to
computational resources in the cloud.
Cloud API
data streamsRDBs, triple stores, temporal DBs, etc.
Stream connectorJDBC, Teiid
... ...
Information Workbench frontend API* Infor. Workbench
frontend API* Information Workbench frontend API* Infor.
Workbench
frontend API*
Cloud (virtual resource pool)
Ans. visual.: WorkbenchQuery Formulation
Interface
Answers visualisation Optique's Configuration Interface
Ontology and Mapping Management Interface
Ontology editing Interface: Protégé
PresentationLayer
Query Answering Component
External visualisation engines
Workbench visualisationengine
Shared triple store
Sesame
- ontology- mappings- configuration- queries- answers- history-
lexical information- etc.
Map
ping
s
Ontology and Mapping Manager's Processing Components
Ont/Mapp matchers
Ont/Mapp bootsrappers
Query Formulation Processing Components
Query by Navig.1-time Q SPARQL Stream Q
Context Sens. Ed.1-time Q SPARQL Stream Q
Direct Editing1-time Q SPARQL Stream Q
Faceted searchQuery Ans man1-time Q SPARQL Stream Q
QDriven ont construction
1-time Q SPARQL Stream Q
Export functional
mininglog analysis
...
Stream analyticsranking, abductionprovenance, etc.
Met
adat
a
Configuration of modules
LDAP authentification
Feedback funct.
Sesame
Query transformation
Query rewriting1-time Q SPARQL
Stream Q
Semantic QOpt.1-time Q SPARQL
Stream Q
Syntactic QOpt1-time Q SPARQL
Stream Q
Sem indexing1-time Q SPARQL
Stream Q
Query Execution1-time Q
SQL->RDFStream
Q
Distributed Query Execution
Q Planner1-time Q
SQLStream
Q
Optimization1-time Q
SQLStream
Q
Materialization module
Shared database
JDBC, Stream API
ontology mapping
Bootstrapper
ontology mapping
Analyser
ontology mapping
Evolution Engine
ontology mapping
Transformator
ontology mapping
ApproximationSimplification
OW
L API
Federation module
Federation module
Manager
Ont/Mapp revision control, editing
* E.g., widget development, Java, REST
Ontology Processing
Ontology modularization
Sesame
Front end: mainly Web-basedComponent
Group of components
Optique solution
External solution
Components Colouring Convention Expert users
Types of Users End users API Application receiving
answers
Ontology reasoner 1Ontology reasoner 2
...
ComponentManager,
Setup module
Data,Resource Layer
ApplicationLayer
Optique Platform: Integrated via Information Workbench
Figure 2.1: The general architecture of the Optique OBDA
system
6
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
2.2 The Information Workbench as a Platform for Integration
The Optique platform is built up on the Information Workbench,
an industrial-strength and mature softwareplatform developed and
maintained by fluid Operations.1 The Information Workbench is an
open, data-centric development platform that has been specifically
designed to support the whole lifecyle of interactingwith semantic
data – from integration to access, visualization, exploration, and
data interaction. Exploitingthe features that are already present
in the Information Workbench thus provides us with a stable
foundationfor the implementation of the Optique platform and
enables us to make an first version of the platformavailable
already during this project phase.Based on the architecture and
language standards agreed in Deliverable 2.1, the Optique platform
is supposedto provide interfaces for data access, querying, and
plugin-mechanisms for query language extensions aswell as frontend
components. While the Information Workbench provides already a
number of extensionpoints in order to automatically plug-in new
modules, special programming interfaces have been designedand
implemented according to the requirements as specified in
Deliverable 2.1. This is to enable a tightintegration and seamless
interaction of the Optique software components.
2.3 Shared Platform Interfaces
We distinguish between (i) the shared interfaces among the
Optique components and (ii) the interfaces pro-vided by each
component itself (e.g. APIs for query formulation). The concrete
interfaces for the componentsand their interplay with the platform
are described in the deliverables of the components itself. In the
initialarchitecture specification, we agreed on a number of shared
interfaces:
(1) API for Ontology Management
(2) API for Mapping Management
(3) API for Relational Data and Metadata Management
(4) API for RDF Data Management
(5) API for Streaming Data Management
(6) API for Cloud Automation
Within the first year we designed and implemented interfaces for
(1), (2), (3) and (4) as those have been pri-oritized by the
partners during the requirement specification in Deliverable 2.1.
Table 2.1 provides a overviewon which API is used by the Ontology
and Mapping Management (O&M), the Query Transformation(QT)and
the Query Formulation (QF) component within the initial prototype
of the platform.
O&M
QT QF
Ontology Management API X X XMapping Management API X
XRelational Data and Metadata Management API X XRDF Data Management
API X
Table 2.1: Usage of shared APIs by the different components.
The following subsections describe in detail the signatures and
functionalities of these four interfaces. Theimplementation of the
interfaces is tested by a set of JUnit tests which are provided in
the respective packages.
1http://www.fluidops.com/information-workbench/
7
http://www.fluidops.com/information-workbench/
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
2.3.1 Ontology Management API
The Ontology Management exposes the functionality for loading,
storing, manipulating and reasoning overOWL ontologies. Ontologies
are stored in the Optique repository natively as RDF triples, each
ontology inits own context (named graph). Through the ontology
management API, ontologies can be accessed throughthe object model
of the OWLAPI2. In this way, much of the basic ontology management
functionality canbe directly reused from the OWLAPI and seamlessly
interacts with Reasoners such as Hermit or Pellet.3
Ontologies in the Optique repository are uniquely identified via
their Ontology URI.4
Note on ontology identifiers:The OWL 2 specification uses IRIs
instead of URIs. We use URIs for consistency with the rest of
theOptique APIs. The API has convenience methods to convert URI
IRI. The OWL 2 specificationstates that an ontology *may* have an
ontology IRI, we require that it *must* have one in order
toidentify it.
At this point ontology identifiers are supposed to be unique. In
the future, we may want to extend the APIto support multiple
versions of an ontology with the same ontology IRI/URI, but
different version identifiers.
Location
• Interface and implementation class in package
com.fluidops.iwb.api
• Class OntologyManager : the ontology manager interface
• Class OntologyManagerImpl : the corresponding implementation
class
Using the API
The API comes with three basic methods, as documented below. In
order to obtain a runtime object of theontology manager, it is
requred to to call:
>OntologyManager om =
EndpointImpl.api().getOntologyManager();
On top of this object, the API itself provides the following
methods:
Load an ontology from the Optique repositoryOWLOntology
loadOntology(URI ontologyURI) throws
OWLOntologyCreationException;
The method loads an ontology with the URI ontologyURI. It
returns an object of type OWLOntology,which is defined in the
OWLAPI. Therefore, one can now use all the existing functionality
from theOWLAPI. For the implementation of the load method, we have
implemented a new OWLOntologyDoc-umentSource that allows to read
ontologies directly from the Optique triple store in an efficient
manner.
Store an ontology in the Optique repositoryboolean
storeOntology(OWLOntology ontology, URI ontologyURI, boolean
overwrite)
throws OWLOntologyStorageException,
OWLOntologyCreationException;
This method takes an ontology and stores it under the given
ontologyURI in the Optique repository. Withoverwrite one can
specify whether an existing ontology with the ontologyURI should be
overwritten. Themethod returns true if the ontology has been
successfully stored. Of course it is possible to store anontology
that has been created via the OWLAPI from some other source.
Remove an ontology from the repositoryboolean removeOntology(URI
ontologyURI) throws OWLOntologyStorageException;
2http://owlapi.sourceforge.net3see http://clarkparsia.com/pellet
and
http://hermit-reasoner.com4http://www.w3.org/TR/owl2-syntax/#Ontology_IRI_and_Version_IRI
8
http://owlapi.sourceforge.nethttp://clarkparsia.com/pellethttp://hermit-reasoner.comhttp://www.w3.org/TR/owl2-syntax/#Ontology_IRI_and_Version_IRI
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
This method takes a given ontologyURI and removes the
corresponding ontology from the Optique repos-itory. The method
returns true if the ontology has been successfully removed.
Code Example
Listing 2.1: Example of programmatic access to the
OntologyAPI
/*
* create the OntologyManager and get an instance
*/
OWLOntologyManager m = OWLManager.createOWLOntologyManager
();
OntologyManager om = EndpointImpl.api().getOntologyManager
();
/*
* Load an Ontology from the web via the OWLAPI and store it in
the Optique repository
*/
// create the URI
URI testOntologyURI = ValueFactoryImpl.getInstance
().createURI("http ://www.w3.org/TR /2003/PR-owl -
guide -20031209/ wine");
// first use the OWL API to load the wine ontology from the
web
OWLOntology o =
m.loadOntologyFromOntologyDocument(IRI.create(testOntologyURI));
// store the ontology using the platform ontology manager
om.storeOntology(o, testOntologyURI , false);
/*
* Load the Ontology from the platform repository
*/
OWLOntology o = om.loadOntology(testOntologyURI);
// Use the OWL API to count the axioms
int axiomCount = o.getAxiomCount ();
// Check that the ontology is in OWL 2 DL
boolean isInDL = (new OWL2DLProfile
()).checkOntology(o).isInProfile ();
/*
* Remove the ontology from the platform repository
*/
om.removeOntology(testOntologyURI);
2.3.2 Relational Schema API
The Relational Data & Metadata API provides a standardized
method for accessing relational metadata suchas table and column
information, datatypes, constraints, and indices. In order to use
the API, a so-calledRelational Metadata Provider,
RDBMetadataProvider for short, needs to be created within the
platform(compare Section ), which is responsible for extracting the
required information from a relational database.While the API
serves primarily as common layer for abstraction of different
relational endpoint, it acts alsoas a caching layer which allows
for much faster access to the metadata than fetching the metadata
everytime individually from within the different components. Once
the information has been collected, the APIcan be used to obtain an
in-memory representation (PoJo, Plain Old Java Object) of the
database and itsschema information.
Location
• The Relational Data & Metadata API is implemented as part
of the core platform.
• The RDBMetadataProvider is implemented in package
com.fluidops.iwb.provider
• The API itself is implemented in the package
com.fluidops.iwb.api
• The package com.fluidops.iwb.api.datacatalog contains the
interfaces
• The class DataCatalogService in package
com.fluidops.iwb.api.catalog is the main entry point
9
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
• The class RelationalDataEndpoint in package
com.fluidops.iwb.api.DataCatalog represents the PoJo servingas an
entry point for accessing meta information about relational
schemas
• The package com.fluidops.iwb.api.DataCatalog.impl contains the
implementation
Access Extracted Metadata
The API comes with three methods, as documented below. In order
to obtain a runtime object from theAPI, one needs to call:
> DataCatalogService dcs =
EndpointImpl.api().getDataCatalog();
At this point, it is assumed that a RDBMetadataProvider has
already been created before (see Section 3.2.1).On top of this
object, the API itself provides the following methods:
Check Data Endpoint Existencepublic boolean
dataEndpointExists(URI dataEndpointId)
The method checks whether the data endpoint with the given ID
exists. The endpoint ID is composed asfollows: // where is the
default namespaceof the platform the is the ID of the
RDBMetaInformationProvider.
Load Data Endpoint by IDpublic DataEndpoint loadDataEndpoint(URI
dataEndpointId) throws IllegalArgumentException
Returns the DataEndpoint PoJo object associated with the
specified data catalog entry, which allowsbrowsing through the
database’s meta information and schema. If no catalog entry for the
specified idexists, an IllegalArgumentException is thrown. For now,
the returned object will always be of the morespecific type
RelationalDatabaseEndpoint. Hence, to access the relational schema,
it might be requiredto cast the endpoint to the latter.
Load All Data Endpointspublic List loadDataEndpoints()
Returns a list of the DataEndpoint PoJos for all DataEndpoints
that have been extracted. As for methodloadDataEndpoint(URI
dataEndpointId) it might be required to cast the DataEndpoints in
the list totype RelationalDatabaseEndpoint in order to access the
Relational Schema contained within.
Working With Extracted Data
RelationalDatabaseEndpoint objects can be retrieved through the
API as described above. An endpointcorresponds to a database in
RDBMS terms, i.e., each RelationalDatabaseEndpoint represents one
schemawithin a specific instance of a database
server.RelationalDatabaseEndpoint contains two methods:
• Schema getSchema(): returns an object representing metadata on
the schema/database.
• DatabaseInfo getDatabaseInfo(): returns an object representing
information about the database serverinstance that hosts the
schema/database.
From Schema objects, schema’s name and list of Tables can be
accessed. Tables, in turn, contain the followingmethods to access
relevant meta information:
• String getName(): returns the table’s local name, e.g.,
“mytable”
• String getFullName(): returns the fully qualified name of a
table, e.g., “mydb.mytable”
• List getColumns(): returns the columns (i.e., attributes) in
the table
• PrimaryKey getPrimaryKey(): returns the primary key of the
table
10
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
• public List getForeignKeys(): returns a list of foreign keys,
if any
• TableType getTableType(): returns the type of the table (can
be TABLE for normal tables, but couldalso be a VIEW or other types
supported by the SQL standard)
• Schema getSchema(): returns the schema that contains the
table
• List getIndices(): lists indices of the table RDF
Representation of Relational Schemas
Internally, the RelationalSchemaOntology is used to represent
relational schema in RDF. The ontology definesa set of classes and
properties that allow to interlink all information that belongs to
a schema: Schemas arerelated to tables. Tables consist of columns,
primary keys, keys, foreign keys and indices. A column has
adatatype associated and a position at which it occurs in the
table. The ontology is bundled with the Optiqueplatform and will be
loaded automatically during the start-up of the platform.
Code Examples
The following code snippet exemplifies how to output the table
names and columns of such an endpoint(assuming there is a
RDBMetainformationProvider with id “npd”, as illustrated
above).
Listing 2.2: Example of programmatic access to the
RelationalDatabaseEndpoint
// compose the URI of the data catalog through the providerId of
the RDBMetaInformationProvider
URI dataEndpointId = ProviderUtils.objectAsUri(
EndpointImpl.api().getNamespaceService ().defaultNamespace () +
"npd/npd");
if (!dcs.dataEndpointExists(dataEndpointId))
System.out.println("Data endpoint with ID " + dataEndpointId + "
does not exist");
else
{
DataEndpoint de = dcs.loadDataEndpoint(dataEndpointId);
if (de instanceof RelationalDatabaseEndpoint)
{
RelationalDatabaseEndpoint rde =
(RelationalDatabaseEndpoint)de;
Schema s = rde.getSchema ();
System.out.println("Schema = " + s.getFullName ());
for (Table t : s.getTables ())
{
System.out.println("-> Table: " + t.getName ());
for (Column c : t.getColumns ())
{
System.out.println("---> Column: " + c.getName () + " with
datatype " + c.
getColumnDataType ().getName ());
}
}
}
}
2.3.3 R2RML Mapping Management API
The R2RML Mapping Management API has been designed for managing
and manipulating collections ofmappings according to the R2RML
standard.5
Location
• Interface and factory class in package
eu.optique.api.r2rml
• Class R2RMLMappingManager : the mappig manager
interface5http://www.w3.org/TR/r2rml/
11
http://www.w3.org/TR/r2rml/
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
• Class R2RMLMappingManagerFactory : associated factory
class
• Implementation in package eu.optique.api.r2rml
• Class R2RMLMappingManagerImpl : implementation
• Test cases in package eu.optique.api.r2rml contains various
test cases
Access Mapping from the API
The API comes with methods for mapping collection management,
supporting the deserialization of map-pings from an input stream,
serialization to file, maintenance (add, delete, replace) of
mappings loaded intothe central metadata repository. At its core,
the API provider in-memory classes for mapping explorationand
manipulation. In order to obtain the API object, the factory class
needs to be used as follows:
> R2RMLMappingManager mm =
R2RMLMappingManagerFactory.getApi();
The mapping manager relies on the concept of mapping
collections, where a mapping collection containsa set of mappings,
thus serving as a logical grouping of R2RML mappings (e.g.
containing all mappingsobtained from importing an R2RML file, or
all mappings related to a given relational database). At APIlevel,
every API method requires the specification of the so-called
mapping collection ID. Mapping collectionswith different IDs are
thus treated as isolated sets of mappings. On top of the
R2RMLMappingManagerobject, the API itself provides the following
functionality:
Import R2RML Mappings From an Input Streampublic void
importMappings(String mappingCollectionId, InputStream is,
RDFFormat rdfFormat)
throws IllegalArgumentException;
The method imports an R2RML mapping from an input stream. The
input stream is passed in parameteris. The mappingCollectionId
provides an internal ID for the collection of mappings specified in
the inputstream. The rdfFormat specifies the RDF format in which
the mapping is serialized (if not provided,Turtle format is
assumed). If the mappingCollectionId exists already, an
IllegalArgumentException isthrown.
Export R2RML Mappings as String (in Turtle)public String
exportMappings(String mappingCollectionId)
throws IllegalArgumentException;
This method allows to export an existing mapping collection with
the specified mappingCollectionId asa string. The output format is
Turtle. An IllegalArgumentException is thrown if the specified
mappingcollection does not exist.
Export R2RML Mappings as Stringpublic String
exportMappings(String mappingCollectionId, RDFFormat rdfFormat)
throws IllegalArgumentException;
This method allows to export an existing mapping collection with
the specified mappingCollectionId as astring. The rdfFormat
specifies the RDF format in which the mapping is to be serialized
(if not provided,Turtle format is assumed). An
IllegalArgumentException is thrown if the specified mapping
collectiondoes not exist.
Export R2RML Mappings to Filepublic void exportMappings(String
mappingCollectionId,String fileName,RDFFormat rdfFormat)
throws IllegalArgumentException;
Allows to export an existing mapping collection with the
specified mappingCollectionId to the file specifiedin fileName (to
be stored on the server). The rdfFormat specifies the RDF format in
which the mappingis to be serialized (if not provided, Turtle
format is assumed). An IllegalArgumentException is thrown ifthe
specified mapping collection does not exist.
12
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
In-memory Mapping Retrieval & Access
The API provides a method in order to retrieve a list of
identifiers for existing mapping collections as wellas to check
whether a collection with a specific ID is already known by the
platform:
• public List listMappingCollections();: Lists the IDs of all
mapping collections that are availablein the system (i.e., have
been previously imported).
• public boolean mappingCollectionExists(String
mappingCollectionId);: Checks whether the mapping col-lection with
the specified mappingCollectionId exists.
Once a mapping collection exists (i.e., mappings have been
imported), it is possible to access, add, removeand replace the
available collections and mappings through a bunch of methods, as
documented in thefollowing:
Check Existence of a Mapping in a Collectionpublic boolean
mappingExists(TriplesMap mapping, String mappingCollectionId)
throws IllegalArgumentException,
InvalidR2RMLMappingException;
Checks whether the specified mapping (passed as in-memory
object) exists in a given mapping collection.Throws an
IllegalArgumentException if the collection does not exist. The
existence check does notsemantically or syntactically match the
mappings, but instead just checks whether there is a mappingin the
specified mapping collection that has a blank node or URI identical
to the one that is stored inthe mapping in-memory object. In the
very unlikely case that the specified mapping collection cannot
beprocessed, an InvalidR2RMLMappingException is thrown.
Load a Collection of Mappings as Main-memory Javapublic
R2RMLMappingCollection loadMappings(String mappingCollectionId)
throws IllegalArgumentException;
The method loads a collection of (previously imported) mappings
into main memory and returns anobject that can be used for further
exploration and manipulation. The mappingCollectionId
identifieswhich mappings are to be loaded. If the specified
mappingCollectionId does not exist in the central store,an
IllegalArgumentException is thrown. In the very unlikely case that
the specified mapping collectioncannot be processed, an
InvalidR2RMLMappingException is thrown.
Deleting a Mapping Collectionpublic void deleteMappings(String
mappingCollectionId)
throws IllegalArgumentException;
The method deletes a collection of (previously imported)
mappings. The mappingCollectionId identifieswhich mappings are to
be deleted. If the specified mappingCollectionId does not exist in
the central store,an IllegalArgumentException is thrown.
Delete a Mapping From a Collectionpublic boolean
deleteMapping(TriplesMap mapping, String mappingCollectionId)
throws IllegalArgumentException,
InvalidR2RMLMappingException;
Deletes the given mapping from the collection. Throws an
IllegalArgumentException if the mappingcollection ID does not
exist. Deletion is performed by considering the mapping’s resource
ID. If nomapping with the specified resource ID is found, the
method returns false. If deletion succeeds, themethod returns true.
In the very unlikely case that the specified mapping collection
cannot be processed,an InvalidR2RMLMappingException is thrown.
Add a Mapping to a Collectionpublic void addMapping(TriplesMap
mapping, String mappingCollectionId)
throws IllegalArgumentException,
InvalidR2RMLMappingException;
Adds the given mapping to the collection; if the collection does
not exist, a new collection is created. The
13
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
method throws an IllegalArgumentException if a mapping with the
mapping’s resource ID already existsin the repository.
Replace a Mapping in a Collectionpublic boolean
replaceMapping(TriplesMap toRemove, TriplesMap toAdd, String
mappingCollectionId)
throws IllegalArgumentException,
InvalidR2RMLMappingException;
Replaces a mapping in the collection with the specified
mappingCollectionId. Throws an IllegalArgu-mentException if the
mapping collection ID does not exist. Deletion is performed by
matching themapping’s resource ID. The flag indicates whether the
specified mapping has been successfully deleted(adding will take
place independently, i.e. if the mapping to be removed does not
exist, this methodis equivalent to calling addMappingToCollection.
In the very unlikely case that the specified mappingcollection
cannot be processed, an InvalidR2RMLMappingException is thrown.
In-memory Mapping Manipulation
Creating a TriplesMapTo create a new TriplesMap, the
MappingFactory class provides three methods:
• public static TriplesMap createTriplesMap(LogicalTable lt,
SubjectMap sm);
• public static TriplesMap createTriplesMap(LogicalTable lt,
SubjectMap sm, PredicateObjectMap pom);
• public static TriplesMap createTriplesMap(LogicalTable lt,
SubjectMap sm,List listOfPom);
A TriplesMap must have a LogicalTable and a SubjectMap. It can
also have any number of PredicateOb-jectMaps.6 These can also be
created with methods in the MappingFactory class. A LogicalTable
musteither be a SQLBaseTableOrView or a R2RMLView. These can be
created with the methods:
• public static SQLTable createSQLBaseTableOrView(String
tableName);
• public static R2RMLView createR2RMLView(String query);
A SubjectMap is a subclass of TermMap. TermMaps must have a
TermMapType which must be set to eitherTermMapType.COLUMN_VALUED,
TermMapType.CONSTANT_VALUED orTermMapType.TEMPLATE_VALUED. A
SubjectMap can be created with the methods:
• public static SubjectMap createSubjectMap(Template
template);
• public static SubjectMap createSubjectMap(TermMapType type,
String columnOrConst);
When using the first method, the TermMapType will be set to
TermMapType.TEMPLATE_VALUED. ATemplate can be created with the
methods:
• public static Template createTemplate();
• public static Template createTemplate(String template);
A PredicateObjectMap can be created with one of the following
methods:
• public static PredicateObjectMap
createPredicateObjectMap(PredicateMap pm, ObjectMap om);
• public static PredicateObjectMap
createPredicateObjectMap(PredicateMap pm, RefObjectMap
rom);6http://www.w3.org/TR/r2rml/#property-index
14
http://www.w3.org/TR/r2rml/#property-index
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
• public static PredicateObjectMap createPredicateObjectMap(List
pms, List oms,List roms);
The PredicateObjectMap must have at least one PredicateMap and
at least one ObjectMap or RefOb-jectMap. When using the third
method, the lists must contain elements that meet this requirement.
Pred-icateMap and ObjectMap are subclasses of TermMap (as
SubjectMap). These can be created with themethods
createPredicateMap and createObjectMap. A RefObjectMap can be
created with the method:
• public static RefObjectMap createRefObjectMap(Resource
parentMap);
The resource node points to the parent triples map of the
RefObjectMap.
Modifying TriplesMapsA TriplesMap can be modified using the
following methods:
• public void setLogicalTable(LogicalTable lt);
• public void setSubjectMap(SubjectMap sm);
• public void addPredicateObjectMap(PredicateObjectMap pom);
• public void removePredicateObjectMap(PredicateObjectMap
pom);
The methods setLogicalTable and setSubjectMap will overwrite the
previously set LogicalTable or Sub-jectMap in the TriplesMap. The
method addPredicateObjectMap will add the PredicateObjectMap to
theend of the list of PredicateObjectMaps, and
removePredicateObjectMap will remove the specified
Predica-teObjectMap from the list. In general, set-methods will
overwrite the previously set object and add-methodswill add the
object to the end of a list. The remove methods can throw an
IllegalStateException if re-moving the object would make the
TriplesMap invalid (for example removing the last PredicateMap from
aPredicateObjectMap).
Using the API via REST/JSON
Some of the methods are exposed publicly via the platform’s REST
endpoint. In particular, the followingmethods can be used via
REST/JSON:
• public String exportMappings(String mappingCollectionId)
throws IllegalArgumentException;
• public void deleteMappings(String mappingCollectionId) throws
IllegalArgumentException;
• public boolean mappingCollectionExists(String
mappingCollectionId);
• public List listMappingCollections();
As an example, we illustrate how to read a mapping collection
from the REST/JSON API using the ex-portMappings functionality.
Assume we know there’s a mapping collection with id test. The REST
call toobtain a serialized version of the mappings in this
collection (as Turtle) is as
follows:http://:/REST/JSON/getMappingManager/?method=exportMappings¶ms=[%22test%22]&id=
The and needs to be replaced through the respective server name
(or IP) and port, respec-tively. The is just an ID that needs to be
passed to the request, an integer such as 1, which will bereported
in the answer. The answer may look like
this:{"id":"1","result":
"
@prefix dc: .\n
@prefix Settings: .\n
@prefix rso: .\n
...
a
...
","jsonrpc":"2.0"}
15
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
Thus, the result is returned as a JSON object which can be
further processed with a JSON library, wherethe result is stored in
variable result. In order to automate the request, the username and
password (HTMLbasic authentication) need to be passed as POST
parameters. In a standard setting, these will be admin/iwb(can be
created and modified users at the User Management page).
2.3.4 RDF Data Management API
A variety of assets in Optique (ontology, mapping, relational
database metadata, . . . ) will be stored in thecentral store as
RDF. Using the Sesame RDF API, it will be possible to access these
entities directly, e.g.by the means of SPARQL queries. In addition,
the RDF data management API allows one to store, query,and
manipulate other kinds of RDF data in the central store
directly.
Location
• Interface and factory class in package eu.optique.api.rdf
• Class RDFDataManager: the data manager interface
• Class RDFDataManagerFactory: associated factory class
• Implementation in package eu.optique.api.rdf.impl
• Class RDFDataManagerImpl: implementation
General Documentation
The API comes with five methods, as documented below. In order
to obtain a runtime object of the API,one need to call
> RDFDataManager rdfDm = RDFDataManagerFactory.getApi();
On top of this object, the API itself provides the following
methods:
Load RDF data from a filepublic void load(String filename,
String format, String context, String source, Boolean
userEditable)
The method loads data from the file filename. The other
parameters are optional: The parameter formatmay be used to specify
the file format (e.g. N3), if not specified the file type is
guessed from the file suffix.The parameter context determines the
context into which the data is written (either as a prefixed URIor
as a full URI enclosed in -brackets). The parameter source is a
user-defined string representing thesource from which the data is
loaded. The parameter userEditable defines, whether the loaded data
canbe edited (i.e., modified & deleted) by the user. If not
set, the system default configuration is used.
Execute a SPARQL 1.0/1.1 SELECT querypublic TupleQueryResult
sparqlSelect(String query) throws Exception
The query result is returned in the form of Sesame’s
TupleQueryResult. 7
Execute a SPARQL 1.0/1.1 CONSTRUCT querypublic GraphQueryResult
sparqlConstruct(String query) throws MalformedQueryException,
QueryEvaluationException
The query result is returned in the form of Sesame’s
GraphQueryResult. 8
Execute a SPARQL 1.0/1.1 ASK querypublic boolean
sparqlAsk(String query) throws MalformedQueryException,
QueryEvaluationException
The query result is returned in the form of a
boolean.7http://www.openrdf.org/doc/sesame2/api/org/openrdf/query/TupleQueryResult.html8http://www.openrdf.org/doc/sesame2/api/org/openrdf/query/GraphQueryResult.html
16
http://www.openrdf.org/doc/sesame2/api/org/openrdf/query/TupleQueryResult.htmlhttp://www.openrdf.org/doc/sesame2/api/org/openrdf/query/GraphQueryResult.html
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
Execute a SPARQL 1.0/1.1 INSERT/DELETE querypublic void
sparqlUpdate(String query) throws MalformedQueryException,
UpdateExecutionException
Executes the SPARQL UPDATE query.
Code Example
Listing 2.3: Example of programmatic access to the
OntologyAPI
// get an instance of the RDFDataManager
RDFDataManager rdfDM = RDFDataManagerFactory.getApi ();
/*
* load an example rdf file
*/
rdfDM.load("example.rdf", null , "", "npd -v2-db -vocab",
false);
/*
* SPARQL ASK
*/
if(rdfDM.sparqlAsk("ASK { ?s rdfs:label ?o }"))
System.out.println("Triple exists!");
/*
* SPARQL SELECT
*/
TupleQueryResult res = rdfDM.sparqlSelect("SELECT ?s ?o WHERE
{?s rdfs:label ?o}");
// iterate over the results and extract the variable
bindings
int i = 0;
while (res.hasNext ())
{
BindingSet bs = res.next();
Binding p = bs.getBinding("p");
Binding o = bs.getBinding("o");
System.out.print("Result tuple #" + i++ + ": ");
if (p != null && p.getValue () != null)
System.out.print("?p -> " + p.getValue ());
if (o != null && o.getValue () != null)
System.out.println(", ?o -> " + o.getValue ());
}
/*
* SPARQL CONSTRUCT
*/
GraphQueryResult gres = rdfDM.sparqlConstruct(
"CONSTRUCT { rdfs:label ?o } WHERE { " +
"?s rdfs:label ?o }");
while (gres.hasNext ())
System.out.println(gres.next().getSubject ());
/*
* SPARQL UPDATEWE
*/
rdfDM.sparqlUpdate("INSERT DATA { rdfs:label \" Optique Project
\"}");
2.4 Integration of Components
2.4.1 Ontology & Mapping Management
OBDA systems crucially depend on the existence of suitable
ontologies and mappings. Developing themfrom scratch is likely to
be expensive and a practical OBDA system should support a (semi-)
automaticbootstrapping of an initial ontology and set of
mappings.
The Ontology and Mapping (O&M) management component is in
charge of creating and evolving the
17
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
ontology and mappings, as well as, providing an interface to
feed the Query Formulation component withthe ontology vocabulary in
order to guide the formulation of the queries.
The current implementation of the Optique O&M component is
equipped with an O&M bootstrapper, aroutine that takes a
database schemata and possibly instances over these schemata as an
input, and returnsan ontology and a set of mappings connecting the
ontology entities to the elements of the input schemata.For this
purpose the O&M component retrieves the required metadata from
the platfrorms’ shared metadatarepository using the Relational
Schema API (see Section 2.3.2).
The Optique O&M component also integrates an ontology
matching system to align the bootstrappedontology with state of the
art domain ontologies and an ontology approximation module to
transform theresulting ontology if it is outside the desired OWL 2
profile.
Currently, the O&M bootstrapper includes two installation
wizards (basic and advanced) for the ontologyand the mappings.
Figure 2.2 depicts the workflows of these wizards. Please refer to
Deliverable D4.1 fordetails about the O&M bootstrapper and its
installation wizards. The O&M bootstrapper is compliant
with
Figure 2.2: O&M installation wizards
the Ontology API (see Section 2.3.1) and the Mapping Management
API (see Section 2.3.2) provided by theOptique platform and is able
to store the resulting assests back to the platform.
2.4.2 Query Transformation
The Query Transformation (QT) component Ontop allows to query
virtual RDF graphs defined by a re-lational database, an ontology,
and a set of R2RML mappings. The general architecture is
illustrated inFigure 2.3. In a first step a set of mappings, the
ontology and the SPARQL query are translated into a setof Datalog
rules that represent these objects. Second, the program is
optimized using query containmentbased techniques and Semantic
Query Optimization. In particular we:
• use SLD-resolution to compute a partial evaluation of the
program; and
• optimize the query(ies) with respect to Primary/Foreign Keys
to avoid redundant self-joins.
The optimized program is translated into an equivalent
relational algebra expression, the SQL query isgenerated and
executed by the DBMS.
In order to integrate Ontop into the Optique platform we agreed
on the languages and libraries we weregoing to support. In
particular we agreed on using R2RML as mapping language, on the
OWLAPI for
18
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
SPARQL query q
R2RMLOntology
Datalog Relational Algebra
SQL queryRelational DB
+
translation
+
Optimization
DB Metadata
Figure 2.3: Query answering with mappings in Ontop
handling ontologies as well as on the Sesame API to generate the
SPARQL algebra, handle RDF graphsand respective result sets. The
database metadata, together with the R2RML mappings and the
OWLontology are fetched from the platform’s shared metadata
repository using the respective APIs and passedto the Ontop (Quest)
repository during initialization —c.f. Figure 2.4. As Ontop
internally makes use ofspecialized data structures for handling
metadata, we created a transformer between the Optique’s
databasemetadata object and the Ontop specific metadata object.
Similarly, Ontop supports R2RML mappings bytransforming them
internally into its own type of mappings. The Ontop component is
registered within the
Figure 2.4: OBDA assets are passed from the platform’s APIs to
the Ontop component
platform as a Sesame repository. Consequently, native platform
functionalities and widgets such as for searchand visualization are
able to seamless operate on this repository.
2.4.3 Query Formulation
We complete the cycle with an intuitive visual query formulation
system (OptiqueVQS) for the ontology-based data access framework.
The details of OptiqueVQS are described in deliverable D3.1, we
here onlyprovide a brief overview and focus on the aspects relevant
for the integration. OptiqueVQS allows end-users to formulate
queries by directly manipulating visual representations of domain
elements.The designmantra of the OptiqueVQS relies on high
usability rather than high expressiveness, that is
OptiqueVQSaccommodates reasonably complex and frequently occurring
queries (i.e, from end-user perspective), andleave out highly
complex and less frequently used queries, which might hinder the
overall usability withoutmuch gain. However, we also plan to
provide support for complex queries; this is possible by either
hidingsuch functionality behind layers so that advanced users can
access them or introducing a textual editor wheredomain experts and
IT experts collaborate on queries from different perspectives.
Concerning the actualdesign of the interface, we employ a
multi-parading approach where different representation and
interactionparadigms are combined to benefit from their individual
strengths.
OptiqueVQS is designed as a user-interface (UI) mashup built on
widgets. A UI mashup aggregatesdifferent applications into a common
graphical space and orchestrates them for common goals.
Widgets9
are the building blocks of our VQS and refer to portable,
self-contained, full-fledged, and mostly client
side9http://www.w3.org/TR/widgets/
19
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
applications with limited functionality and complexity. Widgets
in our system communicate with each otherby delivering events,
generated by user actions, through a client-side communication
channel. Each widgetreacts to events either in a preprogrammed way
or by considering the semantic and syntactic signatures
ofevents.
The OptiqueVQS consists of a client-side component that is the
presentation and a server-side componentthat uses platform APIs to
serve the ontology and data to the client-side component. The
client-side com-ponent is purely HTML and JavaScript and uses
existing JavaScript libraries for realising its
functionality.JQuery mobile10 is used to generate widgets, which
can also run on mobile devices, InfoVis11 is used tovisualise query
graphs, and JQuery12 is used for cross-browser compliance. The
communication channel isbuilt on HTML 5’s13 message passing
support. The server-side component is built in IWB and
integratedwith other APIs. The communication between the
client-side and the server-side components is handled viaa set of
REST calls, which at every step return a fragment of the ontology
in JSON format.
Currently the OptiqueVQS uses the following REST methods from
OntologyAccess4QueryFormulationclass in eu.optique.api.component.qf
package:
• getAvailableOntologies(): gets the list of identifiers of the
available ontologies in the triple store.
• loadOntology(String ontologyURI): it loads an ontology given
its URI.
• getCoreConcepts(): gets the core concepts of the active (i.e.
loaded) ontology to be listed in theOptiqueVQS.
• getConceptFacets(String conceptURI): given a concept URI/Id,
retrieves its associated facets.
• getNeighbourConcepts(String conceptURI): given an concept
URI/Id retrieves the associated con-cept neighbours.
Each REST call returns the ontology-related information
serialised as JSON objects, which will populatethe OptiqueVQS
interface. For example, the REST
call:http://:/REST/JSON/getQFOntologyAccess/?method=getCoreConcepts&id=1
will return a set of JSONobjects corresponding to the core concepts
in the ontology.
10http://jquerymobile.com/11http://philogb.github.io/jit/12http://jquery.com/13http://www.w3.org/TR/html5/
20
-
Chapter 3
Documentation and Prototype
This chapter provides detailed instructions for system
administrators (IT-Experts) in order to deploy andconfigure the
Optique platform. Beside guidance for the technical system
installation of the platform on dif-ferent operation systems, it
explains the basic steps that are required to initialize the
platform. Particularly,it shows how to connect the system to a
relational datasource (also called “data endpoints”),
bootstrappingthe system with initial configurations which are
required for the initialization of the component for
querytransformation. Finally, basic functionality of the novel
query formulation interface is explained to both, theIT-Expert as
well as the End-User.
3.1 Installation Instructions
3.1.1 Obtaining the platform bundle
The Optique platform prototype is available in the restricted
download area of the project website. Pleasecontact the project
coordinator if you would like to obtain a copy for review or
evaluation purpose.
3.1.2 Installation Requirements
Server - Operating SystemWindows (64-bit only): Windows 7,
Windows Server 2008Linux (64-bit only): openSUSE 12.1Java Runtime
Environment (JRE >= 1.7.0_25 64 bit)32-bit systems, other Linux
Distributions, different versions of Windows or OS X systems may
alsowork, but are not officially supported..
Client -BrowsersFirefox >=17.x (ESR)Internet Explorer
>=8Safari >=5.1.7Other browsers may also work, but are not
officially supported.
3.1.3 Installation
The Optique platform supports both Windows and Linux based
operating systems.
Windows
Installation from the zip-distributionIt is recommended to use a
64bit Windows operating system with a 64Bit Java SE Runtime
Environ-
21
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
ment in version 1.7 (taken from the JDK). The reference version
shipped with the installer is JRE SE1.7.0_25 64bit. This is also
the version used in steps a) and b) below. Unpack the distribution
into adirectory of any choice (e.g. C:\\OPTIQUE). In the following,
we will refer to the absolute pathname ofthis directory by .
Running the Optique platform as executableExecute /start.cmd
Running the Optique platform on a 32bit Windows operating
systemThe Optique platform can be run on a 32bit Windows operating
system by following the steps below:
1) Download and install Java SE 32-Bit JDK version 1.7.1
2) Set the path of java.exe in the file /fiwb/backend.conf,
examples:wrapper.java.command=C:\Program
Files\Java\jdk1.7.0_25\bin\java (absolute
path)wrapper.java.command=java (if the java command is in the Path
environment)3) Execute the Optique platform as described above.
Linux
To run the Optique platform under Linux a Java SE Runtime
Environment version 1.7 (taken from theJDK) must be installed. Note
that the Optique platform does not ship a reference version bundled
with therelease. Unpack the distribution into a directory of any
choice (e.g. /opt/optique). In the following, we willrefer to the
absolute pathname of this directory by . Download and install Java
SE64-Bit JDK version 1.7. Make sure the java command is added to
the command-path of the user root.
a) Running as serviceCreate a user under which the optique
platform shall run, e.g. “fluid” (in the following we will refer
tothis user as ).If “fluid” has not been chosen as user, the script
/fiwb/iwb.sh has to be adaptedaccordingly: Search for
RUN_AS_USER=fluid and replace fluid by Exceute the script
linux-install.sh in as user root, like follows:bash -eu
linux-install.sh
This installs an init-script as /etc/init.d/iwb and starts the
application. To make sure this script isexecuted on reboot create
corresponding links in the run-level specific directories.Depending
on the unix distribution, this can be done with: chkconfig -a iwb
or with insserv iwb
b) Running the Optique platform as executableMake sure all
script are executable by executing in :chmod +x *.sh fiwb/*.sh
fiwb/wrapper-linux*
If the Optique platform needs to be executed as a user different
from “fluid”, the script/fiwb/iwb.sh has to be adapted accordingly:
Search for RUN_AS_USER=fluid andreplace “fluid” by any preferred
user (this user must exist on the system).Execute start.sh in .
Mac OS X
Please note that, while we have successfully installed and run
the Optique platform on Mac OS X, this plat-form is not officially
supported. To run the Optique platform on Mac OS X it requires a
compatible versionof the Java runtime (ideally, version 1.7). OS X
may ask whether to install a Java runtime automatically if
itdetects that one is needed but missing the Optique platform
distribution (.zip file). To get started, proceedas follows:
• Unpack the Optique platform zip
distribution.1http://www.oracle.com/technetwork/java/javase/downloads/index.html
22
http://www.oracle.com/technetwork/java/javase/downloads/index.html
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
• Open the terminal application and cd into the unpacked
distribution (e.g. cd Desktop/IWB’ [ENTER])
• Make scripts executable by typing chmod +x *.sh fiwb/*.sh
fiwb/wrapper* [ENTER])
• In the file, fiwb/iwb.sh, modify the value for
RUN_AS_USER=fluid to the user name that is intended to runthe
Optique platform. (Alternatively, create a user account with the
name fluid.)
• To start the Optique platform, now type ./start.sh’
[ENTER]
• After a few seconds, the Optique platform should be accessible
locally from any browser under theaddress http://localhost:8888
3.1.4 Opening the Optique platform
Once the Optique platform is running (startup may take up to a
few minutes), it can be accessed athttp://localhost:8888 . The
start page (cf. Figure 3.1) provides some links to important pages
as wellas to the user documentation.By default, an administrator
account with credentials “admin/iwb” is created. It is highly
recommended to
Figure 3.1: Start page of the Optique Platform, describing the
basic steps for initial configuration.
change the password in the Admin area after the first login.
3.1.5 Shutting down the Optique platform
Windows Service: Go to the Services configuration tool and stop
the serviceLinux Daemon: Invoke /etc/init.d/iwb stopCommand line
(all OS): Exit the Optique platform by clicking in the command line
window and pressingCtrl + c.IMPORTANT: Never exit the Optique
platform by closing the command line window without propershutdown.
This can result in corrupting the data store and loss of data.
3.2 Administration Guide
The initial set-up of the platform for a specific domain,
requires to go through several pref-configurationsteps (Section
3.2.1) before the system can be fully customized (Section 3.2.2).
We assume that this is to bedone by an IT-Expert rather than the
End-User (cf. Figure 1.1).
23
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
3.2.1 Setup and Configuration Steps
Registration of a Relational Database Endpoint and Extraction of
Relational Metadata
In order to extract data from a Relational Database, a so-called
RDBMetainformationProvider needs to beset-up. The provider securely
stores the relevant connection parameters as well as connects to
the relationaldatabase in order to extract information from it. To
do so, start your Optique platform, go to Admin -> DataProvider
Setup, and choose “Add Provider”. Then choose the
RDBMetainformationProvider from the popup andinsert the connection
information. You need to provide the following information:
• The Poll Interval can be set to refresh database information
periodically (a value of 0 indicates thatthe automatic refresh is
disabled, i.e. only manually triggered runs are possible)
• The username is the name of the database user used to
authenticate with the Relational Database
• The password is the password for the previous database user
used to authenticate with the RelationalDatabase
• The schemaName is the name of the schema that needs to be
extracted from the Relational Database(in order to extract multiple
schemata from the same database, one needs to create multiple
providers)
• The connectionString is the JDBC connection string used to
connect to the database,e.g. jdbc:mysql://helloworld:3306/npd (see
standard JDBC documentation for connection strings todifferent
database types)
• The driverClass specifies the Java class of the JDBC driver,
e.g. com.mysql.jdbc.Driver. Note that theplatform does NOT ship
JDBC drivers (typically for licensing reasons), so the .jar file of
the respectivedriver need to be downloaded and added to the
classpath.2
Figure 3.2 illustrates how to set up an RDBMetadataProvider for
the relational NPD dataset 3 which ishosted within Optiques’
development infrastructure (in order for this example to work, a
VPN connectionto the Optique network is required).
Bootstrapping of Initial System Configurations: Ontologies and
Mappings
Once the metadata has been extracted successfully, the O&M
component widget can be utilized in order to:
• bootstrap an initial ontology directly from the relation
metadata
• bootstrap a set of direct mappings
• automatically align the bootstrapped ontology to an existing
domain ontology
• if needed, approximate the ontology language profile to the
OWL 2 QL profile
• store the assets in the platform’s shared metadata
repository
The O&M user interface is implemented as a wizard, which
guides you stepwise through this process.While it is linked from
the systems’ start-page, you can also open it directly, for
example, by go to theurl
http://localhost:8888/resource/Bootstrapping on you local
instance.
2For adding the JDBC driver to the classpath, it is required to
download the JDBC driver forMySql
http://dev.mysql.com/downloads/connector/j/5.0.html, copy the
respective .jar file into the folder/fiwb/libs/extensions and
restart the platform
3http://sws.ifi.uio.no/project/npd-v2
24
http://dev.mysql.com/downloads/connector/j/5.0.htmlhttp://sws.ifi.uio.no/project/npd-v2
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
Figure 3.2: RDBMetadataProvider - Example Configuration of an
RDBMetadataProvider
Configuration and Initialization of the Query Transformation
Component
After bootstrapping, alignment and approximation of the ontology
and respective mappings the query trans-formation component needs
to be initialized. For this purpose, the platform provides a
specific configurationwidget (cf. Figure 3.3), which allows to
select and combine easily different relational endpoints, with
differentcollections of mappings and ontologies. It automatically
requests all registered relational data endpoints,mappings as well
as ontologies from the shared metadata repository. By selecting the
respective identifiersfrom the drop-down elements, you can easily
define the input parameters in order to configure the
Ontopcomponent, which is responsible for query transformation. Once
the required configuration parameters havebeen set and stored, the
query transformation component will be initialized and registered
as a Sesamerepository within the platform.45 After initialization,
the platform will automatically offer to operate on thevirtualised
repository, for example, from within widgets or the
sparql-interface.
3.2.2 Full Administrator Guide
A full documentation of features for administration of the
platform core as well as for developing customizeddomain
extensions, ships with the platform bundle as well as is available
online.6 The documentation coversin particular topics such as
system settings, wiki management, access to the platform’s APIs via
REST andCLI and the management of user rights. The developers guide
explains basic plugin and extension mechanismfor developing
customized solutions, for example, domain specific query answer
visualisation widgets.
3.3 End User Documentation
3.3.1 General Platform Features for Exploration, Search,
Authoring and Visualisation
Platform features that specifically target End-User needs such
as different kinds of widgets for query resultvisualization and
browsing through the data, are documented. The help is included in
the platform bundleas well as available through the online
reference (see above). It documents the underlying key concepts
4http://openrdf.callimachus.net/sesame/2.7/apidocs/org/openrdf/repository/Repository.html5Depending
on the size of the database, ontology and mappings this may take
some time.6http://optique.fluidops.net/platform/help.pdf
25
http://openrdf.callimachus.net/sesame/2.7/apidocs/org/openrdf/repository/Repository.htmlhttp://optique.fluidops.net/platform/help.pdf
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
Figure 3.3: Widget for configuration and initialization of the
Query Transformation component.
of template mechanism for browsing, searching and semantic
authoring of resource using a semantic wikiapproach.
3.3.2 Visual Query Formulation
The OptiqueVQS initially includes three widgets as depicted in
Figure 3.4, with examples from [3]:
• The first widget (W1 - see the bottom-left part of Figure 3.4)
is a menu-based query by navigationwidget and allows users to
navigate concepts through pursuing relationships between them,
hencejoining relations in a database.
• The second widget (W2 - see the bottom-right part of Figure
3.4) is a form-based widget, which presentsthe attributes of a
selected concept for selection and projection operations.
• The third widget (W3 - see the top part of Figure 3.4) is a
diagram-based widget and provides anoverview of the constructed
query and affordances for manipulation.
These three widgets are orchestrated by the system, through
harvesting event notifications generated by eachwidget as a user
interacts, to jointly extract and represent the information need of
a user.In a typical query construction scenario, a user first
selects a kernel concept, i.e., the starting concept, fromW1, which
initially lists all domain concepts accompanied with icons,
descriptions, and the potential/ap-proximate number of results. The
selected concept becomes the focus/pivot concept (i.e., the node
colouredin orange or highlighted), appears on the graph (i.e., W3)
as a variable-node, W2 displays its attributes, andW1 displays all
concept-relationship pairs pertaining to this concept.
The user can select attributes to be included in the result list
(i.e., using the “eye" button) and/or imposeconstraints on them
through form elements (i.e., W2). Currently, the attributes
selected for output appearon the corresponding variable-node in
black with a letter “o", while constrained attributes appear in
bluewith letter “c". Note that W1 does not purely present
relationships, but combine relationship and conceptpairs (i.e.,
relationship and range) into one selection; this helps us to reduce
the number of navigational
26
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
Figure 3.4: OptiqueVQS – an example query is depicted for the
Statoil use case.
levels that a user has to pass through. The user can select any
available option from the list, which resultsin a join between two
variable-nodes over the specified relationship and moves focus to
the selected concept(i.e., pivot). The user has to follow the same
steps to involve new concepts in the query and can alwaysjump to a
specific part of the query by clicking on the corresponding
variable-node. The arcs that connectvariable-nodes do not have any
direction, since for each active node only outgoing relationships,
includinginverse relationships, are presented for selection in W1;
this allows queries to be always read from left toright.
In W3, we employ node duplication approach for cyclic queries
for the sake of having tree-shaped rep-resentations for queries,
hence avoiding graph representation, which might be complex for
end-users tocomprehend.
An example query is depicted in Figure 3.4 for the Statoil use
case. The query asks for all fields thatcontain an oil producing
facility and are operated by the Statoil company. In the output, we
would like tosee the name of the field and the name of the
facility. The user can delete nodes by switching to delete modeor
assert that two variable-nodes indeed refer to same variable (i.e.,
cyclic query). Affordances for these areprovided by the buttons at
the bottom-left part of the W3.
The user can also switch to SPARQL mode and see the textual form
of the query by clicking on “SPARQLQuery" button at the
bottom-right part of the W3 as depicted in Figure 3.5. The user can
keep interactingwith the system in textual form and continue to the
formulation process by interacting with the widgets.For this
purpose, pivot/focus node is highlighted and every variable-node is
made clickable to allow users tochange focus. Currently, the
textual SPARQL query is non-editable and is for didactical
purposes, so thatadvanced end-users, who are eager to learn the
textual query language, could switch between two modes andsee the
new query fragments added after each interaction.
27
-
Optique Deliverable D2.3 First Prototype of the Core
Platform
Figure 3.5: OptiqueVQS – the example query is depicted in SPARQL
form.
28
-
Chapter 4
Conclusion and Outlook
In this deliverable we presented a first prototype of the
Optique platform with specific focus on the integra-tion of
existing components from the technical workpackages. In its core
the Optique platform rests on theInformation Workbench, which
serves as a semantic platform for integration. During the first
project phase,the platform have been extended with specialized
programming interfaces according to the requirementsspecification
and overall architecture as defined in Deliverable 2.1. The shared
interfaces enable a seamlessintegration and interaction of
components with the platform and among each other. More
specifically, theplatform allows components such as those for
Ontology and Mapping Management, Query Transformationand Visual
Query Formulation to access required OBDA assets (ontologies,
mappings, metadata and config-urations) from the platform’s shared
metadata repository in a standardized and convenient way. On the
userinterface side a number platform widgets and wizards have been
created that support IT-Experts in settingup and deploying the
Optique system to specific use-case domains. So far the platform
has been successfullydeployed for the Siemens and Statoil
use-cases.1 Furthermore, relevant user interface components such
thosefor visual query formulation and browsing have been presented
to a group of End-Users during the respectiveworkshops.Ongoing work
with respect to the second year, focuses on the design and
implementation of interfaces forcloud automation. In particular,
native support for managing computational cloud resources will
enable atight integration of the Distributed Query Answering and
Processing Component (APD). While the currentplatform bundle
already includes special JBDC driver for accessing ADP, the
integration with the componentfor query transformation is an
ongoing activity. For details, please refer to Deliverable 7.1.
Moreover, somerefinements to the Relational Schema API are need in
order the enable access to meta-data across multipleschema and user
rights. This will be of particular importance to the Statoil
use-case. Moreover, we areplanning to provide a dedicated
infrastructure for storing and managing queries as first order
objects inthe platform. While this will primarily support the task
of building and managing domain specific querycatalogs, it also
contributes to the challenge of establishing a proper framework for
continuous performanceand integration testing. Finally, we are
discussing about improvements to the usability (i.e., exception
han-dling) as well as to the performance (i.e., the initialization
and reconfiguration of the query transformationcomponent).
1See http://fact-pages.fluidops.net and
http://optique-siemens.fluidops.net
29
http://fact-pages.fluidops.nethttp://optique-siemens.fluidops.net
-
Bibliography
[1] Diego Calvanese, Giuseppe De Giacomo, Domenico Lembo,
Maurizio Lenzerini, Antonella Poggi, MarianoRodriguez-Muro,
Riccardo Rosati, Marco Ruzzi, and Domenico Fabio Savo. The MASTRO
system forontology-based data access. Semantic Web, 2(1):43–53,
2011.
[2] Jim Crompton. Keynote talk at the W3C Workshop on Semantic
Web in Oil & Gas Industry: Houston,TX, USA, 9–10 December,
2008. available from
http://www.w3.org/2008/12/ogws-slides/Crompton.pdf.
[3] Evgeny Kharlamov, Martin Giese, Ernesto Jiménez-Ruiz, Martin
G. Skjæveland, Ahmet Soylu, DmitriyZheleznyakov, Timea Bagosi,
Marco Console, Peter Haase, Ian Horrocks, Sarunas Marciuska,
ChristophPinkel, Mariano Rodriguez-Muro, Marco Ruzzi, Valerio
Santarelli, Domenico Fabio Savo, Kunal Sen-gupta, Michael Schmidt,
Evgenij Thorstensen, Johannes Trame, and Arild Waaler. Optique 1.0:
Se-mantic access to big data: The case of Norwegian Petroleum
Directorate’s FactPages. In InternationalSemantic Web Conference
(Posters & Demos), pages 65–68, 2013.
[4] Mariano Rodriguez-Muro and Diego Calvanese. High Performance
Query Answering over DL-Lite On-tologies. In KR, 2012.
30
http://www.w3.org/2008/12/ogws-slides/Crompton.pdfhttp://www.w3.org/2008/12/ogws-slides/Crompton.pdf
-
Glossary
ADP Athena Distributed ProcessingAPI Application Programming
InterfaceCLI Command Line InterfaceDL Description LogicIWB FOP
Information WorkbenchJDBC Java Database ConnectivityNPD Norwegian
Petroleum DictorateOBDA Ontology-based Data AccessOS Operating
SystemOWL Web Ontology LanguageO&M Ontology and MappingQA Query
AnsweringPoJo Plain Old Java ObjectQF Query FormulationQT Query
TransformationRDB Relational Data BaseRDBMS Relational Data Base
Management SystemRDF Resource Description FrameworkREST
Representational State TransferR2RML RDB to RDF Mapping
LanguageSPARQL SPARQL Protocol and RDF Query LanguageSQL Structured
Query LanguageURI Uniform Resource IdentifierVQS Visual Query
Formulation SystemW3C World Wide Web Consortium
31
1 Introduction2 Optique Platform Architecture &
Implementation2.1 Architectural Overview2.2 The Information
Workbench as a Platform for Integration2.3 Shared Platform
Interfaces2.3.1 Ontology Management API2.3.2 Relational Schema
API2.3.3 R2RML Mapping Management API2.3.4 RDF Data Management
API
2.4 Integration of Components2.4.1 Ontology & Mapping
Management2.4.2 Query Transformation2.4.3 Query Formulation
3 Documentation and Prototype3.1 Installation Instructions3.1.1
Obtaining the platform bundle3.1.2 Installation Requirements3.1.3
Installation3.1.4 Opening the Optique platform3.1.5 Shutting down
the Optique platform
3.2 Administration Guide3.2.1 Setup and Configuration Steps3.2.2
Full Administrator Guide
3.3 End User Documentation3.3.1 General Platform Features for
Exploration, Search, Authoring and Visualisation3.3.2 Visual Query
Formulation
4 Conclusion and OutlookBibliographyGlossary