Undefined 1 (2010) 1–20 1 IOS Press Context-driven RDF Data Replication on Mobile Devices 1 Stefan Zander a and Bernhard Schandl a a University of Vienna, Research Group Multimedia Information Systems Liebiggasse 4/3-4, 1010 Wien, Austria E-mail: {firstname.lastname}@univie.ac.at Abstract. With the continuously growing amount of structured data available on the Semantic Web there is an increasing desire to replicate such data to mobile devices. This enables services and applications to operate inde- pendently of the network connection quality. Traditional replication strategies cannot be properly applied to mobile systems because they do not adopt to changing user information needs, and they do not consider the technical, environmental, and infrastructural restrictions of mobile devices. Therefore, it is reasonable to consider contextual information, gathered from physical and logical sensors, in the replication process, and replicate only data that are actually needed by the user. In this paper we present a framework that uses Semantic Web technologies to build comprehensive descriptions of the user’s information needs based on contextual information, and employs these descriptions to selectively replicate data from external sources. In consequence, the amount of replicated data is reduced, while a maximum share of relevant data are continuously available to be used by applications, even in situations with limited or no network connectivity. Keywords: Mobile applications, data replication, context awareness 1. Introduction Mobile devices have become central parts of our everyday lives for managing our digital as- sets and lifestyle. Due to the convergence of tradi- tionally separated networks and information chan- nels and the continuing technical progress of mo- bile devices, network and online services can now be accessed regardless of spatial or temporal con- straints: anytime, anywhere, and whenever net- work connection allows to do so. In parallel, with the emergence of the Web of Data, the amount of structured data available on the Web and in particular on the Semantic Web has been contin- uously growing throughout the last years. An in- creasing advent of applications that utilize and integrate such data from different, distributed sources can be observed, providing additional ser- 1 This paper is an extended version of [49]. vices on top to users or software agents. This trend is even more accelerated by so-called Semantic Web 2.0 [23] applications. A common strategy to maintain service avail- ability and to guarantee a certain service quality is replication of remote data sets. However, tradi- tional replication mechanisms do not apply prop- erly to mobile scenarios for the following reasons: – Technical limitations: mobile devices are re- stricted in terms of memory capacity, CPU performance, power supply, and heat genera- tion, which may hinder the local replication of large data sets. – Environmental, infrastructural, and security constraints : network connectivity may be lim- ited technically (no cellular radio coverage), economically (the network connection may be expensive), or because of security restrictions (e.g., when the network does not permit to es- tablish a VPN connection). Consequently, in 0000-0000/10/$00.00 c 2010 – IOS Press and the authors. All rights reserved
26
Embed
Context-driven RDF Data Replication on Mobile Devices 1 · Context-driven RDF Data Replication on Mobile Devices1 Stefan Zandera and Bernhard Schandla a University of Vienna, Research
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Undefined 1 (2010) 1–20 1IOS Press
Context-driven RDF Data Replication onMobile Devices 1
Stefan Zander a and Bernhard Schandl a
a University of Vienna, Research Group Multimedia Information SystemsLiebiggasse 4/3-4, 1010 Wien, AustriaE-mail: {firstname.lastname}@univie.ac.at
Abstract. With the continuously growing amount of structured data available on the Semantic Web there is anincreasing desire to replicate such data to mobile devices. This enables services and applications to operate inde-pendently of the network connection quality. Traditional replication strategies cannot be properly applied to mobilesystems because they do not adopt to changing user information needs, and they do not consider the technical,environmental, and infrastructural restrictions of mobile devices. Therefore, it is reasonable to consider contextualinformation, gathered from physical and logical sensors, in the replication process, and replicate only data that areactually needed by the user. In this paper we present a framework that uses Semantic Web technologies to buildcomprehensive descriptions of the user’s information needs based on contextual information, and employs thesedescriptions to selectively replicate data from external sources. In consequence, the amount of replicated data isreduced, while a maximum share of relevant data are continuously available to be used by applications, even insituations with limited or no network connectivity.
Keywords: Mobile applications, data replication, context awareness
1. Introduction
Mobile devices have become central parts ofour everyday lives for managing our digital as-sets and lifestyle. Due to the convergence of tradi-tionally separated networks and information chan-nels and the continuing technical progress of mo-bile devices, network and online services can nowbe accessed regardless of spatial or temporal con-straints: anytime, anywhere, and whenever net-work connection allows to do so. In parallel, withthe emergence of the Web of Data, the amountof structured data available on the Web and inparticular on the Semantic Web has been contin-uously growing throughout the last years. An in-creasing advent of applications that utilize andintegrate such data from different, distributedsources can be observed, providing additional ser-
1This paper is an extended version of [49].
vices on top to users or software agents. This trendis even more accelerated by so-called SemanticWeb 2.0 [23] applications.
A common strategy to maintain service avail-ability and to guarantee a certain service qualityis replication of remote data sets. However, tradi-tional replication mechanisms do not apply prop-erly to mobile scenarios for the following reasons:
– Technical limitations: mobile devices are re-stricted in terms of memory capacity, CPUperformance, power supply, and heat genera-tion, which may hinder the local replicationof large data sets.
– Environmental, infrastructural, and securityconstraints: network connectivity may be lim-ited technically (no cellular radio coverage),economically (the network connection may beexpensive), or because of security restrictions(e.g., when the network does not permit to es-tablish a VPN connection). Consequently, in
2 S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices
case of unstable network connections only themost important and relevant data sets shouldbe replicated.
– Different application and operation models:mobile devices employ different applicationmodels and operating system infrastructuresoriginating from ad-hoc and situational usage-patterns. For instance, since mobile devicesuse different modalities in accessing infor-mation resources and are operated in differ-ent contexts, it is more common that currenttasks might be intermitted abruptly or movedto the background.
Due to these significant differences, mobile datareplication should consider the importance ofreplicated data in relation to user tasks and activ-ities as well as their operating environments. Wetherefore adopt the concepts of context and con-text awareness and utilize them for replication ofRDF data sets to mobile devices. This allows fora proactive, selective, and transparent replication,focusing on the user’s situation and informationneeds. Our proposed solution addresses these is-sues from two sides: first, it considers the current(and future) context of the user and, based on thisinformation, selects subsets of remote data sourcesfor replication. Hence, the amount of data to bereplicated (i.e., to be transferred to, and stored onthe mobile device) is reduced. Second, these sub-sets are replicated to the mobile device proactivelyand transparently, whenever network connectivityallows to do so. As a consequence, data are stillavailable when no network connectivity is present,while access times are significantly reduced sincedata can be reused from the local replica. As aside effect, semantic technology infrastructure isbrought to mobile devices, which can be utilizedby any application.
This paper presents the MobiSem Context Frame-work1, which is designed as a situation-sensibleinfrastructure framework for Semantic Web ap-plications running on mobile devices. It uses aloosely coupled combination of context- and dataproviders to populate the local triple store withdata from remote sources. It considers context in-
1The MobiSem Context Framework has been developedin the course of the MobiSem project (see http://www.
mobisem.org) and is currently being transitioned into a
commercial solution.
formation acquired from the device itself or thesurrounding environment, thus hiding the tasksof context acquisition and data provisioning fromusers and applications.
We want to motivate our approach through atypical use case of a knowledge worker and itsdaily working data items. In this scenario, we as-sume that the user will be on travel during thenext three days, where a number of business meet-ings will take place. The user cannot rely on astable network connection during this trip. There-fore, it is desirable to transfer relevant informa-tion about the meetings and, in particular, thepersons that will participate in the meetings tothe mobile phone. This information can originatefrom a variety of sources, including public knowl-edge bases like Wikipedia/DBpedia2 and GeoN-ames3, or the user’s private data which mightbe available through their Semantic Desktop en-vironment (e.g., relevant documents, email mes-sages, and to-do lists, which are represented in amachine-processable format).
The next section (Section 2) gives an introduc-tion to the notion of context and context aware-ness, and it elaborates on how they can be aug-mented using Semantic Web concepts and tech-nologies. It is followed by an overview of cur-rently available mobile RDF frameworks and re-lated context-aware Semantic Web projects (Sec-tion 3). The architecture of the MobiSem frame-work, which can be deployed to mobile devices,and technical details are presented (Section 4).The feasibility of thiscontext awareness approachis shown through a prototypical implementation ofan application scenario on the Android platform(Section 5) followed by a comparative performanceevaluation of the underlying RDF processing in-frastructure (Section 6).
2. Context and Context Awareness
For the intelligent provision of user-relateddata,context awareness it is essential to capturethe context in which the user currently operates.We aim to utilize the notion of context in orderto describe and represent the user’s informationneeds so that relevant data sets can be proac-
S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices 3
tively retrieved from external data sources in atransparent and automated manner. In this sec-tion, we provide an introduction to the conceptsof context and context awareness from a techni-cal perspective and discuss their main problemswhen used in information systems. That followed,we present several areas where concepts and tech-nologies from the Semantic Web can substantiallyenhance context-aware computing on mobile sys-tems.
2.1. Context and Context Awareness inInformation Systems
Many definitions have been proposed for thenotions of context and context awareness. Con-text in its widest sense is defined as “everythingthat surrounds a user or device and gives mean-ing to something” [43] as well as “anything thatcan be used to characterize the situation of an en-tity” [14]. We define the term contextual informa-tion to refer to any information that is relevant fordescribing the situation a user or device operatesin. Consequently, context can be acquired explic-itly where context related information is manuallyspecified by the user, or implicitly where contextinformation is captured using specific technologiessuch as sensors or network communication, or bymonitoring user behavior. The main focus of ourframework lies on the implicit acquisition of con-textual information, especially from physical sen-sors embedded in the device or ubiquitous sensorslocated in the immediate vicinity, as well as logicalor software sensors that extract context-relevantinformation from personal sources such as emails,calendars, or web services. In this respect, the chal-lenge is to identify the set of relevant features usedfor capturing and describing a situation or partsof the environment sufficiently [4].
In general, two forms of context awareness canbe found in information systems [21]: direct aware-ness shifts the process of context acquisition ontothe device itself, usually by embodying sensorsthat autonomously obtain contextual informa-tion; e.g., location ascertainment using the device-internal GPS sensor. Indirect awareness, in con-trast, captures contextual information by commu-nicating with sensors or services via the surround-ing environment or infrastructure. For instance,to capture the social context of a user, a mobiledevice may request data from social communities
or portals; to track the user’s location, a remotegeocoding service (based on the user’s IP address)may be employed.
A fundamental problem of context-sensitive sys-tems is that there exists no general model of con-text and context awareness. Especially in mo-bile computing, the notion of context is usedvery ambiguously across communities and is usu-ally defined according to specific application do-mains (cf. [8,37]). This problem is also reflected inthe developments of mobile context-aware appli-cations since no widely accepted and well-definedprogramming model exists, resulting in a tightcoupling and low-level interaction between ap-plication code and context acquisition compo-nents. Consequently, interpretation and exchangeof sensed values is anchored within applicationsin a proprietary manner. Recent approaches pro-pose a more flexible architectural and conceptualdesign for representing and processing context-relevant information by using formal models forcontext and context awareness (e.g., [4,6]) com-plemented with user and task analysis to enablea dynamic interaction between context-, task-,and user models (e.g., [38]), or employ middle-ware infrastructures (e.g., [24,28]) that encapsu-late sensor-specific APIs in dedicated componentsin order to facilitate communication and interop-erability between context processing components4
and the underlying framework, while making useof knowledge representation frameworks such asRDF for describing context information [17,35].
The MobiSem framework extends this idea inthat it has been designed specifically to operateon mobile systems, and to use Semantic Web tech-nologies to acquire, interpret, aggregate, store, andreason on contextual information, independent ofany application or infrastructure. Semantic Webtechnologies and practices, which are designed asan information processing infrastructure for het-erogeneous environments, can help to solve someof the issues described before, and are thereforehighly relevant for the design and development ofubiquitous and mobile context-aware systems.
4That is context producers such as context acquisitioncomponents, and context consumers such as services, ap-
plications, or the device itself.
4 S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices
2.2. Dynamically Evolving Context Descriptions
Especially in technical disciplines it is a pre-dominant practice to concentrate on sensorial andstatic data such as location, time, identity, ac-tivity etc. (cf. [34,41]). In such disciplines, con-text is predominantly considered as a representa-tional issue, where the focus is put on its codifi-cation and representation [43]. According to thatperception, context can be scoped in advance, isinstance-independent, and separable from user ac-tivities [15]. The reasons for that predominantpractice of utilizing context in information systemscan be attributed to the adherence to existing soft-ware methodologies [43].
However, this static perception entirely neglectsthe dynamic aspects of context, which arise in thecourse of interaction and render the determinationof relevance of contextual facts a priori at designtime impossible. Context should be considered asan emergent phenomenon or feature of interac-tion that is centered around user activities [43]and continuously renegotiated between communi-cating partners [12,15,34]. Therefore, the determi-nation of a relevant set of canonical context prop-erties in advance is very difficult and nearly im-possible [17].
To cope with the dynamic and emergent na-ture of context, a context processing and manage-ment framework must facilitate flexible, extensi-ble, and open context descriptions that are not re-stricted to a single static vocabulary or predefinedschema. Static context descriptions are not ableto deal with unknown context information at runtime, but require links between different contextvocabularies to be specified at design time [17].Therefore, a fundamental requirement of our pro-posed context framework is the ability to handlenew types of context information dynamically us-ing well-accepted standard vocabularies to guar-antee their accuracy and evolution. In this respect,one can observe an analogy to the “real” SemanticWeb which deals with providing infrastructure toprocess information in a distributed and heteroge-neous manner.
A general problem of context management andprocessing is that of context ambiguity [14]. Mostcontext-aware computing approaches are based onthe implicit assumption that the acquired con-text is a 1-to-1 representation of the surroundingreal world context. Obviously, this assumption is
wrong due to the inherently existing differencesin the way context is sensed and represented elec-tronically, and the way it is perceived by individ-uals [14,15]. Therefore, a context framework canonly work on a more or less accurate representa-tion of the surrounding real-world context, wherethe degree of accuracy depends on a multitude ofdifferent factors (e.g., the user’s task at hand, theirinformation needs, personal goals, etc.). The dy-namic nature of context makes it difficult to spec-ify all relevant context parameters at a system’sdesign time, since in general context is always de-fined relative to the situation in which it is used.Modeling context in information systems is there-fore never universal in that a context model en-compasses all information characterizing a certainsituation, but rather represents a relevant sub-set of the constituting characteristics [14,15]. Thisleads to cases of having multiple representationsof the same situation differing in the accuratenessand the contextual aspects they include.
Detecting all artifacts that constitute a spe-cific context is nearly impossible and cannot befulfilled by any context framework. However, ap-plying reasoning and machine learning techniquesonly increases the accuracy of context acquisitionand context recognition processes but never ac-counts for identifying all the possible artifacts con-stituting a specific context or situation respec-tively. Context-aware computing is therefore al-ways an approximation to a real-world situationrather than a 1-to-1 reflection of it.
To deal with that issue, several techniques andmethodologies from different fields such as activ-ity theory [36,43], aspect-oriented context model-ing and modularization [11], or situational reason-ing [6,33] have been applied to process context onhigher, more abstract levels by aligning contex-tual aspects to abstract concepts (e.g., “businessmeeting”) adhering to upper-level ontologies. Theidea is to aggregate and transform quantitativelyacquired context artifacts into qualitative state-ments in order to express complex conceptual re-lationships and dependencies [6], and for applyingclassification-based reasoning techniques [22]. Ad-ditionally, high-level representations of contextualinformation unify access and utilization among ap-plications. Context consumers do not need to befamiliar with low-level data processing and inter-pretation, thus context sharing and exchange aresimplified. Such transformations are considered as
S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices 5
a means to make contextual information domain-independent.
The MobiSem Context Framework follows thisidea in that it provides the technical infrastruc-ture on which additional more sophisticated lay-ers (e.g., for situation-awareness) can be deployed.Such layers allow for aggregating the contextualartifacts acquired by the underlying frameworkand apply different methodologies (e.g., Bayesiannetworks, case-based reasoning, stochastic meth-ods, etc.) for their interpretation, consolidation,and augmentation. Section 5 describes a use casein which new contextual information can be de-rived by intelligently combing independently ac-quired contextual artifacts to provide a more so-phisticated representation of a contextual aspect(in that case, location).
A general approach to systematically managecontext information is to use ontologies, whichprovide a common structure for representing anddescribing information. The Resource DescriptionFramework (RDF) enables communication andsharing of context descriptions between collabo-ratively communicating partners; i.e., services ordevices. Its open architecture allows for the inte-gration of different vocabularies so that contextdescriptions can dynamically grow and becomemore elaborated. Different works in the fields ofpervasive and ubiquitous computing (e.g., [8,17])have shown that both RDF(S) and OWL are ap-propriate languages for representing dynamic andevolving context descriptions [34]. Since these aregrounded on the open world assumption, the possi-bility of adding new and more detailed informationto existing descriptions makes them applicable indynamic and unpredictable environments.
Ontologies further help in matching expressedcontext information to application or service needsin that only relevant statements are extracted. Acontext consumer, i.e., a device or application onlyneeds to query for the information it is interestedin, instead of processing the entire context descrip-tion. If parties expose context descriptions thatcannot be understood by others, ontology match-ing algorithms can be applied in order to reconciledifferences in the description semantics. Ontologyalignment services [16] can be used to account for
the compatibility between different context mod-els by identifying correspondences between con-text descriptions and performing query transfor-mations to better reflect domain and informationspace evolutions [17].
Semantic technologies facilitate both direct andindirect context awareness, since context-relatedinformation can be acquired from external servicesor repositories in a structured and well-definedway based on explicitly represented semantics us-ing open standards. Sensorial context data canbe mapped to vocabularies so that sensed valuesare embedded in a controlled context descriptionbased on ontological semantics, where new factscan be discovered via aggregation and reasoning.In this respect, RDF simplifies the aggregation ofheterogeneous context information, both on thesemantic and syntactical layer.
Since technologies and concepts from the Se-mantic Web have been designed for heterogenousenvironments, they offer languages and technolo-gies that serve as standards for expressing contex-tual information, and can therefore be shared andexchanged among systems and applications. RDFfurther allows one to represent contextual infor-mation in multiple ways by using different vocab-ularies and transformation rules so that it can beused and understood by different components orcontext consumers. RDF thus facilitates transfor-mations or mappings between heterogenous con-text representations as well as the reconciliation ofcontext heterogeneity.
If context-relevant data are represented usingSemantic Web languages, they can be integratedand processed even if they were not known at de-sign time of a mobile system (see [17] p.30 foran example). This also applies to divergent sensoror service feature descriptions where the identifi-cation of correspondences between heterogeneousdescriptions serves as a basis for utilizing servicesand integrating acquired information that werenot been anticipated at design time of a mobilesystem.
Additionally, ontologies facilitate the interpre-tation of sensed or derived values to allow for theiraggregation and transformation into symbolic val-ues, i.e., transforming collected data into state-ments adhering to a prescribed vocabulary. Hence,context acquisition components do not need to an-ticipate possible queries beforehand, but providethe data they have and let the requesting compo-
6 S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices
nents decide which information is of relevance tothem.
The Semantic Web community has already de-veloped a wide range of vocabularies that canbe used to describe contextual information (in-cluding physical parameters like time5 and loca-tion6, technical parameters7, or social aspects8).The terms defined in these vocabularies are knownacross communities and adhere to a well-definedand commonly understood semantics. Such vocab-ularies facilitate data interchange between hetero-geneous systems, and are often maintained by alarge number of people to guarantee their accu-rateness and relevance. Not being bound to a sin-gle vocabulary also adheres to the idea of dynamicand flexible context descriptions evolving in thecourse of user-relevant activities that can not bedetermined a priori—especially not at design timeof a mobile system or a mobile application.
In this section, we have outlined some of the ar-eas in context-aware computing where SemanticWeb technologies can make substantial contribu-tions in representing, processing, and sharing con-textual information as well as in the reconcilia-tion of heterogeneous context semantics. The po-tential benefits of semantic technologies for relatedareas such as pervasive computing have been dis-cussed in previous works [17]. In the following, wediscuss relevant work in terms of related mobilereplication approaches, mobile RDF frameworks,and existing context-aware mobile Semantic Webapplications, and provide an overview of the Mo-biSem Context Framework that implements theideas and concepts presented in this section. Wedenote this form of context-aware computing asSemantic Web-based context-aware computing.
3. Related Work
For realizing our idea of making Semantic Webtechnologies available on mobile systems for theintelligent, context-dependent provision of user-related data, we analyzed existing Semantic Webframeworks according to their appropriateness anddeployability on mobile platforms and discuss ex-
isting projects that aim to synthesize semantictechnologies, mobile systems, and context-awarecomputing.
3.1. Mobile Data Replication
The problem of replicating data to mobile de-vices is not new. Standard replication strategies—as known e.g., from relational data bases—cannotbe directly applied to mobile scenarios because ofthe special restrictions imposed by changing con-text parameters, as outlined in Section 2. There-fore, several algorithms were proposed that esti-mate the costs of data usage based on various con-text parameters, and adapt the used replicationstrategies accordingly (e.g., costs of data transmis-sion [27], access frequency [47], location [48], ordevice and environment characteristics [3]). Theseapproaches are highly optimized towards singlespecific context parameters but do not considerthe entire user context; especially they do not fo-cus on the semantics of replicated data. However,they can be considered complementary to our ap-proach since they can be used to determine thefrequency of replication updates.
Several approaches follow a more generic strat-egy and provide architectures that are extensiblew.r.t. the considered context parameters and repli-cated data (e.g., [25,32]). However, all these ap-proaches are depending on a server infrastructure,on which context processing and inference tasksare performed. To the best of our knowledge, noapproach exists that solely relies on processing ex-ecuted on the mobile device itself, without depend-ing on external components and services.
3.2. Mobile Semantic Web Frameworks
Typical Semantic Web frameworks like Sesame9,Virtuoso10, and Jena11 hide the details of RDFdata processing, serialization, and query execu-tion from higher-level applications. However, theseheavy-weight systems cannot be deployed on typ-ical mobile devices because of their limited mem-ory and processing capacities, latencies as wellas incompatible application models and operat-ing system infrastructures [18,30]. Those frame-
S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices 7
works are usually developed for powerful server ordesktop computing infrastructures incorporatingmany-core architectures, whereas mobile devicesin general contain dedicated single-core RISC-based processors whose architecture was not de-signed for processing large data amounts.
Although they have proven to be powerfulmeans to process, store, and reason over RDFdata, they cannot be efficiently deployed on mobilesystems due to the previously mentioned reasonsand are therefore not considered in our relatedwork analysis. Instead, we exclusively concentrateon RDF frameworks that have been specificallydesigned for deployment on mobile platforms andare available as Java libraries as well as mobilequery and storage frameworks that are built ontop of existing RDF frameworks and provide ad-ditional functions for local RDF data query andpersistence.
3.2.1. Mobile XML ParserskXML12 is a lightweight XML pull parser that
was specifically designed for constrained environ-ments such as Applets or Java ME-based mobiledevices. It is based on the Common XML PullAPI 13 and combines advantages of XML DOMand SAX parsers, such as aligning XML process-ing routines to the structure of an XML documentand, at the same time, providing instant access toparsed document elements. It was specifically de-signed to be used in CLDC14 applications. How-ever, development stalled in 2005.
NanoXML for J2ME (+RDF/OWL)15 is aJ2ME port16 of the original non-validating XMLparser NanoXML17 for Java, and has been ex-tended with RDF and OWL support. It is dedi-cated to mobile environments and offers conve-nience methods for navigating and retrieving datafrom RDF and OWL documents such as resourceor property values, but neither supports inferenc-ing nor elaborates on RDFS/OWL semantics.
3.2.2. Mobile RDF FrameworksMobile RDF18 is a Java-based open source im-
ing a simple and easy-to-use API for accessingand serializing RDF graphs. It is specifically de-signed for Java ME Personal Profile19 and Con-nected Device Configuration (CDC)20 compliantdevices, which is one of the main drawbacks ofthis framework since these application environ-ments are only supported by a comparatively smallamount of devices, namely those that employ aCDC-specific Java Virtual Machine (JVM). Mostcurrent and older J2ME-compliant devices deploythe more widely-used CLDC profile. It providesspecific packages for creating, parsing, and serial-izing RDF/S and OWL ontologies, and supportsRDF Schema type and property propagation rulesas well as rule-based inferencing. However, RDFgraph modifications like deleting or editing RDFtriples are not supported.
µJena21 is a port of the popular Jena Seman-tic Web framework, targeted for low-capacity mo-bile and embedded devices. Although its API iscurrently in a prototypical state and only allowsfor processing RDF data serialized in N-Triplesformat, it covers the entire set of RDF modelingprimitives, provides ontology and limited inferencesupport, as well as convenience classes for han-dling OWL ontologies. Like in Jena, RDF dataare represented on two levels: on the lower moregeneric level, µJena stores triple nodes, where amodel API is deployed on top that offers con-venience methods for accessing and manipulatingRDF models.
Androjena22 is a more recent Jena port specif-ically created for the Android platform. It adoptsJena version 2.6.2 and offers all the functionsand libraries Jena includes such as full RDF andontology support, inferencing, as well as read-ing and writing RDF data in different serializa-tion formats. The Androjena core libraries—asthe original Jena libraries—do not include spe-cific APIs for querying RDF data, persistence,Named Graphs [10], or support for external rea-soners. However, to provide at least a minimumof query functionality, the Androjena project page
19http://java.sun.com/products/personalprofile/20CDC is a framework specification for deploying and
sharing mobile Java applications on hardware-constraintdevices such as mobile devices or set-top boxes. It defines a
basic set of libraries and virtual machine features that theunderlying runtime environment must exhibit.
8 S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices
also hosts the ARQoid project23, which is a re-duced port of Jena’s SPARQL query engine ARQ.Currently, ARQoid is in prototypical status andlacks some of ARQ’s original features such as full-text query support.
In summary, none of the existing mobile RDFframeworks fully supports queries on RDF datavia SPARQL or other query languages, althoughAndrojena provides a prototypical implementa-tion of the Jena ARQ libraries. A storage mecha-nism that translates RDF data into internal stor-age formats used by mobile devices (e.g., theSQLite database provided natively by the Androidplatform) and vice versa could also not be found.
3.2.3. Query and Persistence FrameworksRDF On the Go24 is a full-fledged RDF stor-
age and query framework specifically designed andimplemented for mobile devices that feature theAndroid operating system. It follows an approachsimilar to Androjena, as the Jena core APIs in-cluding ARQ have been adapted to the Androidplatform to allow developers to directly operateon and manipulate RDF data models. The pri-mary storage infrastructure are B-Trees as pro-vided by a lightweight version of the Berkley DB25
adopted for mobile usage and deployment. The in-ternal query processor provides support for bothstandard and spatial SPARQL queries, where anR-Tree based indexing mechanism is used for stor-ing URIs with spatial properties [31]. The currentversion as of March 2011 supports a large set ofstandard SPARQL query operations where aggre-gation, sorting, and some spatial operations aresubject to future implementations [31].
SWIP: Semantic web in the pocket26 was devel-oped in order to support RDF data storage andexchange in a uniform, schema-less, and system-wide way based on the Linked Data principles [5].SWIP represents an Android-specific implemen-tation of an RDF storage infrastructure that isbased on the Android-internal concept of Con-tentProviders27 for application-wide data storageand exchange across applications and processes.
It maps URIs to data stored in the local SQLitedatabase deployed on Android systems and re-turns data in the form of triple sets or tuple ta-bles. It employs a simple subject-predicate-objecttable layout for RDF data storage and is currentlyin prototypical status [13]. For demonstration pur-poses, data stored in device-internal data sourcessuch as calendar entries or contacts have been ex-posed as RDF-based Linked Data and visualizedthrough a generic browser interface.
However, these RDF storage and query infras-tructures are available as experimental prototypesor concept studies and lack specific storage andquery optimizations for mobile platforms. Never-theless they demonstrate that typical RDF pro-cessing and storage tasks can be executed on mo-bile devices although the efficient execution ofcomplex processing operations (e.g., reasoning) orindexing mechanisms is still subject to further re-search.
3.3. Mobile Semantic Web Applications
DBpedia Mobile28 [2], a location-aware mobileapplication, allows users to access informationfrom the DBpedia project29 about the physical en-vironment surrounding them. Users are able to re-ceive additional information by exploring links toother resources located in the Semantic Web.
mSpace Mobile30 [46] takes a similar approach,where access to related location-based informationwith respect to the user’s current situation is pro-vided via a spatial browser. Considered contextsare time, space, and subject.
IYOUIT31 [6] collects contextual informationabout certain aspects of the user’s lifestyle—suchas visited places, or people met—and displaysthem on the Web. People are able to share theirpersonal contexts within a community portal.
Although these projects make use of SemanticWeb technologies such as RDF, the processing ofcontextual data is done on external servers or ap-plications rather than on the device itself. Thismeans, however, that in case of missing networkconnectivity the applications become practicallyuseless. While a system that is deployed on the
S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices 9
mobile device also does not allow to proactivelyupdate data from remote sources without connec-tivity, it provides at least a local buffer of thedata that has been replicated so far, and hence al-lows the user to continue using the applications,although in a restricted manner. Another distinctaspect is that context acquisition and context rep-resentation is not limited to a predefined set ofcontextual aspects, i.e., the context descriptionscreated by the framework are dynamic and includeas many aspects as could be acquired. Applica-tions can process the data they are interested inleading to a greater flexibility in elaborating oncontextual constellations.
In summary, our analysis revealed that context-driven replication of RDF data to mobile deviceshas not been addressed by current or related re-search yet. The RDF frameworks currently avail-able for mobile systems provide the necessaryfunctions for such a replication infrastructure al-though much space is left for optimization. In Sec-tion 6 we therefore analyze the performance ofmobile RDF frameworks in creating, parsing, andstoring RDF triples directly on a device. Thereexist a few mobile storage and query frameworkshowever, but they are mostly in prototypical sta-tus to date although recent developments indi-cate an increasing awareness of deploying Seman-tic Web technology on mobile devices (cf. exploit-ing linked data for mobile Augmented Reality [39],SWIP [13], i-MoCo [45]).
4. System Design and Architecture
The MobiSem framework has been specificallydesigned for direct deployment on mobile plat-forms. This allows it to acquire, process, store, andmanage contextual information independently ofany application or client-server infrastructure. Themain goals of the MobiSem Context Frameworkcan be summarized as follows:
– To provide a storage repository for semanticdata on a mobile device. With the increas-ing proliferation of services based on SemanticWeb technologies, the need for mechanismsto store, manipulate, and retrieve RDF dataon mobile devices becomes apparent. The lo-cal storage of RDF data on a mobile devicenot only reduces the dependency on a perma-
nent network connection, but also enables theimplementation of more efficient search andreasoning algorithms, and extends the user’slocal information space.
– To make efficient use of available context in-formation. Modern mobile devices provide amagnitude of options to capture the user’scontext, which can be used to infer future in-formation needs and adapt application anddevice behavior. A semantically appropriateinterpretation of these context data helps tobuild more user-oriented applications and ser-vices and enhance the overall mobile user ex-perience.
– To proactively provide context-relevant dataon the device. As stated before, we cannot relyon a permanent network connection in mobilescenarios. On the other hand, we can infer fu-ture information needs from the user’s cur-rent context information and thus proactivelyretrieve data from remote data sources to themobile device that might become relevant inthe future, and buffer it using the local stor-age repository.
– To provide the technical infrastructure forhigh-level context processing. The dynamicand flexible characteristic of our contextframework enables the deployment of addi-tional high-level context recognition and uti-lization services on mobile devices to enablesituation-awareness (cf. [1,19,33,42,44]). Theframework facilitates almost all aspects of amobile context processing and managementarchitecture and serves as a foundation for thesystematic management and exchange of con-text descriptions using open semantic stan-dards.
To realize these goals it is necessary to combinethe processing of context information with the lo-cal replication of remote data sources. However, itis also necessary to keep the framework design asflexible as possible: it depends on the capabilitiesof the mobile device which context information canbe tracked. Further, the user’s information needsmight evolve over time, hence the approach cannotbe restricted to a fixed set of remote data sourcesand should be flexible enough to enable the dy-namic integration of new potential context sourceson the fly.
10 S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices
Data Provider
Data Provider
Triple Store
MobiSemData Access
API
Mobile Application
Linked Data
Web 2.0Application
SemanticWeb
Service
Query Languagee.g. SPARQL
HTTP Request / Response
API / Remote Procedure Call
Physical Sensors
Logical / Software Sensors
(active)Context Provider
Low-level Context
Acquisition
(active)Context Provider
(passive)Context Provider
Context Provider Orchestration
ContextDispatcher
AggregationMerging
Reasoning
Notification
Global RDF-based Context Model
Data Provider
RDF-based Context Descriptions
Replicated RDF Data
Mobile ApplicationMobile
Application
Mobile Device
Fig. 1. Architecture of the MobiSem Context-Processing Framework
We have decided to decouple the tasks of con-text acquisition and data replication (cf. Figure 1).Context relevant data are retrieved by dedicatedcomponents (called context providers) and areconverted into RDF-based context descriptions.These descriptions are aggregated to an RDF-based global context model that is used by dataproviders to replicate RDF data to the device.Replicated data are stored in a local triple storeand made available through a data access API.A loose, data-based coupling between contextproviders and data providers is realized througha context dispatcher, which is notified every timea context provider detects a change in a contextsource it observes. The context dispatcher aggre-gates, consolidates, and reasons on context in-formation, and forwards them to the appropriatedata provider components.
This architecture exhibits two significant advan-tages in comparison to server-based approaches, asit does not require context information to be trans-ferred outside the mobile device. First, the systemdoes not depend on the availability of an externalsystem. Second, all contextual data (which mayinclude highly private information, like the currentposition, contacts, appointments, and so on) areprocessed only on the mobile device, which reducessecurity and privacy issues.
In the following, we describe in more detail theindividual system components.
Context Providers We employ two types ofcontext providers: primary (i.e., active) and com-plementary (i.e., passive) context providers. Pri-mary context providers encapsulate a hardwareor software sensor and become active whenevera change in a context source is detected. Com-plementary, that is passive or re-active contextproviders react according to changes in primarycontext providers and become active when a cor-responding primary context provider delivers anupdated context model. They complement thecontextual data retrieved from primary contextproviders by taking these context descriptions asinput for initiating their acquisition tasks (contextaugmentation).
To provide the necessary flexibility in acquir-ing context-relevant data, context providers im-plement their own logic and heuristics for trans-forming any kind of input data (either sensorial orweb-based content) into an RDF-based context de-scription by using well-defined and well-acceptedsemantic vocabularies. As previously outlined, theacquisition of contextual data should not be re-stricted to capture sensorial data exclusively sincethe Internet and Web 2.0 applications in particu-lar provide excellent sources for gathering context-relevant data. Context providers therefore areable to request data from four different types ofsources:
S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices 11
(i) Hardware sensors that are integrated intothe mobile system such as GPS module, lumi-nosity sensor, camera etc. Most modern mo-bile platforms provide specific APIs for ac-cessing and utilizing locally deployed hard-ware sensors.
(ii) Ubiquitous sensors or devices that are lo-cated in the physical environment [20]. Suchsensors must provide open accessible inter-faces based on open network and access pro-tocols.
(iii) Web applications such as Facebook32, Linked-In33 etc. often contain useful informationw.r.t the users’ social relationships. Onlineand Linked Data repositories in particu-lar provide magnitudes of freely availablecontext-relevant data that can be exploitedfor complementing sensorially captured data.
(iv) Software or logical sensors that allow formonitoring user or application behavior todeduce on the type of data that is relevantto the user in a specific situation.
By employing logical sensors, the acquisition ofuser-related contexts is emphasized. Such sensorscan be adjusted towards a particular system in-frastructure to gather context-relevant informa-tion by monitoring system processes to deduceinformation about the currently running applica-tions as well as the data they operate on34. Con-text providers can make use of context descriptionsfrom other context providers as well as externaldata sources; e.g., a component may use the GPScoordinates provided by another context providerto look up names of the current location using anexternal service35.
Orchestration Framework To facilitate thiskind of cooperation between decoupled contextproviders, an orchestration framework dynami-cally routes data between context providers basedon the type of context information they provide.It orchestrates context providers in form of a di-rected acyclic graph. Within this graph, primarycontext providers represent starting nodes, while
32http://developers.facebook.com/33http://developer.linkedin.com/index.jspa34We implemented software sensors that track user
queries issued to various mobile applications such as
browsers or the internal ‘quicksearch’-function on an An-droid device.
35See Figure 4 in Section 5 for an example.
complementary context providers represent adja-cent nodes. Edges represent data flow betweencontext providers; i.e., they indicate compatibil-ity in terms of contextual data so that the datadelivered by one context provider can be furtherprocessed by another context provider.
The orchestration framework analyzes the datadescription of each context provider. Such a datadescription consists of sets of mandatory and op-tional namespaces as well as terms, which can beprocessed as input data by the respective contextprovider, as well as namespaces and terms that thecontext provider uses in its output data.
Figure 2 depicts an excerpt of an exemplarydata description. This complementary contextprovider extracts contact data from acquired cal-endar data. A data description consists of an in-put description (indicated by the ddesc:input
property) and an output description (indicatedby ddesc:output property). The former specifiesthe data a context provider needs for perform-ing its acquisition tasks. It may contain multipleddesc:vocabulary properties, covering the casethat context providers may be capable of pro-cessing data described with different vocabularies.Multiple vocabulary properties are interpreted bythe orchestration framework as alternatives, thatis, they are interpreted as being connected with alogical or.
A vocabulary specification consists of threeparts: the ddesc:namespace property, which holdsthe vocabulary’s namespace that is used for anupper-level orchestration, and the ddesc:conceptsand ddesc:properties statements, which spec-ify mandatory and optional concepts and proper-ties that the context provider processes. The latterspecifications allow for a detailed, element-levelorchestration of context providers.
Additionally, a data description specifies thenamespaces and terms that the context provideremits as output data (indicated by the ddesc:out-put property). This property is mandatory for allcontext providers. The output description followsthe schema of the input description, consistingof parts for vocabulary, concepts, and properties.In contrast to the input specification, the output
12 S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices
Fig. 2. Exemplary data description for a complementary context provider for extracting contact data from calendar entries
specification may consist only of mandatory ele-ments36.
The orchestration framework can be configuredto either perform a loose orchestration on thenamespace level, or a detailed one by consider-ing concepts and properties given by the contextproviders’ data descriptions. When a new contextprovider is found in the system, the orchestrationmanager analyzes its data description and basedon its configuration integrates the context providerin the orchestration graph. While running com-pletely decoupled from the context framework, re-balancing the orchestration graph does not affectcontext acquisition tasks as such.
The orchestration graph is represented as an ad-jacency matrix whose values are decimal numbersbetween 0 and 1, indicating the degree of compat-ibility between two context providers. The match-ing value for each pair of context providers is com-puted by a matching algorithm based on config-urable scores for correspondences on the names-pace, concept, and property levels. The match-ing algorithms performs an arithmetic match-ing based on data similarities and is addition-ally capable of including RDFS semantics suchas rdfs:subClassOf relationships. For instance, ifone context provider emits foaf:Person instances
36According to the RDF semantics it is possible to spec-ify optional data, although they will not be considered by
the orchestration framework in its current version.
and another context provider requires foaf:Agentinstances as input data, the matching algorithmdetects the compatibility between these differ-ing concepts since foaf:Person is a subclass offoaf:Agent according to the FOAF ontology [9].
Context Dispatcher The context dispatcher isnotified by context providers whenever a con-text description has changed. Before propagat-ing updated context descriptions to data providercomponents, the dispatcher performs additionalprocessing on the data, like inference and con-solidation. Currently, the reasoning componentuses (i) a generic lightweight rule-based reasoner,which allows to specify conditions under whichnew triples are added to the knowledge base,and (ii) hard-coded rules which are expressed byimplementing a Java interface. The combinationof these two mechanisms can, for instance, beused to specify that if one resource has multiplevalues for a functional property, the values de-note the same resource (the corresponding rule(A :ifp X) ∧ (A :ifp Y) ⇒ (X owl:sameAs Y)can be interpreted by the rule-based reasoner),and that multiple resources that are related via aowl:sameAs property can be merged into a singleresource in order to simplify further processing (acorresponding algorithm can be implemented asa Java class and be integrated into the reasoningprocess).
Context descriptions are forwarded not only todata providers, but also back to context providers,
S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices 13
so that they are enabled to mutually reuse andaugment their context descriptions37.
Communication between the context providersand the context dispatcher is realized via a con-text description queue that not only buffers themost recent context updates, but also stores previ-ous context updates for compensation strategies incase a context source is temporarily not availableor malfunctioning. In such cases, the context dis-patcher can revert to previously committed con-text description to continue the context acquisi-tion process. However, the context dispatcher em-ploys some logic to maintain consistency amongaggregated context descriptions.
Global Context Model The global contextmodel represents an aggregated version of all con-text providers’ context descriptions received bythe context dispatcher. It is created whenever aprimary context provider had detected a changein the context source it observes and delivered anupdated context description. This context updatewill first be propagated to all complementary con-text providers to enrich it with additional data.When all context acquisition tasks are completed,the context dispatcher collects the updated con-text descriptions, aggregates them, applies rea-soning rules as described before, and creates theglobal context model while maintaining contextcompleteness, consistency, and accuracy.
Data Providers Data providers are responsiblefor handling RDF data replication tasks. They re-ceive aggregated context description models fromthe context dispatcher and subsequently replicatedata of any kind to the triple store. These dataare usually retrieved from external data sourcesor may be generated by the data provider it-self. For instance, a data provider may act uponchanges of the current location and retrieve infor-mation about nearby points of interest. Each dataprovider is assigned a named graph under whichit stores its data replicas in the triple store.
In addition to the default data providers thatmerely retrieve data from remote sources and storethem in the triple store, we have implemented aselective checkout data provider that makes useof a partial versioning mechanism for RDF triplesbased on triple bitmaps [40] as well as a write-
37Figure 4 depicts an example of augmenting GPS-
coordinates with data from the GeoNames.org web service.
back data provider that synchronizes the partiallyreplicated data back to the repository, if the lattersupports write operations.
Triple Store Modern mobile platforms providetransparent access to persistent storage devices(e.g., flash memory cards) through a file systemAPI. Therefore, the most straightforward way tostore RDF data on a mobile device is to serial-ize it into a file on such a device using a stan-dard RDF serialization format, like RDF/XML orN3. While this storage mechanism is extremelyfast compared to DB-backed mobile storage so-lutions (cf. Section 6), it also has the significantdisadvantage that RDF graphs must completelybe loaded into the mobile device’s working mem-ory (RAM) before they can be further processed(e.g., before a SPARQL query can be issued). Al-ternatively, triples can be stored in a relationaldatabase, which causes an increase of read andwrite times but provides the possibility for struc-tured queries over the data.
Regardless of which actual storage solution isused, it can be wrapped by a Java class that mapsall read and write access methods to correspondingoperations on the underlying physical representa-tion (either flat files or a relational model). Cur-rently, our triple store implementation does notperform in-memory buffering or caching. However,it can be wrapped by an additional in-memoryGraph instance (which provides faster access) thatregularly synchronizes itself with the database-backed instance.
Data Access API Applications can use theMobiSem Data Access API to access data storedin the device’s local triple store. The API assignsto each replicated graph a unique URI, which canbe used to access and retrieve the data containedin the graph. It exposes insert, update, delete andquery methods and offers multi-grained access todata replicas, i.e., applications can access all repli-cas cached in the database, a specific replica, ora specific resource including all adhering triples ofa specific replica.38 In the background, this API
38This functionality is implemented through an Android
Content Provider that allows for defining explicit URIschemes for data replicas through which operating system-
wide data access and data utilization is offered. By exposingdistinct URIs (e.g. content://org.mobisem.rdfprovider/graph#<graphid>) triples can be retrieved, added, deleted,
and updated.
14 S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices
hides the details of context processing and datareplication from applications; from the outside theMobiSem framework looks like a common triplestore whose data are regularly updated.
5. Implementation and Case Study
To demonstrate the feasibility of our architec-ture, we have implemented a prototypical frame-work plus an initial set of context and dataproviders. The selection of these components isbased on the assumption that the informationneeds of a mobile user depend on their currentcontext (e.g., their location) as well as their futurecontext. However, we want to emphasize that thisframework is to be considered as an infrastruc-ture, upon which end-user applications that pro-vide specific functionality, based on specific con-text information and replicated data, can be built.
Our implementation is based on the Androidplatform39 and uses the µJena Framework (cf. Sec-tion 3.2.2) to process RDF graphs40. In the follow-ing we demonstrate how the MobiSem frameworkcan be used to proactively provide RDF data onthe mobile device. Our objective is to permanentlyequip the user with data about the locations theyare going to visit, about people they are likely tomeet in the upcoming days, as well as people thatare based near the user’s current position. To ac-complish this, different kinds of contextual infor-mation are utilized, including the device’s currentposition and the user’s calendar data.
Context Acquisition We have implementedthree context providers: first, a location contextsensor using the device’s built-in GPS unit to trackgeographical coordinates returns context descrip-tions that contain a context:currentLocationproperty to describe the coordinates of the currentlocation (cf. Figure 3).
A second context provider uses the GeoNamesservice41 to resolve GPS coordinates to geograph-ical entities. This component receives context up-dates from the context dispatcher, extracts prop-
39http://developer.android.com40As shown in Section 6, µJena exposes a very weak per-
formance compared to other RDF frameworks; however,
more efficient implementations have been made availableonly recently. We plan to port our implementation to amore efficient RDF framework in the near future.
41http://www.geonames.org
erties that represent geographical coordinates,and returns information retrieved from the webservice—in our example, a reference to a geograph-ical entity as well as its name (cf. Figure 4).
In parallel, a third context provider regularlyscans the user’s calendar and extracts all appoint-ments within the next 72 hours. From these ap-pointments the e-mail addresses of all participantsare extracted and returned, as depicted in Fig-ure 5 (in this case, two e-mail addresses are re-turned). Further, the locations of appointmentsare extracted and are returned as GeoNames fea-tures. This context provider uses terms from theNEPOMUK ontologies42 and from FOAF to de-scribe the extracted resources.
The context dispatcher—which receives notifi-cations from the context providers every time acontext value changes—buffers, combines, and en-riches the context description graphs with addi-tional information. It merges all resources typedas context:Context into a single one, assignsit a URI (enabling it to be referenced by othercontext descriptions), and adds a timestamp aswell as a link to the preceding context de-scriptor. Moreover, it applies simple inferencerules to the context model: for example, thecontext:currentLocation property has been de-fined as functional property (since we assume thatthe user can be at only one location at the sametime), from which the reasoner can deduce that thetwo anonymous location resources returned by thedifferent context providers are actually the sameand can likewise be merged, as shown in Figure 6.
The context dispatcher distributes this ag-gregated context description model to all dataproviders in the system whenever a contextualchange is detected. It is then up to each dataprovider to decide whether to initiate a new repli-cation tasks, and which information from the con-text description they use for this purpose.
Data Provisioning We have implemented anumber of data providers that address different in-formation needs and replicate data from differentsources to the mobile device. One data provideruses the Sindice Semantic Web index43 to retrieveinformation from FOAF descriptions (which aredistributed across the Web) based on the e-mailaddresses found in the context description. This
ncal:location [ a geonames:Feature ; rdfs:label "Munich" ] .
] .
<http://sws.geonames.org/2761369/>
a geonames:Feature ;
rdfs:label "Vienna"@en .
Fig. 6. Aggregated context description model
includes names, contact and location information,and personal interests of the user’s prospectivebusiness partners. Also, it includes the social net-work of the meeting participants and is thereforevaluable information for business negotiations aswell as smalltalk.
A second data provider retrieves triples aboutpeople that are based near the user’s cur-rent location by looking up resources that arefoaf:based near the current and future loca-tions. This information allows the user to increase
the effectiveness of their trip by scheduling addi-tional meetings with these persons without addi-tional travel costs.
A third data provider returns additional datafrom DBpedia about the user’s current and futurelocations, by reusing the GeoNames URI providedby the location context provider (a code excerptfrom this data provider is depicted in Figure 7). Bydoing so, the user is automatically equipped withinformation about the locations they will visit, andabout points of interests in their vicinity.
16 S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices
public class DBpediaLocationDataProvider extends AbstractDataProvider
{
// called when the context description is updated
@Override
protected void updateContextImpl() {
this.currentResourceLabels = new ArrayList<String>();
// iterate over all geonames features in the context model
try { // read model into targetModel (for further processing by the abstract superclass)
this.targetModel.read(url, "N-TRIPLE");
} catch (Exception e) { // error handling
}
}
}
Fig. 7. Code snippet of DBpediaLocationDataProvider, querying DBpedia for data about location resources.updateContextImpl() is called by the context dispatcher every time the global context model is updated, while
updateDataImpl() is called whenever the data provider is requested to actually replicate data from the remote data source.
Fig. 8. Example SPARQL query produced by DBpediaLocationDataProvider
S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices 17
From an initial analysis, we can expect a signifi-cant effect on the amount of potentially interesteddata that is to be replicated to a mobile device.For instance, the public DBpedia data set containsinformation about around 462,000 places. Whileno detailed information is available, from the over-all size of the data set we can estimate that theseplaces are described by around 88 million triples44.By analyzing the user’s calendar and querying DB-pedia for corresponding resources, this amount ofdata can be significantly reduced. For instance, ifthe MobiSem Context Framework detects three lo-cations in the user’s calendar, it can convert theminto a SPARQL query (cf. Figure 7) and query DB-pedia. In case the user’s upcoming events withinthe next 72 hours take place in Vienna, Salzburg,and Munich, the corresponding query (cf. Fig-ure 8) yields around 8,500 triples, which can behandled by common state-of-the-art smartphones(cf. Section 6).
All replicated data is persisted by a storage com-ponent that is compatible to the MobiSem Con-text Framework (cf. Section 4). In the case of An-droid, RDF graphs are either serialized into flatfiles (which is very performant but cannot be di-rectly queried) or are stored into a custom triplestore that is backed by a SQLite database. Itstable layout applies the normalized triple storeapproach; i.e., it stores triples within a Tripletable that holds references to separate tablesfor resources and literals. Moreover, it provideslightweight support for named graphs; thereforethe relational schema contains a separate Graphtable.
Any application built on top of this frameworkis now enabled to directly access these data viathe MobiSem Data Access API. It could, for in-stance, iterate over all resources that are typedas foaf:Person and provide a list of names andphone numbers, disburdening the user from theneed to manually search for these data in casethey will miss an appointment and needs to notifythe participants. The MobiSem framework entirelyhides all context processing steps: an applicationis presented with a simple view on the triple storewhich is always populated with context-relevantinformation.
44http://blog.dbpedia.org/2011/01/17/
dbpedia-36-released/
6. Performance Evaluation of Mobile SemanticWeb Platforms
In the resource-limited context of mobile de-vices, efficient processing of RDF data is crucial. Inorder to obtain insights on the processing capabil-ities of modern mobile platforms, we have carriedout a performance evaluation of the three existingmobile RDF frameworks Androjena, µJena, andMobile RDF (cf. Section 3.2) on three differentmobile devices (cf. Table 1). A very important fac-tor of efficient processing is the time needed to cre-ate and store an RDF model in-memory, as this isusually the basis for further computation, analysis,inference, or transmission of data over a network.We did not include RDF on the Go and SWIP inour evaluation since they either exist as an imple-mentation of a specific platform-dependent tech-nology (SWIP) or have been released after ourevaluation has been conducted (RDF on the Go).
6.1. Test Environment
The Android HTC G145, released in 2008, wasone of the first Android devices available on themarket and represents the entry-level device class.It contains a 32-bit Qualcomm MSM7201A RISCCPU that runs with a clock speed of 350 MHz.Tests on this device were performed with the stan-dard memory capacity of 192 MB under the An-droid operating system version 1.6 update 4.
The Motorola Milestone46 was released in De-cember 2009 and represents the middle-class ofAndroid capable devices. It runs on a 32-bit TIOMAP3430 Superscalar ARM Cortex-A8 RISCCPU with a nominal clock speed of 600 MHz. Onthis device, the tests were performed with the stan-dard memory capacity of 256 MB under the oper-ating system version 2.1 update 1.
Finally, we have tested a Samsung Galaxy SI900047 smartphone, which was released in Sum-mer 2010. It uses a Qualcomm S5PC111 ARMv7-compatible CPU named “Hummingbird” with anominal clock speed of max. 1 GHz paired with a
18 S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices
PowerVR SGX540 GPU chip. This device uses 512MB main memory and runs the Android systemversion 2.2.
We analyzed the creation, parsing, and stor-age time for RDF models of various sizes, rang-ing from 10 to 50,000 triples. These models repre-sent the different model sizes that are involved inthe context processing and data replication tasksperformed by our framework, as described in Sec-tion 4. Typically, a single context provider emitsvery small models in the range of 10 to 100 triples,while a complete context model that has been ag-gregated from the single context providers mayhave several hundred to thousand triples in total.Data that are replicated from external sources mayin principle be of arbitrary size, therefore we havescaled our tests up to 50,000 triples in a singleRDF model.
The distribution of distinct subject, predicate,and object nodes has been estimated based on ananalysis of the 2009 Billion Triple Challenge dataset [40]. In these data we can observe that typicallyRDF data sets have a very high number of distinctobject values and a low number of distinct predi-cates, while the number of distinct subjects rangesin between these boundaries. All benchmarks wereperformed on the mobile devices during regular us-age of a device where the usual system processeswere running in parallel to our tests.
For each framework, device, and operation, wemeasured the total amount of time needed in mil-liseconds. From these measurements we can calcu-late the standard deviation between different testruns for each size as well as the number of triplesthat the particular combination of a device and aframework is able to process within one second.
In order to eliminate technological differencesbetween SD cards in terms of access times as wellas read-/write performance, we first copied datareplicas from the SD card to the internal non-volatile memory (ROM) of a device from wherethey are then parsed and transformed into a work-ing in-memory model.
Before each benchmark was initiated, the devicehad been restarted to ensure identical run-timeconditions. At the end of each benchmark, all filesand data that had been created during a test runwere deleted and the test environment was resetedto avert an influence on consecutive benchmarks.
Fig. 9. Construction of RDF graphs (Android HTC G1,
Motorola Milestone, Samsung Galaxy S I9000)
6.2. Results
Figures 9, 10, and 11 depict the results of ourmeasurements (detailed numbers can be found inthe appendix) for each analyzed device48.
48‘RDF/XML’, ‘N3’, and ‘N-TRIPLE’ refer to the dif-
ferent serialization formats supported by the Androjenaframework. For readability issues we excluded the frame-work’s name ‘Androjena’ and just referred to the respective
format for all parsing and storage figures.
S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices 19
Table 1
Overview of the Android Devices’ Specification
HTC G1 Motorola Milestone Samsung Galaxy S I9000
Processor Qualcomm MSM7201ATM TI OMAP3430 ARM Cortex A8 Qualcomm S5PC111 (ARMv7-comp.)
Clock speed in MHz 350 MHz 600 MHz 1 GHz
Memory Capacity (RAM) 192 MB 256 MB 512 MB
OS Version Android 1.6-update4 Android 2.1-update1 Android 2.2
Release 09/2008 12/2009 06/2010
Constructing In-memory RDF Graphs. Whencreating in-memory RDF graphs of certain sizes,we can observe a similar behavior on all threetested platforms. Androjena and Mobile RDF ex-hibit very similar results, namely, a nearly con-stant processing time per triple, even with in-creasing model size. Although processing times ofmobile RDF frameworks vary considerably acrosssmall context descriptions with sizes smaller than500 triples (up to factor 10 on the Samsung GalaxyS I9000 using Mobile RDF for processing a modelcontaining 100 triples), processing times normal-ize for models of size greater or equal than 1000triples on the two frameworks. In general, we canobserve that Androjena and Mobile RDF are ableto handle RDF graphs containing 20,000 or moretriples, although the limiting factor is the device’smemory capacity.
Additionally, the total execution time (in ms)for Androjena and Mobile RDF scales almost lin-early with the size of context descriptions. Theperformance of µJena, on the contrary, decreasessignificantly with increasing model sizes, leadingto very low processing times with models largerthan 100 triples. µJena tests with more than 2,000triples failed on all devices, making it basicallyunsuitable for the processing of voluminous RDFdata.
Processing speed of Androjena ranges between480 and 680 triples per second on an Android HTCG1, and 1000 and 2000 triples per second on aMotorola Milestone. Interestingly, on the SamsungGalaxy we can observe that the performance in-creases when models with more than 200 triplesare processed. The performance of µJena con-stantly decreases with increasing model size on allthree devices. Mobile RDF exhibits a similar per-formance behavior compared to Androjena wherea significant increase in triples per second values
on a Samsung Galaxy can be observed for mod-els with more than 500 triples. In general, MobileRDF has shown to be the most performant frame-work w.r.t. the amount of triples processed persecond on all tested devices.
When comparing the different devices, we canobserve the expected behavior that the AndroidHTC G1 exposes the weakest results due to itsslow CPU and small main memory, leading tomemory problems when creating models with20,000 or more triples. The other devices exposea better performance, making them more suitablefor processing larger volumes of RDF data. Onlythe Samsung Galaxy I S9000 was able to handle amodel of 50,000 triples; on the other devices testswith this model size failed with “out of memory”errors.
Parsing RDF Graphs. Androjena scales rea-sonably well with available processing power andyields best parsing results in terms of triples persecond ratios with RDF graphs containing morethan 200-500 triples. However, we could not no-tice a remarkable difference between the differentserialization formats on newer mobile device withgraphs smaller than 100 triples, i.e., significant dif-ferences in benchmark results among different seri-alization formats can first be noticed on newer mo-bile devices for graphs with more than 100 triples.
µJena yields best results with very small RDFgraphs containing less than 20 triples. However,we could observe a dramatic decrease in parsingperformance with models containing more than 20to 50 triples, which renders µJena inappropriatefor processing larger data replicas.
MobileRDF also scales reasonably well withavailable processing power and turns out to be thefastest RDF framework in terms of parsing per-formance, especially for larger RDF graphs withmore than 100 to 200 triples. This behavior was
20 S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices
Fig. 10. Parsing of RDF graphs (Android HTC G1, Mo-
torola Milestone, Samsung Galaxy S I9000)
more distinctive on less powerful devices such asthe HTC G1 or the Motorola Milestone but dis-solved on recent, more powerful devices such asthe Samsung Galaxy49. The best performance re-sults could be measured with RDF graphs contain-
49We verified this assertion using a Dell Streak smart-phone that also runs an ARM Cortex A8 CPU clocked at
1 GHz, where we could ascertain a similar behavior.
ing around 5,000 to 10,000 triples on the SamsungGalaxy S I9000.
In summary, the parsing benchmark exhibitssimilar behavior on all three devices revealing thatMobileRDF yields the fastest parsing performancefollowed by Androjena and µJena, whose pars-ing performance constantly drops with increasinggraph sizes. Additionally, Androjena and Mobile-RDF scale reasonably well with available process-ing power. Considering the different serializationformats supported by Androjena, the best pars-ing results were measured with N-Triple serializedgraphs followed by N3 and RDF/XML.
Serializing RDF Graphs. Storage times of allframeworks are relatively linear with the amountof triples to be stored, i.e, we could observe alinear scaling between storage run-times and theamount of triples to be saved on all three frame-works and on each device. However, no significantdifference w.r.t. the file sizes between the differ-ent frameworks and serialization formats could befound, which indicates that storage algorithms donot make use of e.g. QNames. File sizes of the se-rialized data replicas are rather similar among allframeworks and devices.
Androjena’s saving performance scales reason-ably well w.r.t. available processing power wherebest results could be achieved on the SamsungGalaxy; total storage times were seven times fastercompared to those measured on the HTC G1 forall serialization formats. Serializing RDF graphsin the N3 format yields the best triples per secondratio, followed by RDF/XML and N-Triple. Thebest storage performance results could be mea-sured with graphs of sizes between 100 and 2,000triples irrespectively of the serialization formatand device.
Although by far the least competitive frame-work in terms of creation and parsing perfor-mance, µJena yields the best storage performanceon the HTC G1 and the Motorola Milestone.However, this behavior disappeared on the Sam-sung Galaxy and similar devices such as theDell Streak50 where MobileRDF and N3-serializedgraphs using the Androjena framework showed thebest results51. Interestingly, the best storage per-
50http://www.dell.com/us/p/mobile-streak/pd51We tested the storage performance also on a Dell
Streak smartphone, which exhibits similar processing powerand clock speed
S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices 21
Fig. 11. Serialization of RDF graphs (Android HTC G1,Motorola Milestone, Samsung Galaxy S I9000)
formance results could be measured on the Mo-torola Milestone that exceeds the results of theother two devices considerably.
Storage performance of MobileRDF scales withavailable processing power for RDF graphs withtriple sizes greater than 200 to 500. Best resultscould be measured for graph sizes between 500 and5,000 triples where the triples per second ratio dif-fers by the factor 8 between the Samsung Galaxyand the HTC G1.
In summary we can see that modern mobile de-vices, in combination with recent RDF frameworksthat are optimized for mobile devices, can withouthesitation be used as the basis for Semantic Webapplications on mobile devices. In further work, weaim to analyze the behavior of these devices w.r.t.modification and deletion operations, as well asquerying and inference over RDF data, dependingon the availability of such implementations.
7. Conclusions and Future Work
The notion of context and context awarenessare key factors in providing a selective RDF-based data replication infrastructure for mobiledevices. We have outlined that traditional repli-cation strategies do not hold in mobile scenariosfor several reasons. They should be improved byconsidering current and future users’ informationneeds as well as the different contexts they are op-erating in, thus replicating only selected subsets ofthe base data. We therefore adopted the notion ofcontext and context awareness and synthesized itwith semantic technologies since they provide thenecessary flexibility and expressivity for context-dependent RDF-based data replication on mobiledevices. Our framework employs a loose couplingbetween context acquisition and data provisioningcomponents, gained by applying semantic tech-nologies (data models, vocabularies, inference) tointerpret and process context information. We im-plemented an example scenario in which personalinformation from Linked Data sources is replicatedbased on the user’s current location and upcom-ing appointments. Our performance evaluation hasshown that the performance of current RDF pro-cessing frameworks, deployed on state-of-the-artmobile devices, is acceptable for the processing ofRDF models of several thousand triples.
Although we have demonstrated that seman-tic technologies can provide substantial contribu-tions in realizing a mobile context-aware infras-tructure for RDF(-based) data replication, thereare still some open issues that need to be ad-dressed in future research: the integration of dy-namically discovered context sources is a chal-lenge most context-management frameworks face,especially in ubiquitous environments. We there-fore plan to investigate additional methods fordynamic context source discovery and integra-
22 S. Zander et al. / Context-driven RDF Data Replication on Mobile Devices
tion as well as heuristics for transforming senso-rial data into qualitative context descriptions. Wefurther plan to consider re-using functionality al-ready built into the framework (namely, the acqui-sition and combination of contextual informationfrom varying sources) to decide upon the optimaltime for initiating replication tasks. Currently, ourframework does not include feedback loops thatwould allow for adjusting context acquisition andaggregation tasks according to data provisioningneeds, and it lacks advanced reasoning capabili-ties, which we plan to implement in the near fu-ture.
An approach as proposed by [29] to integrateformal rule languages like SWRL [26] into con-text processing tasks would allow for the user-and application-driven specification of aggrega-tion, reasoning, and consolidation rules for col-lected and augmenting contextual data. Addition-ally, context processing could be complementedwith machine learning techniques for detecting us-age patters, as proposed by [6,7]. However, a con-text framework by itself can be made context-aware to adapt its processing rules and policies ac-cording to specific circumstances, for instance toreduce replication cycles in case of low battery etc.We plan to address these issues in future work.
Acknowledgements This work has been fundedby the FIT-IT grant 815133 from Austrian FederalMinistry of Transport, Innovation, and Technol-ogy. We would also like to thank Martin Raubaland Jerome Euzenat for their valuable feedback,which helped a lot to improve this work.
References
[1] C. B. Anagnostopoulos, Y. Ntarladimas, and S. Had-
jiefthymiades. Situational computing: An innovativearchitecture with imprecise reasoning. J. Syst. Softw.,
80(12):1993–2014, 2007.[2] C. Becker and C. Bizer. DBpedia Mobile: A Location-
Enabled Linked Data Browser. In Workshop on LinkedData on the Web (LDOW2008), 2008.
[3] A. Beloued, J.-M. Gilliot, M.-T. Segarra, andF. Andre. Dynamic Data Replication and Consistency
in Mobile Environments. In Proc. of the 2nd interna-tional doctoral symposium on Middleware, pages 1–5,New York, NY, USA, 2005. ACM.
[4] G. Biegel and V. Cahill. A Framework for Developing