Reusing XML Schemas' Information as a Foundation for Designing Domain Ontologies Thomas Bosch (M.Sc.) [email protected] | http://boschthomas.blogspot.com Problem • Traditionally, ontology engineers work in close collaboration with domain experts to design domain ontologies (DOs) which requires lots of time and effort • DOs as well as XSDs describe domain data models • In many cases, XSDs are already defined and can therefore be reused to design DOs Hypothesis The effort and the time delivering high quality DOs using the proposed approach is much less than creating DOs completely manual Main Research Question How to accelerate the time-consuming process designing DOs based on already available XSDs? XSD and OWL follow different modeling goals, the mapping transports only XSDs' information, and generated ontologies (GOs) are not conform to the highest quality requirements of DOs GOs are not immediately useful domain experts and ontology engineers enrich GOs with additional domain-specific semantic information in form of DOs Benefits • Process designing DOs from scratch is sped up significantly • All XSDs' information (terminology, syntactic structure of XML docs) is reused in GOs • GOs' RDF representations can be published in the LOD cloud and linked to other RDF datasets • All XML data conforming to XSDs can be imported automatically as DOs' instances • GOs and DOs can be maintained in a fast way • Detect technical and content-related data models' weaknesses Novelty of Approach • Based on XSD meta-model • Does not extract semantics out of XSDs • Transformation on terminological and assertional knowledge level • Automatic transformation of XSDs and XML docs • More expressive power of OWL instead of RDFS GOs Limitations • Prerequisite: XSDs • Not suitable use cases (e.g. when XSDs do not represent the domain knowledge correctly or when XSDs are technically not well designed) Map XSDs to GOs • <xs:element name="VariableName" ... /> VariableName ⊑ Element • <xs:element name= "VariableName" ... /> VariableName ⊑ name_Element_String.{'VariableName'} • <xs:attribute ref="lang"/> Lang-Reference ⊑ ref_Attribute_Attribute.Lang • <xs:element name="VariableName" type="NameType"/> VariableName ⊑ type_Element_Type.NameType • <xs:extension><xs:attribute name="translated"/><xs:attribute name= "translatable"/></xs:extension> Extension1 ⊑ contains_Extension_Attribute.(Translated ⊔ Translatable) Use Cases • To proof approach's generality: any XSDs and corresp. XML docs can be converted to GOs and their RDF representations, as all XSD meta-model's components are covered • Generic test cases: derived from XSD meta-model • Domain-specific use cases: Data Documentation Initiative (DDI) ontology; projects: MISSY, da|ra, LOD pilot project, SOFISwiki Evaluation • To verify the hypothesis • User study to compare traditional manual and proposed approach (define measurement methods) • Derive DOs of multiple and differing domains Proposed Approach Derive DOs using SWRL rules