Top Banner
A Taxonomy of Semantic Web data Retrieval Techniques Anila Sahar Butt , Armin Haller, Lexing Xie The Australian National University, Australia.
25

A Taxonomy of Semantic Web data Retrieval Techniques

Apr 15, 2017

Download

Technology

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Taxonomy of Semantic Web data Retrieval Techniques

A Taxonomy of Semantic Web data Retrieval Techniques

Anila Sahar Butt, Armin Haller, Lexing Xie

The Australian National University, Australia.

Page 2: A Taxonomy of Semantic Web data Retrieval Techniques

Introduction

The Semantic Web • Provides access to an increasing amount of

structured information in a wide variety of domains• Information overload due to the large amount of

structured data is as much a problem as on the traditional Web

• Ample research has been proposed on Semantic Web data retrieval (SWR) techniques

Page 3: A Taxonomy of Semantic Web data Retrieval Techniques

Introduction

The questions:• Is the field of Semantic Web data retrieval making

progress? • What are the directions that have been taken?• What are some of the promising significant directions

to pursue future research?

Page 4: A Taxonomy of Semantic Web data Retrieval Techniques

Semantic Web Data Retrieval Approaches

o Ontology Retrieval TechniquesSwoogle, BioPortal, AktiveRank, LOV, OntoSearch2 ,OBO ,OntoSelect WATSON ,OntoKhoj

o Linked/RDF data Retrieval TechniquesSindice, Sig.ma, SWSE, SemRank, LTR

o Graph/Structured data Retrieval TechniquesSLQ, OSQ, Lindex, NeMa, BLINK, SSSGD

Page 5: A Taxonomy of Semantic Web data Retrieval Techniques

Overview of Semantic Web retrieval process

Data Acquisition

Data Warehousing

Reasoning

Indexing

Ranking

Query Evaluation

User Interface

Pre-processing Online-

processing

Page 6: A Taxonomy of Semantic Web data Retrieval Techniques

Semantic Web data Retrieval TechniquesRetrieval Aspects

Storage & Search Ranking Evaluation Practical

Aspects

Implementation

Datasets

Efficiency

Effectiveness

Scalability

Ranking Scope

Ranking Factor

Ranking Domain

Data Acquisition

Indexing

Query Match

Scope

Query Model

Results Type

Data Storage

User Interface

Page 7: A Taxonomy of Semantic Web data Retrieval Techniques

Semantic Web data Retrieval TechniquesRetrieval Aspects

Scope

Query Model

Results Type

Type(s) of the data that can be explored with the approach

• Ontologies • Linked/RDF Data• Graph/Structured Data

Way(s) a user can initiate the retrieval process• Keyword search• Structured query search• Faceted browsing• Hyperlink-based navigation

Type(s) of the output as a result of a user’s query• Relation centric • Entity centric• Document centric

Page 8: A Taxonomy of Semantic Web data Retrieval Techniques

Semantic Web data Retrieval TechniquesRetrieval Aspects

Storage & Search Ranking Evaluation Practical

Aspects

Implementation

Datasets

Efficiency

Effectiveness

Scalability

Ranking Scope

Ranking Factor

Ranking Domain

Data Acquisition

Indexing

Query Match

Scope

Query Model

Results Type

Data Storage

User Interface

Page 9: A Taxonomy of Semantic Web data Retrieval Techniques

Semantic Web data Retrieval TechniquesStorage &

Search

Data Acquisition

Indexing

Query Match

Data Storage

Way(s) and rule(s) of the data collection• Manual collection• HTML agnostic crawlers• HTML aware crawlers• Focussed crawlers

Type(s) of storage structures• Relational databases• Native storage• NoSQL databases

Type(s) of index structures• No index • Full text index • Structural index • Graph index• Multi-level indexing

Way(s) to find query matches in the data collection

• Exact Match• Partial Match

Page 10: A Taxonomy of Semantic Web data Retrieval Techniques

Semantic Web data Retrieval TechniquesRetrieval Aspects

Storage & Search Ranking Evaluation Practical

Aspects

Implementation

Datasets

Efficiency

Effectiveness

Scalability

Ranking Scope

Ranking Factor

Ranking Domain

Data Acquisition

Indexing

Query Match

Scope

Query Model

Results Type

Data Storage

User Interface

Page 11: A Taxonomy of Semantic Web data Retrieval Techniques

Semantic Web data Retrieval Techniques

Ranking

Ranking Scope

Ranking Factor

Ranking Domain

Query dependence of a ranking model • Global• Focus

Factor(s) based on which the ranks are calculated• No Ranking• Popularity• Authority• Informativeness• Relatedness• Coverage • Centrality• Learned model• Feedback

Original domain of a ranking model• Semantic web• Graph databases• Document retrieval• Machine learning

Page 12: A Taxonomy of Semantic Web data Retrieval Techniques

Semantic Web data Retrieval TechniquesRetrieval Aspects

Storage & Search Ranking Evaluation Practical

Aspects

Implementation

Datasets

Efficiency

Effectiveness

Scalability

Ranking Scope

Ranking Factor

Ranking Domain

Data Acquisition

Indexing

Query Match

Scope

Query Model

Results Type

Data Storage

User Interface

Page 13: A Taxonomy of Semantic Web data Retrieval Techniques

Semantic Web data Retrieval Techniques

Evaluation

Efficiency

Effectiveness

Scalability

Time taken to retrieve the relevant results • Query execution time• Index construct time• Index update time

Correctness of the retrieved or relevant results• Recall• Precision• F-Measure• MAP• NDCG

Flexibility of the approach• Data size• Data complexity• Query size• Query complexity

Page 14: A Taxonomy of Semantic Web data Retrieval Techniques

Semantic Web data Retrieval TechniquesRetrieval Aspects

Storage & Search Ranking Evaluation Practical

Aspects

Implementation

Datasets

Efficiency

Effectiveness

Scalability

Ranking Scope

Ranking Factor

Ranking Domain

Data Acquisition

Indexing

Query Match

Scope

Query Model

Results Type

Data Storage

User Interface

Page 15: A Taxonomy of Semantic Web data Retrieval Techniques

Semantic Web data Retrieval TechniquesPractical Aspects

Implementation

Datasets

User Interface

Programming language(s) adopted• Java• Python• C#• C

Type of dataset used for implementation or evaluation

• Synthetic• Real

Ways(s) of user interaction• GUI• API

Page 16: A Taxonomy of Semantic Web data Retrieval Techniques

Survey of Existing SWR Techniques

o Ontology Retrieval TechniquesSwoogle, BioPortal, AktiveRank, LOV, OntoSearch2 ,OBO ,OntoSelect WATSON ,OntoKhoj

o Linked/RDF data Retrieval TechniquesSindice, Sig.ma, SWSE, SemRank, LTR

o Graph/Structured data Retrieval TechniquesSLQ, OSQ, Lindex, NeMa, BLINK, SSSGD

Page 17: A Taxonomy of Semantic Web data Retrieval Techniques

Categorization of Prominent Semantic Web Retrieval Techniques 1/2

Techniques

Search Aspect

Storage and Search

Ranking Evaluation

Practical Aspect

Search Scope

Query Model

Result Type

Data Acquisition

Data Storage

Indexing

Query Match

Ranking Scope

Ranking Factor

Ranking Domain

Efficiency

Effectiveness

Scalability

Implementation

Datasets

User Interface

LOV [16] O K,S D M N F E G P D - - - J R G,A

BioPortal [10] O K,F D M - - E G F D - - - J R G,A

OntoSearch2 [14]

O K,F D M N S P G Co - Q - DS - R G

OBO [13] O H D M - N - - - - - - - - R G

OntoSelect [3] O K,F D M,AG

- - - - - D - - - - R G,A

Swoogle [6] O,L K D AG R F P G P D Q - - J R G

WATSON [5] O,L K,S D AG N S,F

E - N D - - - J R G,A

OntoKhoj [12] O K D AG R F - G P D - F - - R G,A

AKTiveRank[1] O K D - - - P F P,Co,C

G,D - P - - - -

Sindice [11] L K D F NoS F P F Co D Q,C,U - DS J R G,A

Page 18: A Taxonomy of Semantic Web data Retrieval Techniques

Categorization of Prominent Semantic Web Retrieval Techniques 2/2

Techniques

Search Aspect

Storage and Search

Ranking Evaluation Practical Aspect

Search Scope

Query Model

Result Type

Data Acquisition

Data Storage

Indexing

Query Match

Ranking Scope

Ranking Factor

Ranking Domain

Efficiency

Effectiveness

Scalability

Implementation

Datasets

User Interface

Sig.ma [15] L K E - NoS

F P F A,Co D Q - - J R G,A

SWSE [8] L K E AG N S P F A,P D Q,C,U

- DS J R G

SemRank [2]

L - R M - - E F - M - - - - S -

LTR [4] L K E M - - - G L M - N - - R -

SLQ [19] G K D M N S,F P F C M Q P,M,N DS J R,S -

OSQ [17] G S D M N G P - - - Q,C P,M,N DS - R,S -

Lindex [20] G S D M N,M

G P - - - Q,C,U

- DS - R,S -

NeMa [9] G S D M N G P - - - Q,C,U

P,R,F DS - R -

BLINK [7] G K D M N M P F C G Q,C - DS - R -

SSSGD[18] G S D M N G P - - - Q - QS,DS

- R,S -

Page 19: A Taxonomy of Semantic Web data Retrieval Techniques

Discussion and Research Directions 1/3 o Dynamic Faceted Browsing

Challenge: Syntactic diversity in describing the same propertyA title of a resource can be described as a name, a title, a label etc.

Solution: Clustering similar types of properties into a single group

o Ontology Retrieval Challenge: Discovering the most related vocabularies for a

query string Solution: Find relevant ontologies that cover most of the

query terms or related concepts to these terms

Page 20: A Taxonomy of Semantic Web data Retrieval Techniques

Discussion and Research Directions 2/3 o Ontology Ranking Models

Challenge: Match of a search term with a more expressive class, property or ontology description

Solution: Ranking models that consider design perspective, level of details, and extension in ontologies.

o Linked data retrieval Effectiveness vs. Efficiency Challenge: Effectiveness focussed techniques vs. Efficiency

focussed techniques Solution: A reasonable trade off between effectiveness and

efficiency

Page 21: A Taxonomy of Semantic Web data Retrieval Techniques

Discussion and Research Directions 3/3 o Ranking of triples for entity retrieval approaches

Challenge: Ranking of a property depends upon the entity it belongs to, and ranking of the object values for multivalued properties depending upon the entity to which the property belongs to

Solution: Ranking of triples for the entity to prioritize relevant attributes and object values of that entity

o An evaluation framework for Semantic Web data retrieval techniques

Challenge: Comparative evaluation of different SWR techniques with regards to their effectiveness, efficiency and scalability

Solution: Conducting comprehensive comparative experimental studies

Page 22: A Taxonomy of Semantic Web data Retrieval Techniques

Conclusion

o An overview of Semantic Web data retrievalo A taxonomy for Semantic Web data retrieval

techniqueso Categorization of Prominent Semantic Web

Retrieval techniqueso Future research directions

Page 23: A Taxonomy of Semantic Web data Retrieval Techniques

Thanks!o For more details

Anila Sahar Butt, Armin Haller, Lexing Xie, “A Taxonomy of Semantic Web Data Retrieval Techniques” In the Proceedings of the 8th International Conference on Knowledge Capture, Palisades, NY, USA, 7th – 10th October 2015

o Contact [email protected]@anu.edu.au

Page 24: A Taxonomy of Semantic Web data Retrieval Techniques

References

[1] H. Alani, C. Brewster, and N. Shadbolt. Ranking ontologies with aktiverank. In The Semantic Web-ISWC 2006, pages 1–15. Springer, 2006.

[2] K. Anyanwu, A. Maduko, and A. Sheth. Semrank: ranking complex relationship search results on the semantic web. In Proceedings of the 14th international conference on World Wide Web, pages 117–127. ACM, 2005.

[3] P. Buitelaar, T. Eigner, and T. Declerck. Ontoselect: A dynamic ontology library with support for ontology selection. In In Proceedings of the Demo Session at the International Semantic Web Conference. Citeseer, 2004.

[4] L. Dali, B. Fortuna, T. T. Duc, and D. Mladeni´c. Query-independent learning to rank for rdf entity search. In The Semantic Web: Research and Applications, pages 484–498. Springer, 2012.

[5] M. d’Aquin and E. Motta. Watson, more than a semantic web search engine. Semantic Web, 2(1):55–63, 2011.

[6] L. Ding, T. Finin, A. Joshi, R. Pan, R. S. Cost, Y. Peng, P. Reddivari, V. Doshi, and J. Sachs. Swoogle: a search and metadata engine for the semantic web. In Proceedings of the thirteenth ACM international conference on Information and knowledge management, pages 652–659. ACM, 2004.

[7] H. He, H. Wang, J. Yang, and P. S. Yu. Blinks: Ranked keyword searches on graphs. In Proceedings of the 2007 ACM SIGMOD international conference on Management of data, pages 305–316. ACM, 2007.

[8] A. Hogan, A. Harth, J. Umbrich, S. Kinsella, A. Polleres, and S. Decker. Searching and browsing linked data with swse: The semantic web search engine. Web semantics: science, services and agents on the world wide web, 9(4):365–401, 2011.

[9] A. Khan, Y. Wu, C. C. Aggarwal, and X. Yan. Nema: Fast graph search with label similarity. the VLDB Endowment, 6(3):181–192, 2013.

[10] N. F. Noy, N. H. Shah, P. L. Whetzel, B. Dai, M. Dorf, N. Griffith, C. Jonquet, D. L. Rubin, M.-A. Storey, C. G. Chute, et al. Bioportal: ontologies and integrated data resources at the click of a mouse. Nucleic Acids Research, 37(suppl 2):W170–W173, 2009.

Page 25: A Taxonomy of Semantic Web data Retrieval Techniques

References

[11] E. Oren, R. Delbru, M. Catasta, R. Cyganiak, H. Stenzhorn, and G. Tummarello. Sindice. com: A document-oriented lookup index for open linked data. International Journal of Metadata, Semantics and Ontologies, 3(1):37–52, 2008.

[12] C. Patel, K. Supekar, Y. Lee, and E. Park. Ontokhoj: a semantic web portal for ontology searching, ranking and classification. In Proceedings of the 5th ACM international workshop on Web information and data management, pages 58–61. ACM, 2003.

[13] B. Smith, M. Ashburner, C. Rosse, J. Bard, W. Bug, W. Ceusters, L. J. Goldberg, K. Eilbeck, A. Ireland, C. J. Mungall, et al. The obo foundry: coordinated evolution of ontologies to support biomedical data integration. Nature biotechnology, 25(11):1251–1255, 2007.

[14] E. Thomas, J. Z. Pan, and D. Sleeman. Ontosearch2: Searching ontologies semantically. In Proceedings of the OWLED 2007 Workshop on OWL: Experiences and Directions, volume 258 of CEUR Workshop Proceedings, 2007.

[15] G. Tummarello, R. Cyganiak, M. Catasta, S. Danielczyk, R. Delbru, and S. Decker. Sig. ma: Live views on the web of data. Web Semantics: Science, Services and Agents on the World Wide Web, 8(4):355–364, 2010.

[16] P.-Y. Vandenbussche and B. Vatant. Linked Open Vocabularies. ERCIM news, 96:21–22, 2014.

[17] Y. Wu, S. Yang, and X. Yan. Ontology-based subgraph querying. In Data Engineering (ICDE), 2013 IEEE 29th International Conference on, pages 697–708. IEEE, 2013.

[18] X. Yan, P. S. Yu, and J. Han. Substructure similarity search in graph databases. In Proceedings of the 2005 ACM SIGMOD international conference on Management of data, pages 766–777. ACM, 2005.

[19] S. Yang, Y. Wu, H. Sun, and X. Yan. Schemaless and structureless graph querying. Proceedings of the VLDB Endowment, 7(7), 2014.

[20] D. Yuan and P. Mitra. Lindex: a lattice-based indexfor graph databases. The VLDB Journal, 22(2):229–252, 2013.