innovation. quality. service “Enabling clients to realize the full potential of their content and increase efficiency throughout their enterprise.” Engineering technology to deliver the revolution Presentation to Online Publishers’ forum November 29, 2011 Priya Parvatikar, Technical Architect
22
Embed
Publishing Technology Online Forum - Engineering the semantic web
Priya Parvatakir from Publishing Technology demonstrates how it is implementing semantic web technologies in new publisher GSE Research's online publishing website.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
innovation. quality. service
“Enabling clients to realize the full potential of their content and increase efficiency throughout their enterprise.”
Engineering technology to deliver the revolution
Presentation to Online Publishers’ forum
November 29, 2011
Priya Parvatikar, Technical Architect
About this talk
Engineering technology to deliver the revolution 2
• Features of the GSE Research website
• Overview of how the features have been achieved
• ‘Under the hood’ look at the technology
Improved search - Enhancing auto-suggest
Engineering technology to deliver the revolution 3
Using taxonomy information for “did you mean”
Engineering technology to deliver the revolution 4
Boosting relevant results
Engineering technology to deliver the revolution 5
Guiding the user through facets
Engineering technology to deliver the revolution 6
Guiding the user through suggestions
Engineering technology to deliver the revolution 7
Concept homepages
Engineering technology to deliver the revolution 8
Showing concepts on item homepages
Engineering technology to deliver the revolution 9
Suggest related items
Engineering technology to deliver the revolution 10
GSE Research – How?
Engineering technology to deliver the revolution 11
• Built using the pub2web platform
• MetaStore used for metadata storage
• Apache Solr used for search indexing
• Semantic enrichment of content
• Apache UIMA used for entity extraction
MetaStore
Engineering technology to deliver the revolution 12
• RDF triplestore for storing metadata
• Agnostic to the type of data being stored
• Able to store rich and very granular data
• Flexible to cater for future data enhancements
For the GSE Research site:
Content
Authors
Taxonomy concepts and relations
Federation of data from external datasets
Search
Engineering technology to deliver the revolution 13
• Uses enterprise-grade Apache Solr
• Inbuilt support for rich features
• Faceted searching
• Synonyms
• Stemming
• Boosting
• ‘More like this’
• ‘Did you mean’
Content for GSE Research website
Engineering technology to deliver the revolution 14
Provided by GSE
• Content XML
• Taxonomy prepared by GSE
Taxonomy enhancement
• Concepts mapped to Library of Congress classifications
• Taxonomy automatically enhanced with terms from this classification
GSE Research taxonomy - example
Engineering technology to deliver the revolution 15
For example, the GSE taxonomy contains
Climate change, pollution & environmental impacts
Water pollution
Air pollution
After enhancing with Library of Congress classification
Climate change, pollution & environmental impacts
Water pollution – variants: aquatic pollution, water contamination