Top Banner
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Carol Jean Godby Devon Smith Devon Smith OCLC Online Computer Library OCLC Online Computer Library Center Center Knowledge Technologies 2002 – Seattle, Knowledge Technologies 2002 – Seattle,
26

Strategies for subject navigation of linked Web sites using RDF topic maps

Dec 31, 2015

Download

Documents

mckenzie-may

Strategies for subject navigation of linked Web sites using RDF topic maps. Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies 2002 – Seattle, WA. Complex Web sites. Many institutions are struggling to solve problems with their official Web sites. But: - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Strategies for subject navigation of linked Web sites using RDF topic maps

Strategies for subject navigation of linked Web sites using RDF

topic maps

Carol Jean GodbyCarol Jean Godby

Devon SmithDevon Smith

OCLC Online Computer Library CenterOCLC Online Computer Library Center

Knowledge Technologies 2002 – Seattle, WAKnowledge Technologies 2002 – Seattle, WA

Page 2: Strategies for subject navigation of linked Web sites using RDF topic maps

Complex Web sites

Many institutions are struggling to solve Many institutions are struggling to solve problems with their official Web sites.problems with their official Web sites.

But:But: The contents constantly change.The contents constantly change. The editors can’t exercise sufficient The editors can’t exercise sufficient

control.control. One result: an institution’s major presence One result: an institution’s major presence

on the Web is difficult to navigate.on the Web is difficult to navigate.

Page 3: Strategies for subject navigation of linked Web sites using RDF topic maps

The Semantic Web

Tim Berners-Lee’s vision:Tim Berners-Lee’s vision: ““The current Web has documents for people, not The current Web has documents for people, not

computers. By augmenting Web pages with data computers. By augmenting Web pages with data designed for automated processing, users will designed for automated processing, users will transform the Web into the Semantic Web.”transform the Web into the Semantic Web.”

““Computers will find the meaning of semantic Computers will find the meaning of semantic data by following hyperlinks to definitions of key data by following hyperlinks to definitions of key terms and rules for reasoning about them terms and rules for reasoning about them logically.”logically.”

Page 4: Strategies for subject navigation of linked Web sites using RDF topic maps

The Semantic Web:An Architecture

Unicode URI

XML + XML namespaces + XMLschema

RDF + RDFschema

Ontology vocabulary

Logic

Proof

Digitalsignature

Trust

Data

Data

Rules

Self-describingdocuments.

Source: Tim Berners-Lee

Page 5: Strategies for subject navigation of linked Web sites using RDF topic maps

The promise of the Semantic Web

A common data modelA common data model

Conceptual linksConceptual links

Limited inferencesLimited inferences

Page 6: Strategies for subject navigation of linked Web sites using RDF topic maps

Our demo: goals

Represent subject/topic information obtained from Represent subject/topic information obtained from different sources.different sources.

Demonstrate the value of hypothetical metadata-Demonstrate the value of hypothetical metadata-based navigation for a collection of related Web based navigation for a collection of related Web sites.sites. oclc.orgoclc.org Portions of w3c.orgPortions of w3c.org dublincore.orgdublincore.org

Develop and evaluate the utility of Open Source Develop and evaluate the utility of Open Source prototyping tools based on RDF.prototyping tools based on RDF.

Page 7: Strategies for subject navigation of linked Web sites using RDF topic maps

SSome common topics

digital library xml

dublin core xml namespace

xml schemametadata

oclc.org w3c.org

dublincore.org

xml fragmentxml stylesheet

element nodedc element syntax

library automationclassification

traditional librarylibrary userslibrary network

xml profileschema processoruri syntax

Page 8: Strategies for subject navigation of linked Web sites using RDF topic maps

Sources of subject/topic metadata

HTML keywordsHTML keywords Subject lines in email messagesSubject lines in email messages An index of library/information science An index of library/information science

termsterms Terms extracted automatically from text Terms extracted automatically from text

using natural-language-processing using natural-language-processing algorithmsalgorithms

Page 9: Strategies for subject navigation of linked Web sites using RDF topic maps

Some term relationshipsSingular/Plural Library, librariesAcronyms

Standard Generalized Markup Language--SGMLLibrary of Congress Subject Headings--LCSH

Coordinationlibrary and information science--library science, information scienceinformation storage and retrieval--information storage, information retrieval

Broad/NarrowComputational linguistics—linguisticsClassification scheme—classification

Type-of Library—digital library, traditional libraryRelated Library—library classification scheme, library automation

Page 10: Strategies for subject navigation of linked Web sites using RDF topic maps

An RDF encoding

<Topic rdf:about=http://purl.org/rdf/topics/<Topic rdf:about=http://purl.org/rdf/topics/classificationclassification>><name><name>classificationclassification</name></name><related_concepts <related_concepts

rdf:resource=“http://purl.org/rdf/topics/rdf:resource=“http://purl.org/rdf/topics/classification_codesclassification_codes”/>”/><related_concepts rdf:resource=http://purl.org/rdf/topics/<related_concepts rdf:resource=http://purl.org/rdf/topics/classification classification

numbernumber”/>”/><types_of rdf:resource=http://purl.org/rdf/topics/<types_of rdf:resource=http://purl.org/rdf/topics/automatic classificationautomatic classification”/>”/><types_of rdf:resource=“http://purl.org/rdf/topics/<types_of rdf:resource=“http://purl.org/rdf/topics/library_classificationlibrary_classification”/>”/><coordinate rdf:resource=“http://purl.org/rdf/topics/<coordinate rdf:resource=“http://purl.org/rdf/topics/resource_discovery and resource_discovery and

classificationclassification”/>”/><coordinate rdf:resource=“http:/purl.org/rdf/topics/<coordinate rdf:resource=“http:/purl.org/rdf/topics/classification and classification and

knowledgeknowledge”/>”/></Topic></Topic>

Page 11: Strategies for subject navigation of linked Web sites using RDF topic maps

Connected RDF encodings

<Topic rdf:about=http://purl.org/rdf/topics/<Topic rdf:about=http://purl.org/rdf/topics/resource_discoveryresource_discovery>><name><name>resource discoveryresource discovery</name></name><broad_concepts rdf:resource=“http://purl.org/rdf/topics/<broad_concepts rdf:resource=“http://purl.org/rdf/topics/resourceresource”/>”/></Topic></Topic>

<Topic rdf:about=http://purl.org/rdf/topics/<Topic rdf:about=http://purl.org/rdf/topics/resourceresource>><name><name>resourceresource</name></name><related_concepts rdf:resource=http://purl.org/rdf/topics/<related_concepts rdf:resource=http://purl.org/rdf/topics/resource resource

discoverydiscovery”/>”/><types_of rdf:resource=http://purl.org/rdf/topics/<types_of rdf:resource=http://purl.org/rdf/topics/resource description resource description

frameworkframework”/>”/><related rdf:resource<related rdf:resource=“http://purl.org/rdf/topics/web_resource=“http://purl.org/rdf/topics/web_resource”/>”/></Topic></Topic>

Page 12: Strategies for subject navigation of linked Web sites using RDF topic maps

A graphical representation of relationships

classification

classificationcodes

automaticclassification

resource discoveryand classification

Coordination

Broad/Narrow

resourcediscovery

resource

resource descriptionframework

rdf

Type_of

Coordination

Related

Acronym

Page 13: Strategies for subject navigation of linked Web sites using RDF topic maps

The philosophy of our system

ModularModular

Open SourceOpen Source

Project Web site accessible at: Project Web site accessible at:

topicmap.oclc.org:5000topicmap.oclc.org:5000

Page 14: Strategies for subject navigation of linked Web sites using RDF topic maps

System architecture: 1

Extractterms

Filter terms Structureterms

NormalizedHTML

data

RDF graph

Page 15: Strategies for subject navigation of linked Web sites using RDF topic maps

Term filters: using knowledge encoded in the text

Positive contexts for terms: study of, information about, professor of, department of

information science, metadata applications, data processing, automatic classification, computational linguistics, internet resources

Negative contexts for terms: very different things, few messages, good point, interesting example, appealing idea, small extension, terse document, simple kind

Page 16: Strategies for subject navigation of linked Web sites using RDF topic maps

System architecture: 2

Harvester (Perl)

File System (HTML)

Metadata Scraper(Perl)

File System(Normalized HTML)

Term manipulator(Java)

File System (XML/RDF)

XML/RDF Loader

Database

Page 17: Strategies for subject navigation of linked Web sites using RDF topic maps
Page 18: Strategies for subject navigation of linked Web sites using RDF topic maps
Page 19: Strategies for subject navigation of linked Web sites using RDF topic maps
Page 20: Strategies for subject navigation of linked Web sites using RDF topic maps
Page 21: Strategies for subject navigation of linked Web sites using RDF topic maps
Page 22: Strategies for subject navigation of linked Web sites using RDF topic maps
Page 23: Strategies for subject navigation of linked Web sites using RDF topic maps

Open issues

RDF knowledge in the user interface.RDF knowledge in the user interface.

Encoding in RDF or XML?Encoding in RDF or XML?

The construction of knowledge ontologies.The construction of knowledge ontologies.

Page 24: Strategies for subject navigation of linked Web sites using RDF topic maps
Page 25: Strategies for subject navigation of linked Web sites using RDF topic maps

Conclusions

The enterprise succeeds or fails on the The enterprise succeeds or fails on the strength of the knowledge ontology.strength of the knowledge ontology.

RDF and the XTM standard are RDF and the XTM standard are descriptively equivalent for our work.descriptively equivalent for our work.

Sophisticated user interface design is Sophisticated user interface design is required to exploit all of the encoded required to exploit all of the encoded information.information.

Page 26: Strategies for subject navigation of linked Web sites using RDF topic maps

For more information

Sharon Caraballo. Automatic Construction Sharon Caraballo. Automatic Construction of a Hypernym-Labeled Noun Hierarchy. of a Hypernym-Labeled Noun Hierarchy. PhD dissertation. Brown University, 2001.PhD dissertation. Brown University, 2001.

Carol Jean Godby. A Computational Study Carol Jean Godby. A Computational Study of Lexicalized Noun Phrases in English. of Lexicalized Noun Phrases in English. PhD dissertation. The Ohio State PhD dissertation. The Ohio State University, 2002.University, 2002.