1 ICS-FORTH Describing Resources on the Web: The Resource Description Framework Vassilis Christophides Dimitris Plexousakis Computer Science Department, University of Crete Institute for Computer Science - FORTH Heraklion, Crete http://www.ics.forth.gr/proj/isst/RDF
78
Embed
1 ICS-FORTH Describing Resources on the Web: The Resource Description Framework Vassilis Christophides Dimitris Plexousakis Computer Science Department,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
ICS-FORTH
Describing Resources on the Web: The Resource Description Framework
Vassilis ChristophidesDimitris Plexousakis
Computer Science Department, University of CreteInstitute for Computer Science - FORTH
3.6 million Web sites Five hundred million or more
addressable pages on the Web High consumer expectations
conflicting with primitive tools and mechanisms
Uncertain quality, integrity, trust
4
ICS-FORTH
The Information Landscape in the Web-era
The Web changes relationships among authorspublishersinformation intermediaries and distributorsusers
Lower barriers to “publication”rapid dissemination of information and ideasless advantage to size or centralizationgreatly expanded access
Manageability is reduced resource discovery is chaoticorganization is haphazardpreservation is almost non-existent
5
ICS-FORTH
The Web Information System vs. Traditional Libraries
Search systems are motivated by advertising Index coverage is unpredictable and limited (1/3) Too much recall, too little precision Index spam abound Resources (and their names) are volatile What about versions, editions, back issues? Archiving is presently unsolved Authority and quality of service are spotty Managing Access Rights is hard
6
ICS-FORTH
Metadata: Higher Quality Web Information Services
Traditionally: metadata has been understood as “Data about Data”help to impose order on chaos
Example(s): a library catalogue contains information (metadata) about
publications (data)a file system maintains permissions (metadata) about files (data)
Metadata describes other dataOne application’s metadata is another application’s dataMetadata can itself be described by metadata (but that doesn’t
make it meta-metadata) Example:
Price lists (metadata) have expiration dates: metadata about metadata (It is still just metadata!!)
7
ICS-FORTH
Metadata takes Many Forms
resourcediscovery
documentadministration
rightsmanagement
contentrating
security andauthentication
archivalstatus
products andservices
databaseschemas
process controlor description
8
ICS-FORTH
Metadata exists for Almost Anything
People
Places
Objects
Concepts
Documents
Archives
Databases
9
ICS-FORTH
Application: Item and Collection Cataloguing
Describing individual resources documents, pages, images, audio files, etc.
Describing the content of collectionsWeb sites, databases, directories, etc.
Relationships among ResourcesTables of Content, chapters, images….Site Maps
10
ICS-FORTH
Search engines can better “understand” the contents of a particular page
More accurate searches Additional information aids precision
Makes it possible to automate searches because less manual “weeding” is needed to process the search results
Application: Resource Discovery
11
ICS-FORTH
Metadata can be used to encode information needed in all stages of electronic commerce
agreeing on terms of saleprices, terms of payment,
contractual informationtransactions
delivery mechanisms, dates, terms
Application: Electronic Commerce
Broker
Market place
Providers/Clients
12
ICS-FORTH
Application: Intelligent Agents
Representation and sharing of knowledge
knowledge exchangemodeling
Communicationuser-to-agent, agent-to-agent,
agent-to-service Resource discovery
gives web-roaming agents the ability to “understand” their environment
place
service
place
place
13
ICS-FORTH
Application: Content Rating
Empowering users to select which kinds of web content they wish to see
Child Protection W3C PICS (Platform for Internet
Content Selection) working groupUS Communications Decency Act
of 1996simple metadata architectureprecursor to RDF
14
ICS-FORTH
Application: Digital Signatures
These are key to building the “Web of Trust” Required by
agentselectronic commercecollaboration
RDF will become the preferred way to encode digital signatures on documents and on statements about documents
15
ICS-FORTH
Other Applications
Privacy Preferences and Policiesdescribing a user’s willingness/
reluctance to disclose information about him/her-self
describing a site administrator’s desire to gather information about visiting users
Intellectual Property Rightscontractual terms related to usage
and distribution rights to a document
16
ICS-FORTH
(Meta)Data Transmission Methods
Embedded (eg META)
Associated With(in HTTP header)
Trusted Third Party(explicit HTTP GET)
17
ICS-FORTH
Metadata Assertions
The Web is “machine-readable” but not “machine-understandable”
Metadata is usefulA lot could be gained from
structured description of pages, servers, search services, and other resources
Accommodate multiple varieties of metadata
Metadata requirements will evolve
18
ICS-FORTH
A Plethora of Metadata Standards
Many metadata standards have evolved at different levels, and to meet different requirements...
MICI
19
ICS-FORTH
Interoperability Issues
SemanticInteroperability
StructuralInteroperability
SyntacticInteroperability
“Let’s talk English”Standardisation ofcontent
Standardisation ofform
“Here’s how to make a sentence”
Standardisation ofexpression
“These are the rulesof grammar”
“cat milk sat drank mat ”
“Cat sat on mat. Drankmilk.”
“The cat sat on the mat.It drank some milk.”
20
ICS-FORTH
Metadata Challenges
Many flavours of metadatawhich one do I use?
Managing changenew varieties, and evolution
of existing forms Tension between functionality
and simplicity, extensibility and interoperability
Functions, features, and cool stuff Simplicity and interoperability
21
ICS-FORTH
Towards Metadata for Community Webs
Group of people sharing a domain of discourse and a set of resources (e.g., data, documents, services) and having some common interests
Commerce, Education, Health
Provide community-specific metadata functionality in order to create, administrate, and access resources
common semantic, structural, and syntactic conventions for exchange of resource description information
Community Webs
Education
HealthCommerce
Workplace
22
ICS-FORTH
ScientificData
HomePages Geo
CommunityWebs
Library
Museums
Commerce
Whatever...
Metadata Interoperability in Community Webs
Communities of expertise (not software vendors) are responsible for:
SemanticsRegistrationAdministrationAccess managementAuthority of dataSharing and
Distribution
23
ICS-FORTH
Metadata Implementation Approaches
Harvesting metadata into a repository (database) Distributed Database Search
24
ICS-FORTH
Harvesting Metadata into a Repository (database)
HTML
XML
Other types
Repository HarvesterQuery
Dynamic document creation from database
retrieve resource
25
ICS-FORTH
Distributed Database Search
Z39.50 Server
Z39.50 Server
Z39.50 Server
Z39.50 GatewayQuery
retrieve resource
26
ICS-FORTH
Understanding RDF
RDF
27
ICS-FORTH
RDF origins
W3C Metadata Activity 1997-2000 PICS (Internet content selection) Warwick Framework / Dublin Core XML (XML Data, Channels etc) MCF (Apple, Netscape) URI specification for Web identifiers
28
ICS-FORTH
RDF Objectives
Enables resource description communities to define their own semantics
We can disagree about semantics, but share infrastructure (syntax, query, editors)
Imposes structural constraints on the expression of various application metadata
for consistent encoding, exchange and processing of metadata on the Web
Metadata vocabularies can be developed without central coordination
Fine-grained mixing of diverse metadata Signed RDF is the basis for trust XML used for ‘serialisation syntax’
29
ICS-FORTH
Describing Community Resources using RDF
Advanced Knowledge Schemas
(ontologies, thesauri)
<tag1> <tag2> <tag3></tag1>
<tag1> <tag2> <tag3></tag1>
Complexity and diversity
of information resources
Heterogeneous
resource descriptions
30
ICS-FORTH
The Basic RDF Data Model
RDF: Resource Descriptions Data Model: Directed Labeled
GraphsNodes: Resources (URIs) or
LiteralsEdges: Properties – Attributes
or RelationshipsStatement: assertion of the
form resource, property, valueDescription: set of statements
concerning a resourceXML syntax
31
ICS-FORTH
The Basic RDF Data Model: Primitives
ResourceProperty
Value
Statement
Resource
32
ICS-FORTH
Simple Example
URI:TutorialAuthor
“Vassilis”URI:Vassilis
33
ICS-FORTH
The notion of Resource
A resource is identified by a URI:[absoluteURI | relativeURI] [“#” fragment-id]
The resource identified by a URI may be abstract i.e. not network retrievable
Resource is distinct from entity resolved at any particular timehttp://www.ics.forth.gr/RDF/
From RFC 2396:Resource A resource can be anything that has identity. Familiar examples include an
electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can also be considered resources. The resource is the conceptual mapping to an entity or set of entities, not necessarily the entity which corresponds to that mapping at any particular instance in time. Thus, a resource can remain constant even when its content---the entities to which it currently corresponds---changes over time, provided that the conceptual mapping is not changed in the process.
34
ICS-FORTH
RDF Syntax
RDF Model defines a formal relationships among resources, properties and values
Syntax is required to...Store instances of the model
RDF relies on a (edge labeled) directed graph model that can easily
extended by just adding more edgescombine multiple vocabularies,
distinguished by their URIs RDF provides a standard syntax to
represent these graphs in XMLRDF Model can be thought of as a
simplified XML Infoset But RDF goes beyond XML syntactic
issuesIt allows to define semantic networks
on the Web
48
ICS-FORTH
Semantic Networks
Person
Artist
Painter Sculptor
name
Sculpture
Artifact
Painting
lives in
creates
paintssculpts
“a Person has a name and lives_in somewhere . Artists are persons, paintersand sculptors are artists. An artist creates artifacts, (paintings or sculptures)a painter paints paintings and a sculptor sculpts sculptures”
String
isa
isa isa
isa
isaisa
49
ICS-FORTH
RDF Schema Definition: RDFS
Declaration of label vocabularies for description graph nodes & edges Enables communities to share machine readable tokens and define
human readable labels Node labels (types) are defined as classes
Literal data types as defined by XML Schemas WG Resource may have a specific ‘type’ property
Edge labels (predicates) are defined as properties of these classes A resource of given type may have a given property (domain
constraint) A resource of given type may be the value of a given predicate
(range constraint) RDFS vocabularies expressible in the basic RDF model and syntax
RDFS vocabularies are also Web resources (and have URIs) and therefore can be described using RDF
50
ICS-FORTH
Constructing and Using RDF schemas
RDFS Schema Vocabularies allows for
Specialization of both classes & properties (simple & multiple)
Multiple classification of resources under several classes
Relational or Object Database Models (ODMG, SQL) Classes don’t define table or object types Instances may have associated quite different properties Collections with heterogeneous members
Semistructured or XML Data Models (OEM, UnQL, YAT, XML Schema) Schema labels on both nodes and edges Class and property subsumption is not captured Heterogeneous structures reminiscent to SGML exceptions
Knowledge Representation Languages (Telos, DL, F-Logic) Absence of complex values and n-ary relationships (bags, sequences)
58
ICS-FORTH
Some RDF Applications
Web Browsers:Netscape 6 from Netscape/AOL uses RDF to integrate various data-oriented
applications such as bookmarks, mail/news, channels, etc. as well as for smart browsing and related links (RDF annotation services)
Amaya Editor/Browser from W3C uses RDF to support user annotations on Web pages as metadata
Brokers/Portals:RSS (RDF Site Summary) XML/RDF Specification 1.0 2000Web Service Description Language (WSDL) XML/RDF Specification 2000PICS Rating Vocabularies in XML/RDF W3C NOTE 27 March 2000Platform for Privacy Preferences and RDF/RDF W3C Draft 10 May 2000
Content Management:OCLC Dublin Core Elements in RDFICOM-CIDOC Conceptual Reference Model in RDFThe Wordnet Lexical Ontology in RDFEuropean Treasury Browser in RDF
59
ICS-FORTH
Example: Annotation & Recommendation Services
60
ICS-FORTH
Practical notes on RDF
Authoring/Visualizationby hand (experts only, perhaps copy & paste)support by other tools (editors like Stanford Protégé)conversion from existing data stores (using XSLT)visualize RDF graphs (using Rudolf RDFViz)
Declarative query language for RDF description basesrelies on a typed data model (literal & container types + union types)follows a functional approach (basic queries and filters)adapts the functionality of semistructured or XML query languages to
RDF, but also: treats properties as self-existent individualsexploits taxonomies of node and edge labels allows querying of schemas as semistructured data
Find the resources of type painter and sculptor ExtResource intersect Sculpture
{{ www.rodin.fr/thinker.gif }}
Schema constructs used as query terms & support for automatic query
expansion (similar to thesauri-based IRS)
Useful to query resources with minimal schema knowledge
Includes paints & sculpts
Multiply classified resources
71
ICS-FORTH
Personalizing Portal Catalogs with RQL
Navigational queries on semistructured resource descriptionsFind the Museum resources that have been modified in year 2000. select x from Museum{x}.last_modified{y} where y >= 2000/01/01
{{museoreinasofia.mcu.es}}
Similar functionality to semistructured or XML query languages (Lorel, UnQL, XQL, XML-QL, XML-GL)
Useful in the absence of schema information or when multiple schemas are used to describe resources
Data paths not
foreseen in the schema
72
ICS-FORTH
Querying Portal Catalogs with Large Schemas
Filtering both resource descriptions and schemasFind the paintings having as technique “oil on canvas” that have
been created by a neo-impressionist painter
select y from {:$X}creates{y:Painting}.technique{z} where $X <= neo-impressionist and z = “oil on canvas”
Data filtering with
schema informationSchema Filtering on
Class hierarchies
73
ICS-FORTH
Querying Portal Schemas with RQL
Pure schema queriesFind the properties which specialize the property creates and may
have as domain the class Painter along with their corresponding range classes
select @P, $Y from {:Painter}@P{:$Y} where @P <= creates
Similar functionality to DBMS schema QLs (SchemaSQL, XSQL) Useful for large schemas (integrating ontologies and thesauri)
75
ICS-FORTH
Putting it all Together
Nested schema and data queriesFind the resources modified after 2000/01/01 which can be reached
by a property applied to the class Painting and its subclasses
select R, y from (select @P from {:$X}@P where $X <= Painting){R}.{y}last_modified{z} where z >= 2000/01/01
{{ [exhibited, museoreinasofia.mcu.es] }}
Subcommunities may use different schemas while sharing the same description base
R ranges over the labels
of type property
76
ICS-FORTH
RQL:Examples
PortalSchema
PortalResourceDescriptions
“oil on canvas”technique
exhibited
&r3
&r2
&r4
Painting
Museumexhibited
techniqueString
2000/06/09last_modified
2000/01/02last_modified
77
ICS-FORTH
Putting it all Together Schema and data queries
Find all metadata about the resources of the site museoreinasofia.mcu.es
select x,$$Y,$P,z,$$W from {x:$$Y}$P{z:$$W} where x like “*museoreinasofia.mcu.es*” or y like “*museoreinasofia.mcu.es*” {{[www.portal.gr/picasso132, Painter, paints, museoreinasofia.mcu.es/guernica.gif,