Web of Belief: Modeling and using Trust and Provenance in the Semantic Web Department of Computer Science and Electronic Engineering University of Maryland.

Web of Belief: Modeling and using Trust and Provenance in the Semantic Web

Department of Computer Science and Electronic EngineeringUniversity of Maryland Baltimore County Li DingLast updated:04/19/23

Outline

Introduction Thesis Statement

Research description Research plan Preliminary Work

The Web Of Belief Framework Evaluation

Contributions to computer science Thesis Schedule

Motivation

The growing body of the Semantic Web Observations

Information More Data encoded in Semantic Web language from many sources Various dialect Ontologies

Information is managed in two layer mechanism in terms of “Document, Ontology, namespace, term” Physical layer: the web of semantic web documents Logical layer: the RDF graph

More Semantic Web Tools Drive forces

Industrial: Weblog, RSS, social network websites Academic: research projects

Motivation (cont’d)

The Semantic Web has not achieved a real world “KB” Credibility & Consistency

Facts are provided by many sources w/o guarantee Scalability

Data is in vast amount Data is stored in an open and distributed context

Utility Data is fragmented

Bad URI Reference of resource & namespace in the Web of documents

Lack of associations in the RDF graph

Motivation (cont’d)

Why provenance and trust Important concepts borrowed from human world

Multi-discipline origins: social, epistemology, psychology The foundation of knowledge management and inference

Keys to credibility assessment and justification Empirical heuristics, also the complement method, in the

absence of domain knowledge to direct reason over credibility.

Explicit representation of justification trace. Good Heuristics to resolve inconsistency.

Keys to effectiveness and efficiency Knowledge can be managed by Provenance besides Topic Trust reduces search complexity

Thesis Statement

This dissertation shows that our Web Of Belief framework, a provenance and trust aware inference framework, is critical and effective in deriving answers with credibility assessment and justification across the open, distributed, and large scale online knowledge base provided by the Semantic Web.

Research Description

General DescriptionGoal: model and use provenance and trust in the SW • to enable a credible “world KB”.• to enable trust layer in the Semantic Web

Representation Encode provenance and trust Represent SW as KB

Management acquisition & digest data access interface Inference space expansion

InferenceHypothesis Test

Trust network computation Statement credibility Justification

Ontology Dictionary Term definition Class tree

The Infrastructure of the Semantic Web

Computing Services Data Service

Directory/Digest Service

Applications

SW data service

SW Service finder

Reputation Service

digests

Web entity directory

database(Web) document

RDF document

digests

searches

usesuses

SW Data finder

Assumptions

Propositional knowledge (facts) Uncertain knowledge with provenance Open and distributed knowledge storage

Relationship to Other Work

Representation Logical formalisms of agent model (AI) Truth theory (Epistemology) Provenance

Data access Collaborative KB in open distributed context (DB)

Learning Learning agent models: knowledge and behavior (social

learning & psychology) Inference

Reason over uncertain knowledge (reasoning)

Logical Formalisms

Modal Logic -- logically formalize agent Agent & action (McCarthy,1969; Kanger-Porn-Lindahl) Agent & belief and intention (Cohen, Levesque,1990) Agent & knowledge (Epistemic logic) Agent & belief (Doxastic logic) Agent & obligation (Deontic logic )

Other logical formalisms for trust and belief Regan’s formal framework for belief and trust Josang’s subjective logic Abdul-Rahman’s social trust model Jones and Firozabadi’s integrated logic model of trust

Epistemology

Learning Agent models

Objects to be learned Domain Trust Referral Trust

Methods Histogram Feedback based

Reason over uncertain knowledge Quantitative approach

Certainty factors - Mycin (Shortliffe, 1976) (obsolete heuristic), similar to Fuzzy approach

Possibility theory: Fuzzy logic (Zade, 1965;1976) Dempter-Shafer theory (Dempster,1968; Shafer

1976) Subjective logic

Probabilistic theory: Bayes Network (Pearl;1982) Qualitative approach

Non-monotonic logic

Two level data access

Datalog Logical level

RDF data access language (with provenance) Quads TriQL SPARQL

Storage level Centralized

triplestore Kowari

Decentralized Search engine?

Example walkthrough

Given a hypothesis/query in form of a collection of RDF statements with or w/o variables

Provenance where can I find them? where are the definitions for each term? Belief( agent, fact): Who said or asserted so? Justify( fact, fact):

Trust Can I believe them and thus use them in decision making How do I trust the other agents

Relationship to Other Work

Representation Agent, knowledge Provenance Trust

Data access Metadata RDF query language

Pattern extraction Transitive closure

RDF storage

Inference Trust network inference Credibility

Probabilistic inference Scalability

Domain filter Social filter

Semantic Web

Research Plan

Approach – the WOB framework Representation

WOB ontology Model provenance and trust into the semantic web Explicit represent the semantic web Represent SW as a KB in terms of “agent, statement, association”

Management Provenance aware data access language Social network extraction and integration Provenance and trust based knowledge base expansion

Inference Hypothesis credibility assessment

Trust network inference Provenance and trust based belief evaluation Explicit justification

Ontology dictionary

Research Methodology

Identify real world problems with examples Approach problems

Formalize problem Position problem in literature, and find related work Find issues to be resolved Design and implement solutions

Evaluation methods Statistics Project application Survey

Artifacts to be produced

[Data] Web Of Belief Ontology [System] Swoogle metadata and search

service [System] Ontology dictionary [Data] Swoogle Statistics

[System] SemDis Trust layer [Algorithm] Trust based belief evaluation [Algorithm] Trust based knowledge expansion

Limitations

Limited in online Semantic Web documents

Preliminary Work

WebOfBelief Ontology

Ontology Entity: Document, Statement, Reference, Agent, Association

Sub-classes: trust, belief, justification, dependency Facets

Confidence (conditional probability) Connective (semantics)

Provenance (Agent-document) Ownership/Authorship (Agent-Reference) belief (Reference-Reference) justification (doc-doc) dependency

Logical Formalisms

Reference foaf:Agent

rdf:Statement

selects

Web Of Belief (WOB) Conceptual Framework (v0.92)

Justification TrustBelief

Association

contains

foaf:Document

rdf:Resource

foaf:page

Dependency

xsd:real [0,1]AssociationConnective

confidenceconnective

source

wob:believewob:disbelievewob:nonbelieve

wob:supportwob:weakenwob:causewob:imply

wob:truthfulwob:wisewob:knowledgeablewob:cooperative

dc:creator

wob:importswob:priorVersion

Data digest service

Support data access language

Credibility Assessment

Trust Network InferenceGiven a trust network, how to propagate trust so as to evaluate trust between any two agents

Trust and provenance based statement evaluation

Explicit Justification

Ontology dictionary?

Social network extraction and mapping

Application

Trust based belief evaluation Trust and provenance aware inference Hypothesis testing and justification

Evaluation

Validate derived trust relations: survey users Validate performance of WOB inference

Compare results w or w/o trust & provenance Validate application utility: customer report

Contributions

A practical framework that makes the Semantic Web a KB The Web of Belief Ontology Semantic Web data digest service

Search and browse mechanisms for SW Support of RDF data access language?

Inference Judge information trustworthiness

The first work in characterizing the Semantic Web trust and provenance aware distributed inference

Dissertation schedule

Measures Size of data that could be handle Size of trust network

Milestones Half-way finished

the Semantic Web

SW services

SW file

SW data serviceInformation protection

SW Composer

SW user Heuristic searchFlexible query

Inference• Derive trust• Belief fusion• Justification

SW digest

SW service finder

SW data finder

Reputation service

SW intelligent user Representation• Belief, trust• Policy, rule

composeRich Information Text

P2P Possibility TheoryTrust Belief TheorySemantic Web

SW digest

Digest/Search Service

Inference Service

An outline of the Semantic Web

Find Washington Population

Sure! the following SWDs/Agents know that

Here are the certainty/trustworthiness for each unique answer

Oh Yeah! Answer X is credible because it comes from government website

Sorry I don’t have it, Do you want US population?

disambiguation

SW digest

inference

Which `Washington’ do you mean?

AssociationsBelief. Who knows what?

Trusting provenance•Credential based trust•Reputation based trust•Context/Role based trust

Trusting content • consensus• context axioms

RDF referenceHow to refer part of RDF graph

Trust network

Justification Rule represent hypothesisJustification instantiates rule

Fill a RDF templateShow me the complete definition of class X

Trust network discoveryUncertainty and Precision

An example

Expected Contributions

Framework Features for characterize the Semantic Web An Web of Belief ontology to connect the Semantic Web

Association/ annotation Query language or data access language?

Mechanisms Search/browse Semantic Web Document Judge information trustworthiness

Applications Swoogle Semdis

1. Web of Belief – represent the SW Build an abstract view of the Semantic Web Select features to characterize it

Overall features: timeline, category Different levels: term, document, network Different classes: Entity, Association Different semantics: Meta-ontology, domain-

ontology Build web of belief ontology for explicit

representation

Ontology, Document, Namespace, and Term

Term

Document

Namespace

SWDBOntology

Local name

contains (m:n)

uses (n:1)

defines (m:n)

defines (m:n)

Swoogle Search & Browse (1/3)

hasName (n:1)

sameLocalName

An abstract view of the Semantic Web

Document

NSWDSWD

SW ontology

SW database

Java Source

RDF Node

Resource Literal

class

property

ID

Non-ID

Semantic Web

Document doc-doc association

RDF Node Node-node association node-doc association

Network level

Document level

RDF Node level

RDF Database

2. Swoogle – index service for SW Even we have knowledge online, a portal data

digest service is need to facilitate data access RDF digest

Meta level (use RDF/OWL semantics) Domain level (use domain semantics)

RDF query Document Term Literal (name, identifier)

Dictionaries Term/Ontology dictionary Web entity dictionary

Ontological c-p definition

Empirical c-p definition

Ontological annotationAssociation Feature

node-node Term-definition class-property

Ontological Empirical

meta association, e.g. rdfs:subClassOf, rdfs:domain node-doc

resource, doc, #subject,#property,#object, #subject-type-X, #X-type-object

Literal, doc, predicate doc-doc

Meta association, e.g. owl:imports Namespace co-occurrence

C

MetaC

o1IP1

rdf:type

---P2

rdf:type

P3

rdf:domainrdf:range

Story 1: Big RDF file & P2P

Facts We found WordNet has published its ontology in a 60M daml file, where

JENA fails to load it in memory. Most people use ontology as data exporting annotation, (Stefen Decker

argues in WWW2004 Dev day), Querying RDF should be tractable (Ian Harrock, Andy Seanbome). i.e. we

need to balance the tractability and the expressiveness of a query. the query result for a graph pattern (with variables) can be of three types: a

subgraph, the variable binding, a max subgraph Provenance information mainly range in Agent (person, organization,

website). i.e. agent’s belief Question

Is it appropriate to say a RDF model is a RDF file? If not, how do we describe a distributed RDF model?

Will there be any very big RDF file? Why? Can we let RDF stored in small files and distributed throughout the world.

3. SemDis: How to judge information trustworthiness? Granularity

rdf:Statement SWD Information source (agent, website) Topic

Association Social network (FOAF) Belief, Authorship (foaf:maker) Justification

Trust computation Ranking Network Consensus

Practice of Trust

Fields Weblog

FOAF RSS

Online Social network DBLP FOAF Google

Applications Manipulate precision

Disambiguation: specialize knowledge Privacy protection: generalize

knowledge Manipulate completeness – fuse

knowledge Algorithms

Trust propagation algorithm: surfer model, flow model,

Belief merging algorithm

Given A new statement Reasoning: What is its trustworthiness

given opinions on it from some information sources? (subjective logic, fuzzy cognitive map)

Justification: How to find evidences to support/weaken it? (web of belief ontology for annotation)

Given A question Search: effective/efficient in open

environment (rdf digest, bounded search with trust heuristic)

Given Online multi-network Social relations among information

sources (FOAF) Ontological relations among topics (sub-

topic) Web entity identification and mapping

Emergence model How these can really affect the

semantic web research?

Story 2: Identity

Facts We found a lot social network online, e.g. coauthor(dblp),

knows(foaf), colleague. Different networks adopt different identities

Each of them might not well connected, or quite small, but what-if we connected them

One identity shared by multiple persons, by mistake or by nature Identity mapping is m:n

Questions Can we determine certainty of identity How to map identity

Story3: Knowledge Fusion

Fact We can fuse person info. From multiple FOAF file.

Some statements are confirmed by a lot of people We can build a model which has multiple

provenance Questions

How to use provenance information to assure the receiver.

What if Dr. Joshi want to determine his trust to the ontology created by Dr. Amit Sheth

Story 4: Justification Markup Language

Facts about distributed justification on the web (semantic web) The justification on the web may not always be formalized. Knowledge on the web could be objective (like database) or subjective (like joke, estimation). Knowledge on the semantic web is inherently inconsistent Determining what counts as adequate reasons is an obstacle to providing justification. This process of reason

giving can be viewed as argumentation in four major forms: inductive, deductive, conclusive, and prima facie. Inductive and deductive justification involve evidence and logical evaluation. In a conclusive argument, reasons are analyzed by asking if another rational human would have the same belief given the

same reasons. prima facie argumentation is a process of giving several reasons for believing something and choosing the most important

one. Question:

How to represent the mixture of human inference, statistical information and logical inference Distributed justification: trust-based, case-based, logical-inference

Example: I will buy a new Honda Accord because (1) [inductive] it is a good car because 90% related online comments are positive ; (2) [deductive] it has better mile/gas performance; (3) [conclusive/mimic] I will buy a car since my friend (who has similar taste as me ) like to buy it . (4) [prima facie] Among all factors that make me happy, buying a new car is the most important

Solution Formal language to express logical programming proof trace, e.g. PML We also need informative language to express human justification

Express relation between statements: support, casual, critique, Log decision process as a case for future sharing/recall/query. Cite a case/used reason as proof of new justification

Web of Belief: Modeling and using Trust and Provenance in the Semantic Web Department of Computer Science and Electronic Engineering University of Maryland.

Documents

semantic web representation

semantic web language

sw data finder slide

provenance open

trust layer

topic trust

search complexity slide

efficiency knowledge