Web of Belief: Modeling and using Trust and Provenance in the Semantic Web Department of Computer Science and Electronic Engineering University of Maryland Baltimore County Li Ding Last updated:03/22/22
Dec 24, 2015
Web of Belief: Modeling and using Trust and Provenance in the Semantic Web
Department of Computer Science and Electronic EngineeringUniversity of Maryland Baltimore County Li DingLast updated:04/19/23
Outline
Introduction Thesis Statement
Research description Research plan Preliminary Work
The Web Of Belief Framework Evaluation
Contributions to computer science Thesis Schedule
Motivation
The growing body of the Semantic Web Observations
Information More Data encoded in Semantic Web language from many sources Various dialect Ontologies
Information is managed in two layer mechanism in terms of “Document, Ontology, namespace, term” Physical layer: the web of semantic web documents Logical layer: the RDF graph
More Semantic Web Tools Drive forces
Industrial: Weblog, RSS, social network websites Academic: research projects
Motivation (cont’d)
The Semantic Web has not achieved a real world “KB” Credibility & Consistency
Facts are provided by many sources w/o guarantee Scalability
Data is in vast amount Data is stored in an open and distributed context
Utility Data is fragmented
Bad URI Reference of resource & namespace in the Web of documents
Lack of associations in the RDF graph
Motivation (cont’d)
Why provenance and trust Important concepts borrowed from human world
Multi-discipline origins: social, epistemology, psychology The foundation of knowledge management and inference
Keys to credibility assessment and justification Empirical heuristics, also the complement method, in the
absence of domain knowledge to direct reason over credibility.
Explicit representation of justification trace. Good Heuristics to resolve inconsistency.
Keys to effectiveness and efficiency Knowledge can be managed by Provenance besides Topic Trust reduces search complexity
Thesis Statement
This dissertation shows that our Web Of Belief framework, a provenance and trust aware inference framework, is critical and effective in deriving answers with credibility assessment and justification across the open, distributed, and large scale online knowledge base provided by the Semantic Web.
General DescriptionGoal: model and use provenance and trust in the SW • to enable a credible “world KB”.• to enable trust layer in the Semantic Web
Representation Encode provenance and trust Represent SW as KB
Management acquisition & digest data access interface Inference space expansion
InferenceHypothesis Test
Trust network computation Statement credibility Justification
Ontology Dictionary Term definition Class tree
The Infrastructure of the Semantic Web
Computing Services Data Service
Directory/Digest Service
Applications
SW data service
SW Service finder
Reputation Service
digests
Web entity directory
database(Web) document
RDF document
digests
searches
usesuses
SW Data finder
Assumptions
Propositional knowledge (facts) Uncertain knowledge with provenance Open and distributed knowledge storage
Relationship to Other Work
Representation Logical formalisms of agent model (AI) Truth theory (Epistemology) Provenance
Data access Collaborative KB in open distributed context (DB)
Learning Learning agent models: knowledge and behavior (social
learning & psychology) Inference
Reason over uncertain knowledge (reasoning)
Logical Formalisms
Modal Logic -- logically formalize agent Agent & action (McCarthy,1969; Kanger-Porn-Lindahl) Agent & belief and intention (Cohen, Levesque,1990) Agent & knowledge (Epistemic logic) Agent & belief (Doxastic logic) Agent & obligation (Deontic logic )
Other logical formalisms for trust and belief Regan’s formal framework for belief and trust Josang’s subjective logic Abdul-Rahman’s social trust model Jones and Firozabadi’s integrated logic model of trust
Learning Agent models
Objects to be learned Domain Trust Referral Trust
Methods Histogram Feedback based
Reason over uncertain knowledge Quantitative approach
Certainty factors - Mycin (Shortliffe, 1976) (obsolete heuristic), similar to Fuzzy approach
Possibility theory: Fuzzy logic (Zade, 1965;1976) Dempter-Shafer theory (Dempster,1968; Shafer
1976) Subjective logic
Probabilistic theory: Bayes Network (Pearl;1982) Qualitative approach
Non-monotonic logic
Two level data access
Datalog Logical level
RDF data access language (with provenance) Quads TriQL SPARQL
Storage level Centralized
triplestore Kowari
Decentralized Search engine?
Example walkthrough
Given a hypothesis/query in form of a collection of RDF statements with or w/o variables
Provenance where can I find them? where are the definitions for each term? Belief( agent, fact): Who said or asserted so? Justify( fact, fact):
Trust Can I believe them and thus use them in decision making How do I trust the other agents
Relationship to Other Work
Representation Agent, knowledge Provenance Trust
Data access Metadata RDF query language
Pattern extraction Transitive closure
RDF storage
Inference Trust network inference Credibility
Probabilistic inference Scalability
Domain filter Social filter
Semantic Web
Approach – the WOB framework Representation
WOB ontology Model provenance and trust into the semantic web Explicit represent the semantic web Represent SW as a KB in terms of “agent, statement, association”
Management Provenance aware data access language Social network extraction and integration Provenance and trust based knowledge base expansion
Inference Hypothesis credibility assessment
Trust network inference Provenance and trust based belief evaluation Explicit justification
Ontology dictionary
Research Methodology
Identify real world problems with examples Approach problems
Formalize problem Position problem in literature, and find related work Find issues to be resolved Design and implement solutions
Evaluation methods Statistics Project application Survey
Artifacts to be produced
[Data] Web Of Belief Ontology [System] Swoogle metadata and search
service [System] Ontology dictionary [Data] Swoogle Statistics
[System] SemDis Trust layer [Algorithm] Trust based belief evaluation [Algorithm] Trust based knowledge expansion
WebOfBelief Ontology
Ontology Entity: Document, Statement, Reference, Agent, Association
Sub-classes: trust, belief, justification, dependency Facets
Confidence (conditional probability) Connective (semantics)
Provenance (Agent-document) Ownership/Authorship (Agent-Reference) belief (Reference-Reference) justification (doc-doc) dependency
Logical Formalisms
Reference foaf:Agent
rdf:Statement
selects
Web Of Belief (WOB) Conceptual Framework (v0.92)
Justification TrustBelief
Association
contains
foaf:Document
rdf:Resource
foaf:page
Dependency
xsd:real [0,1]AssociationConnective
confidenceconnective
source
wob:believewob:disbelievewob:nonbelieve
wob:supportwob:weakenwob:causewob:imply
wob:truthfulwob:wisewob:knowledgeablewob:cooperative
dc:creator
wob:importswob:priorVersion
Credibility Assessment
Trust Network InferenceGiven a trust network, how to propagate trust so as to evaluate trust between any two agents
Trust and provenance based statement evaluation
Explicit Justification
Application
Trust based belief evaluation Trust and provenance aware inference Hypothesis testing and justification
Evaluation
Validate derived trust relations: survey users Validate performance of WOB inference
Compare results w or w/o trust & provenance Validate application utility: customer report
Contributions
A practical framework that makes the Semantic Web a KB The Web of Belief Ontology Semantic Web data digest service
Search and browse mechanisms for SW Support of RDF data access language?
Inference Judge information trustworthiness
The first work in characterizing the Semantic Web trust and provenance aware distributed inference
Dissertation schedule
Measures Size of data that could be handle Size of trust network
Milestones Half-way finished
the Semantic Web
SW services
SW file
SW data serviceInformation protection
SW Composer
SW user Heuristic searchFlexible query
Inference• Derive trust• Belief fusion• Justification
SW digest
SW service finder
SW data finder
Reputation service
SW intelligent user Representation• Belief, trust• Policy, rule
composeRich Information Text
P2P Possibility TheoryTrust Belief TheorySemantic Web
SW digest
Digest/Search Service
Inference Service
An outline of the Semantic Web
Find Washington Population
Sure! the following SWDs/Agents know that
Here are the certainty/trustworthiness for each unique answer
Oh Yeah! Answer X is credible because it comes from government website
Sorry I don’t have it, Do you want US population?
disambiguation
SW digest
inference
Which `Washington’ do you mean?
AssociationsBelief. Who knows what?
Trusting provenance•Credential based trust•Reputation based trust•Context/Role based trust
Trusting content • consensus• context axioms
RDF referenceHow to refer part of RDF graph
Trust network
Justification Rule represent hypothesisJustification instantiates rule
Fill a RDF templateShow me the complete definition of class X
Trust network discoveryUncertainty and Precision
An example
Expected Contributions
Framework Features for characterize the Semantic Web An Web of Belief ontology to connect the Semantic Web
Association/ annotation Query language or data access language?
Mechanisms Search/browse Semantic Web Document Judge information trustworthiness
Applications Swoogle Semdis
1. Web of Belief – represent the SW Build an abstract view of the Semantic Web Select features to characterize it
Overall features: timeline, category Different levels: term, document, network Different classes: Entity, Association Different semantics: Meta-ontology, domain-
ontology Build web of belief ontology for explicit
representation
Ontology, Document, Namespace, and Term
Term
Document
Namespace
SWDBOntology
Local name
contains (m:n)
uses (n:1)
defines (m:n)
defines (m:n)
Swoogle Search & Browse (1/3)
hasName (n:1)
An abstract view of the Semantic Web
Document
NSWDSWD
SW ontology
SW database
Java Source
RDF Node
Resource Literal
class
property
ID
Non-ID
Semantic Web
Document doc-doc association
RDF Node Node-node association node-doc association
Network level
Document level
RDF Node level
RDF Database
2. Swoogle – index service for SW Even we have knowledge online, a portal data
digest service is need to facilitate data access RDF digest
Meta level (use RDF/OWL semantics) Domain level (use domain semantics)
RDF query Document Term Literal (name, identifier)
Dictionaries Term/Ontology dictionary Web entity dictionary
Ontological c-p definition
Empirical c-p definition
Ontological annotationAssociation Feature
node-node Term-definition class-property
Ontological Empirical
meta association, e.g. rdfs:subClassOf, rdfs:domain node-doc
resource, doc, #subject,#property,#object, #subject-type-X, #X-type-object
Literal, doc, predicate doc-doc
Meta association, e.g. owl:imports Namespace co-occurrence
C
MetaC
o1IP1
rdf:type
---P2
rdf:type
P3
rdf:domainrdf:range
Story 1: Big RDF file & P2P
Facts We found WordNet has published its ontology in a 60M daml file, where
JENA fails to load it in memory. Most people use ontology as data exporting annotation, (Stefen Decker
argues in WWW2004 Dev day), Querying RDF should be tractable (Ian Harrock, Andy Seanbome). i.e. we
need to balance the tractability and the expressiveness of a query. the query result for a graph pattern (with variables) can be of three types: a
subgraph, the variable binding, a max subgraph Provenance information mainly range in Agent (person, organization,
website). i.e. agent’s belief Question
Is it appropriate to say a RDF model is a RDF file? If not, how do we describe a distributed RDF model?
Will there be any very big RDF file? Why? Can we let RDF stored in small files and distributed throughout the world.
3. SemDis: How to judge information trustworthiness? Granularity
rdf:Statement SWD Information source (agent, website) Topic
Association Social network (FOAF) Belief, Authorship (foaf:maker) Justification
Trust computation Ranking Network Consensus
Practice of Trust
Fields Weblog
FOAF RSS
Online Social network DBLP FOAF Google
Applications Manipulate precision
Disambiguation: specialize knowledge Privacy protection: generalize
knowledge Manipulate completeness – fuse
knowledge Algorithms
Trust propagation algorithm: surfer model, flow model,
Belief merging algorithm
Given A new statement Reasoning: What is its trustworthiness
given opinions on it from some information sources? (subjective logic, fuzzy cognitive map)
Justification: How to find evidences to support/weaken it? (web of belief ontology for annotation)
Given A question Search: effective/efficient in open
environment (rdf digest, bounded search with trust heuristic)
Given Online multi-network Social relations among information
sources (FOAF) Ontological relations among topics (sub-
topic) Web entity identification and mapping
Emergence model How these can really affect the
semantic web research?
Story 2: Identity
Facts We found a lot social network online, e.g. coauthor(dblp),
knows(foaf), colleague. Different networks adopt different identities
Each of them might not well connected, or quite small, but what-if we connected them
One identity shared by multiple persons, by mistake or by nature Identity mapping is m:n
Questions Can we determine certainty of identity How to map identity
Story3: Knowledge Fusion
Fact We can fuse person info. From multiple FOAF file.
Some statements are confirmed by a lot of people We can build a model which has multiple
provenance Questions
How to use provenance information to assure the receiver.
What if Dr. Joshi want to determine his trust to the ontology created by Dr. Amit Sheth
Story 4: Justification Markup Language
Facts about distributed justification on the web (semantic web) The justification on the web may not always be formalized. Knowledge on the web could be objective (like database) or subjective (like joke, estimation). Knowledge on the semantic web is inherently inconsistent Determining what counts as adequate reasons is an obstacle to providing justification. This process of reason
giving can be viewed as argumentation in four major forms: inductive, deductive, conclusive, and prima facie. Inductive and deductive justification involve evidence and logical evaluation. In a conclusive argument, reasons are analyzed by asking if another rational human would have the same belief given the
same reasons. prima facie argumentation is a process of giving several reasons for believing something and choosing the most important
one. Question:
How to represent the mixture of human inference, statistical information and logical inference Distributed justification: trust-based, case-based, logical-inference
Example: I will buy a new Honda Accord because (1) [inductive] it is a good car because 90% related online comments are positive ; (2) [deductive] it has better mile/gas performance; (3) [conclusive/mimic] I will buy a car since my friend (who has similar taste as me ) like to buy it . (4) [prima facie] Among all factors that make me happy, buying a new car is the most important
Solution Formal language to express logical programming proof trace, e.g. PML We also need informative language to express human justification
Express relation between statements: support, casual, critique, Log decision process as a case for future sharing/recall/query. Cite a case/used reason as proof of new justification