Top Banner
Trust Analysis on Heterogeneous Networks Manish Gupta 17 Mar 2011
30

Trust Analysis on Heterogeneous Networks

Feb 24, 2016

Download

Documents

Trust Analysis on Heterogeneous Networks. Manish Gupta 17 Mar 2011. Survey Roadmap. Basic Iterative Fact Finder Models Extensions to B asic F act F inder Models Source Copying Detection Trust Analysis for Homogeneous Networks Trust Metrics Trust Analysis using Logic - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Trust  Analysis on Heterogeneous  Networks

Trust Analysis on Heterogeneous Networks

Manish Gupta17 Mar 2011

Page 2: Trust  Analysis on Heterogeneous  Networks
Page 3: Trust  Analysis on Heterogeneous  Networks

Survey Roadmap

• Basic Iterative Fact Finder Models• Extensions to Basic Fact Finder Models• Source Copying Detection• Trust Analysis for Homogeneous Networks• Trust Metrics• Trust Analysis using Logic• Applications of Trust Analysis Models• Conclusion

Page 4: Trust  Analysis on Heterogeneous  Networks

Basic Iterative Fact Finder Models

Three components of model– Trustworthiness of provider t(p)– Confidence of fact s(f)– Implications between facts imp(f, f’)

Yin et al. TKDE 2008

Page 5: Trust  Analysis on Heterogeneous  Networks

Basic Iterative Fact Finder Models

• Sums (Hubs and Authorities)

• Average.Log

• Investment

• Pooled Investment

Pasternack et al. COLING 2010

Page 6: Trust  Analysis on Heterogeneous  Networks

Extensions to Basic Fact Finder Models(Incorporating Hardness of Facts)

• An answer to an easy question should earn less trust than to a hard one.

• Three iterative approaches: Cosine, 2-Estimates, 3-Estimates

• 3-Estimates iteratively computes three things: hardness of facts (propensity of sources to be wrong on this fact), confidence in the T(true) value of the fact and the error (1-trustworthiness) of the sources.

Galland et al. WSDM 2010

Page 7: Trust  Analysis on Heterogeneous  Networks

Extensions to Basic Fact Finder Models(Probabilistic Assertions)

• The sources may provide facts with some uncertainty.• Rather than a unweighted provider-fact network, one can

consider a k-partite weighted graph.• Weight on source-claim edge depends on probability that

source asserted claim according to the information extractor, certainty expressed by the source in the claim, and similarity among claims.

• Sums, Average.Log, Investment and TruthFinder are rewritten to incorporate weight of edges.

• Introduce the notion of layered model, one which consists of multiple layers rather than just two layers.

Pasternack et al. WWW 2011

Page 8: Trust  Analysis on Heterogeneous  Networks

Extensions to Basic Fact Finder Models(Cluster-based Fact Finding)

• Providers perform better in their areas of focus.• Basic Cluster-based Fact Finder (BCFF)

– Computes object-conditional provider trust.– Clusters objects using Kmeans over object-conditional trust vectors.

• Advanced Cluster-based Fact Finder (ACFF)– Starts with an initial clustering using BCFF.– Iteratively

• Performs cluster-conditional trust analysis using current clustering. • Refines the clusters using current analysis obtained on the previous set of

clusters.– Smooth cluster-based fact confidence score using the global

computations.

Gupta et al. WWW 2011

Page 9: Trust  Analysis on Heterogeneous  Networks

Extensions to Basic Fact Finder Models(Using Common-Sense Reasoning)

• Fact finders should be able to incorporate common sense knowledge.

• Common sense knowledge is expressed using first order logic rules and then transformed into a tractable linear program.

• This linear program constrains the claim beliefs produced by a fact finder, ensuring that the belief state is consistent with both common-sense and known facts.

• Iterative framework:– Compute trustworthiness values using confidence of facts.– Update confidence of facts based on trustworthiness of providers.– Correct confidence of facts using the linear program.

Pasternack et al. COLING 2010

Page 10: Trust  Analysis on Heterogeneous  Networks

Source Copying Detection(Overview)

• Sourced copy from each other.• The higher the similarity between the data sources, the

more is the likelihood of similarity dependence.• Direction of dependence: consider the data source

whose different subsets of data show different properties (e.g., accuracy, average rating) as more likely to be dependent on the other.

• Two different settings: static and dynamic• Complex copying relationships: co-copying, transitive

copying, copying from multiple sources, lazy copying

Berti-Equille et al. CIDR 2009

Page 11: Trust  Analysis on Heterogeneous  Networks

Source Copying Detection(For Static snapshots)

• Data sources that share common false values are much more likely to be dependent than data sources that share common true values.

• Bayesian analysis model (iterative solution)– Determine true values– Compute accuracy of sources– Discover dependence between every pair of sources

• Further extensions– Handle similarity between values when computing the confidence of a

value– Removes the assumption of “false values of an object being uniformly

distributed”– Incorporate the accuracy of a source when computing the dependency

between pair of sources.

Dong et al. VLDB 2009

Page 12: Trust  Analysis on Heterogeneous  Networks

Source Copying Detection(Dynamic scenarios)

• True values can evolve over time. Also copying relationships can evolve over time.

• Data sources that share common false values are much more likely to be dependent than data sources that share common recent or outdated true values.

• Data sources that perform the same updates in close enough time frame are more likely to be dependent, especially if the same update trace is rarely observed from other sources.

• HMM Model to detect copying dependence over time.• Bayesian analysis model (iterative solution)

– Compute CEF (coverage, exactness and freshness) for each source– Compute probability of copying between sources– Decide the life span of each object

Dong et al. VLDB 2009

Page 13: Trust  Analysis on Heterogeneous  Networks

Source Copying Detection(Complex copying relationships)

• Complex copying relationships exist like co-copying, transitive copying, copying from multiple sources and correlated copying.

• The first step locally decides possibility of copying and copying direction between each pair of sources using completeness, accuracy and formatting of data and correlated copying.

• The second step (greedy algorithm) globally identifies co-copying and transitive copying and copying from multiple sources and correlated copying

Dong et al. VLDB 2010

Page 14: Trust  Analysis on Heterogeneous  Networks

Source Copying Detection(Using multiple attributes)

• Data sources usually provide complex data, i.e. collections of tuples with many attributes. Different attributes may exhibit different evidence of dependence.

• They compute (i) probability that the observed properties of an object assume certain values (ii) accuracy of a provider with respect to each observed property.

• When performing copy detection, they combine information from multiple properties.

• They consider stock data with multiple attributes and show that some 3-attribute configurations perform better than 1-attribute configurations. But considering all 5 attributes results in lower accuracy.

Blanco et al. CAiSE 2010

Page 15: Trust  Analysis on Heterogeneous  Networks

Trust Analysis for Homogeneous Networks(SourceRank)

• Trust analysis for deep web (non-cooperative) sources.• They build an agreement graph with nodes as sources

and edges weights=Agreement based on sample query results using partial queries. Compute source importance using random walks.

• Agreement=Similarity in attribute value, tuple, answer set.

• Combat source collusion based on topk answers to large answer queries and adjust agreement by (1-collusion).

Balakrishnan et al. WWW 2011

Page 16: Trust  Analysis on Heterogeneous  Networks

Trust Analysis for Homogeneous Networks(Semi-supervised truth finder)

• Including some level of supervision can help guide the iterative fact finding algorithms in the right direction.

• The approach is based on three principles: (i) facts provided by the same data source should have similar confidence scores, (ii) similar (and therefore mutually supportive) facts should have similar confidence scores and (iii) if two facts are conflicting, they cannot be both true.

• These principles are encoded into a facts graph using appropriate edge weights

• Truth discovery is then equivalent to solving an optimization problem that aims to assign scores to graph nodes that are consistent with the relationships indicated by the graph edges. This involves minimizing a convex function using an iterative algorithm which converges to an optimal solution.

Yin et al. WWW 2011

Page 17: Trust  Analysis on Heterogeneous  Networks

Trust Analysis for Homogeneous Networks(Trust in presence of non-cooperative sources)• All facts may not be available before performing trust analysis.• Delay Tolerant Networks (DTNs): use of group information

requires that nodes throughout the network be aware of the membership lists for all groups.

• MembersOnly that collects group membership information from each node it meets and consolidates it on-the-fly.

• Nodes propagate group membership lists only for groups of which they are members, to every contact that node makes.

• MembersOnly calculates the difference between the strength of the (sigmoid transformations of) positive evidence and the strength of the (sigmoid transformations of) negative evidence.

Nelson et al. CHANTS 2010

Page 18: Trust  Analysis on Heterogeneous  Networks

Trust Metrics

• Topic/Cluster• Context and criticality• Popularity• Perceived authority• Direct experience• Recommendation• Related resources• Provenance• User expertise• Bias

• Incentive• Limited resources• Agreement• Specificity• Likelihood• Age• Appearance• Deception• Recency

Multiple factors can determine trust

Castelfranchi et al. ICMAS 1998, Gil et al. Web Semantics 2007, Pasternack et al. ARL 2010

Page 19: Trust  Analysis on Heterogeneous  Networks

Trust Analysis using LogicAugmenting providers network with proof networks of premise, logic inference rules and conclusions.

Mary

John

JaneDave

𝛿Dave= 𝐼𝑛𝑑𝑖𝑒𝐹𝑖𝑙𝑚(𝑥)∧𝐷𝑖𝑟𝑒𝑐𝑡𝑒𝑑𝐵𝑦 (𝑥 , 𝐴𝐵𝐶)h𝑊𝑎𝑡𝑐 (𝑥 )

:0.8IndieFilm(hce:1) DirectedBy(hce,

ABC):1

Watch(hce):0.8

First argument

𝛿 Jane= 𝐼𝑛𝑑𝑖𝑒𝐹𝑖𝑙𝑚(𝑥)∧ h𝑆𝑝𝑎𝑛𝑖𝑠 𝐹𝑖𝑙𝑚(𝑥 )¬ h𝑊𝑎𝑡𝑐 (𝑥)

:0.7SpanishFilm(hce:1) IndieFilm(hce):1

Watch(hce):0.7

Second argument

Trust network

0.8

0.8

0.90.7

0.7

rebut

rebut

Tang et al. EUMAS 2010

Page 20: Trust  Analysis on Heterogeneous  Networks
Page 21: Trust  Analysis on Heterogeneous  Networks
Page 22: Trust  Analysis on Heterogeneous  Networks
Page 23: Trust  Analysis on Heterogeneous  Networks

Applications of Trust Analysis Models

Page 24: Trust  Analysis on Heterogeneous  Networks

Applications of Trust Analysis Models• Provenance-based Access Control Systems

– Credibility of a data item based on data similarity, data conflict, path similarity and data deduction

• Ranking web results (graph of answers and websites)• Website ranking (graph of webpages and websites)• Sensor networks

Dai et al. BNCOD 2008, Wu et al. WebDB 2007, Gao et al. ACAI 2009, Le et al. ISPN 2011

Page 25: Trust  Analysis on Heterogeneous  Networks

Applications of Trust Analysis Models

• Quality Inference on User Generated Content (Annotator-Article graph)

• Data fusion (data and sources)• News Finding (news articles-websites-topics)

Liao et al. ICDE 2010, Dong et al. VLDB 2009, Miao et al. PRICAI 2010

Page 26: Trust  Analysis on Heterogeneous  Networks

Conclusion• We reviewed the work in the data mining community on

performing heterogeneous network-based trust analysis based on the data provided by multiple information sources for different objects.

• We presented a classification of the approaches based on the network design used and the sub-problems solved.

• We discuss various aspects of trust including the basic fact finder models, their extensions, source dependency models, logic based models, homogeneous trust network models, and semi-supervised learning models.

Page 27: Trust  Analysis on Heterogeneous  Networks

References

Page 28: Trust  Analysis on Heterogeneous  Networks

References

Page 29: Trust  Analysis on Heterogeneous  Networks

References

Page 30: Trust  Analysis on Heterogeneous  Networks

Thanks!