Enhanced Social Learning via Trust and Reputation Mechanisms in Multi-agent Systems PhD Completion Seminar Golriz Rezaei Supervisors: Dr. Michael Kirley Dr. Shanika Karunasekera Dept. Computer Science and Software Engineering The University of Melbourne, Australia 20 April 2011
74
Embed
Enhanced Social Learning via Trust and Reputation Mechanisms in Multi-agent Systems
Enhanced Social Learning via Trust and Reputation Mechanisms in Multi-agent Systems. PhD Completion Seminar Golriz Rezaei Supervisors: Dr. Michael Kirley Dr. Shanika Karunasekera Dept. Computer Science and Software Engineering The University of Melbourne, Australia 20 April 2011. Outline. - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Enhanced Social Learning via Trust and Reputation Mechanisms in
Multi-agent Systems
PhD Completion Seminar
Golriz Rezaei
Supervisors: Dr. Michael Kirley
Dr. Shanika Karunasekera
Dept. Computer Science and Software EngineeringThe University of Melbourne, Australia
Problem?• Appropriate partners Successful performance Maximise utility• Open dynamic MAS Uncertainty + Partial knowledge
Establishing strategic connections is difficult!
Enhanced Social Learning• Social Learning (biological background)?
• Learning through observation / interaction with others• Knowledge transmission without genetic materials• Acquire knowledge from others without incurring the cost / time
• Major mechanism Imitation (perceive and reproduce behaviour)
• Why good? • keep track of beneficial interaction partners• save time / energy / cost• Improve long term performance (individual / system)
Problem? error-prone / outdated / inappropriate information
• Solution? selective
• When High individual trial-and-error cost
Intermediate environment change rate
• How Mixed with personal innovation
• From whom • Agents are heterogeneous• Appropriate role models Important for performance• Partner selection
Enhanced Social Learning cont.
?
1) Top-down • Plan at design time • Ability of the designer predict optimal connections in advance• Fixed structure of relations (random / particular topology)• Autonomy condition + Environmental condition not realistic
2) Automatic learning • Build and sustained adaptively at run time• Trust & Reputation Formal definition?• Evaluate before interaction Partner selection / Decision making• Relations evolve Partner’s reliability / trustworthiness
Survey in Ch2
Enhanced Social Learning cont.
Evolutionary game theory Concrete App MAS
Coevolutionary Endogenous Social Networks
Dynamic relation formation
Topology Behaviour
Social
ties
Agents’
strategies
Proposed framework
1. Life-experiences
2. Endogenous Evolving
Social Networks
Evaluation
Trust & Reputation
SocialLearning
Enhanced Social Learning
1) Social Dilemma Evolutionary Games
2) Advice-seeking in Distributed Service Provision Applications
?
Research goals and questions
Central hypothesis:
“Does incorporating concepts of trust and reputation within a social learning framework help to enhance the agents’ interactions in a MAS? And
consequently does it help to improve their long term performance?”
1. (Life-experiences / Aging) + (Coevolutionary endogenous social networks) Trust / Reputation? Effective social learning approaches?
2. Encourage cooperation in social dilemmas? Broader perspective of general MAS applications (Advice-Seeking for Resource Discovery in Distributed Service Provision)
3. Impacts of agents’ heterogeneity (behaviour/attributes/preferences)
4. Structural characteristics of the underlying evolved relationship networks?5. Interaction patterns system's behaviour?
Interaction pattern System behaviour
Publications
Life Experiences in Spatial 2-player Prisoners’ Dilemma Game
1. G. Rezaei and M. Kirley (2008). Heterogeneous payoffs and social diversity in the spatial prisoner's dilemma game. In X. Li, M. Kirley, and M. Zhang, editors, Proceedings of 7th International Conference on Simulated Evolution and Learning (SEAL), volume 5361 of Lecture Notes in Computer Science, pages 585--594, Springer.
2. G. Rezaei and M. Kirley (2009). The effects of time varying rewards on the evolution of cooperation. Evolutionary Intelligence, 2(4):207-218.
First Model
Publications cont.
N-player Prisoners' Dilemma Game on an Evolving Social Network
1. G. Rezaei, M. Kirley and J. Pfau (2009). Evolving cooperation in the N-player prisoner's dilemma: A social network model. In K. B. Korb, M. Randall, and T. Hendtlass, editors, Artificial Life: Borrowing from Biology (ACAL), volume 5865 of Lecture Notes in Computer Science, pages 32-42, Springer Verlag, Berlin.
2. An extended version is under preparation (2011).
Distributed Advice-Seeking on an Evolving Social Network
3. G. Rezaei, J. Pfau and M. Kirley (2010). In Distributed Advice-Seeking on an Evolving Social Network. 2010 IEEE/WIC/ACM International Conference on Intelligent Agent Technology.
BackgroundTrust and Reputation in Multi-agent SystemsTrust and Reputation in Evolutionary Game TheoryEvolutionary Games on Graphs
The Research workFirst ModelSecond ModelThird Model
Concluding Discussion
Acknowledgment and Questions?
Only Decision making No Partner selection Cooperative behaviour
First ModelLife Experiences in Spatial 2-PD Game
Trust & Reputation
SocialLearning
Enhanced Social
Learning
Life-experiences&
Age
1 2 3
4 5
6 7 8
Fixed Network (grid)
Local neighbourhood interaction Moore Accumulates received payoffs Fitness End of each round Imitate
the most successful neighbour (MSN) Clusters of cooperators
outweigh losses against defectors
?
First Model cont.The challenge
Typically “Universal fixed payoff matrix” Hypothesis Introducing “social diversity” alters trajectory of the population
Adaptive rewards (Individual agent strategies + Life-experiences) Given a limited agent life span
MSN (Highest accumulated normalized utility + Older) Role model trustworthiness!
Age αi(t+1) = αi(t) + 1Life-span λi randomly from a uniform distribution [min, max]
(αi(t) == λi dies and replaced by a new random agent)Personal version of payoff matrix updated at each time step based on experience level
Each agent
Update rule
Contributions ?
First Model cont.Adaptive rewards
Update
Where is the payoff values for agent i at time t is the default payoff matrix values T, R, P, S is the magnitude of the rescaled values is the age of agent i at time t is the expected life time of agent i is limiting factor and characterises the uncertainty related to the environment
1)
2)
First Model cont.Scenarios
1. Standard PD Universal fixed Payoffs + Age
2. Homogeneous model Universal fixed Payoffs+Age
3. Heterogeneous model Individual Adaptive Payoffs + Age(3 versions: update 4 elements / update 1 element / update 1 element capped)
contribution to social welfare is beneficial for the group
Conventional EG (D,D, … all D)
0 cb
Nbc /
Agents play cooperatively form social links (reinforced)
One agent defects breaks his links with the opponents
Second Model cont.Evolving Relations
slow positive / fast negative
Incorporating “social network” into N-player PD
Network evolves by cooperative behaviour
Introducing “cognitive” agents Decision making based on some function of the opponents
Second Model cont.Contribution - Hypothesis
Encourage high levels of cooperation Persist for longer Analyse the state of the underlying network
Second Model cont.Schematic Algorithm
Algorithm: Social network based N-PD modelRequire: Population of agents P, iteration = imax, players N 2
1: for i = 0 to imax do2: G = 0;3: while g = NextGame(P,G, N) do4: G = G {g}5: PlayGame(g)6: AdaptLinks(g)7: end while8: a,b = Random Sample(P)9: CompareUtilityAndSelect(a,b)10: end for
Decision making
Partner selection
First agent Randomly from remaining population
Two Scenarios
(N-1) partners
Second Model cont.Game Formation Partner selection
Randomly from remaining population
From the first agent remaining social contacts probabilistically
Two scenarios (cognitive abilities)
Pure strategy (always cooperate/defect)
Mixed strategy (play probabilistically)
Discriminators function of
Agents receive corresponding payoff based on outcomes (Boyd and Richerson function)
Second Model cont.Game Execution
Decision making
generositygradient
Average links weight
Second Model cont.Snapshots
|P| = 25, N = 3, Defector, Cooperator, Discriminator
Self-organize social ties based on their self-interest
Strategy update cultural evolution
Second Model cont.Scenarios
Partner selection + Decision making (Random matching) (Pure strategy)
Partner selection + Decision making (Social Network game formation) (Pure strategy)
Partner selection + Decision making (Social Network game formation) (Pure strategy + Discriminators)
Step 1
Step 2
Step 3
Step 4
Population size = 1000 Group sizes = (2, 4, 5, 10, 15, 20) ε = 0.9 Game formation probability b = 5 and c = 3 (payoff values benefit & cost) Pure strategy scenario (50% pure C – 50% pure D) Mixed strategy scenario (33.3% each) α = 1.5 and β = 0.1 (decision function) average 20 independent trials up to 40000 iterations
Second Model cont.Experimental Setup
What is the equilibrium state and network topology?
Second Model cont.Group size vs. Strategy
Step 1 Step 2
Step 3 Step 4
Second Model cont.Emergent Social Networks
ClusteringCoefficient
Step 2
Step 3
Step 4
Second Model cont.Final Degree Distribution
Step 4N=2
Step 4N=5
Cooperation higher degree distribution higher Size & shape depend on N
BackgroundTrust and Reputation in Multi-agent SystemsTrust and Reputation in Evolutionary Game TheoryEvolutionary Games on Graphs
The Research workFirst ModelSecond ModelThird Model
Concluding Discussion
Acknowledgment and Questions?
Decision making Partner selection Coevolution (Interaction network + System’s behaviour)
Third ModelDistributed Advice-Seeking on an
Evolving Social Network
Trust & Reputation
SocialLearning
Enhanced Social
Learning
Endogenous Evolving Social Networks
Life-experiences
Games Advice-Seeking in Distributed Service ProvisionRelations evolve over time (Link weights Trust & Reputation)
?
Third Model cont.Distributed Infrastructure Technology
Characteristics1) Unknown large environment2) Varieties of selection options3) Users are heterogeneous4) Exact characteristics not available
until accessed, if it is made explicit at all
Ex./ Specialized protein search engines, Netflix
Approaches
1) Individual try & error2) Central registration directory (Brokers, Web Service [Facciorusso et. al. 2003])
3) Advice seeking Direct exchange of “selection advice” beneficial! ex./ Learning [Nunes and Oliveira 2003 ], Distributed Recommender Systems
Unknown
Unknown
Unknown
Unknown
Unknown
Unknown
Unknown
Unknown
Unknown
Unknown
Question?
Social Networks!
Third Model cont.Advice-Seeking
Question:
Heterogeneous individual requirements Whom?
Challenge: Identify other suitable users difficult!
?
- Large number of them - Preferences not publicly available - Not in a position to make their own preferences explicit
Social contacts serve as valuable resources Manage improve long term payoff gains
Third Model cont.Abstract Framework
Agent-based simulation (resources + agents)
Repeatedly
Subjective Utility
Goal = Maximize long term utility, limited selections
Challenge = Identify appropriate resources
Evolving Social Network
- Connect with similar minded Autonomously based on local information only
- Receive advice improve resource selection - Learn their own subjective utility advice accuracy
decide retain / drop the contact - Form new connections Seek referrals
Match? Unknown
Unknown
Unknown
Unknown
Unknown
Unknown
Unknown
Unknown
Unknown
Unknown
Unknown
Third Model cont.What we study?
This capability
Connection network Advice exchange
Agents’ interactions Social relationships
The evolving social network Utility gain
UnknownAffect the match?
How co-evolve?
Improve?
Change?
Algorithm: Evolving Social Network Advice seekingRequire: Population of agents , set of resources , rounds , evolutionary rate , maximum out degree , recommendation threshold t, default edge weight
1: Weighted Graph = InitializeGraph ( , , )2: for r = 1 to do3: for each a in random order ∈ do
4:5: if Random() > then6: AccessResource(a, )7: else8: Query (a, , , t)9: end if
10: if Random() < then 11: AdaptLinks(a, , RANDOM() < , ) 12: end if
Third Model cont. Schematic Algorithm
1-Initialization
2-Exploitation/Exploration
3-Advice selection 4-Assessment *
5-Network Adaptation *
Third Model cont.1-Initialization
Heterogeneous pool of resourcesn-dimensional binary feature vector fr initialized randomly
Heterogeneous agent population n-dimensional binary preference vector pa initialized randomly
Initialize Graph( , , )
2 scenarios: random agents no structural restriction social agents outgoing edges, default weight ( = 0.5)
Selection based on personal knowledge / Query others!
Probabilistic Quality of the agent’s acquired knowledge
Exploit Access the largest utility resource it knows so far Explore Seek advice (resource, utility)
Random agents other random agents
Social agents outgoing edges, social contacts
Third Model cont.2-Exploitation/Exploration
Third Model cont.
A suggestion probabilistically
1. Advisor Link’s weight
2. One of his suggestions Reported utility
Subjective utility of accessed resource• Similarity between pa & fr
• Normalized Hamming distance mapped to [-1,1]
Positive values better than average random selection
Negative values random selection would have done better
3-Advice selection
Third Model cont.
Social agents learn from their interactions adjust the weight of links
Following a particular suggestion
- Positive | ua (r) – urep (r)| < thrdis
- Negative
Adjust the link weight with multiple advisors
- the link weight
- w(a,b) < thrtolerance remove the edge, free slot!
4-Assessment *
Third Model cont.
Social agents
opportunity to change their links probabilistically!
Link to a random agent with default weight
Ask for referrals Trust propagation [Massa and Avesani 2007, Vidal 2005]
5-Network Adaptation *
Third Model cont.Snapshots
Steps 4 & 5 eventually make link with similar preferences Similar-minded community spot beneficial resources faster
Third Model cont.Experimental Setup
Monte-Carlo simulations, various parameter settings
Scenarios (Social agents only and Random agents only)
Population sizes (small = 100, large = 300 agents)
BackgroundTrust and Reputation in Multi-agent SystemsTrust and Reputation in Evolutionary Game TheoryEvolutionary Games on Graphs
The Research workFirst ModelSecond ModelThird Model
Concluding Discussion
Acknowledgment and Questions?
Efficacy of Enhanced Social learning approaches Agents interactions Individuals’ and System’s long term (utility) performance
Life-experiences + Endogenous Evolving Social Networks Trust and Reputation ESL
First Model (2-PD on Fix Grid Structure): Adaptive rewards Life-experiences / Age
Innovative notion of role model trustworthiness / Heterogeneous social diversity Cooperation
Second Model (N-PD on an Evolving Social Network): Endogenous network formation Partner selection + Decision making (Cooperation)
Emergent Social Networks High average clustering + Broad-Scale heterogeneity
Third Model (Distributed Advice-Seeking for Resource Discovery):Life-experiences + Endogenous network formation Similar minded (appropriate role models)
Strongly connected communities with similar preferences Higher utility
Summary Thesis contributions
Limitations
Generality of Adaptive rewards on Fixed interaction networks2-PD on simple Grid Other classes of games (Hawk-Dove / Stag-Hunt / …)Age attribute Heterogeneity Other concepts? How encourage Cooperation?Simple Grid Other fixed topologies? Effect of different neighbourhood structures
Generality of Adaptive rewards on Evolving Social NetworksDynamic Payoffs N-PD framework Not satisfying! (limited parameter settings)
Extensive analysis Determine why it was not helpful / If it is helpful at all / How?(Ex./ Bigger ranges of life-span / different time scales for update rules + evolution interaction network)
Realistic approaches for Advice-Seeking frameworkGeneric model Inspired by several distributed service provision systems
Synthetic date Set up specific, controlled platform Represent semi-realistic MAS Evaluate performance of the ESL Not solution for particular application!
Exploit such techniques real technological systems real data sets real users preference profiles binary preferences Not realistic!
Future workN-PD fixed group sizes + similar for all agentsDynamic group formation + heterogeneous sizes different communities in real-world
Advice-Seeking model similarities with Recommender SystemsDifferent purpose here BUT!
Interesting to Modify and apply in such context Comparison with other models
Enhanced Social Learning Imitation (basic cultural learning)Extend to other methods of MAS learning ex./ Reinforcement Learning
Evolutionary Game Theory + Advice-Seeking Investigation domainsPotential domains (MAS) P2P / Mobile Ad-hoc Networks / Grid Computing
Robustness of the proposed mechanisms Different scales of dynamicity in real-world environment
Acknowledgment
1. Michael, Shanika, Adrian
2. Jens
3. Les, Ed, Leon, Liz, …
4. Agent lab members, Rebecca, …
5. Dept. Computer Sci / Uni Melb
6. Rahil, Leila, Parvin, Toktam, …
7. Lab colleagues (Saeed/Raymond/…)
8. …
Questions?
Thank you
References1) D. Gambetta. Can We Trust Trust? In D. Gambetta, editor, Trust: Making and Breaking Cooperative
Relations, pages 213--237. Basil Blackwell, 1988.2) R. Ismail, A. Jøsang, and C. Boyd. A survey of trust and reputation systems for online service provision.
Decision Support Systems, 43:618644, 2007.3) M. A. Nowak. Five rules for the evolution of cooperation. Science, 314:1560-1563, 2006.4) R. Boyd and P. Richerson. The evolution of reciprocity in sizeable groups. Journal of Theoretical Biology,
132:337--356, 1988.5) C. Facciorusso, S. Field, R. Hauser, Y. Hoffner, R. Humbel, R. Pawlitzek, W. Rjaibi, and C. Siminitz. A Web
Services Matchmaking Engine for Web Services. In E-Commerce and Web Technologies, Lecture Notes in Computer Science, pages 37--49, 2003.
6) L. Nunes and E. Oliveira. Advice-exchange in heterogeneous groups of learning agents. In Proceedings of the second international joint conference on Autonomous agents and multiagent systems, pages 1084--1085, 2003.
7) P. Massa and P. Avesani. Trust-aware recommender systems. In Proceedings of the 2007 ACM conference on Recommender systems, pages 17--24, 2007.
8) J. M. Vidal. A Protocol for a Distributed Recommender System. In J. Sabater R. Falcone, S. Barber and M. Singh, editors, Trusting Agents for Trusting Electronic Societies. Springer, 2005.
9) G. Hardin. The Tragedy of the Commons. Science, 162:1243{1248, 1968.10) U. Wilensky. Modelling Nature's Emergent Patterns with Multi-agent Languages. In Proceedings of
EuroLogo, 2002. NetLogo is a cross-platform multi-agent programmable modelling environment. See http://ccl.northwestern.edu/netlogo/.
Backup Slides
First Model cont.Sensitivity to the magnitude of K