Top Banner
NEW TIES WP2 Agent and learning mechanisms
18

NEW TIES WP2 Agent and learning mechanisms

Jan 26, 2016

Download

Documents

jalia

NEW TIES WP2 Agent and learning mechanisms. Decision making and learning. Agents have a controller (decision tree, DQT) Input: situation (as perceived = seen/heard/interpr’d Output: action Decision making = using DQT Learning = modifying DQT - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: NEW TIES WP2  Agent and learning mechanisms

NEW TIES WP2 Agent and learning mechanisms

Page 2: NEW TIES WP2  Agent and learning mechanisms

Decision making and learning

Agents have a controller (decision tree, DQT) Input: situation (as perceived = seen/heard/interpr’d Output: action

Decision making = using DQT Learning = modifying DQT Decisions also depend on inheritable “attitude

genes” (learned through evolution)

Page 3: NEW TIES WP2  Agent and learning mechanisms

Example of a DQT

0.5

B

BT

ABias Test Action Decision 0.2 Genetic bias YES Boolean choice

Legend

VISUAL:FRONTFOODREACHABLE

T

NO YES

TURNLEFT

MOVE TURNRIGHT

A

0.6 0.2 0.2

PICKUP

1.0

A

BAG:FOODT

YES NO

TURNLEFT

MOVE TURNRIGHT

A

0.6 0.2 0.2

EAT

1.0

A

0.5

Page 4: NEW TIES WP2  Agent and learning mechanisms

Interaction evolution & individual learning

Bias node with n children each with bias bi

Bias ≠ probability Bias bi is learned, changing (name: learned

bias) Genetic bias gi is inherited, part of genome,

constant Actual probability of choosing child x:

p(b,g) = b + (1 - b) ∙ g Learned and inherited behaviour are linked

through formula

Page 5: NEW TIES WP2  Agent and learning mechanisms

DQT nodes & parameters cont’d

Test node language: native concepts + emerging concepts

Native: see_agent, see_mother, see_food, have_food, see_mate, …

New concepts can emerge by categorisation (discrimination game)

Page 6: NEW TIES WP2  Agent and learning mechanisms

Learning: the heart of the emergence engine

Evolutionary learning: not within an agent (not during lifetime), over

generations by variation + selection

Individual learning: within one agent, during lifetime by reinforcement learning

Social learning: during lifetime, in interacting agents by sending/receiving + adopting knowledge pieces

Page 7: NEW TIES WP2  Agent and learning mechanisms

Types of learning: properties Evolutionary learning:

Agent does not create new knowledge during lifetime Basic DQTree + genetic biases are inheritable “knowledge creator” = crossover and mutation

Individual learning: Agent does create new knowledge during lifetime DQTree + learned biases are modified “knowledge creator” = reinforcement learning (driven by

rewards) Individually learnt knowledge dies with its host agent

Social learning: Agent imports knowledge already created elsewhere (new? not

new?) Adoption of imported knowledge ≈ crossover Importing knowledge pieces

can save effort for recipient can create novel combinations

Exporting knowledge helps its preservation after death of host

Page 8: NEW TIES WP2  Agent and learning mechanisms

Present status of types of learning

Evolutionary learning: Demonstrated in 2 NT scenarios Autonomous selection/reproduction causes problems with

population stability (im/explosion) Individual learning:

code, but never demonstrated in NT scenarios Social learning:

Under construction/design based on the “telepathy” approach

Communication protocols + adoption mechanisms needed

Page 9: NEW TIES WP2  Agent and learning mechanisms

Evolution: variation operators

Operators for DQT: Crossover = subtree swap Mutation =

Substitute subtree with random sub-tree Change concepts in test nodes Change bias on an edge

Operators for attitude genes: Crossover = full arithmetic xover Mutation =

Add Gaussian noise Replace with random value

Page 10: NEW TIES WP2  Agent and learning mechanisms

Evolution: selection operators

Mate selection: Mate action chosen by DQT Propose – accept proposal Adulthood OK

Survivor selection: Dead if too old ( ≥ 80 years) Dead if zero energy

Page 11: NEW TIES WP2  Agent and learning mechanisms

Experiment: Simple world

Setup: Environment

World size: 200 x 200 grid cells Agents and food (no tokens, roads, etc).

Both are variable in number. Initial distribution of agents (500): in

upper left corner Initial distribution of food (10000): 5000

in upper left and lower right corner.

Page 12: NEW TIES WP2  Agent and learning mechanisms

Experiment: Simple world

Setup: Agents

Native knowledge (concepts and DQT sub trees)

Navigating (random walk) Eating (identify, pickup and eat plants) Mating (identify mates, propose/agree)

Random DQT-tree branches Differs per agent Based on the “pool” of native concepts

Page 13: NEW TIES WP2  Agent and learning mechanisms

Experiment: Simple world

Simulation continued for 3 months real time to test stability

Page 14: NEW TIES WP2  Agent and learning mechanisms

Experiment: Poisonous Food

Setup: Environment

Two types of food: poisonous (decreases energy) and edible (increases energy)

World size: 200 x 200 grid cells Agents and food (no tokens, roads, etc). Both

are variable in number. Initial distribution of agents (500): uniform

random over the grid space. Initial distribution of food (10000): 5000 of

each type of food uniform random over the same grid space as the agents.

Page 15: NEW TIES WP2  Agent and learning mechanisms

Experiment: Poisonous Food

Setup: Agent

Native knowledge Identical to simple world experiment

Additional native knowledge Can distinguish poisonous from edible plants Relation with eating/picking up is not present

No random DQT-tree branches

Page 16: NEW TIES WP2  Agent and learning mechanisms

Experiment: Poisonous Food

Measures

Population size Welfare (energy) Number of poisonous and edible plants Complexity of controller (nr. of nodes) Age

Page 17: NEW TIES WP2  Agent and learning mechanisms

Experiment: Poisonous Food

Demo

Page 18: NEW TIES WP2  Agent and learning mechanisms

Experiment: Poisonous Food Results

0

500

1000

1500

2000

2500

timestep 1250 2500 3750 5000 6250 7500 8750 10000 11250 12500 13750 15000

population size

healthy plants (x10)

poisonous plants (x10)

average agent energy (x100)