INTELLIGENT AGENTS Chapter 2 ١٤٤٣/٠٦/١٧ 1
Dec 27, 2015
INTELLIGENT AGENTS
Chapter 2/ /١٤٤٤ ٠٩ ٢٩
1
Outline
Agents and environments Rationality PEAS (Performance measure,
Environment, Actuators, Sensors) Environment types Agent types
/ /١٤٤٤ ٠٩ ٢٩
2
Agents
An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators
Human agent: eyes, ears, and other organs for sensors; hands, legs, mouth, and other body parts for actuators
Robotic agent: cameras and infrared range finders for sensors; various motors for actuators
/ /١٤٤٤ ٠٩ ٢٩
3
Agents
/ /١٤٤٤ ٠٩ ٢٩
4
Agents and environments
The agent function maps from percept histories to actions:
[f: P* A] The agent program runs on the physical
architecture to produce f agent = architecture + program
/ /١٤٤٤ ٠٩ ٢٩
5
Vacuum-cleaner world
Percepts: location and contents, e.g., [A,Dirty]
Actions: Left, Right, Suck, NoOp
/ /١٤٤٤ ٠٩ ٢٩
6
A vacuum-cleaner agent
What is the right function?
Can it be implemented in as a small agent program?/ /١٤٤٤ ٠٩ ٢٩
7
Rational agents
An agent should strive to "do the right thing", based on what it can perceive and the actions it can perform.
The right action is the one that will cause the agent to be most successful.
Performance measure: An objective criterion for success of an agent's behavior.E.g., performance measure of a vacuum-cleaner agent could be amount of dirt cleaned up, amount of time taken, amount of electricity consumed, amount of noise generated, etc.
/ /١٤٤٤ ٠٩ ٢٩
8
Rational agents
Rational Agent: For each possible percept sequence, a rational agent should select an action that is expected to maximize its performance measure, given the evidence provided by the percept sequence and whatever built-in knowledge the agent has.
Rationality is distinct from omniscience (all-knowing with infinite knowledge)
/ /١٤٤٤ ٠٩ ٢٩
9
Rational agents
Agents can perform actions in order to modify future percepts so as to obtain useful information (information gathering, exploration)
An agent is autonomous if its behavior is determined by its own experience (with ability to learn and adapt)
/ /١٤٤٤ ٠٩ ٢٩
10
PEAS
PEAS: Performance measure, Environment, Actuators, Sensors
Consider, e.g., the task of designing an automated taxi driver: Performance measure Environment Actuators Sensors
/ /١٤٤٤ ٠٩ ٢٩
11
PEAS
Consider, e.g., the task of designing an automated taxi driver: Performance measure: Safe, fast, legal,
comfortable trip, maximize profits Environment: Roads, other traffic, pedestrians,
customers Actuators: Steering wheel, accelerator, brake,
signal, horn Sensors: Cameras, sonar, speedometer, GPS,
odometer, engine sensors, keyboard
/ /١٤٤٤ ٠٩ ٢٩
12
PEAS
Agent: Medical diagnosis system Performance measure: Healthy patient,
minimize costs, lawsuits Environment: Patient, hospital, staff Actuators: Screen display (questions, tests,
diagnoses, treatments, referrals)
Sensors: Keyboard (entry of symptoms, findings, patient's answers)
/ /١٤٤٤ ٠٩ ٢٩
13
PEAS
Agent: Part-picking robot Performance measure: Percentage of
parts in correct bins Environment: Conveyor belt with parts,
bins Actuators: Jointed arm and hand Sensors: Camera, joint angle sensors
/ /١٤٤٤ ٠٩ ٢٩
14
PEAS
Agent: Interactive English tutor Performance measure: Maximize
student's score on test Environment: Set of students Actuators: Screen display (exercises,
suggestions, corrections) Sensors: Keyboard
/ /١٤٤٤ ٠٩ ٢٩
15
Environment types
Fully observable (vs. partially observable): An agent's sensors give it access to the complete state of the environment at each point in time.
Deterministic (vs. stochastic): (actions are predictable) The next state of the environment is completely determined by the current state and the action executed by the agent.
Episodic (vs. sequential): The agent's experience is divided into atomic "episodes" (each episode consists of the agent perceiving and then performing a single action), and the choice of action in each episode depends only on the episode itself.
/ /١٤٤٤ ٠٩ ٢٩
16
Environment types
Static (vs. dynamic): The environment is unchanged while an agent is deliberating. (The environment is semidynamic if the environment itself does not change with the passage of time but the agent's performance score does)
Discrete (vs. continuous): A limited number of distinct, clearly defined percepts and actions.
/ /١٤٤٤ ٠٩ ٢٩
17
Environment types
Single agent (vs. multiagent): An agent operating by itself in an environment.
Multi-agents environment Competitive : chess Cooperative : taxi
/ /١٤٤٤ ٠٩ ٢٩
18
Summary
Fully observable vs. Not (fully) observable. Does the agent see the complete state of
the environment? Deterministic vs. Nondeterministic.
Is there a unique mapping from one state to another state for a given action?
Episodic vs. Sequential Does the next “episode” depend on the
actions taken in previous episodes?
/ /١٤٤٤ ٠٩ ٢٩
19
Summary
Static vs. Dynamic. Can the world change while the agent is
thinking? Discrete vs. Continuous.
Are the distinct percepts and actions limited or unlimited?
/ /١٤٤٤ ٠٩ ٢٩
20
Environment types
Chess with Chess without Taxi driving
a clock a clockFully observable Yes Yes No Deterministic Strategic Strategic No Episodic No No No Static Semi Yes No Discrete Yes Yes NoSingle agent No No No
The environment type largely determines the agent design
The real world is (of course) partially observable, stochastic, sequential, dynamic, continuous, multi-agent
/ /١٤٤٤ ٠٩ ٢٩
21
Agent functions and programs An agent is completely specified by the agent
function mapping percept sequences to actions. The agent program implements the agent
function mapping percepts sequences to actions Agent=architecture + program.
Architecture= sort of computing device with physical sensors and actuators.
Aim of AI is to design the agent program : find a way to implement the rational agent function concisely.
/ /١٤٤٤ ٠٩ ٢٩
22
Table-lookup agent
Function Table-Driven-Agent(percept)
Static: percepts, a sequence, initially empty
table, a table of actions, indexed by percept sequences, initially fully specified
append percept to the end of percepts
action <- Lookup(percepts,table)
Return action
The table agent program is invoked for each new percept and returns an action each time. It keeps track of percept sequences using its own private data structure.
/ /١٤٤٤ ٠٩ ٢٩
23
Table-lookup agent
Drawbacks: Huge table Take a long time to build the table No autonomy Even with learning, need a long time to learn
the table entries. Example : let P be the set of possible percepts and T be the
lifetime of the agent (the total number of percepts it will receive) then the lookup table will contain Pt (t=0…T) entries.
The table of the vacuum agent (VA) will contain more than 4T entries (VA has 4 possible percepts).
/ /١٤٤٤ ٠٩ ٢٩
24
Agent program for a vacuum-cleaner agent
The vacuum agent program is very small compared to the corresponding table : it cuts down the number of possibilities from 4T to 3. This reduction comes from the ignoring of the history percepts.
/ /١٤٤٤ ٠٩ ٢٩
25
Agent types
Four basic types in order of increasing generality : Simple reflex agents Model-based reflex agents Goal-based agents Utility-based agents
/ /١٤٤٤ ٠٩ ٢٩
26
Simple reflex agents Program
Single current percept : the agent select an action on the basis current percept, ignoring the rest of percept history.
Example : The vacuum agent (VA) is a simple reflex agent, because it decision is based only on the current location and on whether that contains dirt.
Rules relate “State” based on percept “action” for agent to perform “Condition-action” rule: If a then b: e.g.
vacuum agent (VA) : if in(A) and dirty(A), then vacuum taxi driving agent (TA): if car-in-front-is-braking then
initiate-braking./ /١٤٤٤ ٠٩ ٢٩
27
Schematic diagram of a simple reflex agents
/ /١٤٤٤ ٠٩ ٢٩
28
Simple reflex agents ProgramFunction Simple-Reflex-Agent(percept)
Static: rules, set of condition-actions rules;
state <- Interpret-Input(percept)
Rule <- Rule-Match(state, rules)
action <- Rule-Action[Rule]
Return action
A simple reflex agent. It acts according to rule whose condition matches the current state, as defined by the percept. / /١٤٤٤ ٠٩ ٢٩
29
Simple reflex agents Program Simple, but VERY limited
Must be fully observable to be accurate Limited intelligence (decision can be made –
only if the environment is fully observable) Example : vacuum agent deprived of its
location sensor, and has only a dirt sensor (2 possible percepts : [dirty] and [clean]) : The action for [dirty] is suck. What is the action for [clean] ? Moving left fails for
ever if it happens to start in location A.
/ /١٤٤٤ ٠٩ ٢٩
30
Model-based reflex agents Solution to partial observability problems
Maintain state Keep track of parts of the world can't see now Maintain internal state that depends on the percept
history
Update previous state based on Knowledge of how world changes, e.g. TA : an
overtaking car generally will be closer behind than it was a moment ago.
Knowledge of effects of own actions, e.g. TA: When the agent turns the steering wheel clockwise the car turns to the right.
=> Model called “Model of the world” implements the knowledge about how the world work.
/ /١٤٤٤ ٠٩ ٢٩
31
Schematic diagram of a Model-based reflex agents
/ /١٤٤٤ ٠٩ ٢٩
32
Model-based reflex agents
Function Model-based-Reflex-Agent(percept)
Static: state, a description of the current world state rules, set of condition-actions rules;
actions, the most recent action, initially none
State<-Update-State(oldInternalState,LastAction,percept)
rule<- Rule-Match(State, rules)
action <- Rule-Action[rule]
Return action
A model-based reflex agent. It keep track of the current state of the world using an internal model. It then chooses an action in the same way as the reflect agent.
/ /١٤٤٤ ٠٩ ٢٩
33
Goal-based agents
• The current state of the environment is not always enough to decide what to do.
• The taxi can turn left, right or go straight on. The decision depends on where the taxi is trying to get to (for example : being at the passenger’s destination).
/ /١٤٤٤ ٠٩ ٢٩
34
Goal-based agents
/ /١٤٤٤ ٠٩ ٢٩
35
A model-based, goal-based agentFunction Model-based-Reflex-Agent(percept)
Static: state, a description of the current world state
rules, set of condition-actions rules;
actions, the most recent action, initially none
State<-Update-State(oldInternalState,LastAction,percept)
rule <- Rule-Match(State, rules, goal)
action <- Rule-Action[rule]
Return action
A goal-based agent. It keep track of the current state of the world using an internal model as well as a set of goals it is trying to achieve. It then chooses an action that leads to the achievement of the goal.
/ /١٤٤٤ ٠٩ ٢٩
36
Utility-based agents
• Goal: – Issue: Only binary: achieved/not achieved– Want more nuanced:
• Not just achieve state, but faster, cheaper, smoother,...
• Solution: Utility– Utility function: state (sequence percepts) ->
value– Select among multiple or conflicting goals
/ /١٤٤٤ ٠٩ ٢٩
37
Utility-based agents
Goals just provide a crude distinction between “happy” and not “happy” states, whereas more general performance measure should allow a comparison of different world sequences
Utility is a function that maps a state onto real number, which describes the associated degree of happiness helps to make decision when there are conflicting goals (like speed and safety)
The right decision = Function (percept, goal) + Quicker + Safer + Reliable + Less cost
/ /١٤٤٤ ٠٩ ٢٩
38
Utility-based agents
/ /١٤٤٤ ٠٩ ٢٩
39
A model-based, utility-based agentFunction Model-based-Reflex-Agent(percept)
Static: state, a description of the current world state rules, set of condition-actions rules;
actions, the most recent action, initially none
State<-Update-State(oldInternalState,LastAction,percept)
AllRules <- Rule-Match(State, rules, goal)
bestRule <- utility(AllRules)
action <- Rule-Action[bestRule]
Return action
A utility-based agent. It uses a model of the world along with utility function that measures its preferences among states of the world.
/ /١٤٤٤ ٠٩ ٢٩
40
Learning agents• All agents can improve their performance
through learning.• Learning: allow agent to match new states/actions• A learning agent can be divided into four conceptual
components:• Learning element: makes improvements• Performance element: selects external actions based on
percept (entire agent in previous cases).• Critic: gives feedback to learning about success (it tells
the learning element how well the agent is doing with respect to a fixed performance standard.
• Problem generator: suggests actions to find new states.
/ /١٤٤٤ ٠٩ ٢٩
41
Example : Learning agents : Taxi driving
•The performance element consists of whatever collection of knowledge and procedures the TA has for selecting its driving actions.
•The critic observes the world and passes information along to the learning element. For example after the taxi makes a quick left turn across three lanes the critic observes the shocking language used by other drivers. From this experience the learning element is able to formulate a rule saying this was a bad action, and the performance element is modified by installing this new rule.
•The problem generator may identify certain areas of behavior in need of improvement and suggest experiments : such as testing the brakes on different road surfaces under different conditions.
•The learning element can make change in any Knowledge of previous agent types : observation between two states (how the world evolves), observation of results of actions (what my action do).
/ /١٤٤٤ ٠٩ ٢٩
42
Learning agents
/ /١٤٤٤ ٠٩ ٢٩
43
Summary : Exercises
Define in your own words the following terms : agent, agent function, agent program, rationality, autonomy, reflex agent, model-based agent, goal-based agent, utility-based agent, learning agent.
/ /١٤٤٤ ٠٩ ٢٩
44
Solution
/ /١٤٤٤ ٠٩ ٢٩
45
Agent : an entity that perceives and acts (program that operate on behalf of a human).Agent function : a function that specifies the agent’s action in response to every possible percept sequence (input percept sequence, output action).Agent program : that program which combined with a machine architecture implements an agent function (input one percept, output an action).Rationality : property of agents that choose actions that maximizes their expected utility, given the percepts to date.Autonomy a property of agents whose behavior is determined by their own experience rather than solely by their initial programming.Reflex-based agent : an agent whose action depends only on the current percept.Model-based agent : an agent whose action is derived directly from an internal model of the current world state that is updated over time.Goal-based agent : an agent that selects actions that it believes will achieve explicitly represented goals.Utility-based agent : : an agent that selects actions that it believes will maximize the expected utility of the outcome state.Learning agent : an agent whose behavior improves over time based on its experience.
Exercise
/ /١٤٤٤ ٠٩ ٢٩
46
Sensors Actuators Environment Performace Measure
Agent type
Robot Soccer Player
Internet book-shopping agent
Autonomous mars rover
Agents Discrete Static Episodic Deterministic observable Task environment
Robot Soccer
Internet book-shopping agent
Autonomous mars rover