Operant Conditioning

Operant Conditioning – Module 19Cognitive (Latent) Learning –

Module 19

Intro Psychology Oct 21-23, 2009Classes #23-24

Instrumental Conditioning

E. L. Thorndike (1905) Described the learning that

was governed by his "law of effect" as instrumental conditioning because responses are strengthened when they are instrumental in producing rewards

Law of Effect Responses that are

rewarded are more likely to be repeated and responses that are produce discomfort are less likely to be repeated

Thorndike's Puzzle Box

In his classic experiment, a cat was locked in the box and enticed to escape by using food that was placed out of the reach from the box The box included ropes, levers, and

latches that the cat could use to escape Trial and error behavior would lead to

ultimate success (usually within three minutes)

Thorndike felt we learned things through trial and error – awareness

Gestalt Viewpoint

Wolfgang Kohler A Gestalt psychologist had an opposing

view is that we learn things implicitly – unawareness – natural insight

Example: gorilla in a cage – food out of reach – but stick is not…

Operant Conditioning

Operant Conditioning A type of learning in which voluntary

(controllable and non-reflexive) behavior is strengthened if it is reinforced and weakened if it is punished (or not reinforced)

Skinner (1938)

The organism learns a response by operating on the environment…

Note: The terms instrumental conditioning and

operant conditioning describe essentially the same learning process and are often used interchangeably

Basically, Skinner extended and formalized many of Thorndike's ideas


Response comes first and is voluntary unlike classical where stimulus comes first and response is involuntary Classical: S R Operant: S R S

that becomes

R S

The Skinner Box

Soundproof chamber with a bar or key that could be manipulated to release a food or water reward

Shaping:Reinforcing successive approximations

Responses that come successively closer to the desired response were reinforced… Skinner referred to this as his “Behavioral

Technology” Taught pigeons “unpigeon-like” behaviors Walking in Figure 8, playing ping-pong, and

keeping a “guided missile” on course by pecking at a moving target displayed on a screen…but most proud of getting them to hoist an American flag and then to salute it

B.F. Skinner (1904-1990)

In the Lab…


Important terms Primary Reinforcers Conditioned (Secondary) Reinforcers Positive Reinforcement Punishment Negative Reinforcement

Reinforcers

Primary Reinforcers Innately rewarding; no learning necessary Stimulus that naturally strengthens any

response that precedes it without the need for any learning on the part of the organism

Food, water, etc. Secondary Reinforcers

A consequence that is learned by pairing with a primary reinforcer

For people, money, good grades, and words of praise, etc. are often linked to basic rewards

We need money to buy food, etc.

Positive Reinforcement

Behavior is strengthened when something pleasant or desirable occurs following the behavior With the use of positive

reinforcement chances that the behavior will occur in the future is increased

Punishment

Any stimulus presented immediately after a behavior in order to decrease the future probability of that behavior For example:

If your kid runs into the middle of the street and you flip out and “express to him how bad he is” this (at least in psychological terms) is only considered to be punishment if it does in fact lead to a decrease in that child’s behavior of running into the street

Negative Reinforcement

One of the most misunderstood terms in psychology…

Definitely a problem with semantics here The word reinforcement means that a response is

strengthened The word negative seems to imply that the

response is somehow weakened This is not the case here! So how literally can a response be negatively

reinforced??? Often, this term is misapplied to term punishment

So lets try to proceed slowly in our attempts to figure this out…


Positive Reinforcement is a reward That’s easy enough

Punishment is something that weakens a response Again, this is pretty basic

In an attempt to increase the likelihood of a behavior occurring in the future, an operant response is followed by the removal of an aversive stimulus. This is negative reinforcement… Example: When a child says "please" and

"thank you" to his/her mother, the child may not have to engage in his/her dreaded chore of setting the table


So we are learning to do something to turn off a bad stimulus Example: We put on boots to prevent

sitting in class with wet socks on Increasing a behavior to stop a bad thing

from occurring Doing something to remove the reinforcer

Types of Negative Reinforcement

Escape Conditioning This occurs when the behavior has led to a reduction of the

aversiveness of the environment Example: Rats moving away from the shock area after

feeling the pain This does involve an observable change in the

environment Avoidance Conditioning

When a behavior has prevented the onset of an impending increase in the aversiveness of the environment

Example: Rats moving away from the shock area after hearing a signal that the shock is about to be administered

A child apologizes upon seeing their parent frowning thus avoiding being yelled at

Involves no observable change in the environment

Schedules of Reinforcement Continuous Reinforcement

Reinforcement delivered every time a particular response occurs

Intermittent Reinforcement Reinforcement is administered only some of

the time

Intermittent Schedules of Reinforcement Fixed-Ratio

Reinforcement provided after a fixed number of responses

Food every tenth bar press

Variable-Ratio Reinforcement after a

a variable number of responses (works on a average)

Unpredictable number of responses are required (slot machines)

Intermittent Schedules of Reinforcement Fixed-Interval Schedules

Provides reinforcement for the first response that occurs after some fixed time has passed since the last reward

Number of responses doesn’t matter only time Example: Food is given to rats every 20 min.

Variable-Interval Schedule Reinforce the first responses after a certain amount of

time has past Again number of responses doesn’t matter But this time the amount of time changes

Might be the first response after ten minutes then the next time it is the first response after 20 minutes, and then the next time it is the first response after 30 min…

Applications of Operant Conditioning: In the Classroom Skinner thought that our education system was

ineffective He suggested that one teacher in a classroom

could not teach many students adequately when each child learns at a different rate

He proposed using teaching machines (what we now call computers) that would allow each student to move at their own pace

The teaching machine would provide self-paced learning that gave immediate feedback, immediate reinforcement, identification of problem areas, etc., that a teacher could not possibly provide

Applications of Operant Conditioning: In the Workplace

Pedalino & Gamboa (1974) To help reduce the frequency of employee

tardiness, these researchers implemented a game-like system for all employees that arrived on time

When an employee arrived on time, they were allowed to draw a card

Over the course of a 5-day workweek, the employee would have a full hand for poker

At the end of the week, the best hand won $20 This simple method reduced employee

tardiness significantly and demonstrated the effectiveness of operant conditioning on humans

Criticisms Of The Use Of Reinforcement

Criticism #1: Behavior should not have to rely on

persuasion… It is manipulative and controlling Appropriate behavior should be the norm Skinner says we are always controlled by

rewards but often are unaware of these… Parents, peers, schools, employers, etc. all

use rewards to control our behavior Skinner:

If its manipulative then everyone is to blame?

Criticisms Of The Use Of Reinforcement

Criticism #2: Reinforcement undermines Intrinsic

Motivation…Messes up our inner desire to do

somethingNow we need to do it for a tangible

rewardExample: Child cleaning his/her room…

Why do they do it? Be careful of overjustification…

Cognitive Learning

Focus on the role of thinking processes in learning

Theory based on unseen internal factors rather than on external factors Skinner was very much against these theories

but lets look at one…latent learning…

Latent Learning

Tolman and Honzik (1930) Took three groups of rats and had them run a

maze Group 1

Reinforced every time they found their way out of the maze (food box) for ten days

Group 2 Never reinforced (no food at the end)

Group 3 Reinforced only after day 10 of the experiment

Latent Learning On day 11, they timed the three groups to

see which group would make it through the maze the quickest… Which group do you think was the

fastest?

Operant Conditioning – Module 19 Cognitive (Latent) Learning – Module 19 Intro Psychology Oct 21-23, 2009 Classes #23-24.

Documents

error behavior

learning necessarystimulus

learning process

reflexive behavior

organism food

cage food

reinforced skinner