Top Banner
Chapter 9 Generalization, Discrimination, and the Representation of Similarity
85

Chapter 9

Mar 15, 2016

Download

Documents

Carly Booker

Chapter 9. Generalization, Discrimination, and the Representation of Similarity. 9.1 Behavioral Processes. 9.1 Behavioral Processes. When Similar Stimuli Predict Similar Consequences When Similar Stimuli Predict Different Consequences - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chapter 9

Chapter 9

Generalization, Discrimination, and the Representation

of Similarity

Page 2: Chapter 9

9.1

Behavioral Processes

Page 3: Chapter 9

3

9.1 Behavioral Processes

• When Similar Stimuli Predict Similar Consequences

• When Similar Stimuli Predict Different Consequences

• Unsolved Mysteries—Why Are Some Feature Pairs Easier to Discriminate Between Than Others?

• When Dissimilar Stimuli Predict the Same Consequence

• Learning and Memory in Everyday Life— Discrimination and Stereotypes in Generalizing about Other People

Page 4: Chapter 9

4

Generalization and Discrimination

• Generalization—transfer of past learning to new situations/problems. Responding to one stimulus (S) as a result of training

with another; influenced by similarity to the training stimulus.

Specificity—deciding how narrowly a rule applies. Generality—deciding how broadly a rule applies.

• Discrimination—recognition of differences between stimuli.

Page 5: Chapter 9

5

When Similar Stimuli Predict Similar Consequences

• Generalization gradient—graph showing how physical changes in stimuli correspond to behavioral response changes.

• In Guttman and Kalish study: Pigeons learned to peck a yellow light (training S) for food. Gradient shows how often they subsequently pecked

different color shades (Fig 9.1). Gradient width illustrates level of S generalization.

Page 6: Chapter 9

6

(Fig 9.1) Stimulus Generalization Gradients in Pigeons

Adapted from Guttman & Kalish, 1956, pp. 79–88.

Page 7: Chapter 9

7

What Causes Generalization Gradients?

• Is it discrimination error?• Logical inference about shared consequences?• Shepard (1987): Identify regions of shared consequence.

Assume all possible regions, small and large. Average probabilistically over all. Result: Standard exp-declining gradient Argues: “View exp-declining gradients as representing attempt to

predict, based on past experience, how likely it is that what is true about the consequences of one stimulus will also be true of other similar stimuli.”

Page 8: Chapter 9

8

Generalization as a Search for Similar Consequences

• Consequential region—all stimuli with the same results as the training stimulus, as mapped on a generalization gradient.

• For example, the pigeon has a moderate expectation to get food from pecking a yellow-range light (given Fig 9.1).

Page 9: Chapter 9

9

The Challenge of Incorporating Similarity into Learning Models• Discrete-component representation— representation in which each individual stimulus (or stimulus feature) corresponds to its own node or “component.” Simplest possible scheme to represent stimuli.

• Fig 9.2 uses discrete-component representations. Shows an unrealistic generalization gradient.

Page 10: Chapter 9

10

(Fig 9.2) Stimulus Generalization Model Using Discrete-Component Representations

Page 11: Chapter 9

11

Limitations of Discrete-Component Representations

• Representations are applicable to situations in which stimuli are dissimilar and little generalization would occur. Fail when stimuli have high degree of physical similarity.

• Note: Different representations in different contexts provide different patterns of similarity. Representations are context-specific.

Page 12: Chapter 9

12

(Fig 9.3) Generalization Gradient Produced by Discrete-Component Network of Fig 9.2

*Shows no response to yellow-orange light (despite similarity to previously trained yellow light). Only responds to trained “yellow” stimulus.; fails to show a smooth generalization gradient like that shown in Fig 9.1.

Page 13: Chapter 9

13

Shared Elements and Distributed Representations

• Thorndike (law of effect), Estes (stimulus sampling theory), Rumelhart (connectionist models) contributed to a contemporary associative-learning model. Conceptualized with distributed representations

(overlapping pools of stimulus nodes).Similar stimuli activate common elements;

something learned about one stimulus transfers to other stimuli that activate the same nodes.

Page 14: Chapter 9

14

Yellow Orange

Thorndike and Estes Shared Elements

Network model follows…

Page 15: Chapter 9

15

Shared Elements and Distributed Representations

• Fig 9.4a–d shows a network model using distributed representations. Nodes laid out in topographic representation (nodes

responding to physically similar stimuli placed beside each other in the model).

• 9.4a shows the model (which is only slightly more complicated than Fig 9.2).

• 9.4b shows the outcome in distributed weights after many acquisition trials.

Page 16: Chapter 9

16

(Fig 9.4a) Distributed Representation Network

Page 17: Chapter 9

17

(Fig 9.4b) Train “Yellow”

Page 18: Chapter 9

18

Shared Elements and Distributed Representations

• 9.4c shows response strength from a stimulus (yellow/orange) test.

• 9.4d shows the weaker response to a more varied stimulus (orange) test.

• Such a distributed representation model better matches real life gradients, much like Fig 9.1 (see Fig 9.5).

Page 19: Chapter 9

19

(Fig 9.4c) Test “Yellow-Orange” Some Decline in Response

Page 20: Chapter 9

20

(Fig 9.4d) Test “Orange” More Decline in Response

Page 21: Chapter 9

21

(Fig 9.5) Stimulus Generalization Gradient Produced by Distributed Representation Model of Fig 9.4

Page 22: Chapter 9

22

When Similar Stimuli Predict Different Consequences

• Two substances that appear similar initially, may become distinguishable over time.

• Example: Gooseberries look like green grapes. If you are

allergic to gooseberries, you learn to distinguish them from green grapes (discrimination).

Page 23: Chapter 9

23

Discrimination Training and Learned Specificity

• The weaker the generalization, the stronger the discrimination. Discrimination = differential responding to two stimuli. Discrimination can be trained; in discrimination

training, two different (but similar) stimuli are presented on each trial.

• The steeper (and skinnier) the gradient, the higher the discrimination.

Page 24: Chapter 9

24

Discrimination Training and Learned Specificity

• Fig 9.6 shows the adapted results of a classic 1962 experiment (Jenkins studies tone discrimination in pigeons). One gradient represents the test pattern for pigeons that

heard a 1000 Hz tone before they pecked and received food.

The other gradient represents the generalization for pigeons that were intermittently exposed to a similar 950 Hz tone without food.

Which is the control group? Experimental group?

Page 25: Chapter 9

25

(Fig 9.6) Generalization Gradients for Tones of Different Frequencies

Adapted from Jenkins and Harrison, 1962.

Page 26: Chapter 9

26

Unsolved Mysteries—Why Are Some Feature Pairs Easier to Discriminate

between Than Others? • Some pairs of stimulus features are separable, such as brightness and hue.

• Other feature pairs are perceived holistically, such as brightness and saturation.

• Understanding the nature of feature pairs relates to stimulus generalization.

Page 27: Chapter 9

27

The Two-Dimensional Filtering Task

Page 28: Chapter 9

28

Negative Patterning: Differentiating Configurations from

Their Individual Components• Negative patterning occurs when we respond positively to two stimuli presented separately, but we respond negatively to the compound (i.e., the combination).

• Example: Mom at home? Eat dinner in the kitchen. Dad at home?

Eat dinner in the kitchen. Both Mom and Dad at home? Don’t eat dinner in the kitchen (Eat in the dining room).

Page 29: Chapter 9

29

Negative Patterning

• Rats, monkeys, and humans learn negative patterning tasks.

• Rabbits can learn to blink to either a tone or a light, and to not blink to a simultaneous tone and light.

Page 30: Chapter 9

30

Negative Patterningin Rabbit Eyeblink Conditioning

Adapted from Kehoe, 1988, Figure 9.

Page 31: Chapter 9

31

Negative Patterning

• Single-layer network models using discrete-component representations cannot learn negative patterning.

Page 32: Chapter 9

32

Negative Patterning

• Fig 9.11 shows a multi-layer network model for negative patterning. Include extra nodes that only fire when two or

more specific features present.

• In Fig 9.11, a configural node for “tone + light” will fire only if both inputs are active.

Page 33: Chapter 9

33

(Fig 9.11) Solving Negative Patterning with a Network Model

Page 34: Chapter 9

34

Configural Learning in Categorization

• Configural tasks require sensitivity to combinations of stimulus cues, above and beyond what is known about stimulus components.

• Configural nodes can be applied to categorization learning, where humans learn to classify stimuli into categories. e.g., diagnosis from symptoms.

Page 35: Chapter 9

35

Configural Learning in Categorization

• Figure 12a–12c shows a configural-node model of category learning.

• 12a shows the model.• In 12b, both fever and soreness together (without ache) predicts the disease. Dilemma = combinatorial explosion

• 12c is a simpler, more flexible (alternative) model.

Page 36: Chapter 9

36

(Fig 9.12a)

Page 37: Chapter 9

37

(Fig 9.12b)

Page 38: Chapter 9

38

(Fig 9.12c)

Page 39: Chapter 9

39

When Dissimilar Stimuli Predict the Same Consequence

• Co-occurrence of stimuli may increase generalization.

• Example: If you like the cookies at a new bakery, you may

like their brownies.

Page 40: Chapter 9

40

Sensory Preconditioning: Similar Predictions for Co-occurring Stimuli• Sensory Preconditioning—conditioning without an explicit US. Prior presentation of compound stimuli results in

later tendency for learning about one stimulus to generalize to the other.

Page 41: Chapter 9

41

Sensory Preconditioning *Example*

• Step 1: (tone, light)• Step 2: (light, puff)

CR eyeblink should develop over acquisition trials.

• Step 3: (tone alone) If CR eyeblink occurs, we call this phenomenon “sensory

preconditioning.”

• Illustrates the generalizability of a stimulus’s power! The tone was never presented as a cue for the puff!

Page 42: Chapter 9

42

Sensory Preconditioning

Page 43: Chapter 9

43

Acquired Equivalence: Novel Similar Predictions Based on

Prior Similar Consequences• Acquired equivalence—prior training in stimulus equivalence increases amount of generalization between two stimuli, even if stimuli are superficially dissimilar.

• In Hall study, pigeons learned the dissimilar colors paired separately with the same color had the same result. Demonstrated this generalization in a new situation.

Page 44: Chapter 9

44

Acquired Equivalence

Page 45: Chapter 9

45

Learning and Memory in Everyday Life— Discrimination and Stereotypes in Generalizing about Other People

• Category formation is a basic cognitive process.• Rational generalizations let us tentatively generalize individual outcomes from previous experiences.

• Stereotyping is denying exceptions for individuals from a group for which we may hold oversimplified beliefs. Attempts to justify unfair treatment.

Page 46: Chapter 9

46

9.1 Interim Summary

• Generalization = transfer of past learning to new situations and problems. Requires finding balance between specificity (knowing how

narrowly a rule applies) and generality (knowing how broadly the rule applies).

• Discrimination = recognizing differences between stimuli; knowing which to prefer.

• Understanding similarity is essential to understand generalization and discrimination.

Page 47: Chapter 9

47

9.1 Interim Summary

• Discrete-component representations: assign each stimulus (or feature) to its own node. Applicable to situations in which similarity among features is

small enough that there is negligible transfer of response from one to another.

• Distributed representations: incorporate idea of shared elements. Allow creation of psychological models with concepts

represented as patterns of activity over many nodes; provide ability to model stimulus similarity and generalization.

Page 48: Chapter 9

48

9.1 Interim Summary

• We tend to assume that patterns formed from compound cues will have consequences that parallel (or even combine) what we know about the individual cues.

• However, some discriminations require sensitivity to the configurations of stimulus cues above and beyond what is known about the individual stimulus cues.

Page 49: Chapter 9

49

9.1 Interim Summary

• Animals and people can learn to generalize between stimuli that have no physical similarity but that do have a history of co-occurrence or of predicting the same outcome.

Page 50: Chapter 9

9.2

Brain Substrates

Page 51: Chapter 9

51

9.2 Brain Substrates

• Cortical Representations and Generalization

• Generalization and the Hippocampal Region

Page 52: Chapter 9

52

Cortical Representations of Sensory Stimuli

• Initial cortical processing of sensory information (vision, sound, touch, etc.) occurs in areas dedicated to each sense.

• Areas in the mammalian cortex can be organized into topographical maps (e.g., homunculi for primary sensory and motor cortices.

Page 53: Chapter 9

53

Topographic Map of the Primary Sensory Cortex

Adapted from Penfield & Rasmussen, 1950.

Page 54: Chapter 9

54

Shared-Elements Models of Receptive Fields

• Does receptive field function match generalization theories?

• If brain is organized in distributed representations, similar stimuli should activate common nodes (or neurons).

Page 55: Chapter 9

55

Shared-Elements Models

• Fig 9.17a–c shows a shared-elements network model of generalization. 9.17a shows how a 550-Hz tone might activate

nodes 2, 3, and 4. 9.17b shows how a similar 560-Hz tone might

activate nodes 3, 4, and 5. 9.17c illustrates the node overlap (3 and 4)

generalization between 550-Hz tone and a 560-Hz tone.

Page 56: Chapter 9

56

(Fig 9.17a)

Page 57: Chapter 9

57

(Fig 9.17b)

Page 58: Chapter 9

58

(Fig 9.17c)

Page 59: Chapter 9

59

Shared-Elements Models of Receptive Fields

• Auditory neurons respond to varying frequencies. Each neuron responds best to one frequency (see Fig 9.18). The wider the neuron’s receptive field, the broader

the range of physical stimuli (in this case, auditory frequencies) processed by that neuron.

Page 60: Chapter 9

60

(Fig 9.18)Activity of node/neuron #3 in Fig 9.17 is recorded for each of the tones between 520 Hz and 580 Hz; the best frequency is 550 Hz.

Page 61: Chapter 9

61

Topographic Organization and Generalization

• Richard Thompson’s 1960s animal studies found that intact auditory cortex is necessary for auditory generalization from a specific tone. Such sensory receptive fields can change from

learning.

Page 62: Chapter 9

62

Plasticity of Cortical Representations

• Even in adult animals, cortical areas temporarily shrink from disuse and spread from use.

• Weinberger studies indicate that cortical plasticity is due to stimulus pairing. Stimulus presentation alone doesn’t drive

plasticity, stimulus needs to be meaningfully related to consequence.

Page 63: Chapter 9

63

Plasticity of Representation in the Primary Auditory Cortex

Adapted from Weinberger, 1977, figure 2.

Page 64: Chapter 9

64

Plasticity of Cortical Representations

• The nucleus basalis in the basal forebrain releases acetylcholine (ACh) throughout the cortex. ACh facilitates

cortical plasticity.

Page 65: Chapter 9

65

Generalization and the Hippocampal Region

• Generalization shown in sensory preconditioning is disrupted by lesioning. Lesioned rabbits display no sensory

preconditioning.

Page 66: Chapter 9

66

Hippocampal Region and Sensory Preconditioning

Drawn from data presented in Port & Patterson, 1984.

Page 67: Chapter 9

67

Generalization and the Hippocampal Region

• Similarly, rats with hippocampal region damage (lesions in the entorhinal cortex) showed poor acquired equivalence. Latent learning in rabbit eyeblink conditioning

was eliminated with entorhinal cortical lesions.

Page 68: Chapter 9

68

Latent Inhibition in Rabbit Eyeblink Conditioning

Adapted from Shohamy, Allen, & Gluck, 2000.

Page 69: Chapter 9

69

Modeling the Role of the Hippocampus in Adaptive Representations

• Gluck and Myers (1993, 2001) propose that compression and differentiation of stimulus representations are computed in the hippocampal region. Region acts as an “information highway.” Cerebral cortex and cerebellum process

associations for behavioral response and storage.

• Research supports this model.

Page 70: Chapter 9

70

9.2 Interim Summary

• While it is possible for an animal without an auditory cortex to learn to respond to auditory stimuli, an intact auditory cortex is essential for normal auditory generalization. Without their auditory cortex, animals can learn

to respond to the presence of a tone, but cannot respond precisely to a specific tone.

Page 71: Chapter 9

71

9.2 Interim Summary

• Cortical plasticity is driven by the correlation between stimulus and salient event. Plasticity is not driven by presentation alone; stimulus has

to be meaningfully related to ensuing consequences. But, primary sensory cortices do not receive information

about which consequence occurred, only that some sort of salient event has occurred.

Thus, primary sensory cortices only determine which stimuli deserve expanded representation and which do not.

Page 72: Chapter 9

72

9.2 Interim Summary

• When stimulus is paired with salient event (such as food or shock), nucleus basalis becomes active. Delivers acetylcholine to cortex. Enables cortical remapping to enlarge the

representation of stimulus in the appropriate primary sensory cortex.

Page 73: Chapter 9

73

9.2 Interim Summary

• Hippocampal region plays key role in learning behaviors that depend on stimulus generalization. e.g., classical conditioning paradigms of sensory

preconditioning and latent inhibition.

• Computational modeling suggests that role is related to hippocampal region’s compression and differentiation of stimulus representations.

Page 74: Chapter 9

9.3

Clinical Perspectives

Page 75: Chapter 9

75

9.3 Clinical Perspectives

• Generalization Transfer and Hippocampal Atrophy in the Elderly

• Rehabilitation of Language-Learning-Impaired Children

Page 76: Chapter 9

76

Generalization Transfer and Hippocampal Atrophy in the Elderly• Hippocampal or entorhinal cortical atrophy may be early sign for Alzheimer’s disease.

Adapted from de Leon et al., 1993. Images courtesy of Dr. Mony de Leon NYU School of Medicine

Page 77: Chapter 9

77

Generalization Transfer and Hippocampal Atrophy in the Elderly

• Myers and associates developed a method to study generalization transfer in the elderly.

Adapted from Myers et al., 2003.

Phase 2: train new outcome

Phase 3: transfer

Phase 1: equivalence training

Human acquired equivalence study

Page 78: Chapter 9

78

Human Acquired Equivalence Study

• Phase 1: Learned to associate the blue fish with the brunette and

the blue fish with the blonde (equivalent preference).

• Phase 2: Learned to associate the red fish with the brunette.

• Phase 3: Can they generalize this red fish preference to

the blonde?

Page 79: Chapter 9

79

Human Acquired Equivalence Study

• Results: Healthy participants completed all three phases. Participants with hippocampal atrophy completed

phases 1 and 2, but could not transfer learning in phase 3.

• Test might be a quick and easy screening tool for potential cognitive impairment.

Page 80: Chapter 9

80

Rehabilitation of Language-Learning-Impaired Children

• Language learning impairment (LLI)— language-learning problems not attributable to known factors. Children with normal intelligence but very low

scores on oral language tests. Tallal found that problem was not language-

specific; rather, a problem in rapid sensory processing.

Page 81: Chapter 9

81

Rehabilitation of Language-Learning-Impaired Children

• In study (Temple et al, 2003): Participants = 20 dyslexic children (8–12 years old) and 12

children matched for age, gender, handedness, and non-verbal IQ.

All received fMRI during a rhyming task before and after dyslexic children’s training.

Study includes behavioral remediation program to improve auditory and language processing.

Uses non-linguistic and acoustically modified speech.

Conducted 5 days per week, 100 min. per day, for 27.9 (average) training days.

Page 82: Chapter 9

82

• Results: Children’s language and reading scores

increased. fMRI increases in language-processing areas (left

temporo-parietal cortex).

• Illustrates cortical plasticity in children from intense behavioral treatment.

Rehabilitation of Language-Learning-Impaired Children

Page 83: Chapter 9

83

Brain Plasticity in Children with Dyslexia

Data from Temple et al., 2003; Images courtesy of Elise Temple.

Page 84: Chapter 9

84

9.3 Interim Summary

• Some forms of generalization depend on medial temporal lobe mediation.

• Elderly individuals with hippocampal region atrophy (a risk factor for subsequent development of Alzheimer’s disease) can learn initial discriminations but fail to appropriately transfer learning in later tests.

Page 85: Chapter 9

85

9.3 Interim Summary

• Studies of dyslexia and other language impairments provide examples of how insights from animal research on cortical function can have clinical implications for humans with learning impairments.