Memory and Consolidation Prof.dr. Jaap Murre University of Amsterdam University of Maastricht [email protected] .

Memory and Consolidation

Prof.dr. Jaap Murre

University of Amsterdam

University of Maastricht

[email protected]

http://www.neuromod.org/courses/cb2004

Overview

• Brief review of neuroanatomy of memory

• Outline of the TraceLink model

• Some simulation results of neural network model, focussing on retrograde amnesia

• Memory Chain Model– Forgetting– Amnesia

The Amnesia Paradox

• Recent items are remembered best

• But they are the first to be lost with (retrograde) amnesia

The Daily News Memory Test at memory.uva.nl

0

0.25

0.5

0.75

1

0 100 200 300

retention interval in days

pro

po

rtio

n c

orr

ec

t

4 AFC

Open

1810 Dutch respondents

Théodule Ribot (1886)

• Ribot’s Law: With memory loss, recent memories suffer more

xretrograde amnesia

anterograde amnesia

lesionpresent past

0

20

40

60

80

100

Amnesia patient

Normal forgetting

Ribot Gradient

Example: Patient data

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

75-'84 65-'74 55-'64 45-'54 35-'44

Controls (n=16)

Korsakoff's (n=6)

Alzheimer's (n=8)

Kopelman (1989)News events test

Neuroanatomy of amnesia

• Hippocampus

• Adjacent areas such as entorhinal cortex and parahippocampal cortex

• Basal forebrain nuclei

• Diencephalon

The position of the hippocampus in the brain

Hippocampal connections

Hippocampus

Entorhinal cortex

7a

36 TF TH 46

7b

3aP-IP-BV1M 3b

Visualareas

Somato-sensoryand motorareas

To and from sensory organs,via subcortical pathways

Hippocampus

Entorhinal cortex

Unimodal and polymodalassociation areas(frontal, temporal, and parietal lobes)

Parahippocampalcortex

Perirhinalcortex

(b)(a)

Hippocampus has anexcellent overview of the entire cortex

The TraceLink Model

A model of memory consolidation and amnesia

Trace-Link model: structure

System 1: Trace system

• Function: Substrate for bulk storage of memories, ‘association machine’

• Corresponds roughly to neocortex

System 2: Link system

• Function: Initial ‘scaffold’ for episodes

• Corresponds roughly to hippocampus and certain temporal and perhaps frontal areas

System 3: Modulatory system

• Function: Control of plasticity• Involves at least parts of the hippocampus,

amygdala, fornix, and certain nuclei in the basal forebrain and in the brain stem

Stages in episodic learning

Dreaming and consolidation of memory

• Theory by Francis Crick and Graeme Mitchison (1983)

• Main problem: Overloading of memory

• Solution: Reverse learning leads to removal of ‘obsessions’

“We dream in order to forget”

Dreaming and memory consolidation

• When should this reverse learning take place?

• During REM sleep– Normal input is deactivated– Semi-random activations from the brain stem– REM sleep may have lively hallucinations

Consolidation may also strengthen memory

• This may occur during deep sleep (as opposed to REM sleep)

• Both hypothetical processes may work together to achieve an increase in the clarity of representations in the cortex

Experiment by Robert Stickgold

• Difficult visual discrimination problem

• Several hours of practice

• One group goes home• Other group stays in

the lab and skips a night of sleep

Improvement without further training due to sleep

0

5

10

15

20

25

0 2 4 6 8 10

Days after training

Imp

rove

me

nt (

ms)

Normal sleep

Skipped first night sleep

Relevant animal data by Matt Wilson and Bruce McNaughton (1994)

• 120 neurons in rat hippocampus

• PRE: Slow-wave sleep before being in the experimental environment (cage)

• RUN: During experimental environment

• POST: Slow-wave sleep after having been in the experimental environment

Wilson en McNaughton Data

• PRE: Slow-wave sleep before being in the experimental environment (cage)

• RUN: During experimental environment

• POST: Slow-wave sleep after having been in the experimental environment

Some important characteristics of amnesia

• Anterograde amnesia (AA)– Implicit memory preserved

• Retrograde amnesia (RA)– Ribot gradients

• Pattern of correlations between AA and RA– No perfect correlation between AA and RA

0

0.5

1

0 10 20 30 40 50

Time (years)

Rec

all p

roba

bilit

y

xretrograde amnesia

anterograde amnesia

lesionpresent past

Amnesia patient

Normal forgetting

Example of patient data

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

75-'84 65-'74 55-'64 45-'54 35-'44

Controls (n=16)

Korsakoff's (n=6)

Alzheimer's (n=8)


Retrograde amnesia

• Primary cause: loss of links

• Ribot gradients

• Shrinkage

Anterograde amnesia

• Primary cause: loss of modulatory system• Secondary cause: loss of links• Preserved implicit

memory

Semantic dementia

• The term was adopted recently to describe a new form of dementia, notably by Julie Snowden et al. (1989, 1994) and by John Hodges et al. (1992, 1994)

• Semantic dementia is almost a mirror-image of amnesia

Neuropsychology of semantic dementia

• Progressive loss of semantic knowledge

• Word-finding problems

• Comprehension difficulties

• No problems with new learning

• Lesions mainly located in the infero-lateral temporal cortex but (early in the disease) with sparing of the hippocampus

Severe loss of traceconnections

Stage-2 learning proceedsas normal

Stage 3 learning stronglyimpaired

Non-rehearsed memorieswill be lost

No consolidation in semantic dementia

Semantic dementia in TraceLink

• Primary cause: loss of trace-trace connections

• Stage-3 (and 4) memories cannot be formed: no consolidation

• The preservation of new memories will be dependent on constant rehearsal

Connectionist implementationof the TraceLink model

With Martijn Meeter

Some details of the model

• 42 link nodes, 200 trace nodes

• for each pattern– 7 nodes are active in the link system– 10 nodes in the trace system

• Trace system has lower learning rate that the link system

How the simulations work: One simulated ‘day’

• A new pattern is activated

• The pattern is learned

• Because of low learning rate, the pattern is not well encoded at first in the trace system

• A period of ‘simulated dreaming’ follows– Nodes are activated randomly by the model– This random activity causes recall of a pattern– A recalled pattern is than learned extra

(Patient data)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

75-'84 65-'74 55-'64 45-'54 35-'44

Controls (n=16)

Korsakoff's (n=6)

Alzheimer's (n=8)


A simulation with TraceLink

R2 = 0.932

R2 = 0.922

0

0.25

0.5

0.75

1

0 5 10 15

Control

Lesion

Frequency of consolidation of patterns over time

0

0.5

1

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Strongly and weakly encoded patterns

• Mixture of weak, middle and strong patterns

• Strong patterns had a higher learning parameter (cf. longer learning time)

0

0.5

1

0 5 10 15

No variance

strong variance

0

0.5

1

0 5 10 15

No variance

Strong variance

0

0.5

1

0 5 10 15

Weak patterns

Middle patterns

Strong patterns

0

0.5

1

0 5 10 15

Weak patterns

Middle patterns

Strong patterns

Transient Global Amnesia (TGA)

• (Witnessed onset) of severe anterograde and retrograde amnesia

• Resolves within 24 hours

• Retrograde amnesia may have Ribot gradients

• Hippocampal area is most probably implicated

maximal TGA: all link nodes inactive

0

0,5

1

-5 0 5 10 15

1/4th of the link nodes functions again

0

0,5

1

-5 0 5 10 15

1/2 of the link nodes functions again

0

0,5

1

-5 0 5 10 15

after the TGA-attack has resolved

0

0,5

1

0 5 10 15 20

Transient Global Amnesia (TGA)

Other simulations

• Focal retrograde amnesia

• Implicit memory

• More subtle lesions (e.g., only within-link connections, cf. CA1 lesions)

• Semantic dementia

• Schizophrenia (memory effects in -) with an extended model (added parahippocampal layer)

Alternative Explanations

• ‘Memory Bump’ appears as reverse gradient

• Nadel and Moscovitz (1997): Trace Replication Theory (will be discussed next time by the students)

Sir Francis Galton (1879)

• Inspected a cue word, e.g., coffee until an event came to mind

• Later, he dated the events

Lifetime distributions

• The Galton-Crovitz method aims for a quasi-random sample of autobiographical memories

• Stratified through the use of keywords

• Technically speaking: The method measures a probability density function of memory age

Rubin, Wetzler & Nebes (1986)

• Found a reminiscence bump between 10 and 30 years, when older than 40 years.

Herinneringsbobbel (Rubin, et al., 1986)

Large-scale replication using the internet

• Website: http://memory.uva.nl

Steve Janssen Antonio Chessa

11 - 20 years old

0

0.01

0.02

0.03

0.04

0 10 20 30 40 50 60 70

21 - 30 years old

0

0.01

0.02

0.03

0.04

0 10 20 30 40 50 60 70

11 - 20 years old

( N = 1189)

0

0.02

0.04

0.06

0 10 20 30 40 50 60 70

21 - 30 years old

( N = 3633)

0

0.02

0.04

0.06

0 10 20 30 40 50 60 70

Memory age pdf Encoding

Subject age Subject age

31 - 40 years old

0

0.01

0.02

0.03

0.04

0 10 20 30 40 50 60 70

41 - 50 years old

0

0.01

0.02

0.03

0.04

0 10 20 30 40 50 60 70

31 - 40 years old

( N = 3983)

0

0.02

0.04

0.06

0 10 20 30 40 50 60 70

41 - 50 years old

( N = 3791)

0

0.02

0.04

0.06

0 10 20 30 40 50 60 70



51 - 60 years old

0

0.01

0.02

0.03

0.04

0 10 20 30 40 50 60 70

61 - 70 years old

0

0.01

0.02

0.03

0.04

0 10 20 30 40 50 60 70

51 - 60 years old

( N = 3190)

0

0.02

0.04

0.06

0 10 20 30 40 50 60 70

61 - 70 years old

( N = 1169)

0

0.02

0.04

0.06

0 10 20 30 40 50 60 70



Memory age probabilities Encoding function

Figure 5

S u bjec t ag e S u bje ct ag e

1 1-20 ye ars ol d

0

0 .0 2

0 .0 4

0 .0 6

0 .0 8

0 .1

0 1 0 20 3 0 4 0 5 0 60 7 0

21 -3 0 yea rs ol d

0

0 .0 2

0 .0 4

0 .0 6

0 .0 8

0 .1

0 1 0 20 3 0 4 0 5 0 60 7 0

31 -4 0 yea rs ol d

0

0 .0 2

0 .0 4

0 .0 6

0 .0 8

0 .1

0 1 0 20 3 0 4 0 5 0 60 7 0

41 -5 0 yea rs ol d

0

0 .0 2

0 .0 4

0 .0 6

0 .0 8

0 .1

0 1 0 20 3 0 4 0 5 0 60 7 0

51 -6 0 yea rs ol d

0

0 .0 2

0 .0 4

0 .0 6

0 .0 8

0 .1

0 1 0 20 3 0 4 0 5 0 60 7 0

61 -7 0 yea rs ol d

0

0 .0 2

0 .0 4

0 .0 6

0 .0 8

0 .1

0 1 0 20 3 0 4 0 5 0 60 7 0

11-20 ye ars o ld

(N = 28 6)

0

0 .0 2

0 .0 4

0 .0 6

0 .0 8

0 .1

0 1 0 20 3 0 4 0 5 0 6 0 70

2 1-30 ye ars ol d

(N = 790 )

0

0 .0 2

0 .0 4

0 .0 6

0 .0 8

0 .1

0 1 0 20 3 0 4 0 5 0 6 0 70

3 1-40 ye ars ol d

(N = 2 200 )

0

0 .0 2

0 .0 4

0 .0 6

0 .0 8

0 .1

0 1 0 20 3 0 4 0 5 0 6 0 70

4 1-50 ye ars ol d

(N = 2 142 )

0

0 .0 2

0 .0 4

0 .0 6

0 .0 8

0 .1

0 1 0 20 3 0 4 0 5 0 6 0 70

5 1-60 ye ars ol d

(N = 1 618 )

0

0 .0 2

0 .0 4

0 .0 6

0 .0 8

0 .1

0 1 0 20 3 0 4 0 5 0 6 0 70

6 1-70 ye ars ol d

(N = 874 )

0

0 .0 2

0 .0 4

0 .0 6

0 .0 8

0 .1

0 1 0 20 3 0 4 0 5 0 6 0 70

Encoding functions retains a contain shapeover all age groupsas was expected

0

0.02

0.04

0.06

0 10 20 30 40 50 60 70

Subject age

Initi

al e

ncod

ing

Encoding combined oversubject age classes

N = 16955

Memory Chain Model

Model of learning, forgetting, retrograde amnesia

Three stages of learning and memory

• Encoding: formation of the memory, after a certain amount of learning time

• Storage: transformation of the memory, under the influence of rehearsal and consolidation

• Retrieval: search for the memory, based on a retrieval cue

• We assume that the contribution of the three stages is independent and multiplicative

Chain of memory ‘stores’

External Information

SensoryStores

Working Memory

Link System

Trace System

Long-term Memory

Short-term Memory

Loss from sensory store

Loss from working memory

Decay and interference

Decay and interference

Chain of memory ‘stores’

External Information

Store s1 Store s2 Store sS-1 Store sS

Long-term Memory

Short-term Memory

Loss of intensity

Loss of intensity

Loss of intensity

Loss of intensity

Sensory Memory

General principles of the multi-store model

• Part of the information is passed to the next store before it decays completely

• Subsequent stores hold information for longer time periods: slower decay rates in ‘higher’ stores

Item representation

• Items are represented as ‘copies’ or ‘critical features’, each of which suffices for recall

• Finding these ‘copies’ during recall is an inherently stochastic process

Neural network interpretation

Jo Brand

Learning and forgetting as a stochastic process: 1-store example• A recall cue (e.g., a face) may access

different aspects of a stored memory

• If a point is found in the neural cue area, the correct response (e.g., the name) can be given

LearningForgettingSuccessfulRecallUnsuccessfulRecall

Performance determined by a single parameter: intensity

• Intensity is the expected number of copies found within the searched region

• Cf. the expected number of trees in a wood within any 5x5 m region

• We use the mathematical framework of point processes

General framework: encoding, storage, and retrieval

( )( ) 1 r tp t e

1

storage retrievalencoding

( ) ( )r t r t q

(q = 1 in the remainder of this talk)

The contributions of individual stores can simply be added

1 2( ) ( ) ( ) ... ( )Sr t r t r t r t

( )( ) 1 r tp t e

Forgetting

One-store case: forgetting

11 1( ) a tr t e

AssumptionIn all stores, we have an exponential decline of intensity with time t

is the intensity immediately after learning

a1 is the decline parameter

Formation and decline

• Longer learning times will lead to higher intensity

• Decline is caused by – interference from other items (not yet modeled)– displacement in some ‘buffer’– loss of effectiveness of the search cue– neural ‘noise’ and competition

The shape of forgetting

11

1( ) 1a tep t e

Forgetting in the one-store case:

Some properties of the forgetting curve

• Probability of recall always stays between 0 and 1

• Forgetting is not necessarily greatest after learning:We predict a flex point when the initial recall is at least 63.01)0( 1 ep

Probe-digit experiment (Waugh & Norman 1965)

R2 = 0.7295

0

20

40

60

80

100

0 2 4 6 8 10 12

Time (s)

Ret

enti

on

(%

)Example: Single-store model fitted

to short-term forgetting data

R2 = 0,985

Amnesia

Retrograde amnesia

Assumption:Hippocampus (link system) = store 1Neocortex (trace system) = store 2

0

0.5

1

0 10 20 30 40 50

Time (years)

Rec

all p

roba

bilit

y

xretrograde amnesia

anterograde amnesia

lesionpresent past

Amnesia patient

Normal forgetting

Amnesia in the two-store model

12 1 2

[1]2 2

1[ 2 ] 1

With normal recall we have:

( ) ( ) ( )

With a lesioned hippocampus we have:

( ) ( )

With a non-functional neocortex we have

( ) ( )

r t r t r t

r t r t

r t r t

Amnesia: animal data

Retrograde amnesia

Cho & Kesner (1996). (mice) R2=0.96

b.

0

0.25

0.5

0.75

1

0 10 20 30 40 50

Time (days)

Rec

all p

rob

abili

ty

Summary of animal data

a.

0

0.25

0.5

0.75

1

0 20 40 60

Time (days)

Rec

all p

roba

bilit

y

b.

0

0.25

0.5

0.75

1

0 10 20 30 40 50

Time (days)R

ecal

l pro

babi

lity

c.

0

0.25

0.5

0.75

1

0 20 40 60

Time (days)

Rec

all p

roba

bilit

y

d.

0

0.25

0.5

0.75

1

0 20 40 60

Time (days)

Rec

all p

roba

bilit

y

f.

0

0.25

0.5

0.75

1

0 50 100 150

Time (days)

Rec

all p

roba

bilit

y

e.

0

0.25

0.5

0.75

1

0 5 10

Time (days)

Re

call

pro

bab

ility

d.

0

0.25

0.5

0.75

1

0 20 40 60

Time (days)

Re

call

pro

bab

ility

Cortical amnesia

Frankland, O’Brien, Ohno, Kirkwood, & Silva, (Nature, 2001).

Data provided by Paul Frankland

Frankland et al. (2001) study

• -CaMKB-dependent plasticity (in neocortex) switched off in knock-out mice

• No LTP measurable in neocortex but LTP in hippocampus was largely normal

• Forgetting curves with different levels of initial learning were measured

• A learning curve was measured

• Assumption: use r1[2](t) for knock-out mice

Forgetting after 3 shocks, using three parameters

Freezing after 3 shocks

0

0.25

0.5

0.75

1

0 10 20 30 40 50

Retention delay (days)

Fre

ez

ing

(fr

ac

tio

n)

Using the same three parameters and a massed-learning correction.


0

0.25

0.5

0.75

1

0 5 10 15


Fre

ezin

g (

frac

tio

n)

Controls receive 1 shock, experimental animals 3 shocks (no new free parameters).

Freezing after weak learning

0

0.25

0.5

0.75

1

0 5 10 15


Fre

ezin

g (

frac

tio

n)

Repeated learning for experimental animals (no new free parameters)

Freezing after repeated learning

0

0.25

0.5

0.75

0 1 2 3

Training day

Fre

ez

ing

(fr

ac

tio

n)

Summary of ‘cortical amnesia’. Using only 4 parameters for all curves (R2 = 0.976).


0

0.25

0.5

0.75

1

0 10 20 30 40 50


Fre

ezin

g (

frac

tio

n)


0

0.25

0.5

0.75

1

0 5 10 15


Fre

ezin

g (

frac

tio

n)

Freezing after weak learning

0

0.25

0.5

0.75

1

0 5 10 15


Fre

ezin

g (

frac

tio

n)

Freezing after repeated learning

0

0.25

0.5

0.75

0 1 2 3

Training day

Fre

ez

ing

(fr

ac

tio

n)

(a) (b)

(c) (d)

Amnesia: human data

Retrograde amnesia

Remarks on the human data

• Fitting procedure identical to animal data

• Data are very noisy

• Basic fits: a2 = 0, full lesion assumed. This leaves three parameters.

Kopelman (1989). News events test. Korsakoff (left), Alzheimer (right, fitted with lower 2). R2=0.951.

a.

0

0.5

1

0 10 20 30 40 50

Time (years)

Rec

all p

roba

bilit

y

a.

0

0.5

1

0 10 20 30 40 50

Time (years)

Rec

all p

roba

bilit

y

Wiig, Cooper & Bear (1996). (rats) R2=0.28

d.

0

0.25

0.5

0.75

1

0 20 40 60

Time (days)

Rec

all p

rob

abili

ty

Problem with nearly all human data

• Straight fits of forgetting curves and Ribot gradients are nonsense

• Tests items for remote periods are easier

• Typically: curves are flat around 80-85% for control subjects

• Reason: maximizes chances of detecting Ribot effects

• Disadvantage: shape of the curves is useless for quantitative analysis

Relative retrograde gradient: a relative measure of memory loss

[1]2[1]2

12

log(1 )( )( )

( ) log(1 )patient

control

pr trr t

r t p

Wiig, Cooper & Bear (1996). (rats) with rr-gradient: R2=0.84

d.

0

0.25

0.5

0.75

1

0 20 40 60

Time (days)

Rec

all p

rob

abili

ty

Albert et al. (1979). Naming of famous faces. Korsakoff patients. R2 = 0.977 and R2

rr=0.978.

a.

0

0.5

1

0 10 20 30 40 50

Time (years)

Rec

all p

roba

bilit

y

a.

0

0.5

1

0 10 20 30 40 50

Time (years)

Re

lativ

e in

tens

ity

In progress with Memory Chain Model

• Fits to patient data: Huntington, Alzheimer, Korsakoff, TGA, focal lesions, ECT, etc.

• Data collection with four new tests of retrograde amnesia (in Dutch) developed in my group

• Memory Chain Model helps with item selection and interpretation of results (clinical application: diagnosis)

Summary of the human data

• This work is still in progress

• About 25 human data sets have been fitted

• rr-gradient allows initial quantitative analysis

• Human data and animal give the same overall picture

Concluding remarks

• Consolidation is still a hotly debated issue

• Modeling can help to elucidate the various viewpoints

• Models at various levels of detail can be developed:– Connectionist models (e.g., TraceLink)– Mathematical models (e.g., Memory Chain

Model)

More information at

http://www.neuroMod.org/

courses/cb2004

E-mail:

[email protected]

Memory and Consolidation Prof.dr. Jaap Murre University of Amsterdam University of Maastricht [email protected] .

Documents

Memory and Consolidation Prof.dr. Jaap Murre University of Amsterdam University of Maastricht [email protected] .