Memory and Consolidation
Prof.dr. Jaap Murre
University of Amsterdam
University of Maastricht
http://www.neuromod.org/courses/cb2004
Overview
• Brief review of neuroanatomy of memory
• Outline of the TraceLink model
• Some simulation results of neural network model, focussing on retrograde amnesia
• Memory Chain Model– Forgetting– Amnesia
The Amnesia Paradox
• Recent items are remembered best
• But they are the first to be lost with (retrograde) amnesia
The Daily News Memory Test at memory.uva.nl
0
0.25
0.5
0.75
1
0 100 200 300
retention interval in days
pro
po
rtio
n c
orr
ec
t
4 AFC
Open
1810 Dutch respondents
Théodule Ribot (1886)
• Ribot’s Law: With memory loss, recent memories suffer more
xretrograde amnesia
anterograde amnesia
lesionpresent past
0
20
40
60
80
100
Amnesia patient
Normal forgetting
Ribot Gradient
Example: Patient data
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
75-'84 65-'74 55-'64 45-'54 35-'44
Controls (n=16)
Korsakoff's (n=6)
Alzheimer's (n=8)
Kopelman (1989)News events test
Neuroanatomy of amnesia
• Hippocampus
• Adjacent areas such as entorhinal cortex and parahippocampal cortex
• Basal forebrain nuclei
• Diencephalon
The position of the hippocampus in the brain
Hippocampal connections
Hippocampus
Entorhinal cortex
7a
36 TF TH 46
7b
3aP-IP-BV1M 3b
Visualareas
Somato-sensoryand motorareas
To and from sensory organs,via subcortical pathways
Hippocampus
Entorhinal cortex
Unimodal and polymodalassociation areas(frontal, temporal, and parietal lobes)
Parahippocampalcortex
Perirhinalcortex
(b)(a)
Hippocampus has anexcellent overview of the entire cortex
The TraceLink Model
A model of memory consolidation and amnesia
Trace-Link model: structure
System 1: Trace system
• Function: Substrate for bulk storage of memories, ‘association machine’
• Corresponds roughly to neocortex
System 2: Link system
• Function: Initial ‘scaffold’ for episodes
• Corresponds roughly to hippocampus and certain temporal and perhaps frontal areas
System 3: Modulatory system
• Function: Control of plasticity• Involves at least parts of the hippocampus,
amygdala, fornix, and certain nuclei in the basal forebrain and in the brain stem
Stages in episodic learning
Dreaming and consolidation of memory
• Theory by Francis Crick and Graeme Mitchison (1983)
• Main problem: Overloading of memory
• Solution: Reverse learning leads to removal of ‘obsessions’
“We dream in order to forget”
Dreaming and memory consolidation
• When should this reverse learning take place?
• During REM sleep– Normal input is deactivated– Semi-random activations from the brain stem– REM sleep may have lively hallucinations
Consolidation may also strengthen memory
• This may occur during deep sleep (as opposed to REM sleep)
• Both hypothetical processes may work together to achieve an increase in the clarity of representations in the cortex
Experiment by Robert Stickgold
• Difficult visual discrimination problem
• Several hours of practice
• One group goes home• Other group stays in
the lab and skips a night of sleep
Improvement without further training due to sleep
0
5
10
15
20
25
0 2 4 6 8 10
Days after training
Imp
rove
me
nt (
ms)
Normal sleep
Skipped first night sleep
Relevant animal data by Matt Wilson and Bruce McNaughton (1994)
• 120 neurons in rat hippocampus
• PRE: Slow-wave sleep before being in the experimental environment (cage)
• RUN: During experimental environment
• POST: Slow-wave sleep after having been in the experimental environment
Wilson en McNaughton Data
• PRE: Slow-wave sleep before being in the experimental environment (cage)
• RUN: During experimental environment
• POST: Slow-wave sleep after having been in the experimental environment
Some important characteristics of amnesia
• Anterograde amnesia (AA)– Implicit memory preserved
• Retrograde amnesia (RA)– Ribot gradients
• Pattern of correlations between AA and RA– No perfect correlation between AA and RA
0
0.5
1
0 10 20 30 40 50
Time (years)
Rec
all p
roba
bilit
y
xretrograde amnesia
anterograde amnesia
lesionpresent past
Amnesia patient
Normal forgetting
Example of patient data
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
75-'84 65-'74 55-'64 45-'54 35-'44
Controls (n=16)
Korsakoff's (n=6)
Alzheimer's (n=8)
Kopelman (1989)News events test
Retrograde amnesia
• Primary cause: loss of links
• Ribot gradients
• Shrinkage
Anterograde amnesia
• Primary cause: loss of modulatory system• Secondary cause: loss of links• Preserved implicit
memory
Semantic dementia
• The term was adopted recently to describe a new form of dementia, notably by Julie Snowden et al. (1989, 1994) and by John Hodges et al. (1992, 1994)
• Semantic dementia is almost a mirror-image of amnesia
Neuropsychology of semantic dementia
• Progressive loss of semantic knowledge
• Word-finding problems
• Comprehension difficulties
• No problems with new learning
• Lesions mainly located in the infero-lateral temporal cortex but (early in the disease) with sparing of the hippocampus
Severe loss of traceconnections
Stage-2 learning proceedsas normal
Stage 3 learning stronglyimpaired
Non-rehearsed memorieswill be lost
No consolidation in semantic dementia
Semantic dementia in TraceLink
• Primary cause: loss of trace-trace connections
• Stage-3 (and 4) memories cannot be formed: no consolidation
• The preservation of new memories will be dependent on constant rehearsal
Connectionist implementationof the TraceLink model
With Martijn Meeter
Some details of the model
• 42 link nodes, 200 trace nodes
• for each pattern– 7 nodes are active in the link system– 10 nodes in the trace system
• Trace system has lower learning rate that the link system
How the simulations work: One simulated ‘day’
• A new pattern is activated
• The pattern is learned
• Because of low learning rate, the pattern is not well encoded at first in the trace system
• A period of ‘simulated dreaming’ follows– Nodes are activated randomly by the model– This random activity causes recall of a pattern– A recalled pattern is than learned extra
(Patient data)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
75-'84 65-'74 55-'64 45-'54 35-'44
Controls (n=16)
Korsakoff's (n=6)
Alzheimer's (n=8)
Kopelman (1989)News events test
A simulation with TraceLink
R2 = 0.932
R2 = 0.922
0
0.25
0.5
0.75
1
0 5 10 15
Control
Lesion
Frequency of consolidation of patterns over time
0
0.5
1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Strongly and weakly encoded patterns
• Mixture of weak, middle and strong patterns
• Strong patterns had a higher learning parameter (cf. longer learning time)
0
0.5
1
0 5 10 15
No variance
strong variance
0
0.5
1
0 5 10 15
No variance
Strong variance
0
0.5
1
0 5 10 15
Weak patterns
Middle patterns
Strong patterns
0
0.5
1
0 5 10 15
Weak patterns
Middle patterns
Strong patterns
Transient Global Amnesia (TGA)
• (Witnessed onset) of severe anterograde and retrograde amnesia
• Resolves within 24 hours
• Retrograde amnesia may have Ribot gradients
• Hippocampal area is most probably implicated
maximal TGA: all link nodes inactive
0
0,5
1
-5 0 5 10 15
1/4th of the link nodes functions again
0
0,5
1
-5 0 5 10 15
1/2 of the link nodes functions again
0
0,5
1
-5 0 5 10 15
after the TGA-attack has resolved
0
0,5
1
0 5 10 15 20
Transient Global Amnesia (TGA)
Other simulations
• Focal retrograde amnesia
• Implicit memory
• More subtle lesions (e.g., only within-link connections, cf. CA1 lesions)
• Semantic dementia
• Schizophrenia (memory effects in -) with an extended model (added parahippocampal layer)
Alternative Explanations
• ‘Memory Bump’ appears as reverse gradient
• Nadel and Moscovitz (1997): Trace Replication Theory (will be discussed next time by the students)
Sir Francis Galton (1879)
• Inspected a cue word, e.g., coffee until an event came to mind
• Later, he dated the events
Lifetime distributions
• The Galton-Crovitz method aims for a quasi-random sample of autobiographical memories
• Stratified through the use of keywords
• Technically speaking: The method measures a probability density function of memory age
Rubin, Wetzler & Nebes (1986)
• Found a reminiscence bump between 10 and 30 years, when older than 40 years.
Herinneringsbobbel (Rubin, et al., 1986)
Large-scale replication using the internet
• Website: http://memory.uva.nl
Steve Janssen Antonio Chessa
11 - 20 years old
0
0.01
0.02
0.03
0.04
0 10 20 30 40 50 60 70
21 - 30 years old
0
0.01
0.02
0.03
0.04
0 10 20 30 40 50 60 70
11 - 20 years old
( N = 1189)
0
0.02
0.04
0.06
0 10 20 30 40 50 60 70
21 - 30 years old
( N = 3633)
0
0.02
0.04
0.06
0 10 20 30 40 50 60 70
Memory age pdf Encoding
Subject age Subject age
31 - 40 years old
0
0.01
0.02
0.03
0.04
0 10 20 30 40 50 60 70
41 - 50 years old
0
0.01
0.02
0.03
0.04
0 10 20 30 40 50 60 70
31 - 40 years old
( N = 3983)
0
0.02
0.04
0.06
0 10 20 30 40 50 60 70
41 - 50 years old
( N = 3791)
0
0.02
0.04
0.06
0 10 20 30 40 50 60 70
Memory age pdf Encoding
Subject age Subject age
51 - 60 years old
0
0.01
0.02
0.03
0.04
0 10 20 30 40 50 60 70
61 - 70 years old
0
0.01
0.02
0.03
0.04
0 10 20 30 40 50 60 70
51 - 60 years old
( N = 3190)
0
0.02
0.04
0.06
0 10 20 30 40 50 60 70
61 - 70 years old
( N = 1169)
0
0.02
0.04
0.06
0 10 20 30 40 50 60 70
Subject age Subject age
Memory age pdf Encoding
Memory age probabilities Encoding function
Figure 5
S u bjec t ag e S u bje ct ag e
1 1-20 ye ars ol d
0
0 .0 2
0 .0 4
0 .0 6
0 .0 8
0 .1
0 1 0 20 3 0 4 0 5 0 60 7 0
21 -3 0 yea rs ol d
0
0 .0 2
0 .0 4
0 .0 6
0 .0 8
0 .1
0 1 0 20 3 0 4 0 5 0 60 7 0
31 -4 0 yea rs ol d
0
0 .0 2
0 .0 4
0 .0 6
0 .0 8
0 .1
0 1 0 20 3 0 4 0 5 0 60 7 0
41 -5 0 yea rs ol d
0
0 .0 2
0 .0 4
0 .0 6
0 .0 8
0 .1
0 1 0 20 3 0 4 0 5 0 60 7 0
51 -6 0 yea rs ol d
0
0 .0 2
0 .0 4
0 .0 6
0 .0 8
0 .1
0 1 0 20 3 0 4 0 5 0 60 7 0
61 -7 0 yea rs ol d
0
0 .0 2
0 .0 4
0 .0 6
0 .0 8
0 .1
0 1 0 20 3 0 4 0 5 0 60 7 0
11-20 ye ars o ld
(N = 28 6)
0
0 .0 2
0 .0 4
0 .0 6
0 .0 8
0 .1
0 1 0 20 3 0 4 0 5 0 6 0 70
2 1-30 ye ars ol d
(N = 790 )
0
0 .0 2
0 .0 4
0 .0 6
0 .0 8
0 .1
0 1 0 20 3 0 4 0 5 0 6 0 70
3 1-40 ye ars ol d
(N = 2 200 )
0
0 .0 2
0 .0 4
0 .0 6
0 .0 8
0 .1
0 1 0 20 3 0 4 0 5 0 6 0 70
4 1-50 ye ars ol d
(N = 2 142 )
0
0 .0 2
0 .0 4
0 .0 6
0 .0 8
0 .1
0 1 0 20 3 0 4 0 5 0 6 0 70
5 1-60 ye ars ol d
(N = 1 618 )
0
0 .0 2
0 .0 4
0 .0 6
0 .0 8
0 .1
0 1 0 20 3 0 4 0 5 0 6 0 70
6 1-70 ye ars ol d
(N = 874 )
0
0 .0 2
0 .0 4
0 .0 6
0 .0 8
0 .1
0 1 0 20 3 0 4 0 5 0 6 0 70
Encoding functions retains a contain shapeover all age groupsas was expected
0
0.02
0.04
0.06
0 10 20 30 40 50 60 70
Subject age
Initi
al e
ncod
ing
Encoding combined oversubject age classes
N = 16955
Memory Chain Model
Model of learning, forgetting, retrograde amnesia
Three stages of learning and memory
• Encoding: formation of the memory, after a certain amount of learning time
• Storage: transformation of the memory, under the influence of rehearsal and consolidation
• Retrieval: search for the memory, based on a retrieval cue
• We assume that the contribution of the three stages is independent and multiplicative
Chain of memory ‘stores’
External Information
SensoryStores
Working Memory
Link System
Trace System
Long-term Memory
Short-term Memory
Loss from sensory store
Loss from working memory
Decay and interference
Decay and interference
Chain of memory ‘stores’
External Information
Store s1 Store s2 Store sS-1 Store sS
Long-term Memory
Short-term Memory
Loss of intensity
Loss of intensity
Loss of intensity
Loss of intensity
Sensory Memory
General principles of the multi-store model
• Part of the information is passed to the next store before it decays completely
• Subsequent stores hold information for longer time periods: slower decay rates in ‘higher’ stores
Item representation
• Items are represented as ‘copies’ or ‘critical features’, each of which suffices for recall
• Finding these ‘copies’ during recall is an inherently stochastic process
Neural network interpretation
Jo Brand
Learning and forgetting as a stochastic process: 1-store example• A recall cue (e.g., a face) may access
different aspects of a stored memory
• If a point is found in the neural cue area, the correct response (e.g., the name) can be given
LearningForgettingSuccessfulRecallUnsuccessfulRecall
Performance determined by a single parameter: intensity
• Intensity is the expected number of copies found within the searched region
• Cf. the expected number of trees in a wood within any 5x5 m region
• We use the mathematical framework of point processes
General framework: encoding, storage, and retrieval
( )( ) 1 r tp t e
1
storage retrievalencoding
( ) ( )r t r t q
(q = 1 in the remainder of this talk)
The contributions of individual stores can simply be added
1 2( ) ( ) ( ) ... ( )Sr t r t r t r t
( )( ) 1 r tp t e
Forgetting
One-store case: forgetting
11 1( ) a tr t e
AssumptionIn all stores, we have an exponential decline of intensity with time t
is the intensity immediately after learning
a1 is the decline parameter
Formation and decline
• Longer learning times will lead to higher intensity
• Decline is caused by – interference from other items (not yet modeled)– displacement in some ‘buffer’– loss of effectiveness of the search cue– neural ‘noise’ and competition
The shape of forgetting
11
1( ) 1a tep t e
Forgetting in the one-store case:
Some properties of the forgetting curve
• Probability of recall always stays between 0 and 1
• Forgetting is not necessarily greatest after learning:We predict a flex point when the initial recall is at least 63.01)0( 1 ep
Probe-digit experiment (Waugh & Norman 1965)
R2 = 0.7295
0
20
40
60
80
100
0 2 4 6 8 10 12
Time (s)
Ret
enti
on
(%
)Example: Single-store model fitted
to short-term forgetting data
R2 = 0,985
Amnesia
Retrograde amnesia
Assumption:Hippocampus (link system) = store 1Neocortex (trace system) = store 2
0
0.5
1
0 10 20 30 40 50
Time (years)
Rec
all p
roba
bilit
y
xretrograde amnesia
anterograde amnesia
lesionpresent past
Amnesia patient
Normal forgetting
Amnesia in the two-store model
12 1 2
[1]2 2
1[ 2 ] 1
With normal recall we have:
( ) ( ) ( )
With a lesioned hippocampus we have:
( ) ( )
With a non-functional neocortex we have
( ) ( )
r t r t r t
r t r t
r t r t
Amnesia: animal data
Retrograde amnesia
Cho & Kesner (1996). (mice) R2=0.96
b.
0
0.25
0.5
0.75
1
0 10 20 30 40 50
Time (days)
Rec
all p
rob
abili
ty
Summary of animal data
a.
0
0.25
0.5
0.75
1
0 20 40 60
Time (days)
Rec
all p
roba
bilit
y
b.
0
0.25
0.5
0.75
1
0 10 20 30 40 50
Time (days)R
ecal
l pro
babi
lity
c.
0
0.25
0.5
0.75
1
0 20 40 60
Time (days)
Rec
all p
roba
bilit
y
d.
0
0.25
0.5
0.75
1
0 20 40 60
Time (days)
Rec
all p
roba
bilit
y
f.
0
0.25
0.5
0.75
1
0 50 100 150
Time (days)
Rec
all p
roba
bilit
y
e.
0
0.25
0.5
0.75
1
0 5 10
Time (days)
Re
call
pro
bab
ility
d.
0
0.25
0.5
0.75
1
0 20 40 60
Time (days)
Re
call
pro
bab
ility
Cortical amnesia
Frankland, O’Brien, Ohno, Kirkwood, & Silva, (Nature, 2001).
Data provided by Paul Frankland
Frankland et al. (2001) study
• -CaMKB-dependent plasticity (in neocortex) switched off in knock-out mice
• No LTP measurable in neocortex but LTP in hippocampus was largely normal
• Forgetting curves with different levels of initial learning were measured
• A learning curve was measured
• Assumption: use r1[2](t) for knock-out mice
Forgetting after 3 shocks, using three parameters
Freezing after 3 shocks
0
0.25
0.5
0.75
1
0 10 20 30 40 50
Retention delay (days)
Fre
ez
ing
(fr
ac
tio
n)
Using the same three parameters and a massed-learning correction.
Freezing after 8 shocks
0
0.25
0.5
0.75
1
0 5 10 15
Retention delay (days)
Fre
ezin
g (
frac
tio
n)
Controls receive 1 shock, experimental animals 3 shocks (no new free parameters).
Freezing after weak learning
0
0.25
0.5
0.75
1
0 5 10 15
Retention delay (days)
Fre
ezin
g (
frac
tio
n)
Repeated learning for experimental animals (no new free parameters)
Freezing after repeated learning
0
0.25
0.5
0.75
0 1 2 3
Training day
Fre
ez
ing
(fr
ac
tio
n)
Summary of ‘cortical amnesia’. Using only 4 parameters for all curves (R2 = 0.976).
Freezing after 3 shocks
0
0.25
0.5
0.75
1
0 10 20 30 40 50
Retention delay (days)
Fre
ezin
g (
frac
tio
n)
Freezing after 8 shocks
0
0.25
0.5
0.75
1
0 5 10 15
Retention delay (days)
Fre
ezin
g (
frac
tio
n)
Freezing after weak learning
0
0.25
0.5
0.75
1
0 5 10 15
Retention delay (days)
Fre
ezin
g (
frac
tio
n)
Freezing after repeated learning
0
0.25
0.5
0.75
0 1 2 3
Training day
Fre
ez
ing
(fr
ac
tio
n)
(a) (b)
(c) (d)
Amnesia: human data
Retrograde amnesia
Remarks on the human data
• Fitting procedure identical to animal data
• Data are very noisy
• Basic fits: a2 = 0, full lesion assumed. This leaves three parameters.
Kopelman (1989). News events test. Korsakoff (left), Alzheimer (right, fitted with lower 2). R2=0.951.
a.
0
0.5
1
0 10 20 30 40 50
Time (years)
Rec
all p
roba
bilit
y
a.
0
0.5
1
0 10 20 30 40 50
Time (years)
Rec
all p
roba
bilit
y
Wiig, Cooper & Bear (1996). (rats) R2=0.28
d.
0
0.25
0.5
0.75
1
0 20 40 60
Time (days)
Rec
all p
rob
abili
ty
Problem with nearly all human data
• Straight fits of forgetting curves and Ribot gradients are nonsense
• Tests items for remote periods are easier
• Typically: curves are flat around 80-85% for control subjects
• Reason: maximizes chances of detecting Ribot effects
• Disadvantage: shape of the curves is useless for quantitative analysis
Relative retrograde gradient: a relative measure of memory loss
[1]2[1]2
12
log(1 )( )( )
( ) log(1 )patient
control
pr trr t
r t p
Wiig, Cooper & Bear (1996). (rats) with rr-gradient: R2=0.84
d.
0
0.25
0.5
0.75
1
0 20 40 60
Time (days)
Rec
all p
rob
abili
ty
Albert et al. (1979). Naming of famous faces. Korsakoff patients. R2 = 0.977 and R2
rr=0.978.
a.
0
0.5
1
0 10 20 30 40 50
Time (years)
Rec
all p
roba
bilit
y
a.
0
0.5
1
0 10 20 30 40 50
Time (years)
Re
lativ
e in
tens
ity
In progress with Memory Chain Model
• Fits to patient data: Huntington, Alzheimer, Korsakoff, TGA, focal lesions, ECT, etc.
• Data collection with four new tests of retrograde amnesia (in Dutch) developed in my group
• Memory Chain Model helps with item selection and interpretation of results (clinical application: diagnosis)
Summary of the human data
• This work is still in progress
• About 25 human data sets have been fitted
• rr-gradient allows initial quantitative analysis
• Human data and animal give the same overall picture
Concluding remarks
• Consolidation is still a hotly debated issue
• Modeling can help to elucidate the various viewpoints
• Models at various levels of detail can be developed:– Connectionist models (e.g., TraceLink)– Mathematical models (e.g., Memory Chain
Model)