This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Learning Systems doIntelligent Agents NeedComplementary LearningSystems Theory UpdatedDharshan Kumaran12 Demis Hassabis13 andJames L McClelland4
We update complementary learning systems (CLS) theory which holds that
intelligent agents must possess two learning systems instantiated in mamma-
lians in
neocortex
and
hippocampus
The 1047297rst
gradually
acquires
structured
knowledge representations
while
the
second
quickly
learns
the
speci1047297cs
of
individual experiences We broaden the role of replay of hippocampal memories
in the theory noting that replay allows goal-dependent weighting of experience
statistics We also address recent challenges to the theory and extend it by
showing that recurrent activation of hippocampal traces can support some
forms of generalization and that neocortical learning can be rapid for informa-
tion that
is
consistent
with
known
structure
Finally
we
note
the
relevance
of
the
theory to the design of arti1047297cial intelligent agents highlighting connections
between neuroscience
and
machine
learning
Complementary Learning Systems
Twenty
years
have
passed
since
the
introduction
of
the
CLS
theory
of
human
learning
andmemory
[1] a
theory
that
itself
had
roots
in
earlier
ideas
of
Marr
and
others
According
to
thetheory
effective
learning
requires
two
complementary
systems
one
located
in
the
neocortexserves
as
the
basis
for
the
gradual
acquisition
of
structured
knowledge
about
the
environmentwhile
the
other
centered
on
the
hippocampus
allows
rapid
learning
of
the
speci1047297csof
individualitems
and
experiences
We
begin
with
a
review
of
the
core
tenets
of
this
theory
We
then
providethree
types
of
updates
First
we
extend
the
role
of
replay
of
memories
stored
in
the
hippocam-pus
This
mechanism
initially
proposed
to
support
the
integration
of
new
information
into
theneocortex
may
support
a
diverse
set
of
functions
[23]
including
goal-related
manipulation
of
experience
statistics
such
that
the
neocortex
is
not
a
slave
to
the
statistics
of
its
environmentSecond
we
describe
recent
updates
to
the
theory
in
response
to
two
key
empirical
challenges(i)
evidence
suggesting
that
the
hippocampus
supports
some
forms
of
generalization
that
gobeyond
those
originally
envisaged
[4ndash6]and
(ii)
evidence
suggesting
that
when
new
informationis
consistent
with
existing
knowledge
the
time
required
for
its
integration
into
the
neocortex
maybe
much
shorter
than
originally
suggested
[78] In
a
1047297nal
section
we
highlight
links
between
thecore
principles
of
CLS
theory
and
recent
themes
in
machine
learning
including
neural
network architectures
that
incorporate
memory
modules
that
have
parallels
with
the
hippocampus
Whilethere
remain
several
issues
not
yet
fully
addressed
(see
Outstanding
Questions)
the
extensionsresponses
to
challenges
and
integration
with
machine
learning
bring
the
theory
into
agreementwith
many
important
recent
developments
and
provide
a
take-off
point
for
future
investigation
TrendsDiscovery of structure in ensembles of experiences depends on an interleavedlearning process both in biologicalneural networks in neocortex and incontemporary arti1047297cial neural networks
Recent work shows that once struc-tured knowledge has been acquired insuch networks new consistent infor-mation can be integrated rapidly
Both natural and arti1047297cial learning sys-tems
bene1047297t from a secondsystem thatstores speci1047297c experiences centred onthe hippocampus in mammalians
Replayof experiences from this systemsupports interleaved learning and canbe modulated by reward or noveltywhich acts to rebalance the genera lstatistics of the environment towardsthe goals of the agent
Recurrent activation of multiple mem-ories within an instance-based systemcan be used to discover links betweenexperiences supporting generalizationand memory-based reasoning
1Google DeepMind 5 New Street
Square London EC4A 3TW UK2Institute of Cognitive Neuroscience
A central tenet of the theory is that the neocortex houses a structured knowledge representationstored
in
the
connections
among
the
neurons
in
the
neocortex
This
tenet
arose
from
theobservation that multi-layered neural networks (Figure 2) gradually learn to extract structurewhen
trained
by
adjusting
connection
weights
to
minimize
error
in
the
network
outputs
[10]
Early
Key Figure
Complementary Learning Systems (CLS) and their Interactions
Bidireconal connecons (blue)link neocorcal representaonsto the hippocampusMTL forstorage retrieval and replay
Rapid learning in connecons within
hippocampus (red) supports inial
learning of arbitrary new informaon
Connecons within andamong neocorcal areas(green) support gradual
acquision of structuredknowledge throughinterleaved learning
Figure 1 Lateral view of one hemisphere of the brain where broken lines indicate regions deep inside the brain or on the
medial surface Primary sensoryand motor cortices areshown in darker yellow Medial temporal lobe (MTL) surrounded bybroken lineswith hippocampus in dark grey andsurroundingMTLcortices in light grey (size andlocationare approximate)Green arrows represent bidirectional connections within and between integrative neocortical association areas andbetween these areas and modality speci1047297c areas (the integrative areas and their connections are more dispersed thanthe 1047297gure suggests) Blue arrowsdenotebidirectional connections between neocortical areas and theMTL Both blue andgreen connections are part of the structure-sensitive neocortical learning system in theCLS theory Red arrowswithin theMTLdenoteconnections within the hippocampus and lighter-red arrows indicate connections between the hippocampusandsurroundingMTLcortices these connections exhibitrapid synaptic plasticity (red greater than light-redarrows) crucialfor the rapidbinding of the elementsof an eventinto an integratedhippocampalrepresentationSystems-level consolidationinvolves hippocampal activity during replay spreading to neocortical association areas via pathways indicated with bluearrows thereby supporting learning within intra-neocortical connections (green arrows) Systems-level consolidation isconsidered complete when memory retrieval ndash reactivation of the relevant set of neocortical representations ndash can occurwithout the hippocampus
Trendsin CognitiveSciences July 2016 Vol 20 No 7 513
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Glossary Attractor network networks withrecurrent connectivity that havestable states which persist in the
absence of external inputs andafford noise tolerance Discretepointattractor networks can be used tostore multiple memories as individualstable states Continuous attractornetworks have a
continuous manifoldof stable points which allow them torepresent continuous variables (egposition in space) Auto-associative storage thestorage within an attractor network of an input pattern constituting anexperience such that elements of theinput pattern are linked togetherthrough plasticity within the recurrentconnections of the network Theoperation of recurrent connectionssupports functions such as patterncompletion whereby the entire inputpattern (eg memory of a birthdayparty) can be retrieved from a partialcue (eg the face of a friend)Exemplar models exemplar modelsin cognitive science related toinstance-based models in machinelearning operate by computing thesimilarity of a new input pattern (iepresented as external sensory input)to stored experiences This results inthe output of the model for examplea predicted category label for thenew input pattern at which point theprocess terminatesNon-parametric we use this termto refer to algorithms where eachexperience or datapoint has its ownset of coordinates where capacitycan be increased as required ndash andthe number of parameters may growwith the amount of data K-nearestneighbor constitutes one commonexample of such a non-parametricinstance-based methodParametric we use this term torefer to algorithms that do not storeeach datapoint but instead directlylearn a function that (for example)
predicts the output value for a giveninput The number of parameters istypically 1047297xedPaired associative inference (PAI)
task a paradigm in which items areorganized into (eg a hundred) setsof triplets (eg ABC) or larger sets(eg sextets
ABCDEF) Participantsview item pairs (eg
AB BC) duringthe study phase and are tested ontheir ability to appreciate the indirectrelationships between items that
Box 1 Empirical Evidence Supporting Core Principles of CLS Theory
The Role of the Hippocampus in Memory
Bilateral damage to the hippocampusprofoundlyaffectsmemoryfor new informationleaving language reading generalknowledge and acquired cognitive skills intact [2934]
consistent with the idea that many types of new learning are
initially hippocampus-dependent Memory for recent pre-morbid information is profoundly affected by hippocampaldamage with older memories being less dependent on the hippocampus and therefore less sensitive to hippocampallesions [13451128] supportinggradual integrationof learned information intocortical knowledge structuresHoweversome evidence suggests that memoryfor speci1047297c details of an event canremainMTL-dependent [52129] aslongas thedetails are retained (eg [130])
Hippocampus Supports Core Computations and Representations of a Fast-Learning Episodic Memory System
Episodicmemoryis widelyacceptedto dependon thehippocampus mediatedbya capacity tobind together (ie lsquoauto-associatersquo) diverse inputsfromdifferentbrainareasthat represent theconstituents of anevent Indeed information aboutthe spatial (eg place)and non-spatial (eg what happened)aspects of an event are thought to be processedprimarilyby parallel streams before converging in the hippocampus at the level of the DGCA3 subregions [37] Two comple-mentary computations ndash pattern separation and pattern completion ndash are viewed to be central to the function of thehippocampus for storing detailsof speci1047297c experiencesEvidencesuggests that thedentate gyrus (DG) subregionof thehippocampus performs pattern separation orthogonalizing incoming inputs before auto-associative storage in theCA3
region [131ndash137] Further the CA3 subregion is crucial for pattern completion ndash allowing the output of an entirestored pattern (eg corresponding to an entire episodic memory) from a partial input consistent with its function as an
attractor network [138139] (Boxes 2ndash4)
Hippocampal Replay
A
wealth of evidence demonstrates that replay of recent experiences occurs during of 1047298ine periods (eg during sleeprest) [23] Further the hippocampus andneocortex interact during replay as predicted by CLS theory [65] putatively tosupport interleaved learning A causal role for replay in systems-level consolidation is supported by the 1047297nding thatoptogenetic blockage ofCA3output in transgenic mouseafter learning in a contextual fear paradigmspeci1047297cally reducessharp-wave ripple (SWR) complexes in CA1 and impairs consolidation [69]
The
Hippocampus And Neocortex Support Qualitatively Different Forms of Representation
A recentexperiment [140] found initial evidence in favor thebehavior of rats in theMorriswater maze early on appearedtore1047298ect individual episodic traces (ie an instance-based non-parametric representation) but at a
later time-point (28days after learning) was consistent with the use of a parametric representation putatively housed in the neocortex
514 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
were never presented together (eg A and C)Paired associative recall task aparadigm where item pairs are
experienced during study (eg wordpairs such as lsquodogndashtablersquo in a humanexperiment or 1047298avorndashlocation pairs ina rodent experiment) and at test theindividual must recall the other item(eg speci1047297c
location) from a cue(the speci1047297c 1047298avor eg banana)Recurrent similarity computation
recurrent similarity computationallows the procedure performed byexemplar models to iterate that isthe retrieved products from the 1047297rststep of similarity computation arecombined with the external sensoryinput and a
subsequent round of similarity computation is performed
This process continues until a
stablestate (ie basin of attraction in aneural network) is reached Thisallows the model to capture higher-order similarities present in a set of related experiences where pairwisesimilarities alone are not informativeSharp-wave ripple (SWR)
spontaneous neural activity occurringwithin the hippocampus duringperiods of rest and slow wave sleepevident as negative potentials (iesharp waves) Transient high-frequency (150Hz) oscillations (ieripples) occur within these sharpwaves which can re1047298ect the replay ( i
e reactivation) of activity patternsthat occurred during actualexperience sped up by an order of magnitudeSparsity the proportion of neuronsin a given brain region that are activein response to a given stimulus(lsquopopulation sparsenessrsquo)
Sparsecoding where a small (eg 1)proportion of neurons is active iscontrasted with densely distributedcoding where a relatively largeproportion of neurons are active (eg20)
modeling
the
neural
computations
supporting
visual processing
of
objects
in
primates
[1718]
Theconsiderable
advantages
of
depth
in
allowing
the
learning
of
increasingly
complex
and
abstractmappings
[16]
are
balanced
here
by
the
strong
interdependencies
among
connection
weights
indeep
networks
[1920]
such
that
the
weights
are
learned
gradually
through
extensive
repeatedand
interleaved
exposure
to
an
ensemble
of
training
examples
that
embody
the
domain
statistics
Although
there
are
real
advantages
of
a
system
using
structured
parametric
representations
on
its
own
such
a
system
would
suffer
from
two
drastic
limitations
[1]
First
it
is
important
to
be
ableto
base
behavior
on
the
content
of
an
individual
experience
For
example
after
experiencing
alife-threatening
situation ndash
for
example
an
encounter
with
a
lion
at
a
watering-hole ndash
it
wouldclearly
be
bene1047297cial
to
learn
to
avoid
that
particular
location
without
the
need
for
furtherencounters
with
the
lion
The
second
problem
is
that
the
rapid
adjustment
of
connectionweights
in
a
multilayer
network
to
accommodate
new
information
can
severely
disrupt
therepresentation
of
existing
knowledge
in
it ndash
a
phenomenon
termed
catastrophic
interference[121ndash23]
that
is
related
to
the
stabilityndashplasticity
dilemma
[24]
If
the
new
information
about
thedangerous
lion
is
forced
into
a
multi-layer
network
by
making
large
connection
weight
adjust-ments just to accommodate this item this can interfere with knowledge of other less-threateninganimals
one
may
already
be
familiar
with
Layer 4 (Output)
0
0
= j
a j lndash1w ij llndash1
w ij llndash1
sumnl i
nl i
al i
al
i
a j lndash1
Layer 3
Layer 2
Layer 1 (Input)
Target Figure 2 A Neocortex-Like Arti1047297cialNeural Network In the complementarylearning systems (CLS) theory neocorticalprocessing is seen as occurring through
the propagation of
activation among neu-rons via weighted connections as simu-lated using arti1047297cial networks of neuron-like units (small circles) Each unit has aninput line and an output line (with arrow-head) There is a separate real-valuedweight where each output line crossesan input line The weights are the knowl-edge that governs processing in the net-work During processing (inset) each unitcomputes a net input ( n) from the activa-tions of its inputs and the weights (plus abias term omitted here) producing anactivation ( a) that is a non-linear functionof n (one such function shown) The unitsin a layer may project back onto their own
inputs (illustrated for layer 3) simulatingrecurrent intra-cortical computations andhigher layers may project back to lowerlayers (Figure 1) In the situation shownthe input ( lower left) is a pattern in whichuni ts are either act ive ( a = 1 black) orinactive ( a = 0 white) and examples of possible activations produced in units of other layers are shown (darker for greateractivation) Learning occurs throughadjusting theweights to reduce the differ-ence between the output of the network anda targetoutput (upperright) [1016] Inthecaseshown theoutput activations aresimilar to the target but there is someerror to drive learning There are no tar-
gets for internal or h idden layers ( ie layers 2 and 3) These patterns dependon the connection weights which in turnare shaped by the error-driven learningprocess
Trendsin CognitiveSciences July 2016 Vol 20 No 7 515
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 2 Functional Roles of Subregions of the Medial Temporal Lobes
Work within the CLS framework [27116141] relies on the anatomical and physiological properties of MTL subregionsand the computational insights of others [92526] to characterize the computations performedwithin these structures
Entorhinal Cortex (ERC) Input to the Hippocampal SystemDuring an experience inputs from neocortex produces a pattern of activation in the ERC that may be thought of as acompressed description of the patterns in the contributing cortical areas (Figure I illustrative active neurons in the ERCare shown in blue) ERC neurons give rise to projections to three subregions of the hippocampus proper the dentategyrus (DG)CA1and CA3[2884]
Pattern selection andpattern separation
novel ERCpatternsare thought to activate asmall setof previously uncommitted DGneurons (shownin redndash theseneuronsmaybe relatively youngneurons createdby neurogenesis) These neurons in turn select a random subset of neurons in CA3 via large lsquodetonator synapsesrsquo(shownas reddots on theprojection from DG toCA3) to serve as therepresentationof thememory in CA3 ensuring thatthenew CA3pattern is asdistinct as possible from theCA3 patterns forothermemories includingthose forexperiencessimilar to the new experience (Boxes 3 and4) Pattern completion recurrent connections from the active CA3neuronsonto other active CA3 neurons are strengthened during the experience such that if a subset of the same neurons laterbecomes active the rest of the pattern will be reactivated Direct connections from ERC to CA3 are also strengthenedallowing the ERC input to directly activate the pattern in CA3during retrieval without requiring DG involvement (Box 3)Pattern reinstatement in ERC and neocortex [116141]
The connections from ERC to CA1 and back are thought tochange relatively slowly to allow stable correspondence between patterns in CA1 and ERC Strengthening of connec-tions from the active CA3 neurons to the active CA1 neurons during memory encoding allows this CA1 pattern to be
reactivated when thecorresponding CA3pattern is reactivated the stable connections from CA1 to ERCthen allow theappropriate pattern there to be reactivated and stable connections between ERC andneocortical areas propagate thereactivated ERCpattern to the neocortex Importantlythe bidirectional projectionsbetweenCA1andERCand betweenERC and neocortex support the formation and decoding of invertible CA1 representations of ERC and neocorticalpatternsand allow recurrent computations These connections shouldnot changerapidly given theextendedrole of thehippocampus in memory ndash otherwise reinstatement in the neocortex of memories stored in the hippocampus would bedif 1047297cult [61]
CA3
CA1
DG
ERC
Neocortex Neocortex
Figure
I
Hippocampal
Subregions
Connectivity
and
Representation
Schematic depictions of neurons (withcircular or triangular cell bodies) are shown along with schematic depictions of projections from neurons in an area toneurons in thesameor other areas (greyor colored lines ndash red coloring indicatesprojectionswith highly-plastic synapseswhile grey coloring illustrates relatively less-plastic or stable projections) CA1 output to ERC then propagates out toneocortex ERCandeven resultingneocorticalactivitycan befed back into thehippocampus(broken line)as proposed inthe REMERGE model (see below)
Trendsin CognitiveSciences July 2016 Vol 20 No 7 517
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
hippo-campal representation formed in learning an event affords a way of allowing gradual integrationof
knowledge
of
the
event
into
neocortical
knowledge
structures
This
can
occur
if
the
hippo-campal representation can reactivate or replay the contents of the new experience back to theneocortex
interleaved
with
replay
andor
ongoing
exposure
to
other
experiences
[1]
In
this
waythe
new
experience
becomes
part
of
the
database
of
experiences
that
govern
the
values
of
theconnections
in
the
neocortical
learning
system
[51ndash53] Which
other
memories
are
selected
forinterleaving
with
the
new
experience
remains
an
open
question
Most
simply
the
hippocampusmight
replay
recent
novel
experiences
interleaved
with
all
other
recent
experiences
still
stored
in
Box 3 Pattern Separation and Completion in Different Subregions of the Hippocampus
Pattern separationand completion [25ndash27] are de1047297nedin terms oftransformationsthat affectthe overlap or similarity amongpatterns of neuralactivity [28142] Patternseparationmakes similarpatternsmoredistinct through conjunctivecoding [925] in which each outputneuron respondsonly to a speci1047297c combinationof activeinputneurons Figures IA and IB illustrate how this can occur Pattern separation is thought to be implemented in DG (see Box 4) using higher-order conjunctions that
reduce overlap even more than illustrated in the 1047297gure
Pattern completion is a process that takesa fragmentof a pattern and1047297llsin theremaining features (asin recallinga lion upon seeingthe scenewhere thelionpreviouslyappeared)or that takesa pattern similarto a familiar patternandmakes it evenmore similarto itComputational simulations [27] have shownhowtheCA3region mightcombine featuresof patternseparationand completion such that moderate andhighoverlap results in pattern completion towardthe storedmemory butless overlapresults in thecreationof a newmemory [37133143] (FigureIC)In this account when environmentalinput produces a pattern in ERCsimilar to a previous pattern theCA3outputs a pattern closerto theone it previously used for this ERCpattern [124144] However when theenvironmentproduces an input on theERC that haslowoverlap with patterns stored previously the DG recruits a new statistically independent cell population in CA3 (ie pattern separation [27]) Emerging evidencesuggests that the amountof overlap required forpattern completion (aswell as other characteristics of hippocampal processing) maydifferacross theproximal-distal[145146] anddorsondashventral axes [98147ndash150] of thehippocampus andmay be shapedby neuromodulatory factors(eg Acetylcholine) [85151] Also incompletepatterns require less overlap with a storedpattern than distorted ones for completion to occur so that partial cues will tend to produce completion aswhen oneseesthe watering hole and remembers seeing a lion there previously [27]
Several studies point to differences between theCA3andCA1 regions in how their neural activity patterns respond to changes to the environment [37] broadly theCA1 region tends to mirror the degree of overlap in the inputs from the ERC while CA3 shows more discontinuous responses re1047298ecting either pattern separation or
completion [134152]
Input overlap Input overlap
Paern separaon in DG(A) (B) (C) Separaon and compleon in CA3
O u t p u t o v e r l a p
O u t p u t o v e r l a p
00
1
0
1
1
0
1
Figure I Conjunctive Coding Pattern Separation and Pattern Completion (A) A set of 10 conjunctive unitswithconnections from a layer of 5 input units isshown twicewith differentinputpatternsHere each conjunctive unit detects activity in a distinct pair of input units (arrows)The outputfor each pattern is sparser thanthe input (ie30 vs 60 respectively) andthe twooutputs overlap less than thetwo correspondinginputs (ie33 vs67 respectively overlap is thenumber of activeunitsshared by twopatternsdivided by thenumber of units activein each)DG mayuse higher-order conjunctions magnifying these effects (B)An illustration of the general form of a pattern separation function showing the relationship between input and output overlap Arrows indicate the overlap of the inputs and outputsshown in the left panel (C) The separation-and-completionpro1047297le associated with CA3 where low levels of input overlap are reduced further while higher levels areincreased [2737]
518 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
comple-mentary properties of each of the two component systems allowing new information to berapidly stored in the hippocampus and then slowly integrated into neocortical representations This
process
sometimes
labeled lsquosystems
level
consolidationrsquo
[51] arises
within
the
theoryfrom gradual cortical learning driven by replay of the new information interleaved with otheractivity
to
minimize
disruption
of
existing
knowledge
during
the
integration
of
the
newinformation
Empirical Evidence of Replay
Because
of
its
centrality
in
the
theory
we
highlight
key
empiricalevidence
that
replay
events
really
do
occur
The
data
come
primarily
from
rodents
recordedduring
periods
of
inactivity
(including
sleep)
in
which
hippocampal
neurons
exhibit
large
irregularactivity
(LIA)
patterns
that
are
distinct
from
the
activity
patterns
observed
during
active
states[23]
During
LIA
states
synchronous
discharges
thought
to
be
initiated
in
hippocampal
areaCA3
produce sharp-wave ripples
(SWRs)
which
are
propagated
to
neocortex
SWRs
re1047298ectthe
reactivation
of
recent
experiences
expressed
as
the
sequential
1047297ring
of
so-called
place
cellscells
that 1047297re
when
the
animal
is
at
a
speci1047297c location
[2357ndash59]
These
replay
events
appear
tobe
time-compressed
by
a
factor
of
about
20
bringing
neuronal
spikes
that
were
well-separatedin
time
during
an
actual
experience
into
a
time-window
that
enhances
synaptic
plasticity
both
Box 4 Sparse Conjunctive Coding and Pattern Separation in the Dentate Gyrus
Neuronal codes range from the extreme of localist codes ndash where neurons respond highly selectively to single entities(lsquograndmother cellsrsquo) to dense distributedcodeswhere items arecoded through theactivity ofmany (eg 50) neuronsin
an area [153154]
While localist codes minimize interference andare easily decodable they are inef 1047297cient in terms of
representational capacity By contrast densedistributed codesare capacity-ef 1047297cient however they are costly in termsof metabolic cost and relatively dif 1047297cult to decode These are endpoints on a continuumquanti1047297ed by a measure calledsparsity where lsquopopulationrsquo sparsity indexes theproportion of neurons that 1047297re in response to a given stimuluslocationand lsquolifetimersquo sparsity indexes the proportion of stimuli to which a single neuron responds [26153155] For example apopulationsparsity of
1meansthatonly 1of the neuronsin a
populationare activein representinga given inputTworandomly selected sparsepatternstend tohave lowoverlap (for tworandomlyselectedpatternsof equalsparsity over thesame setof neurons theaverageproportion of neuronsin eitherpattern that is active in theotheris equal to thesparsity)but neurons still participate in several different memories making them more ef 1047297cient than localist codes Despitevariability in estimatesof thesparsity ofa givenbrain region [27153156157] theDG iswidelybelievedto sustain amongthe sparsest neural code in the brain (05ndash1 population sparseness) [25ndash27] The CA3 region to which the DGprojects is thought to be less sparse (25 [47])
Many studies 1047297nd less-sparse patterns in CA1 than CA3 [134152]
The unique functional and anatomicalproperties of the DG suggest the origins of its sparse pattern-separated code Theperforant pathfromtheERC (containing200000neurons intherodent)projects toa layerof 1millionofDGgranulecellsCombinedwith thehigh levels of inhibition in theDG this supports theformation of highlysparse conjunctive representa-tions such that each neuron in DG responds only when several input neurons aresimultaneouslyactive reducing overlapbetweensimilar input patterns [25ndash27136] Evidencealso suggests thatnew DGneuronsarisefromstemcells throughoutadult lifethesenewneuronsmaybe preferentially recruitedin theformation ofmemories[136] further reducingoverlapwithpreviouslystored
memoriesTheCA3pattern fora memoryis then selectedby theactiveDG neurons eachofwhichhas alsquodetonatorrsquo synapse to15 randomly selectedCA3neurons This process helpsminimize theoverlap of CA3patterns fordifferent memories increasing storage capacity and minimizing interference between them even if the two memoriesrepresentsimilar events thathavehighlyoverlappingpatternsin neocortex andERCEmpiricalevidenceprovidessupport forthis with one study [137] showing that the representation supported by DGwashighly sensitive to small changes in theenvironmentdespiteevidence thatincominginputsfrom theERCwere little affected(alsosee [133145])
FurthermoreDGlesions impairananimalsrsquo abilitytolearntoresponddifferentlyintwoverysimilarenvironmentswhileleavingtheabilitytolearnto respond differently in two environments that are not similar [136]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 519
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 5 Similarity-Based Coding in High-Level Visual Cortex
High-level visual regions of the neocortex are thought to support distributed representations that are inferred to be lesssparsethan those of theDG andthe CA3CA1 regions of thehippocampus (Box4) Populationsparseness in theERC isestimatedat 7ndash10 [158]
with high-level sensory cortices exhibitingsimilar or higher levels of sparseness (eg variable
estimates [44ndash46]) Although lifetime sparseness does not directly translate to population sparseness recent evidencesuggests that V4and inferotemporal cortex(ITc)havea sparsenessof 10on this measure [159] It isworth notingthatlearning ratesmay vary according to neuronal selectivity andlifetime sparseness resultingin differences in learning ratesacross neocortical areasand hippocampal subregionsNeurons in early visual regions that encode frequently-occurringfeatures (ie edges)mayhave a relatively slow learning rate while neurons in higher visual regions andbeyond (eg ITcand perirhinal cortex) may have a higher learning rate to support the encoding of less-frequently occurring more-conjunctive features (eg individual objects) [12160161]
Evidence from electrophysiological recording studies in high-level visual cortical regions such as the ITc in primatesprovides support for the operation of a similarity-based coding scheme ndash whereby related categories (eg dogs andcats) are represented by overlapping neuronal codes [1740ndash43] (Figure I) Representational similarity analysis (RSA) of the ITc population response duringpassive viewing of pictures reveals codingof 1047297ne-grained categorical structure (egof a set of animate and inanimate objects) ndash that iswell 1047297t by deep convolutional neural networks which have algorithmicparallels with feedforward processing in the ventral visual stream [1740] While analogous similarity-based coding wasobserved using fMRI in the human homolog of ITc [41] there wasno evidence for greater within-category (cf between-category) representational similarity in any subregion of the hippocampus in a recent fMRI study [162] which foundevidence consistent with the importance of pattern separation in episodic memory Instead similarity-based coding inthis studywasobservedin theperirhinal andparahippocampal cortexndashMTL regionsthatproject tothe ERC and thataretypically considered to be intermediate zones (ie between the hippocampal and neocortical systems) in CLS theory
Dissimilarity
[percenle of 1 ndash r ]0 100
Monkey ITc Human ITc
AnimateNaturalNot human
Body Fa ce B ody FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
AnimateNaturalNot human
Bo dy Face Body FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
Figure I Similarity-Based Coding in High-Level Visual Cortex Representational dissimilarity matrices (RDM)re1047298ect the correlation (ie 1 r where r is the Pearson correlation coef 1047297cient) between the response of voxel patterns(fMRI in humans [41] right panel) or neuronal populations (electrophysiological recording in monkey [43]
left panel) to a
set of 92 object images RDMs are analogous in monkey and human ITc The RDMs show that the representations of animate objects are similar as are those of inanimate objects In addition to this clear animatendashinanimate distinctionobject coding in ITc exhibits 1047297ner categorical structure (eg for faces body parts) visible in these RDMs (also see [41])Reproduced with permission from [41]
520 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
rodents [7273] This generalized replayndash simultaneous reactivation of multiple related traces during testing or of 1047298ine periods ndash mayfacilitate the creation
of
new representations
f rom the
recombination
of
multiple relatedepisodes (lsquostored generalizationsrsquo) [5] and the discovery of novel relationships (eg shortcuts)[7273]
Empirical
evidencealsosupports
a
roleforthehippocampusin
category-
and
so-calledlsquostatisticalrsquo
learning [105ndash107] the mechanisms in
REMERGE and other
related modelsthat
rely on
separate memory
traces for individual
i tems allow weak hippocampal
tracesthat
support
only relat ively poor item recognition to
mediate
near-normal generalization[5108]
Box 6 Generalization Through Recurrence in the Hippocampal System
The REMERGEmodel (FigureI ) [5] which re1047298ects a synthesisof interactive activationand competition (IAC)models [163]and exemplar models of memory [108164165] constitutes an abstraction and simpli1047297cation of the multi-stagecircuitry of the hippocampal systeminto twoprincipal layers feature andconjunctivelayers broadly corresponding to the
ERC and hippocampus proper respectively The localist coding (eg unit AB) in the conjunctive layer re1047298ects anidealization of the sparsely distributed pattern-separated codes in the DGCA3 subregions of the hippocampus (Boxes2ndash4) that support episodic memory (eg for trials involving presentation of A and B objects together)
An essential principle of the model ndash mediated by the bidirectional excitatory connections between feature andconjunctive layers ndash is the principle of recurrence between the hippocampus proper and neocortical regions suchas the ERC (termed lsquobig-looprsquo recurrence to distinguish it from the internal recurrence known to exist within the CA3region) This allows recirculation of network output as a subsequent input to the system Intuitively this functionality iscrucial to allowing the model to discover the higher-order structure present within a
set of related episodes an initialprobe on the feature layer (eg denoting stimuli present on screen during a test trial) prompts the activation of experiences containing these elements on the conjunctive layer which in turn drives a new pattern of feature layeractivity that re1047298ects not only the external input but also the content of retrieved experiences This in turn leads to theactivation of conjunctive units denoting experiences related to the new feature layer pattern and so on This can bringabout a situation where for example the presentation of A and C can result in the activationof AB and BC which jointlyactivate B in turn further activating AB andBC which then suppress other conjuncts involvingA andC This produces astable state in which AB BC and A B and C are al l act ivated at the same time ndash thereby effectively inferring a link between A andC Longer-rangeinferences (egBndashE) canalsobe supportedby therecurrent mechanism([5] for details)Formally the function of the network can be viewed as carrying out recurrent similarity computation Unlike otherexemplar models [108164165] in which similarity computation is performed only on external inputs REMERGEperforms such computations on inputs affected by its own outputs
Conjuncve
Feature
AB
A B C D E F
BC CD DE EF
Figure I A Schematic of the Architecture of REMERGE Recurrent architecture of REMERGE showing its two-layer architecture with inputoutput units for possible constituents of experiences (A ndashF) conjunctive units representingpairs of constituents that have occurred together (AB BC etc) bidirectional connections (broken arrows) betweenconjuncts and their constituents and recurrent inhibition (broad arrow) among conjunctive units Adapted from [5]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 523
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
inferenceparadigm [590110]) Such representations then become the contents of episodic memorysubject
to
storage
in
the
hippocampus
The
distinction
between
encoding-
and
retrieval-based
models
can
be
related
more
broadly
tothe
1047297nding
of lsquoconceptrsquo cells
hippocampal
neurons
which
come
to
respond
to
common
featuresacross many events for example cells for speci1047297c odors [111] time-points within an episode[112]
attributes
of
a
task
[113]
and
even
cells
that
1047297re
to
any
picture
or
the
name
of
a
famousperson
[114]
In
Box
7
we
review
empirical 1047297ndings
concerning
concept
cells
and
pattern
overlap
sometimes observed in parts of hippocampus and consider how well these 1047297ndings 1047297t within theperspective
that
the
hippocampus
supports
pattern
separation
Rapid
Schema-Dependent
ConsolidationIt
is
useful
to
distinguish
systems-level
consolidation
from
what
we
refer
to
as
within-systemconsolidation
The
former
refers
to
the
gradual
integration
of
knowledge
into
neocortical
circuitswhile
the
latter
denotes
stabilization
of
recently
formed
memories
within
the
hippocampusperhaps
through
stabilization
of
synapses
among
hippocampal
neurons
[89] In
the
initialformulation
of
CLS
systems-level
consolidation
was
viewed
as
temporally
extended
(egspanning
years
or
even
decades
in
humans
[3451ndash53])
Although
it
was
noted
in
[1]
thatthe
timeframe
could
be
highly
variable
(depending
perhaps
on
the
rate
of
replay
of
memory
Box 7 Concept Cells and Nodal CodingsReports of concept cells in thehippocampushavebeen takenas contradictinga tenet ofCLStheorybut theexistence of such neurons is notnecessarilyinconsistentwith itgiven that thetheoryexpects differenthippocampalregions to vary interms of contextspeci1047297city andalso permits variationwithin hippocampal regions (Box 3) Evidence supporting theCLSprediction of context-speci1047297city in theCA3and DGcomes from a recent intracranial recording study in humans [166] Inthis study neurons in CA3DG andalso in the subiculum tended to discriminate between different imagesof a famousperson ndash with responses correlating with successful performance in a recognition memory task that required discri-minating previously experienced targets from similar lures Neurons in other MTL areas (ie entorhinal and parahippo-campal cortices) exhibitedmore invariant lsquoconcept cell likersquo responses that were not linked tomemory performance (theCA1 subregion was sparsely sampled in this study)
It is also interesting to consider the1047297ndingof lsquosplitterrsquo cells in a task where animalsmust alternatebetween turning left andright on successive trials in a T maze [167ndash179] here someCA1 and CA3 place cellsfor locations onthe central stemof the T maze are modulated by the trajectory of the rat (eg whether it will subsequently turn left or right) whereas othersare
trajectory-independentThisphenomenon knownas partial remapping [48170ndash172] is consistent with theidea that
pattern separation is a matter of
degree in our theory [2737] As such we should expectpartly overlapping representa-tions (ie ratherthan fully independent lsquochartsrsquo [121]) whenenvironmental changes are suf 1047297ciently small (Box3)We alsoexpectthe greatest differentiationin DGand at an early point in learningTo ourknowledge no studies have yetrecordedfrom DG in this paradigm
In a recent study representational similarity analysis techniques [173] were applied to ensemble recordingdata collectedwhile rats performed a context-guided rewarddiscrimination task [113] As expected the population codes in CA3 andCA1were dominatedby context andplace coding although other task dimensions ndash reward value and item ndashwere alsorepresented [113] (also see [174]) Although there was some representational overlap across locations based on valueand item CA3CA1 codes were consistent with incomplete but still strong pattern separation especially in the dorsalhippocampus Overall these 1047297ndings appear consistent with the CLS with the provision that pattern separation is amatter of degree andmay vary by task andregionWhyCA3 showsgreater speci1047297citythanCA1in somestudies but notothers requires further exploration
524 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
large amplitude weight changes occurred during the learning of schema-consistent
but not schema-inconsistent
information ndash
emulating the
schema-dependent pattern of neocortical plasticity-related gene expression reported in [8] A theo-retical analysis of
multilayer neural
networks makes clear why
themodel exhibits these effects[20]
the analysis
shows
that
the
rate of
learning within
a
multilayered
neural
network of
thetype that
CLS attributes to
the neocortex
[20]
will always
depend
on
the state of knowledge
Box 8 Rapid Integration of New Learning in the Neocortex When Does it Occur
In the event arena paradigm [78] (Figure I) hippocampal lesions prevent acquisition of new schema-consistentassociations By contrast hippocampal lesions performed as little as 48 h after learning leave memory intact Oneexplanation for the crucial but temporary nature of the hippocampal contribution is replay even a
few minutes with the
hippocampus intact couldallowmultiple replays eachone incrementing the strength of intra-neocortical connections Inan investigation of induction of plasticity-related genes in neocortex [8] the hippocampuswas intact for 80minutes afterinitial exposure to the new associations These 1047297ndings raise the broader question of when rapid integration of newlearning into the neocortex occurs and whether it can occur even without a hippocampus
A substantial body of work from several laboratories now supports the view that a single period of sleep can producechanges in how experiences froma single learning session impact on subsequent responding As key examples somestudies have reported increased levelsof linking inferences [175] andothershave reported increased lexical competitionand related phenomena[109176] attributedto a singlesleepsessionThese1047297ndingsare often interpreted asevidenceof rapidsystems-level consolidation (eg [176])
However thematerials used arenot obviously highly consistentwith priorknowledge in most cases and therefore under the CLS framework wewould not expect full integration into neocorticalnetworks in such a short time-period An alternative interpretation (illustrated in [5]) is that replays during sleep increasethe strength robustness and rate of activation of new hippocampus-dependent traces and that such strengtheningmay be suf 1047297cient to account for the observed effects Thus the 1047297ndings are consistent with the view that integration of these new memories into neocortical structures proceeds over a considerably longer time-period
Work with the lsquofast mappingrsquo paradigm in humanswith hippocampal lesions [177] provides another potential source of evidence about rapid neocortical learning of arbitrary new information In this paradigm human participants seepairs of pictures of objects ndash onefamiliar andone unfamiliar ndash and are asked a question such as lsquois thenumbats tail pointing uprsquoinferring that the unfamiliar name lsquonumbatrsquomust refer to the unfamiliar object [177] Some studies 1047297nd that patients withextensive hippocampus damage show retention of the new objectndashname association at a
delayed test [178179]suggesting very rapid neocortical learning even without a hippocampus However the 1047297nding has proven dif 1047297cult toreplicate [180ndash182] future studies should continue to investigate this issue
(A) (B)Original paired associates
1 2
3
4
5 5
4
8
3
7
2
6
Introducon of new paired associates
Figure I Schematic Illustration of the Event Arena Paradigm (A) Overhead view of 16 m 16 m event arenarats are cuedwithone of
six food 1047298avors (eg banana) each associated with a location in thearena (eg location 3) andare required to gofromany of the four start-boxesto a speci1047297c location to retrieve food (B)Following gradual learning of the originalset twonew 1047298avor-placepairs are introduced(eg cinnamonndashlocation7 nutmegndashlocation8) Rapidschema-dependent one-shot learning of these new PAs is observed (see Box text) Figure based on experimental designdescribed in [7]
526 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
allocatedneuronal codes that are non-overlapping or orthogonal (eg [26]) Notably the advantagesof
this
coding
scheme
for
episodic
memory ndash
reduction
of
interference
between
similar
butdistinct
events ndash
may
also
have
signi1047297cant
bene1047297ts for
continual
learning
Speci1047297cally
thismechanism allows the rapid creation of distinct non-interfering representations for multipletasks
to
which
an
agent
has
been
exposed
in
sequential
fashion
The
utility
of
this
function
andthe ubiquity of continual learning is well established in the domain of spatial navigation wherethe
notion
of
a
task
can
be
related
to
that
of
an
environmental
context
rodents
are
able
to
learnand
sustain
robust
representations
of
many
different
environments
(eg
gt10
environments
in[120])
with
each
environment
being
represented
by
a
pattern-separated
representational
space
Box 9 Experience Replay in Deep Q-Networks
Instead of employing a standard online learning method in which each unit of play experience (consisting of a stateaction next state and resulting reward) is used immediately to adjust connection weights and then discarded anexperience replay buffer similar to the hippocampus is used This allows learning based on randomly chosen subsets of
recent experiencesstored in the replay buffer([119] fordetails)to beinterleavedwith ongoing game-play Theapproach isin line with 1047297ndings cited above [66] that hippocampal replay reactivates reward related neurons in striatum in accordwith the hypothesis that hippocampus-dependent RL facilitates learning during off-line periods
Experience replayin theDQN architecturewascrucial in (i)maximizing data ef 1047297ciency allowing each unit of experience tobe reusedin many updates (egmirroringbene1047297ts of repeated time-compressedhippocampal replay) and (ii) smoothingout learning and avoiding unstable response policies that can result from the tendency of the current policy to bias theexperienced samples The approach minimizes learning from consecutive samples which is undesirable owing to theirstrongly correlated nature and inconsistent with the implicit assumptions built into neural-network learning algorithmsInstead experience replay allows updates within the deep Q-network to be performed on non-adjacent samples from aset of recent experiences in a fashion that breaks up these correlations while sti ll relying on relevant statistics Thedramatic advantage of a network implementing interleaved learning through experience replay was illustrated by theeffects of disabling replayon network performance this causeda severedrop in performance to at best30 of whenexperience replay was present [119] Note that the uniform sampling mechanismas implemented treats all transitions inthe replay memory as if they were equal Recent work [183] shows that biasing replay towards signi1047297cant events ndash
speci1047297cally experiences that are associated with high reward prediction errors ndash yields further gains This mechanismwhich resonateswith therole of the hippocampus in reweighting experiences as discussedabove allows information tobe harvested from rare experiences that may be particularly informative
528 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
A central tenet of the theory is that the neocortex houses a structured knowledge representationstored
in
the
connections
among
the
neurons
in
the
neocortex
This
tenet
arose
from
theobservation that multi-layered neural networks (Figure 2) gradually learn to extract structurewhen
trained
by
adjusting
connection
weights
to
minimize
error
in
the
network
outputs
[10]
Early
Key Figure
Complementary Learning Systems (CLS) and their Interactions
Bidireconal connecons (blue)link neocorcal representaonsto the hippocampusMTL forstorage retrieval and replay
Rapid learning in connecons within
hippocampus (red) supports inial
learning of arbitrary new informaon
Connecons within andamong neocorcal areas(green) support gradual
acquision of structuredknowledge throughinterleaved learning
Figure 1 Lateral view of one hemisphere of the brain where broken lines indicate regions deep inside the brain or on the
medial surface Primary sensoryand motor cortices areshown in darker yellow Medial temporal lobe (MTL) surrounded bybroken lineswith hippocampus in dark grey andsurroundingMTLcortices in light grey (size andlocationare approximate)Green arrows represent bidirectional connections within and between integrative neocortical association areas andbetween these areas and modality speci1047297c areas (the integrative areas and their connections are more dispersed thanthe 1047297gure suggests) Blue arrowsdenotebidirectional connections between neocortical areas and theMTL Both blue andgreen connections are part of the structure-sensitive neocortical learning system in theCLS theory Red arrowswithin theMTLdenoteconnections within the hippocampus and lighter-red arrows indicate connections between the hippocampusandsurroundingMTLcortices these connections exhibitrapid synaptic plasticity (red greater than light-redarrows) crucialfor the rapidbinding of the elementsof an eventinto an integratedhippocampalrepresentationSystems-level consolidationinvolves hippocampal activity during replay spreading to neocortical association areas via pathways indicated with bluearrows thereby supporting learning within intra-neocortical connections (green arrows) Systems-level consolidation isconsidered complete when memory retrieval ndash reactivation of the relevant set of neocortical representations ndash can occurwithout the hippocampus
Trendsin CognitiveSciences July 2016 Vol 20 No 7 513
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Glossary Attractor network networks withrecurrent connectivity that havestable states which persist in the
absence of external inputs andafford noise tolerance Discretepointattractor networks can be used tostore multiple memories as individualstable states Continuous attractornetworks have a
continuous manifoldof stable points which allow them torepresent continuous variables (egposition in space) Auto-associative storage thestorage within an attractor network of an input pattern constituting anexperience such that elements of theinput pattern are linked togetherthrough plasticity within the recurrentconnections of the network Theoperation of recurrent connectionssupports functions such as patterncompletion whereby the entire inputpattern (eg memory of a birthdayparty) can be retrieved from a partialcue (eg the face of a friend)Exemplar models exemplar modelsin cognitive science related toinstance-based models in machinelearning operate by computing thesimilarity of a new input pattern (iepresented as external sensory input)to stored experiences This results inthe output of the model for examplea predicted category label for thenew input pattern at which point theprocess terminatesNon-parametric we use this termto refer to algorithms where eachexperience or datapoint has its ownset of coordinates where capacitycan be increased as required ndash andthe number of parameters may growwith the amount of data K-nearestneighbor constitutes one commonexample of such a non-parametricinstance-based methodParametric we use this term torefer to algorithms that do not storeeach datapoint but instead directlylearn a function that (for example)
predicts the output value for a giveninput The number of parameters istypically 1047297xedPaired associative inference (PAI)
task a paradigm in which items areorganized into (eg a hundred) setsof triplets (eg ABC) or larger sets(eg sextets
ABCDEF) Participantsview item pairs (eg
AB BC) duringthe study phase and are tested ontheir ability to appreciate the indirectrelationships between items that
Box 1 Empirical Evidence Supporting Core Principles of CLS Theory
The Role of the Hippocampus in Memory
Bilateral damage to the hippocampusprofoundlyaffectsmemoryfor new informationleaving language reading generalknowledge and acquired cognitive skills intact [2934]
consistent with the idea that many types of new learning are
initially hippocampus-dependent Memory for recent pre-morbid information is profoundly affected by hippocampaldamage with older memories being less dependent on the hippocampus and therefore less sensitive to hippocampallesions [13451128] supportinggradual integrationof learned information intocortical knowledge structuresHoweversome evidence suggests that memoryfor speci1047297c details of an event canremainMTL-dependent [52129] aslongas thedetails are retained (eg [130])
Hippocampus Supports Core Computations and Representations of a Fast-Learning Episodic Memory System
Episodicmemoryis widelyacceptedto dependon thehippocampus mediatedbya capacity tobind together (ie lsquoauto-associatersquo) diverse inputsfromdifferentbrainareasthat represent theconstituents of anevent Indeed information aboutthe spatial (eg place)and non-spatial (eg what happened)aspects of an event are thought to be processedprimarilyby parallel streams before converging in the hippocampus at the level of the DGCA3 subregions [37] Two comple-mentary computations ndash pattern separation and pattern completion ndash are viewed to be central to the function of thehippocampus for storing detailsof speci1047297c experiencesEvidencesuggests that thedentate gyrus (DG) subregionof thehippocampus performs pattern separation orthogonalizing incoming inputs before auto-associative storage in theCA3
region [131ndash137] Further the CA3 subregion is crucial for pattern completion ndash allowing the output of an entirestored pattern (eg corresponding to an entire episodic memory) from a partial input consistent with its function as an
attractor network [138139] (Boxes 2ndash4)
Hippocampal Replay
A
wealth of evidence demonstrates that replay of recent experiences occurs during of 1047298ine periods (eg during sleeprest) [23] Further the hippocampus andneocortex interact during replay as predicted by CLS theory [65] putatively tosupport interleaved learning A causal role for replay in systems-level consolidation is supported by the 1047297nding thatoptogenetic blockage ofCA3output in transgenic mouseafter learning in a contextual fear paradigmspeci1047297cally reducessharp-wave ripple (SWR) complexes in CA1 and impairs consolidation [69]
The
Hippocampus And Neocortex Support Qualitatively Different Forms of Representation
A recentexperiment [140] found initial evidence in favor thebehavior of rats in theMorriswater maze early on appearedtore1047298ect individual episodic traces (ie an instance-based non-parametric representation) but at a
later time-point (28days after learning) was consistent with the use of a parametric representation putatively housed in the neocortex
514 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
were never presented together (eg A and C)Paired associative recall task aparadigm where item pairs are
experienced during study (eg wordpairs such as lsquodogndashtablersquo in a humanexperiment or 1047298avorndashlocation pairs ina rodent experiment) and at test theindividual must recall the other item(eg speci1047297c
location) from a cue(the speci1047297c 1047298avor eg banana)Recurrent similarity computation
recurrent similarity computationallows the procedure performed byexemplar models to iterate that isthe retrieved products from the 1047297rststep of similarity computation arecombined with the external sensoryinput and a
subsequent round of similarity computation is performed
This process continues until a
stablestate (ie basin of attraction in aneural network) is reached Thisallows the model to capture higher-order similarities present in a set of related experiences where pairwisesimilarities alone are not informativeSharp-wave ripple (SWR)
spontaneous neural activity occurringwithin the hippocampus duringperiods of rest and slow wave sleepevident as negative potentials (iesharp waves) Transient high-frequency (150Hz) oscillations (ieripples) occur within these sharpwaves which can re1047298ect the replay ( i
e reactivation) of activity patternsthat occurred during actualexperience sped up by an order of magnitudeSparsity the proportion of neuronsin a given brain region that are activein response to a given stimulus(lsquopopulation sparsenessrsquo)
Sparsecoding where a small (eg 1)proportion of neurons is active iscontrasted with densely distributedcoding where a relatively largeproportion of neurons are active (eg20)
modeling
the
neural
computations
supporting
visual processing
of
objects
in
primates
[1718]
Theconsiderable
advantages
of
depth
in
allowing
the
learning
of
increasingly
complex
and
abstractmappings
[16]
are
balanced
here
by
the
strong
interdependencies
among
connection
weights
indeep
networks
[1920]
such
that
the
weights
are
learned
gradually
through
extensive
repeatedand
interleaved
exposure
to
an
ensemble
of
training
examples
that
embody
the
domain
statistics
Although
there
are
real
advantages
of
a
system
using
structured
parametric
representations
on
its
own
such
a
system
would
suffer
from
two
drastic
limitations
[1]
First
it
is
important
to
be
ableto
base
behavior
on
the
content
of
an
individual
experience
For
example
after
experiencing
alife-threatening
situation ndash
for
example
an
encounter
with
a
lion
at
a
watering-hole ndash
it
wouldclearly
be
bene1047297cial
to
learn
to
avoid
that
particular
location
without
the
need
for
furtherencounters
with
the
lion
The
second
problem
is
that
the
rapid
adjustment
of
connectionweights
in
a
multilayer
network
to
accommodate
new
information
can
severely
disrupt
therepresentation
of
existing
knowledge
in
it ndash
a
phenomenon
termed
catastrophic
interference[121ndash23]
that
is
related
to
the
stabilityndashplasticity
dilemma
[24]
If
the
new
information
about
thedangerous
lion
is
forced
into
a
multi-layer
network
by
making
large
connection
weight
adjust-ments just to accommodate this item this can interfere with knowledge of other less-threateninganimals
one
may
already
be
familiar
with
Layer 4 (Output)
0
0
= j
a j lndash1w ij llndash1
w ij llndash1
sumnl i
nl i
al i
al
i
a j lndash1
Layer 3
Layer 2
Layer 1 (Input)
Target Figure 2 A Neocortex-Like Arti1047297cialNeural Network In the complementarylearning systems (CLS) theory neocorticalprocessing is seen as occurring through
the propagation of
activation among neu-rons via weighted connections as simu-lated using arti1047297cial networks of neuron-like units (small circles) Each unit has aninput line and an output line (with arrow-head) There is a separate real-valuedweight where each output line crossesan input line The weights are the knowl-edge that governs processing in the net-work During processing (inset) each unitcomputes a net input ( n) from the activa-tions of its inputs and the weights (plus abias term omitted here) producing anactivation ( a) that is a non-linear functionof n (one such function shown) The unitsin a layer may project back onto their own
inputs (illustrated for layer 3) simulatingrecurrent intra-cortical computations andhigher layers may project back to lowerlayers (Figure 1) In the situation shownthe input ( lower left) is a pattern in whichuni ts are either act ive ( a = 1 black) orinactive ( a = 0 white) and examples of possible activations produced in units of other layers are shown (darker for greateractivation) Learning occurs throughadjusting theweights to reduce the differ-ence between the output of the network anda targetoutput (upperright) [1016] Inthecaseshown theoutput activations aresimilar to the target but there is someerror to drive learning There are no tar-
gets for internal or h idden layers ( ie layers 2 and 3) These patterns dependon the connection weights which in turnare shaped by the error-driven learningprocess
Trendsin CognitiveSciences July 2016 Vol 20 No 7 515
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 2 Functional Roles of Subregions of the Medial Temporal Lobes
Work within the CLS framework [27116141] relies on the anatomical and physiological properties of MTL subregionsand the computational insights of others [92526] to characterize the computations performedwithin these structures
Entorhinal Cortex (ERC) Input to the Hippocampal SystemDuring an experience inputs from neocortex produces a pattern of activation in the ERC that may be thought of as acompressed description of the patterns in the contributing cortical areas (Figure I illustrative active neurons in the ERCare shown in blue) ERC neurons give rise to projections to three subregions of the hippocampus proper the dentategyrus (DG)CA1and CA3[2884]
Pattern selection andpattern separation
novel ERCpatternsare thought to activate asmall setof previously uncommitted DGneurons (shownin redndash theseneuronsmaybe relatively youngneurons createdby neurogenesis) These neurons in turn select a random subset of neurons in CA3 via large lsquodetonator synapsesrsquo(shownas reddots on theprojection from DG toCA3) to serve as therepresentationof thememory in CA3 ensuring thatthenew CA3pattern is asdistinct as possible from theCA3 patterns forothermemories includingthose forexperiencessimilar to the new experience (Boxes 3 and4) Pattern completion recurrent connections from the active CA3neuronsonto other active CA3 neurons are strengthened during the experience such that if a subset of the same neurons laterbecomes active the rest of the pattern will be reactivated Direct connections from ERC to CA3 are also strengthenedallowing the ERC input to directly activate the pattern in CA3during retrieval without requiring DG involvement (Box 3)Pattern reinstatement in ERC and neocortex [116141]
The connections from ERC to CA1 and back are thought tochange relatively slowly to allow stable correspondence between patterns in CA1 and ERC Strengthening of connec-tions from the active CA3 neurons to the active CA1 neurons during memory encoding allows this CA1 pattern to be
reactivated when thecorresponding CA3pattern is reactivated the stable connections from CA1 to ERCthen allow theappropriate pattern there to be reactivated and stable connections between ERC andneocortical areas propagate thereactivated ERCpattern to the neocortex Importantlythe bidirectional projectionsbetweenCA1andERCand betweenERC and neocortex support the formation and decoding of invertible CA1 representations of ERC and neocorticalpatternsand allow recurrent computations These connections shouldnot changerapidly given theextendedrole of thehippocampus in memory ndash otherwise reinstatement in the neocortex of memories stored in the hippocampus would bedif 1047297cult [61]
CA3
CA1
DG
ERC
Neocortex Neocortex
Figure
I
Hippocampal
Subregions
Connectivity
and
Representation
Schematic depictions of neurons (withcircular or triangular cell bodies) are shown along with schematic depictions of projections from neurons in an area toneurons in thesameor other areas (greyor colored lines ndash red coloring indicatesprojectionswith highly-plastic synapseswhile grey coloring illustrates relatively less-plastic or stable projections) CA1 output to ERC then propagates out toneocortex ERCandeven resultingneocorticalactivitycan befed back into thehippocampus(broken line)as proposed inthe REMERGE model (see below)
Trendsin CognitiveSciences July 2016 Vol 20 No 7 517
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
hippo-campal representation formed in learning an event affords a way of allowing gradual integrationof
knowledge
of
the
event
into
neocortical
knowledge
structures
This
can
occur
if
the
hippo-campal representation can reactivate or replay the contents of the new experience back to theneocortex
interleaved
with
replay
andor
ongoing
exposure
to
other
experiences
[1]
In
this
waythe
new
experience
becomes
part
of
the
database
of
experiences
that
govern
the
values
of
theconnections
in
the
neocortical
learning
system
[51ndash53] Which
other
memories
are
selected
forinterleaving
with
the
new
experience
remains
an
open
question
Most
simply
the
hippocampusmight
replay
recent
novel
experiences
interleaved
with
all
other
recent
experiences
still
stored
in
Box 3 Pattern Separation and Completion in Different Subregions of the Hippocampus
Pattern separationand completion [25ndash27] are de1047297nedin terms oftransformationsthat affectthe overlap or similarity amongpatterns of neuralactivity [28142] Patternseparationmakes similarpatternsmoredistinct through conjunctivecoding [925] in which each outputneuron respondsonly to a speci1047297c combinationof activeinputneurons Figures IA and IB illustrate how this can occur Pattern separation is thought to be implemented in DG (see Box 4) using higher-order conjunctions that
reduce overlap even more than illustrated in the 1047297gure
Pattern completion is a process that takesa fragmentof a pattern and1047297llsin theremaining features (asin recallinga lion upon seeingthe scenewhere thelionpreviouslyappeared)or that takesa pattern similarto a familiar patternandmakes it evenmore similarto itComputational simulations [27] have shownhowtheCA3region mightcombine featuresof patternseparationand completion such that moderate andhighoverlap results in pattern completion towardthe storedmemory butless overlapresults in thecreationof a newmemory [37133143] (FigureIC)In this account when environmentalinput produces a pattern in ERCsimilar to a previous pattern theCA3outputs a pattern closerto theone it previously used for this ERCpattern [124144] However when theenvironmentproduces an input on theERC that haslowoverlap with patterns stored previously the DG recruits a new statistically independent cell population in CA3 (ie pattern separation [27]) Emerging evidencesuggests that the amountof overlap required forpattern completion (aswell as other characteristics of hippocampal processing) maydifferacross theproximal-distal[145146] anddorsondashventral axes [98147ndash150] of thehippocampus andmay be shapedby neuromodulatory factors(eg Acetylcholine) [85151] Also incompletepatterns require less overlap with a storedpattern than distorted ones for completion to occur so that partial cues will tend to produce completion aswhen oneseesthe watering hole and remembers seeing a lion there previously [27]
Several studies point to differences between theCA3andCA1 regions in how their neural activity patterns respond to changes to the environment [37] broadly theCA1 region tends to mirror the degree of overlap in the inputs from the ERC while CA3 shows more discontinuous responses re1047298ecting either pattern separation or
completion [134152]
Input overlap Input overlap
Paern separaon in DG(A) (B) (C) Separaon and compleon in CA3
O u t p u t o v e r l a p
O u t p u t o v e r l a p
00
1
0
1
1
0
1
Figure I Conjunctive Coding Pattern Separation and Pattern Completion (A) A set of 10 conjunctive unitswithconnections from a layer of 5 input units isshown twicewith differentinputpatternsHere each conjunctive unit detects activity in a distinct pair of input units (arrows)The outputfor each pattern is sparser thanthe input (ie30 vs 60 respectively) andthe twooutputs overlap less than thetwo correspondinginputs (ie33 vs67 respectively overlap is thenumber of activeunitsshared by twopatternsdivided by thenumber of units activein each)DG mayuse higher-order conjunctions magnifying these effects (B)An illustration of the general form of a pattern separation function showing the relationship between input and output overlap Arrows indicate the overlap of the inputs and outputsshown in the left panel (C) The separation-and-completionpro1047297le associated with CA3 where low levels of input overlap are reduced further while higher levels areincreased [2737]
518 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
comple-mentary properties of each of the two component systems allowing new information to berapidly stored in the hippocampus and then slowly integrated into neocortical representations This
process
sometimes
labeled lsquosystems
level
consolidationrsquo
[51] arises
within
the
theoryfrom gradual cortical learning driven by replay of the new information interleaved with otheractivity
to
minimize
disruption
of
existing
knowledge
during
the
integration
of
the
newinformation
Empirical Evidence of Replay
Because
of
its
centrality
in
the
theory
we
highlight
key
empiricalevidence
that
replay
events
really
do
occur
The
data
come
primarily
from
rodents
recordedduring
periods
of
inactivity
(including
sleep)
in
which
hippocampal
neurons
exhibit
large
irregularactivity
(LIA)
patterns
that
are
distinct
from
the
activity
patterns
observed
during
active
states[23]
During
LIA
states
synchronous
discharges
thought
to
be
initiated
in
hippocampal
areaCA3
produce sharp-wave ripples
(SWRs)
which
are
propagated
to
neocortex
SWRs
re1047298ectthe
reactivation
of
recent
experiences
expressed
as
the
sequential
1047297ring
of
so-called
place
cellscells
that 1047297re
when
the
animal
is
at
a
speci1047297c location
[2357ndash59]
These
replay
events
appear
tobe
time-compressed
by
a
factor
of
about
20
bringing
neuronal
spikes
that
were
well-separatedin
time
during
an
actual
experience
into
a
time-window
that
enhances
synaptic
plasticity
both
Box 4 Sparse Conjunctive Coding and Pattern Separation in the Dentate Gyrus
Neuronal codes range from the extreme of localist codes ndash where neurons respond highly selectively to single entities(lsquograndmother cellsrsquo) to dense distributedcodeswhere items arecoded through theactivity ofmany (eg 50) neuronsin
an area [153154]
While localist codes minimize interference andare easily decodable they are inef 1047297cient in terms of
representational capacity By contrast densedistributed codesare capacity-ef 1047297cient however they are costly in termsof metabolic cost and relatively dif 1047297cult to decode These are endpoints on a continuumquanti1047297ed by a measure calledsparsity where lsquopopulationrsquo sparsity indexes theproportion of neurons that 1047297re in response to a given stimuluslocationand lsquolifetimersquo sparsity indexes the proportion of stimuli to which a single neuron responds [26153155] For example apopulationsparsity of
1meansthatonly 1of the neuronsin a
populationare activein representinga given inputTworandomly selected sparsepatternstend tohave lowoverlap (for tworandomlyselectedpatternsof equalsparsity over thesame setof neurons theaverageproportion of neuronsin eitherpattern that is active in theotheris equal to thesparsity)but neurons still participate in several different memories making them more ef 1047297cient than localist codes Despitevariability in estimatesof thesparsity ofa givenbrain region [27153156157] theDG iswidelybelievedto sustain amongthe sparsest neural code in the brain (05ndash1 population sparseness) [25ndash27] The CA3 region to which the DGprojects is thought to be less sparse (25 [47])
Many studies 1047297nd less-sparse patterns in CA1 than CA3 [134152]
The unique functional and anatomicalproperties of the DG suggest the origins of its sparse pattern-separated code Theperforant pathfromtheERC (containing200000neurons intherodent)projects toa layerof 1millionofDGgranulecellsCombinedwith thehigh levels of inhibition in theDG this supports theformation of highlysparse conjunctive representa-tions such that each neuron in DG responds only when several input neurons aresimultaneouslyactive reducing overlapbetweensimilar input patterns [25ndash27136] Evidencealso suggests thatnew DGneuronsarisefromstemcells throughoutadult lifethesenewneuronsmaybe preferentially recruitedin theformation ofmemories[136] further reducingoverlapwithpreviouslystored
memoriesTheCA3pattern fora memoryis then selectedby theactiveDG neurons eachofwhichhas alsquodetonatorrsquo synapse to15 randomly selectedCA3neurons This process helpsminimize theoverlap of CA3patterns fordifferent memories increasing storage capacity and minimizing interference between them even if the two memoriesrepresentsimilar events thathavehighlyoverlappingpatternsin neocortex andERCEmpiricalevidenceprovidessupport forthis with one study [137] showing that the representation supported by DGwashighly sensitive to small changes in theenvironmentdespiteevidence thatincominginputsfrom theERCwere little affected(alsosee [133145])
FurthermoreDGlesions impairananimalsrsquo abilitytolearntoresponddifferentlyintwoverysimilarenvironmentswhileleavingtheabilitytolearnto respond differently in two environments that are not similar [136]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 519
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 5 Similarity-Based Coding in High-Level Visual Cortex
High-level visual regions of the neocortex are thought to support distributed representations that are inferred to be lesssparsethan those of theDG andthe CA3CA1 regions of thehippocampus (Box4) Populationsparseness in theERC isestimatedat 7ndash10 [158]
with high-level sensory cortices exhibitingsimilar or higher levels of sparseness (eg variable
estimates [44ndash46]) Although lifetime sparseness does not directly translate to population sparseness recent evidencesuggests that V4and inferotemporal cortex(ITc)havea sparsenessof 10on this measure [159] It isworth notingthatlearning ratesmay vary according to neuronal selectivity andlifetime sparseness resultingin differences in learning ratesacross neocortical areasand hippocampal subregionsNeurons in early visual regions that encode frequently-occurringfeatures (ie edges)mayhave a relatively slow learning rate while neurons in higher visual regions andbeyond (eg ITcand perirhinal cortex) may have a higher learning rate to support the encoding of less-frequently occurring more-conjunctive features (eg individual objects) [12160161]
Evidence from electrophysiological recording studies in high-level visual cortical regions such as the ITc in primatesprovides support for the operation of a similarity-based coding scheme ndash whereby related categories (eg dogs andcats) are represented by overlapping neuronal codes [1740ndash43] (Figure I) Representational similarity analysis (RSA) of the ITc population response duringpassive viewing of pictures reveals codingof 1047297ne-grained categorical structure (egof a set of animate and inanimate objects) ndash that iswell 1047297t by deep convolutional neural networks which have algorithmicparallels with feedforward processing in the ventral visual stream [1740] While analogous similarity-based coding wasobserved using fMRI in the human homolog of ITc [41] there wasno evidence for greater within-category (cf between-category) representational similarity in any subregion of the hippocampus in a recent fMRI study [162] which foundevidence consistent with the importance of pattern separation in episodic memory Instead similarity-based coding inthis studywasobservedin theperirhinal andparahippocampal cortexndashMTL regionsthatproject tothe ERC and thataretypically considered to be intermediate zones (ie between the hippocampal and neocortical systems) in CLS theory
Dissimilarity
[percenle of 1 ndash r ]0 100
Monkey ITc Human ITc
AnimateNaturalNot human
Body Fa ce B ody FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
AnimateNaturalNot human
Bo dy Face Body FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
Figure I Similarity-Based Coding in High-Level Visual Cortex Representational dissimilarity matrices (RDM)re1047298ect the correlation (ie 1 r where r is the Pearson correlation coef 1047297cient) between the response of voxel patterns(fMRI in humans [41] right panel) or neuronal populations (electrophysiological recording in monkey [43]
left panel) to a
set of 92 object images RDMs are analogous in monkey and human ITc The RDMs show that the representations of animate objects are similar as are those of inanimate objects In addition to this clear animatendashinanimate distinctionobject coding in ITc exhibits 1047297ner categorical structure (eg for faces body parts) visible in these RDMs (also see [41])Reproduced with permission from [41]
520 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
rodents [7273] This generalized replayndash simultaneous reactivation of multiple related traces during testing or of 1047298ine periods ndash mayfacilitate the creation
of
new representations
f rom the
recombination
of
multiple relatedepisodes (lsquostored generalizationsrsquo) [5] and the discovery of novel relationships (eg shortcuts)[7273]
Empirical
evidencealsosupports
a
roleforthehippocampusin
category-
and
so-calledlsquostatisticalrsquo
learning [105ndash107] the mechanisms in
REMERGE and other
related modelsthat
rely on
separate memory
traces for individual
i tems allow weak hippocampal
tracesthat
support
only relat ively poor item recognition to
mediate
near-normal generalization[5108]
Box 6 Generalization Through Recurrence in the Hippocampal System
The REMERGEmodel (FigureI ) [5] which re1047298ects a synthesisof interactive activationand competition (IAC)models [163]and exemplar models of memory [108164165] constitutes an abstraction and simpli1047297cation of the multi-stagecircuitry of the hippocampal systeminto twoprincipal layers feature andconjunctivelayers broadly corresponding to the
ERC and hippocampus proper respectively The localist coding (eg unit AB) in the conjunctive layer re1047298ects anidealization of the sparsely distributed pattern-separated codes in the DGCA3 subregions of the hippocampus (Boxes2ndash4) that support episodic memory (eg for trials involving presentation of A and B objects together)
An essential principle of the model ndash mediated by the bidirectional excitatory connections between feature andconjunctive layers ndash is the principle of recurrence between the hippocampus proper and neocortical regions suchas the ERC (termed lsquobig-looprsquo recurrence to distinguish it from the internal recurrence known to exist within the CA3region) This allows recirculation of network output as a subsequent input to the system Intuitively this functionality iscrucial to allowing the model to discover the higher-order structure present within a
set of related episodes an initialprobe on the feature layer (eg denoting stimuli present on screen during a test trial) prompts the activation of experiences containing these elements on the conjunctive layer which in turn drives a new pattern of feature layeractivity that re1047298ects not only the external input but also the content of retrieved experiences This in turn leads to theactivation of conjunctive units denoting experiences related to the new feature layer pattern and so on This can bringabout a situation where for example the presentation of A and C can result in the activationof AB and BC which jointlyactivate B in turn further activating AB andBC which then suppress other conjuncts involvingA andC This produces astable state in which AB BC and A B and C are al l act ivated at the same time ndash thereby effectively inferring a link between A andC Longer-rangeinferences (egBndashE) canalsobe supportedby therecurrent mechanism([5] for details)Formally the function of the network can be viewed as carrying out recurrent similarity computation Unlike otherexemplar models [108164165] in which similarity computation is performed only on external inputs REMERGEperforms such computations on inputs affected by its own outputs
Conjuncve
Feature
AB
A B C D E F
BC CD DE EF
Figure I A Schematic of the Architecture of REMERGE Recurrent architecture of REMERGE showing its two-layer architecture with inputoutput units for possible constituents of experiences (A ndashF) conjunctive units representingpairs of constituents that have occurred together (AB BC etc) bidirectional connections (broken arrows) betweenconjuncts and their constituents and recurrent inhibition (broad arrow) among conjunctive units Adapted from [5]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 523
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
inferenceparadigm [590110]) Such representations then become the contents of episodic memorysubject
to
storage
in
the
hippocampus
The
distinction
between
encoding-
and
retrieval-based
models
can
be
related
more
broadly
tothe
1047297nding
of lsquoconceptrsquo cells
hippocampal
neurons
which
come
to
respond
to
common
featuresacross many events for example cells for speci1047297c odors [111] time-points within an episode[112]
attributes
of
a
task
[113]
and
even
cells
that
1047297re
to
any
picture
or
the
name
of
a
famousperson
[114]
In
Box
7
we
review
empirical 1047297ndings
concerning
concept
cells
and
pattern
overlap
sometimes observed in parts of hippocampus and consider how well these 1047297ndings 1047297t within theperspective
that
the
hippocampus
supports
pattern
separation
Rapid
Schema-Dependent
ConsolidationIt
is
useful
to
distinguish
systems-level
consolidation
from
what
we
refer
to
as
within-systemconsolidation
The
former
refers
to
the
gradual
integration
of
knowledge
into
neocortical
circuitswhile
the
latter
denotes
stabilization
of
recently
formed
memories
within
the
hippocampusperhaps
through
stabilization
of
synapses
among
hippocampal
neurons
[89] In
the
initialformulation
of
CLS
systems-level
consolidation
was
viewed
as
temporally
extended
(egspanning
years
or
even
decades
in
humans
[3451ndash53])
Although
it
was
noted
in
[1]
thatthe
timeframe
could
be
highly
variable
(depending
perhaps
on
the
rate
of
replay
of
memory
Box 7 Concept Cells and Nodal CodingsReports of concept cells in thehippocampushavebeen takenas contradictinga tenet ofCLStheorybut theexistence of such neurons is notnecessarilyinconsistentwith itgiven that thetheoryexpects differenthippocampalregions to vary interms of contextspeci1047297city andalso permits variationwithin hippocampal regions (Box 3) Evidence supporting theCLSprediction of context-speci1047297city in theCA3and DGcomes from a recent intracranial recording study in humans [166] Inthis study neurons in CA3DG andalso in the subiculum tended to discriminate between different imagesof a famousperson ndash with responses correlating with successful performance in a recognition memory task that required discri-minating previously experienced targets from similar lures Neurons in other MTL areas (ie entorhinal and parahippo-campal cortices) exhibitedmore invariant lsquoconcept cell likersquo responses that were not linked tomemory performance (theCA1 subregion was sparsely sampled in this study)
It is also interesting to consider the1047297ndingof lsquosplitterrsquo cells in a task where animalsmust alternatebetween turning left andright on successive trials in a T maze [167ndash179] here someCA1 and CA3 place cellsfor locations onthe central stemof the T maze are modulated by the trajectory of the rat (eg whether it will subsequently turn left or right) whereas othersare
trajectory-independentThisphenomenon knownas partial remapping [48170ndash172] is consistent with theidea that
pattern separation is a matter of
degree in our theory [2737] As such we should expectpartly overlapping representa-tions (ie ratherthan fully independent lsquochartsrsquo [121]) whenenvironmental changes are suf 1047297ciently small (Box3)We alsoexpectthe greatest differentiationin DGand at an early point in learningTo ourknowledge no studies have yetrecordedfrom DG in this paradigm
In a recent study representational similarity analysis techniques [173] were applied to ensemble recordingdata collectedwhile rats performed a context-guided rewarddiscrimination task [113] As expected the population codes in CA3 andCA1were dominatedby context andplace coding although other task dimensions ndash reward value and item ndashwere alsorepresented [113] (also see [174]) Although there was some representational overlap across locations based on valueand item CA3CA1 codes were consistent with incomplete but still strong pattern separation especially in the dorsalhippocampus Overall these 1047297ndings appear consistent with the CLS with the provision that pattern separation is amatter of degree andmay vary by task andregionWhyCA3 showsgreater speci1047297citythanCA1in somestudies but notothers requires further exploration
524 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
large amplitude weight changes occurred during the learning of schema-consistent
but not schema-inconsistent
information ndash
emulating the
schema-dependent pattern of neocortical plasticity-related gene expression reported in [8] A theo-retical analysis of
multilayer neural
networks makes clear why
themodel exhibits these effects[20]
the analysis
shows
that
the
rate of
learning within
a
multilayered
neural
network of
thetype that
CLS attributes to
the neocortex
[20]
will always
depend
on
the state of knowledge
Box 8 Rapid Integration of New Learning in the Neocortex When Does it Occur
In the event arena paradigm [78] (Figure I) hippocampal lesions prevent acquisition of new schema-consistentassociations By contrast hippocampal lesions performed as little as 48 h after learning leave memory intact Oneexplanation for the crucial but temporary nature of the hippocampal contribution is replay even a
few minutes with the
hippocampus intact couldallowmultiple replays eachone incrementing the strength of intra-neocortical connections Inan investigation of induction of plasticity-related genes in neocortex [8] the hippocampuswas intact for 80minutes afterinitial exposure to the new associations These 1047297ndings raise the broader question of when rapid integration of newlearning into the neocortex occurs and whether it can occur even without a hippocampus
A substantial body of work from several laboratories now supports the view that a single period of sleep can producechanges in how experiences froma single learning session impact on subsequent responding As key examples somestudies have reported increased levelsof linking inferences [175] andothershave reported increased lexical competitionand related phenomena[109176] attributedto a singlesleepsessionThese1047297ndingsare often interpreted asevidenceof rapidsystems-level consolidation (eg [176])
However thematerials used arenot obviously highly consistentwith priorknowledge in most cases and therefore under the CLS framework wewould not expect full integration into neocorticalnetworks in such a short time-period An alternative interpretation (illustrated in [5]) is that replays during sleep increasethe strength robustness and rate of activation of new hippocampus-dependent traces and that such strengtheningmay be suf 1047297cient to account for the observed effects Thus the 1047297ndings are consistent with the view that integration of these new memories into neocortical structures proceeds over a considerably longer time-period
Work with the lsquofast mappingrsquo paradigm in humanswith hippocampal lesions [177] provides another potential source of evidence about rapid neocortical learning of arbitrary new information In this paradigm human participants seepairs of pictures of objects ndash onefamiliar andone unfamiliar ndash and are asked a question such as lsquois thenumbats tail pointing uprsquoinferring that the unfamiliar name lsquonumbatrsquomust refer to the unfamiliar object [177] Some studies 1047297nd that patients withextensive hippocampus damage show retention of the new objectndashname association at a
delayed test [178179]suggesting very rapid neocortical learning even without a hippocampus However the 1047297nding has proven dif 1047297cult toreplicate [180ndash182] future studies should continue to investigate this issue
(A) (B)Original paired associates
1 2
3
4
5 5
4
8
3
7
2
6
Introducon of new paired associates
Figure I Schematic Illustration of the Event Arena Paradigm (A) Overhead view of 16 m 16 m event arenarats are cuedwithone of
six food 1047298avors (eg banana) each associated with a location in thearena (eg location 3) andare required to gofromany of the four start-boxesto a speci1047297c location to retrieve food (B)Following gradual learning of the originalset twonew 1047298avor-placepairs are introduced(eg cinnamonndashlocation7 nutmegndashlocation8) Rapidschema-dependent one-shot learning of these new PAs is observed (see Box text) Figure based on experimental designdescribed in [7]
526 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
allocatedneuronal codes that are non-overlapping or orthogonal (eg [26]) Notably the advantagesof
this
coding
scheme
for
episodic
memory ndash
reduction
of
interference
between
similar
butdistinct
events ndash
may
also
have
signi1047297cant
bene1047297ts for
continual
learning
Speci1047297cally
thismechanism allows the rapid creation of distinct non-interfering representations for multipletasks
to
which
an
agent
has
been
exposed
in
sequential
fashion
The
utility
of
this
function
andthe ubiquity of continual learning is well established in the domain of spatial navigation wherethe
notion
of
a
task
can
be
related
to
that
of
an
environmental
context
rodents
are
able
to
learnand
sustain
robust
representations
of
many
different
environments
(eg
gt10
environments
in[120])
with
each
environment
being
represented
by
a
pattern-separated
representational
space
Box 9 Experience Replay in Deep Q-Networks
Instead of employing a standard online learning method in which each unit of play experience (consisting of a stateaction next state and resulting reward) is used immediately to adjust connection weights and then discarded anexperience replay buffer similar to the hippocampus is used This allows learning based on randomly chosen subsets of
recent experiencesstored in the replay buffer([119] fordetails)to beinterleavedwith ongoing game-play Theapproach isin line with 1047297ndings cited above [66] that hippocampal replay reactivates reward related neurons in striatum in accordwith the hypothesis that hippocampus-dependent RL facilitates learning during off-line periods
Experience replayin theDQN architecturewascrucial in (i)maximizing data ef 1047297ciency allowing each unit of experience tobe reusedin many updates (egmirroringbene1047297ts of repeated time-compressedhippocampal replay) and (ii) smoothingout learning and avoiding unstable response policies that can result from the tendency of the current policy to bias theexperienced samples The approach minimizes learning from consecutive samples which is undesirable owing to theirstrongly correlated nature and inconsistent with the implicit assumptions built into neural-network learning algorithmsInstead experience replay allows updates within the deep Q-network to be performed on non-adjacent samples from aset of recent experiences in a fashion that breaks up these correlations while sti ll relying on relevant statistics Thedramatic advantage of a network implementing interleaved learning through experience replay was illustrated by theeffects of disabling replayon network performance this causeda severedrop in performance to at best30 of whenexperience replay was present [119] Note that the uniform sampling mechanismas implemented treats all transitions inthe replay memory as if they were equal Recent work [183] shows that biasing replay towards signi1047297cant events ndash
speci1047297cally experiences that are associated with high reward prediction errors ndash yields further gains This mechanismwhich resonateswith therole of the hippocampus in reweighting experiences as discussedabove allows information tobe harvested from rare experiences that may be particularly informative
528 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Glossary Attractor network networks withrecurrent connectivity that havestable states which persist in the
absence of external inputs andafford noise tolerance Discretepointattractor networks can be used tostore multiple memories as individualstable states Continuous attractornetworks have a
continuous manifoldof stable points which allow them torepresent continuous variables (egposition in space) Auto-associative storage thestorage within an attractor network of an input pattern constituting anexperience such that elements of theinput pattern are linked togetherthrough plasticity within the recurrentconnections of the network Theoperation of recurrent connectionssupports functions such as patterncompletion whereby the entire inputpattern (eg memory of a birthdayparty) can be retrieved from a partialcue (eg the face of a friend)Exemplar models exemplar modelsin cognitive science related toinstance-based models in machinelearning operate by computing thesimilarity of a new input pattern (iepresented as external sensory input)to stored experiences This results inthe output of the model for examplea predicted category label for thenew input pattern at which point theprocess terminatesNon-parametric we use this termto refer to algorithms where eachexperience or datapoint has its ownset of coordinates where capacitycan be increased as required ndash andthe number of parameters may growwith the amount of data K-nearestneighbor constitutes one commonexample of such a non-parametricinstance-based methodParametric we use this term torefer to algorithms that do not storeeach datapoint but instead directlylearn a function that (for example)
predicts the output value for a giveninput The number of parameters istypically 1047297xedPaired associative inference (PAI)
task a paradigm in which items areorganized into (eg a hundred) setsof triplets (eg ABC) or larger sets(eg sextets
ABCDEF) Participantsview item pairs (eg
AB BC) duringthe study phase and are tested ontheir ability to appreciate the indirectrelationships between items that
Box 1 Empirical Evidence Supporting Core Principles of CLS Theory
The Role of the Hippocampus in Memory
Bilateral damage to the hippocampusprofoundlyaffectsmemoryfor new informationleaving language reading generalknowledge and acquired cognitive skills intact [2934]
consistent with the idea that many types of new learning are
initially hippocampus-dependent Memory for recent pre-morbid information is profoundly affected by hippocampaldamage with older memories being less dependent on the hippocampus and therefore less sensitive to hippocampallesions [13451128] supportinggradual integrationof learned information intocortical knowledge structuresHoweversome evidence suggests that memoryfor speci1047297c details of an event canremainMTL-dependent [52129] aslongas thedetails are retained (eg [130])
Hippocampus Supports Core Computations and Representations of a Fast-Learning Episodic Memory System
Episodicmemoryis widelyacceptedto dependon thehippocampus mediatedbya capacity tobind together (ie lsquoauto-associatersquo) diverse inputsfromdifferentbrainareasthat represent theconstituents of anevent Indeed information aboutthe spatial (eg place)and non-spatial (eg what happened)aspects of an event are thought to be processedprimarilyby parallel streams before converging in the hippocampus at the level of the DGCA3 subregions [37] Two comple-mentary computations ndash pattern separation and pattern completion ndash are viewed to be central to the function of thehippocampus for storing detailsof speci1047297c experiencesEvidencesuggests that thedentate gyrus (DG) subregionof thehippocampus performs pattern separation orthogonalizing incoming inputs before auto-associative storage in theCA3
region [131ndash137] Further the CA3 subregion is crucial for pattern completion ndash allowing the output of an entirestored pattern (eg corresponding to an entire episodic memory) from a partial input consistent with its function as an
attractor network [138139] (Boxes 2ndash4)
Hippocampal Replay
A
wealth of evidence demonstrates that replay of recent experiences occurs during of 1047298ine periods (eg during sleeprest) [23] Further the hippocampus andneocortex interact during replay as predicted by CLS theory [65] putatively tosupport interleaved learning A causal role for replay in systems-level consolidation is supported by the 1047297nding thatoptogenetic blockage ofCA3output in transgenic mouseafter learning in a contextual fear paradigmspeci1047297cally reducessharp-wave ripple (SWR) complexes in CA1 and impairs consolidation [69]
The
Hippocampus And Neocortex Support Qualitatively Different Forms of Representation
A recentexperiment [140] found initial evidence in favor thebehavior of rats in theMorriswater maze early on appearedtore1047298ect individual episodic traces (ie an instance-based non-parametric representation) but at a
later time-point (28days after learning) was consistent with the use of a parametric representation putatively housed in the neocortex
514 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
were never presented together (eg A and C)Paired associative recall task aparadigm where item pairs are
experienced during study (eg wordpairs such as lsquodogndashtablersquo in a humanexperiment or 1047298avorndashlocation pairs ina rodent experiment) and at test theindividual must recall the other item(eg speci1047297c
location) from a cue(the speci1047297c 1047298avor eg banana)Recurrent similarity computation
recurrent similarity computationallows the procedure performed byexemplar models to iterate that isthe retrieved products from the 1047297rststep of similarity computation arecombined with the external sensoryinput and a
subsequent round of similarity computation is performed
This process continues until a
stablestate (ie basin of attraction in aneural network) is reached Thisallows the model to capture higher-order similarities present in a set of related experiences where pairwisesimilarities alone are not informativeSharp-wave ripple (SWR)
spontaneous neural activity occurringwithin the hippocampus duringperiods of rest and slow wave sleepevident as negative potentials (iesharp waves) Transient high-frequency (150Hz) oscillations (ieripples) occur within these sharpwaves which can re1047298ect the replay ( i
e reactivation) of activity patternsthat occurred during actualexperience sped up by an order of magnitudeSparsity the proportion of neuronsin a given brain region that are activein response to a given stimulus(lsquopopulation sparsenessrsquo)
Sparsecoding where a small (eg 1)proportion of neurons is active iscontrasted with densely distributedcoding where a relatively largeproportion of neurons are active (eg20)
modeling
the
neural
computations
supporting
visual processing
of
objects
in
primates
[1718]
Theconsiderable
advantages
of
depth
in
allowing
the
learning
of
increasingly
complex
and
abstractmappings
[16]
are
balanced
here
by
the
strong
interdependencies
among
connection
weights
indeep
networks
[1920]
such
that
the
weights
are
learned
gradually
through
extensive
repeatedand
interleaved
exposure
to
an
ensemble
of
training
examples
that
embody
the
domain
statistics
Although
there
are
real
advantages
of
a
system
using
structured
parametric
representations
on
its
own
such
a
system
would
suffer
from
two
drastic
limitations
[1]
First
it
is
important
to
be
ableto
base
behavior
on
the
content
of
an
individual
experience
For
example
after
experiencing
alife-threatening
situation ndash
for
example
an
encounter
with
a
lion
at
a
watering-hole ndash
it
wouldclearly
be
bene1047297cial
to
learn
to
avoid
that
particular
location
without
the
need
for
furtherencounters
with
the
lion
The
second
problem
is
that
the
rapid
adjustment
of
connectionweights
in
a
multilayer
network
to
accommodate
new
information
can
severely
disrupt
therepresentation
of
existing
knowledge
in
it ndash
a
phenomenon
termed
catastrophic
interference[121ndash23]
that
is
related
to
the
stabilityndashplasticity
dilemma
[24]
If
the
new
information
about
thedangerous
lion
is
forced
into
a
multi-layer
network
by
making
large
connection
weight
adjust-ments just to accommodate this item this can interfere with knowledge of other less-threateninganimals
one
may
already
be
familiar
with
Layer 4 (Output)
0
0
= j
a j lndash1w ij llndash1
w ij llndash1
sumnl i
nl i
al i
al
i
a j lndash1
Layer 3
Layer 2
Layer 1 (Input)
Target Figure 2 A Neocortex-Like Arti1047297cialNeural Network In the complementarylearning systems (CLS) theory neocorticalprocessing is seen as occurring through
the propagation of
activation among neu-rons via weighted connections as simu-lated using arti1047297cial networks of neuron-like units (small circles) Each unit has aninput line and an output line (with arrow-head) There is a separate real-valuedweight where each output line crossesan input line The weights are the knowl-edge that governs processing in the net-work During processing (inset) each unitcomputes a net input ( n) from the activa-tions of its inputs and the weights (plus abias term omitted here) producing anactivation ( a) that is a non-linear functionof n (one such function shown) The unitsin a layer may project back onto their own
inputs (illustrated for layer 3) simulatingrecurrent intra-cortical computations andhigher layers may project back to lowerlayers (Figure 1) In the situation shownthe input ( lower left) is a pattern in whichuni ts are either act ive ( a = 1 black) orinactive ( a = 0 white) and examples of possible activations produced in units of other layers are shown (darker for greateractivation) Learning occurs throughadjusting theweights to reduce the differ-ence between the output of the network anda targetoutput (upperright) [1016] Inthecaseshown theoutput activations aresimilar to the target but there is someerror to drive learning There are no tar-
gets for internal or h idden layers ( ie layers 2 and 3) These patterns dependon the connection weights which in turnare shaped by the error-driven learningprocess
Trendsin CognitiveSciences July 2016 Vol 20 No 7 515
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 2 Functional Roles of Subregions of the Medial Temporal Lobes
Work within the CLS framework [27116141] relies on the anatomical and physiological properties of MTL subregionsand the computational insights of others [92526] to characterize the computations performedwithin these structures
Entorhinal Cortex (ERC) Input to the Hippocampal SystemDuring an experience inputs from neocortex produces a pattern of activation in the ERC that may be thought of as acompressed description of the patterns in the contributing cortical areas (Figure I illustrative active neurons in the ERCare shown in blue) ERC neurons give rise to projections to three subregions of the hippocampus proper the dentategyrus (DG)CA1and CA3[2884]
Pattern selection andpattern separation
novel ERCpatternsare thought to activate asmall setof previously uncommitted DGneurons (shownin redndash theseneuronsmaybe relatively youngneurons createdby neurogenesis) These neurons in turn select a random subset of neurons in CA3 via large lsquodetonator synapsesrsquo(shownas reddots on theprojection from DG toCA3) to serve as therepresentationof thememory in CA3 ensuring thatthenew CA3pattern is asdistinct as possible from theCA3 patterns forothermemories includingthose forexperiencessimilar to the new experience (Boxes 3 and4) Pattern completion recurrent connections from the active CA3neuronsonto other active CA3 neurons are strengthened during the experience such that if a subset of the same neurons laterbecomes active the rest of the pattern will be reactivated Direct connections from ERC to CA3 are also strengthenedallowing the ERC input to directly activate the pattern in CA3during retrieval without requiring DG involvement (Box 3)Pattern reinstatement in ERC and neocortex [116141]
The connections from ERC to CA1 and back are thought tochange relatively slowly to allow stable correspondence between patterns in CA1 and ERC Strengthening of connec-tions from the active CA3 neurons to the active CA1 neurons during memory encoding allows this CA1 pattern to be
reactivated when thecorresponding CA3pattern is reactivated the stable connections from CA1 to ERCthen allow theappropriate pattern there to be reactivated and stable connections between ERC andneocortical areas propagate thereactivated ERCpattern to the neocortex Importantlythe bidirectional projectionsbetweenCA1andERCand betweenERC and neocortex support the formation and decoding of invertible CA1 representations of ERC and neocorticalpatternsand allow recurrent computations These connections shouldnot changerapidly given theextendedrole of thehippocampus in memory ndash otherwise reinstatement in the neocortex of memories stored in the hippocampus would bedif 1047297cult [61]
CA3
CA1
DG
ERC
Neocortex Neocortex
Figure
I
Hippocampal
Subregions
Connectivity
and
Representation
Schematic depictions of neurons (withcircular or triangular cell bodies) are shown along with schematic depictions of projections from neurons in an area toneurons in thesameor other areas (greyor colored lines ndash red coloring indicatesprojectionswith highly-plastic synapseswhile grey coloring illustrates relatively less-plastic or stable projections) CA1 output to ERC then propagates out toneocortex ERCandeven resultingneocorticalactivitycan befed back into thehippocampus(broken line)as proposed inthe REMERGE model (see below)
Trendsin CognitiveSciences July 2016 Vol 20 No 7 517
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
hippo-campal representation formed in learning an event affords a way of allowing gradual integrationof
knowledge
of
the
event
into
neocortical
knowledge
structures
This
can
occur
if
the
hippo-campal representation can reactivate or replay the contents of the new experience back to theneocortex
interleaved
with
replay
andor
ongoing
exposure
to
other
experiences
[1]
In
this
waythe
new
experience
becomes
part
of
the
database
of
experiences
that
govern
the
values
of
theconnections
in
the
neocortical
learning
system
[51ndash53] Which
other
memories
are
selected
forinterleaving
with
the
new
experience
remains
an
open
question
Most
simply
the
hippocampusmight
replay
recent
novel
experiences
interleaved
with
all
other
recent
experiences
still
stored
in
Box 3 Pattern Separation and Completion in Different Subregions of the Hippocampus
Pattern separationand completion [25ndash27] are de1047297nedin terms oftransformationsthat affectthe overlap or similarity amongpatterns of neuralactivity [28142] Patternseparationmakes similarpatternsmoredistinct through conjunctivecoding [925] in which each outputneuron respondsonly to a speci1047297c combinationof activeinputneurons Figures IA and IB illustrate how this can occur Pattern separation is thought to be implemented in DG (see Box 4) using higher-order conjunctions that
reduce overlap even more than illustrated in the 1047297gure
Pattern completion is a process that takesa fragmentof a pattern and1047297llsin theremaining features (asin recallinga lion upon seeingthe scenewhere thelionpreviouslyappeared)or that takesa pattern similarto a familiar patternandmakes it evenmore similarto itComputational simulations [27] have shownhowtheCA3region mightcombine featuresof patternseparationand completion such that moderate andhighoverlap results in pattern completion towardthe storedmemory butless overlapresults in thecreationof a newmemory [37133143] (FigureIC)In this account when environmentalinput produces a pattern in ERCsimilar to a previous pattern theCA3outputs a pattern closerto theone it previously used for this ERCpattern [124144] However when theenvironmentproduces an input on theERC that haslowoverlap with patterns stored previously the DG recruits a new statistically independent cell population in CA3 (ie pattern separation [27]) Emerging evidencesuggests that the amountof overlap required forpattern completion (aswell as other characteristics of hippocampal processing) maydifferacross theproximal-distal[145146] anddorsondashventral axes [98147ndash150] of thehippocampus andmay be shapedby neuromodulatory factors(eg Acetylcholine) [85151] Also incompletepatterns require less overlap with a storedpattern than distorted ones for completion to occur so that partial cues will tend to produce completion aswhen oneseesthe watering hole and remembers seeing a lion there previously [27]
Several studies point to differences between theCA3andCA1 regions in how their neural activity patterns respond to changes to the environment [37] broadly theCA1 region tends to mirror the degree of overlap in the inputs from the ERC while CA3 shows more discontinuous responses re1047298ecting either pattern separation or
completion [134152]
Input overlap Input overlap
Paern separaon in DG(A) (B) (C) Separaon and compleon in CA3
O u t p u t o v e r l a p
O u t p u t o v e r l a p
00
1
0
1
1
0
1
Figure I Conjunctive Coding Pattern Separation and Pattern Completion (A) A set of 10 conjunctive unitswithconnections from a layer of 5 input units isshown twicewith differentinputpatternsHere each conjunctive unit detects activity in a distinct pair of input units (arrows)The outputfor each pattern is sparser thanthe input (ie30 vs 60 respectively) andthe twooutputs overlap less than thetwo correspondinginputs (ie33 vs67 respectively overlap is thenumber of activeunitsshared by twopatternsdivided by thenumber of units activein each)DG mayuse higher-order conjunctions magnifying these effects (B)An illustration of the general form of a pattern separation function showing the relationship between input and output overlap Arrows indicate the overlap of the inputs and outputsshown in the left panel (C) The separation-and-completionpro1047297le associated with CA3 where low levels of input overlap are reduced further while higher levels areincreased [2737]
518 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
comple-mentary properties of each of the two component systems allowing new information to berapidly stored in the hippocampus and then slowly integrated into neocortical representations This
process
sometimes
labeled lsquosystems
level
consolidationrsquo
[51] arises
within
the
theoryfrom gradual cortical learning driven by replay of the new information interleaved with otheractivity
to
minimize
disruption
of
existing
knowledge
during
the
integration
of
the
newinformation
Empirical Evidence of Replay
Because
of
its
centrality
in
the
theory
we
highlight
key
empiricalevidence
that
replay
events
really
do
occur
The
data
come
primarily
from
rodents
recordedduring
periods
of
inactivity
(including
sleep)
in
which
hippocampal
neurons
exhibit
large
irregularactivity
(LIA)
patterns
that
are
distinct
from
the
activity
patterns
observed
during
active
states[23]
During
LIA
states
synchronous
discharges
thought
to
be
initiated
in
hippocampal
areaCA3
produce sharp-wave ripples
(SWRs)
which
are
propagated
to
neocortex
SWRs
re1047298ectthe
reactivation
of
recent
experiences
expressed
as
the
sequential
1047297ring
of
so-called
place
cellscells
that 1047297re
when
the
animal
is
at
a
speci1047297c location
[2357ndash59]
These
replay
events
appear
tobe
time-compressed
by
a
factor
of
about
20
bringing
neuronal
spikes
that
were
well-separatedin
time
during
an
actual
experience
into
a
time-window
that
enhances
synaptic
plasticity
both
Box 4 Sparse Conjunctive Coding and Pattern Separation in the Dentate Gyrus
Neuronal codes range from the extreme of localist codes ndash where neurons respond highly selectively to single entities(lsquograndmother cellsrsquo) to dense distributedcodeswhere items arecoded through theactivity ofmany (eg 50) neuronsin
an area [153154]
While localist codes minimize interference andare easily decodable they are inef 1047297cient in terms of
representational capacity By contrast densedistributed codesare capacity-ef 1047297cient however they are costly in termsof metabolic cost and relatively dif 1047297cult to decode These are endpoints on a continuumquanti1047297ed by a measure calledsparsity where lsquopopulationrsquo sparsity indexes theproportion of neurons that 1047297re in response to a given stimuluslocationand lsquolifetimersquo sparsity indexes the proportion of stimuli to which a single neuron responds [26153155] For example apopulationsparsity of
1meansthatonly 1of the neuronsin a
populationare activein representinga given inputTworandomly selected sparsepatternstend tohave lowoverlap (for tworandomlyselectedpatternsof equalsparsity over thesame setof neurons theaverageproportion of neuronsin eitherpattern that is active in theotheris equal to thesparsity)but neurons still participate in several different memories making them more ef 1047297cient than localist codes Despitevariability in estimatesof thesparsity ofa givenbrain region [27153156157] theDG iswidelybelievedto sustain amongthe sparsest neural code in the brain (05ndash1 population sparseness) [25ndash27] The CA3 region to which the DGprojects is thought to be less sparse (25 [47])
Many studies 1047297nd less-sparse patterns in CA1 than CA3 [134152]
The unique functional and anatomicalproperties of the DG suggest the origins of its sparse pattern-separated code Theperforant pathfromtheERC (containing200000neurons intherodent)projects toa layerof 1millionofDGgranulecellsCombinedwith thehigh levels of inhibition in theDG this supports theformation of highlysparse conjunctive representa-tions such that each neuron in DG responds only when several input neurons aresimultaneouslyactive reducing overlapbetweensimilar input patterns [25ndash27136] Evidencealso suggests thatnew DGneuronsarisefromstemcells throughoutadult lifethesenewneuronsmaybe preferentially recruitedin theformation ofmemories[136] further reducingoverlapwithpreviouslystored
memoriesTheCA3pattern fora memoryis then selectedby theactiveDG neurons eachofwhichhas alsquodetonatorrsquo synapse to15 randomly selectedCA3neurons This process helpsminimize theoverlap of CA3patterns fordifferent memories increasing storage capacity and minimizing interference between them even if the two memoriesrepresentsimilar events thathavehighlyoverlappingpatternsin neocortex andERCEmpiricalevidenceprovidessupport forthis with one study [137] showing that the representation supported by DGwashighly sensitive to small changes in theenvironmentdespiteevidence thatincominginputsfrom theERCwere little affected(alsosee [133145])
FurthermoreDGlesions impairananimalsrsquo abilitytolearntoresponddifferentlyintwoverysimilarenvironmentswhileleavingtheabilitytolearnto respond differently in two environments that are not similar [136]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 519
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 5 Similarity-Based Coding in High-Level Visual Cortex
High-level visual regions of the neocortex are thought to support distributed representations that are inferred to be lesssparsethan those of theDG andthe CA3CA1 regions of thehippocampus (Box4) Populationsparseness in theERC isestimatedat 7ndash10 [158]
with high-level sensory cortices exhibitingsimilar or higher levels of sparseness (eg variable
estimates [44ndash46]) Although lifetime sparseness does not directly translate to population sparseness recent evidencesuggests that V4and inferotemporal cortex(ITc)havea sparsenessof 10on this measure [159] It isworth notingthatlearning ratesmay vary according to neuronal selectivity andlifetime sparseness resultingin differences in learning ratesacross neocortical areasand hippocampal subregionsNeurons in early visual regions that encode frequently-occurringfeatures (ie edges)mayhave a relatively slow learning rate while neurons in higher visual regions andbeyond (eg ITcand perirhinal cortex) may have a higher learning rate to support the encoding of less-frequently occurring more-conjunctive features (eg individual objects) [12160161]
Evidence from electrophysiological recording studies in high-level visual cortical regions such as the ITc in primatesprovides support for the operation of a similarity-based coding scheme ndash whereby related categories (eg dogs andcats) are represented by overlapping neuronal codes [1740ndash43] (Figure I) Representational similarity analysis (RSA) of the ITc population response duringpassive viewing of pictures reveals codingof 1047297ne-grained categorical structure (egof a set of animate and inanimate objects) ndash that iswell 1047297t by deep convolutional neural networks which have algorithmicparallels with feedforward processing in the ventral visual stream [1740] While analogous similarity-based coding wasobserved using fMRI in the human homolog of ITc [41] there wasno evidence for greater within-category (cf between-category) representational similarity in any subregion of the hippocampus in a recent fMRI study [162] which foundevidence consistent with the importance of pattern separation in episodic memory Instead similarity-based coding inthis studywasobservedin theperirhinal andparahippocampal cortexndashMTL regionsthatproject tothe ERC and thataretypically considered to be intermediate zones (ie between the hippocampal and neocortical systems) in CLS theory
Dissimilarity
[percenle of 1 ndash r ]0 100
Monkey ITc Human ITc
AnimateNaturalNot human
Body Fa ce B ody FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
AnimateNaturalNot human
Bo dy Face Body FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
Figure I Similarity-Based Coding in High-Level Visual Cortex Representational dissimilarity matrices (RDM)re1047298ect the correlation (ie 1 r where r is the Pearson correlation coef 1047297cient) between the response of voxel patterns(fMRI in humans [41] right panel) or neuronal populations (electrophysiological recording in monkey [43]
left panel) to a
set of 92 object images RDMs are analogous in monkey and human ITc The RDMs show that the representations of animate objects are similar as are those of inanimate objects In addition to this clear animatendashinanimate distinctionobject coding in ITc exhibits 1047297ner categorical structure (eg for faces body parts) visible in these RDMs (also see [41])Reproduced with permission from [41]
520 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
rodents [7273] This generalized replayndash simultaneous reactivation of multiple related traces during testing or of 1047298ine periods ndash mayfacilitate the creation
of
new representations
f rom the
recombination
of
multiple relatedepisodes (lsquostored generalizationsrsquo) [5] and the discovery of novel relationships (eg shortcuts)[7273]
Empirical
evidencealsosupports
a
roleforthehippocampusin
category-
and
so-calledlsquostatisticalrsquo
learning [105ndash107] the mechanisms in
REMERGE and other
related modelsthat
rely on
separate memory
traces for individual
i tems allow weak hippocampal
tracesthat
support
only relat ively poor item recognition to
mediate
near-normal generalization[5108]
Box 6 Generalization Through Recurrence in the Hippocampal System
The REMERGEmodel (FigureI ) [5] which re1047298ects a synthesisof interactive activationand competition (IAC)models [163]and exemplar models of memory [108164165] constitutes an abstraction and simpli1047297cation of the multi-stagecircuitry of the hippocampal systeminto twoprincipal layers feature andconjunctivelayers broadly corresponding to the
ERC and hippocampus proper respectively The localist coding (eg unit AB) in the conjunctive layer re1047298ects anidealization of the sparsely distributed pattern-separated codes in the DGCA3 subregions of the hippocampus (Boxes2ndash4) that support episodic memory (eg for trials involving presentation of A and B objects together)
An essential principle of the model ndash mediated by the bidirectional excitatory connections between feature andconjunctive layers ndash is the principle of recurrence between the hippocampus proper and neocortical regions suchas the ERC (termed lsquobig-looprsquo recurrence to distinguish it from the internal recurrence known to exist within the CA3region) This allows recirculation of network output as a subsequent input to the system Intuitively this functionality iscrucial to allowing the model to discover the higher-order structure present within a
set of related episodes an initialprobe on the feature layer (eg denoting stimuli present on screen during a test trial) prompts the activation of experiences containing these elements on the conjunctive layer which in turn drives a new pattern of feature layeractivity that re1047298ects not only the external input but also the content of retrieved experiences This in turn leads to theactivation of conjunctive units denoting experiences related to the new feature layer pattern and so on This can bringabout a situation where for example the presentation of A and C can result in the activationof AB and BC which jointlyactivate B in turn further activating AB andBC which then suppress other conjuncts involvingA andC This produces astable state in which AB BC and A B and C are al l act ivated at the same time ndash thereby effectively inferring a link between A andC Longer-rangeinferences (egBndashE) canalsobe supportedby therecurrent mechanism([5] for details)Formally the function of the network can be viewed as carrying out recurrent similarity computation Unlike otherexemplar models [108164165] in which similarity computation is performed only on external inputs REMERGEperforms such computations on inputs affected by its own outputs
Conjuncve
Feature
AB
A B C D E F
BC CD DE EF
Figure I A Schematic of the Architecture of REMERGE Recurrent architecture of REMERGE showing its two-layer architecture with inputoutput units for possible constituents of experiences (A ndashF) conjunctive units representingpairs of constituents that have occurred together (AB BC etc) bidirectional connections (broken arrows) betweenconjuncts and their constituents and recurrent inhibition (broad arrow) among conjunctive units Adapted from [5]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 523
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
inferenceparadigm [590110]) Such representations then become the contents of episodic memorysubject
to
storage
in
the
hippocampus
The
distinction
between
encoding-
and
retrieval-based
models
can
be
related
more
broadly
tothe
1047297nding
of lsquoconceptrsquo cells
hippocampal
neurons
which
come
to
respond
to
common
featuresacross many events for example cells for speci1047297c odors [111] time-points within an episode[112]
attributes
of
a
task
[113]
and
even
cells
that
1047297re
to
any
picture
or
the
name
of
a
famousperson
[114]
In
Box
7
we
review
empirical 1047297ndings
concerning
concept
cells
and
pattern
overlap
sometimes observed in parts of hippocampus and consider how well these 1047297ndings 1047297t within theperspective
that
the
hippocampus
supports
pattern
separation
Rapid
Schema-Dependent
ConsolidationIt
is
useful
to
distinguish
systems-level
consolidation
from
what
we
refer
to
as
within-systemconsolidation
The
former
refers
to
the
gradual
integration
of
knowledge
into
neocortical
circuitswhile
the
latter
denotes
stabilization
of
recently
formed
memories
within
the
hippocampusperhaps
through
stabilization
of
synapses
among
hippocampal
neurons
[89] In
the
initialformulation
of
CLS
systems-level
consolidation
was
viewed
as
temporally
extended
(egspanning
years
or
even
decades
in
humans
[3451ndash53])
Although
it
was
noted
in
[1]
thatthe
timeframe
could
be
highly
variable
(depending
perhaps
on
the
rate
of
replay
of
memory
Box 7 Concept Cells and Nodal CodingsReports of concept cells in thehippocampushavebeen takenas contradictinga tenet ofCLStheorybut theexistence of such neurons is notnecessarilyinconsistentwith itgiven that thetheoryexpects differenthippocampalregions to vary interms of contextspeci1047297city andalso permits variationwithin hippocampal regions (Box 3) Evidence supporting theCLSprediction of context-speci1047297city in theCA3and DGcomes from a recent intracranial recording study in humans [166] Inthis study neurons in CA3DG andalso in the subiculum tended to discriminate between different imagesof a famousperson ndash with responses correlating with successful performance in a recognition memory task that required discri-minating previously experienced targets from similar lures Neurons in other MTL areas (ie entorhinal and parahippo-campal cortices) exhibitedmore invariant lsquoconcept cell likersquo responses that were not linked tomemory performance (theCA1 subregion was sparsely sampled in this study)
It is also interesting to consider the1047297ndingof lsquosplitterrsquo cells in a task where animalsmust alternatebetween turning left andright on successive trials in a T maze [167ndash179] here someCA1 and CA3 place cellsfor locations onthe central stemof the T maze are modulated by the trajectory of the rat (eg whether it will subsequently turn left or right) whereas othersare
trajectory-independentThisphenomenon knownas partial remapping [48170ndash172] is consistent with theidea that
pattern separation is a matter of
degree in our theory [2737] As such we should expectpartly overlapping representa-tions (ie ratherthan fully independent lsquochartsrsquo [121]) whenenvironmental changes are suf 1047297ciently small (Box3)We alsoexpectthe greatest differentiationin DGand at an early point in learningTo ourknowledge no studies have yetrecordedfrom DG in this paradigm
In a recent study representational similarity analysis techniques [173] were applied to ensemble recordingdata collectedwhile rats performed a context-guided rewarddiscrimination task [113] As expected the population codes in CA3 andCA1were dominatedby context andplace coding although other task dimensions ndash reward value and item ndashwere alsorepresented [113] (also see [174]) Although there was some representational overlap across locations based on valueand item CA3CA1 codes were consistent with incomplete but still strong pattern separation especially in the dorsalhippocampus Overall these 1047297ndings appear consistent with the CLS with the provision that pattern separation is amatter of degree andmay vary by task andregionWhyCA3 showsgreater speci1047297citythanCA1in somestudies but notothers requires further exploration
524 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
large amplitude weight changes occurred during the learning of schema-consistent
but not schema-inconsistent
information ndash
emulating the
schema-dependent pattern of neocortical plasticity-related gene expression reported in [8] A theo-retical analysis of
multilayer neural
networks makes clear why
themodel exhibits these effects[20]
the analysis
shows
that
the
rate of
learning within
a
multilayered
neural
network of
thetype that
CLS attributes to
the neocortex
[20]
will always
depend
on
the state of knowledge
Box 8 Rapid Integration of New Learning in the Neocortex When Does it Occur
In the event arena paradigm [78] (Figure I) hippocampal lesions prevent acquisition of new schema-consistentassociations By contrast hippocampal lesions performed as little as 48 h after learning leave memory intact Oneexplanation for the crucial but temporary nature of the hippocampal contribution is replay even a
few minutes with the
hippocampus intact couldallowmultiple replays eachone incrementing the strength of intra-neocortical connections Inan investigation of induction of plasticity-related genes in neocortex [8] the hippocampuswas intact for 80minutes afterinitial exposure to the new associations These 1047297ndings raise the broader question of when rapid integration of newlearning into the neocortex occurs and whether it can occur even without a hippocampus
A substantial body of work from several laboratories now supports the view that a single period of sleep can producechanges in how experiences froma single learning session impact on subsequent responding As key examples somestudies have reported increased levelsof linking inferences [175] andothershave reported increased lexical competitionand related phenomena[109176] attributedto a singlesleepsessionThese1047297ndingsare often interpreted asevidenceof rapidsystems-level consolidation (eg [176])
However thematerials used arenot obviously highly consistentwith priorknowledge in most cases and therefore under the CLS framework wewould not expect full integration into neocorticalnetworks in such a short time-period An alternative interpretation (illustrated in [5]) is that replays during sleep increasethe strength robustness and rate of activation of new hippocampus-dependent traces and that such strengtheningmay be suf 1047297cient to account for the observed effects Thus the 1047297ndings are consistent with the view that integration of these new memories into neocortical structures proceeds over a considerably longer time-period
Work with the lsquofast mappingrsquo paradigm in humanswith hippocampal lesions [177] provides another potential source of evidence about rapid neocortical learning of arbitrary new information In this paradigm human participants seepairs of pictures of objects ndash onefamiliar andone unfamiliar ndash and are asked a question such as lsquois thenumbats tail pointing uprsquoinferring that the unfamiliar name lsquonumbatrsquomust refer to the unfamiliar object [177] Some studies 1047297nd that patients withextensive hippocampus damage show retention of the new objectndashname association at a
delayed test [178179]suggesting very rapid neocortical learning even without a hippocampus However the 1047297nding has proven dif 1047297cult toreplicate [180ndash182] future studies should continue to investigate this issue
(A) (B)Original paired associates
1 2
3
4
5 5
4
8
3
7
2
6
Introducon of new paired associates
Figure I Schematic Illustration of the Event Arena Paradigm (A) Overhead view of 16 m 16 m event arenarats are cuedwithone of
six food 1047298avors (eg banana) each associated with a location in thearena (eg location 3) andare required to gofromany of the four start-boxesto a speci1047297c location to retrieve food (B)Following gradual learning of the originalset twonew 1047298avor-placepairs are introduced(eg cinnamonndashlocation7 nutmegndashlocation8) Rapidschema-dependent one-shot learning of these new PAs is observed (see Box text) Figure based on experimental designdescribed in [7]
526 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
allocatedneuronal codes that are non-overlapping or orthogonal (eg [26]) Notably the advantagesof
this
coding
scheme
for
episodic
memory ndash
reduction
of
interference
between
similar
butdistinct
events ndash
may
also
have
signi1047297cant
bene1047297ts for
continual
learning
Speci1047297cally
thismechanism allows the rapid creation of distinct non-interfering representations for multipletasks
to
which
an
agent
has
been
exposed
in
sequential
fashion
The
utility
of
this
function
andthe ubiquity of continual learning is well established in the domain of spatial navigation wherethe
notion
of
a
task
can
be
related
to
that
of
an
environmental
context
rodents
are
able
to
learnand
sustain
robust
representations
of
many
different
environments
(eg
gt10
environments
in[120])
with
each
environment
being
represented
by
a
pattern-separated
representational
space
Box 9 Experience Replay in Deep Q-Networks
Instead of employing a standard online learning method in which each unit of play experience (consisting of a stateaction next state and resulting reward) is used immediately to adjust connection weights and then discarded anexperience replay buffer similar to the hippocampus is used This allows learning based on randomly chosen subsets of
recent experiencesstored in the replay buffer([119] fordetails)to beinterleavedwith ongoing game-play Theapproach isin line with 1047297ndings cited above [66] that hippocampal replay reactivates reward related neurons in striatum in accordwith the hypothesis that hippocampus-dependent RL facilitates learning during off-line periods
Experience replayin theDQN architecturewascrucial in (i)maximizing data ef 1047297ciency allowing each unit of experience tobe reusedin many updates (egmirroringbene1047297ts of repeated time-compressedhippocampal replay) and (ii) smoothingout learning and avoiding unstable response policies that can result from the tendency of the current policy to bias theexperienced samples The approach minimizes learning from consecutive samples which is undesirable owing to theirstrongly correlated nature and inconsistent with the implicit assumptions built into neural-network learning algorithmsInstead experience replay allows updates within the deep Q-network to be performed on non-adjacent samples from aset of recent experiences in a fashion that breaks up these correlations while sti ll relying on relevant statistics Thedramatic advantage of a network implementing interleaved learning through experience replay was illustrated by theeffects of disabling replayon network performance this causeda severedrop in performance to at best30 of whenexperience replay was present [119] Note that the uniform sampling mechanismas implemented treats all transitions inthe replay memory as if they were equal Recent work [183] shows that biasing replay towards signi1047297cant events ndash
speci1047297cally experiences that are associated with high reward prediction errors ndash yields further gains This mechanismwhich resonateswith therole of the hippocampus in reweighting experiences as discussedabove allows information tobe harvested from rare experiences that may be particularly informative
528 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
were never presented together (eg A and C)Paired associative recall task aparadigm where item pairs are
experienced during study (eg wordpairs such as lsquodogndashtablersquo in a humanexperiment or 1047298avorndashlocation pairs ina rodent experiment) and at test theindividual must recall the other item(eg speci1047297c
location) from a cue(the speci1047297c 1047298avor eg banana)Recurrent similarity computation
recurrent similarity computationallows the procedure performed byexemplar models to iterate that isthe retrieved products from the 1047297rststep of similarity computation arecombined with the external sensoryinput and a
subsequent round of similarity computation is performed
This process continues until a
stablestate (ie basin of attraction in aneural network) is reached Thisallows the model to capture higher-order similarities present in a set of related experiences where pairwisesimilarities alone are not informativeSharp-wave ripple (SWR)
spontaneous neural activity occurringwithin the hippocampus duringperiods of rest and slow wave sleepevident as negative potentials (iesharp waves) Transient high-frequency (150Hz) oscillations (ieripples) occur within these sharpwaves which can re1047298ect the replay ( i
e reactivation) of activity patternsthat occurred during actualexperience sped up by an order of magnitudeSparsity the proportion of neuronsin a given brain region that are activein response to a given stimulus(lsquopopulation sparsenessrsquo)
Sparsecoding where a small (eg 1)proportion of neurons is active iscontrasted with densely distributedcoding where a relatively largeproportion of neurons are active (eg20)
modeling
the
neural
computations
supporting
visual processing
of
objects
in
primates
[1718]
Theconsiderable
advantages
of
depth
in
allowing
the
learning
of
increasingly
complex
and
abstractmappings
[16]
are
balanced
here
by
the
strong
interdependencies
among
connection
weights
indeep
networks
[1920]
such
that
the
weights
are
learned
gradually
through
extensive
repeatedand
interleaved
exposure
to
an
ensemble
of
training
examples
that
embody
the
domain
statistics
Although
there
are
real
advantages
of
a
system
using
structured
parametric
representations
on
its
own
such
a
system
would
suffer
from
two
drastic
limitations
[1]
First
it
is
important
to
be
ableto
base
behavior
on
the
content
of
an
individual
experience
For
example
after
experiencing
alife-threatening
situation ndash
for
example
an
encounter
with
a
lion
at
a
watering-hole ndash
it
wouldclearly
be
bene1047297cial
to
learn
to
avoid
that
particular
location
without
the
need
for
furtherencounters
with
the
lion
The
second
problem
is
that
the
rapid
adjustment
of
connectionweights
in
a
multilayer
network
to
accommodate
new
information
can
severely
disrupt
therepresentation
of
existing
knowledge
in
it ndash
a
phenomenon
termed
catastrophic
interference[121ndash23]
that
is
related
to
the
stabilityndashplasticity
dilemma
[24]
If
the
new
information
about
thedangerous
lion
is
forced
into
a
multi-layer
network
by
making
large
connection
weight
adjust-ments just to accommodate this item this can interfere with knowledge of other less-threateninganimals
one
may
already
be
familiar
with
Layer 4 (Output)
0
0
= j
a j lndash1w ij llndash1
w ij llndash1
sumnl i
nl i
al i
al
i
a j lndash1
Layer 3
Layer 2
Layer 1 (Input)
Target Figure 2 A Neocortex-Like Arti1047297cialNeural Network In the complementarylearning systems (CLS) theory neocorticalprocessing is seen as occurring through
the propagation of
activation among neu-rons via weighted connections as simu-lated using arti1047297cial networks of neuron-like units (small circles) Each unit has aninput line and an output line (with arrow-head) There is a separate real-valuedweight where each output line crossesan input line The weights are the knowl-edge that governs processing in the net-work During processing (inset) each unitcomputes a net input ( n) from the activa-tions of its inputs and the weights (plus abias term omitted here) producing anactivation ( a) that is a non-linear functionof n (one such function shown) The unitsin a layer may project back onto their own
inputs (illustrated for layer 3) simulatingrecurrent intra-cortical computations andhigher layers may project back to lowerlayers (Figure 1) In the situation shownthe input ( lower left) is a pattern in whichuni ts are either act ive ( a = 1 black) orinactive ( a = 0 white) and examples of possible activations produced in units of other layers are shown (darker for greateractivation) Learning occurs throughadjusting theweights to reduce the differ-ence between the output of the network anda targetoutput (upperright) [1016] Inthecaseshown theoutput activations aresimilar to the target but there is someerror to drive learning There are no tar-
gets for internal or h idden layers ( ie layers 2 and 3) These patterns dependon the connection weights which in turnare shaped by the error-driven learningprocess
Trendsin CognitiveSciences July 2016 Vol 20 No 7 515
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 2 Functional Roles of Subregions of the Medial Temporal Lobes
Work within the CLS framework [27116141] relies on the anatomical and physiological properties of MTL subregionsand the computational insights of others [92526] to characterize the computations performedwithin these structures
Entorhinal Cortex (ERC) Input to the Hippocampal SystemDuring an experience inputs from neocortex produces a pattern of activation in the ERC that may be thought of as acompressed description of the patterns in the contributing cortical areas (Figure I illustrative active neurons in the ERCare shown in blue) ERC neurons give rise to projections to three subregions of the hippocampus proper the dentategyrus (DG)CA1and CA3[2884]
Pattern selection andpattern separation
novel ERCpatternsare thought to activate asmall setof previously uncommitted DGneurons (shownin redndash theseneuronsmaybe relatively youngneurons createdby neurogenesis) These neurons in turn select a random subset of neurons in CA3 via large lsquodetonator synapsesrsquo(shownas reddots on theprojection from DG toCA3) to serve as therepresentationof thememory in CA3 ensuring thatthenew CA3pattern is asdistinct as possible from theCA3 patterns forothermemories includingthose forexperiencessimilar to the new experience (Boxes 3 and4) Pattern completion recurrent connections from the active CA3neuronsonto other active CA3 neurons are strengthened during the experience such that if a subset of the same neurons laterbecomes active the rest of the pattern will be reactivated Direct connections from ERC to CA3 are also strengthenedallowing the ERC input to directly activate the pattern in CA3during retrieval without requiring DG involvement (Box 3)Pattern reinstatement in ERC and neocortex [116141]
The connections from ERC to CA1 and back are thought tochange relatively slowly to allow stable correspondence between patterns in CA1 and ERC Strengthening of connec-tions from the active CA3 neurons to the active CA1 neurons during memory encoding allows this CA1 pattern to be
reactivated when thecorresponding CA3pattern is reactivated the stable connections from CA1 to ERCthen allow theappropriate pattern there to be reactivated and stable connections between ERC andneocortical areas propagate thereactivated ERCpattern to the neocortex Importantlythe bidirectional projectionsbetweenCA1andERCand betweenERC and neocortex support the formation and decoding of invertible CA1 representations of ERC and neocorticalpatternsand allow recurrent computations These connections shouldnot changerapidly given theextendedrole of thehippocampus in memory ndash otherwise reinstatement in the neocortex of memories stored in the hippocampus would bedif 1047297cult [61]
CA3
CA1
DG
ERC
Neocortex Neocortex
Figure
I
Hippocampal
Subregions
Connectivity
and
Representation
Schematic depictions of neurons (withcircular or triangular cell bodies) are shown along with schematic depictions of projections from neurons in an area toneurons in thesameor other areas (greyor colored lines ndash red coloring indicatesprojectionswith highly-plastic synapseswhile grey coloring illustrates relatively less-plastic or stable projections) CA1 output to ERC then propagates out toneocortex ERCandeven resultingneocorticalactivitycan befed back into thehippocampus(broken line)as proposed inthe REMERGE model (see below)
Trendsin CognitiveSciences July 2016 Vol 20 No 7 517
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
hippo-campal representation formed in learning an event affords a way of allowing gradual integrationof
knowledge
of
the
event
into
neocortical
knowledge
structures
This
can
occur
if
the
hippo-campal representation can reactivate or replay the contents of the new experience back to theneocortex
interleaved
with
replay
andor
ongoing
exposure
to
other
experiences
[1]
In
this
waythe
new
experience
becomes
part
of
the
database
of
experiences
that
govern
the
values
of
theconnections
in
the
neocortical
learning
system
[51ndash53] Which
other
memories
are
selected
forinterleaving
with
the
new
experience
remains
an
open
question
Most
simply
the
hippocampusmight
replay
recent
novel
experiences
interleaved
with
all
other
recent
experiences
still
stored
in
Box 3 Pattern Separation and Completion in Different Subregions of the Hippocampus
Pattern separationand completion [25ndash27] are de1047297nedin terms oftransformationsthat affectthe overlap or similarity amongpatterns of neuralactivity [28142] Patternseparationmakes similarpatternsmoredistinct through conjunctivecoding [925] in which each outputneuron respondsonly to a speci1047297c combinationof activeinputneurons Figures IA and IB illustrate how this can occur Pattern separation is thought to be implemented in DG (see Box 4) using higher-order conjunctions that
reduce overlap even more than illustrated in the 1047297gure
Pattern completion is a process that takesa fragmentof a pattern and1047297llsin theremaining features (asin recallinga lion upon seeingthe scenewhere thelionpreviouslyappeared)or that takesa pattern similarto a familiar patternandmakes it evenmore similarto itComputational simulations [27] have shownhowtheCA3region mightcombine featuresof patternseparationand completion such that moderate andhighoverlap results in pattern completion towardthe storedmemory butless overlapresults in thecreationof a newmemory [37133143] (FigureIC)In this account when environmentalinput produces a pattern in ERCsimilar to a previous pattern theCA3outputs a pattern closerto theone it previously used for this ERCpattern [124144] However when theenvironmentproduces an input on theERC that haslowoverlap with patterns stored previously the DG recruits a new statistically independent cell population in CA3 (ie pattern separation [27]) Emerging evidencesuggests that the amountof overlap required forpattern completion (aswell as other characteristics of hippocampal processing) maydifferacross theproximal-distal[145146] anddorsondashventral axes [98147ndash150] of thehippocampus andmay be shapedby neuromodulatory factors(eg Acetylcholine) [85151] Also incompletepatterns require less overlap with a storedpattern than distorted ones for completion to occur so that partial cues will tend to produce completion aswhen oneseesthe watering hole and remembers seeing a lion there previously [27]
Several studies point to differences between theCA3andCA1 regions in how their neural activity patterns respond to changes to the environment [37] broadly theCA1 region tends to mirror the degree of overlap in the inputs from the ERC while CA3 shows more discontinuous responses re1047298ecting either pattern separation or
completion [134152]
Input overlap Input overlap
Paern separaon in DG(A) (B) (C) Separaon and compleon in CA3
O u t p u t o v e r l a p
O u t p u t o v e r l a p
00
1
0
1
1
0
1
Figure I Conjunctive Coding Pattern Separation and Pattern Completion (A) A set of 10 conjunctive unitswithconnections from a layer of 5 input units isshown twicewith differentinputpatternsHere each conjunctive unit detects activity in a distinct pair of input units (arrows)The outputfor each pattern is sparser thanthe input (ie30 vs 60 respectively) andthe twooutputs overlap less than thetwo correspondinginputs (ie33 vs67 respectively overlap is thenumber of activeunitsshared by twopatternsdivided by thenumber of units activein each)DG mayuse higher-order conjunctions magnifying these effects (B)An illustration of the general form of a pattern separation function showing the relationship between input and output overlap Arrows indicate the overlap of the inputs and outputsshown in the left panel (C) The separation-and-completionpro1047297le associated with CA3 where low levels of input overlap are reduced further while higher levels areincreased [2737]
518 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
comple-mentary properties of each of the two component systems allowing new information to berapidly stored in the hippocampus and then slowly integrated into neocortical representations This
process
sometimes
labeled lsquosystems
level
consolidationrsquo
[51] arises
within
the
theoryfrom gradual cortical learning driven by replay of the new information interleaved with otheractivity
to
minimize
disruption
of
existing
knowledge
during
the
integration
of
the
newinformation
Empirical Evidence of Replay
Because
of
its
centrality
in
the
theory
we
highlight
key
empiricalevidence
that
replay
events
really
do
occur
The
data
come
primarily
from
rodents
recordedduring
periods
of
inactivity
(including
sleep)
in
which
hippocampal
neurons
exhibit
large
irregularactivity
(LIA)
patterns
that
are
distinct
from
the
activity
patterns
observed
during
active
states[23]
During
LIA
states
synchronous
discharges
thought
to
be
initiated
in
hippocampal
areaCA3
produce sharp-wave ripples
(SWRs)
which
are
propagated
to
neocortex
SWRs
re1047298ectthe
reactivation
of
recent
experiences
expressed
as
the
sequential
1047297ring
of
so-called
place
cellscells
that 1047297re
when
the
animal
is
at
a
speci1047297c location
[2357ndash59]
These
replay
events
appear
tobe
time-compressed
by
a
factor
of
about
20
bringing
neuronal
spikes
that
were
well-separatedin
time
during
an
actual
experience
into
a
time-window
that
enhances
synaptic
plasticity
both
Box 4 Sparse Conjunctive Coding and Pattern Separation in the Dentate Gyrus
Neuronal codes range from the extreme of localist codes ndash where neurons respond highly selectively to single entities(lsquograndmother cellsrsquo) to dense distributedcodeswhere items arecoded through theactivity ofmany (eg 50) neuronsin
an area [153154]
While localist codes minimize interference andare easily decodable they are inef 1047297cient in terms of
representational capacity By contrast densedistributed codesare capacity-ef 1047297cient however they are costly in termsof metabolic cost and relatively dif 1047297cult to decode These are endpoints on a continuumquanti1047297ed by a measure calledsparsity where lsquopopulationrsquo sparsity indexes theproportion of neurons that 1047297re in response to a given stimuluslocationand lsquolifetimersquo sparsity indexes the proportion of stimuli to which a single neuron responds [26153155] For example apopulationsparsity of
1meansthatonly 1of the neuronsin a
populationare activein representinga given inputTworandomly selected sparsepatternstend tohave lowoverlap (for tworandomlyselectedpatternsof equalsparsity over thesame setof neurons theaverageproportion of neuronsin eitherpattern that is active in theotheris equal to thesparsity)but neurons still participate in several different memories making them more ef 1047297cient than localist codes Despitevariability in estimatesof thesparsity ofa givenbrain region [27153156157] theDG iswidelybelievedto sustain amongthe sparsest neural code in the brain (05ndash1 population sparseness) [25ndash27] The CA3 region to which the DGprojects is thought to be less sparse (25 [47])
Many studies 1047297nd less-sparse patterns in CA1 than CA3 [134152]
The unique functional and anatomicalproperties of the DG suggest the origins of its sparse pattern-separated code Theperforant pathfromtheERC (containing200000neurons intherodent)projects toa layerof 1millionofDGgranulecellsCombinedwith thehigh levels of inhibition in theDG this supports theformation of highlysparse conjunctive representa-tions such that each neuron in DG responds only when several input neurons aresimultaneouslyactive reducing overlapbetweensimilar input patterns [25ndash27136] Evidencealso suggests thatnew DGneuronsarisefromstemcells throughoutadult lifethesenewneuronsmaybe preferentially recruitedin theformation ofmemories[136] further reducingoverlapwithpreviouslystored
memoriesTheCA3pattern fora memoryis then selectedby theactiveDG neurons eachofwhichhas alsquodetonatorrsquo synapse to15 randomly selectedCA3neurons This process helpsminimize theoverlap of CA3patterns fordifferent memories increasing storage capacity and minimizing interference between them even if the two memoriesrepresentsimilar events thathavehighlyoverlappingpatternsin neocortex andERCEmpiricalevidenceprovidessupport forthis with one study [137] showing that the representation supported by DGwashighly sensitive to small changes in theenvironmentdespiteevidence thatincominginputsfrom theERCwere little affected(alsosee [133145])
FurthermoreDGlesions impairananimalsrsquo abilitytolearntoresponddifferentlyintwoverysimilarenvironmentswhileleavingtheabilitytolearnto respond differently in two environments that are not similar [136]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 519
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 5 Similarity-Based Coding in High-Level Visual Cortex
High-level visual regions of the neocortex are thought to support distributed representations that are inferred to be lesssparsethan those of theDG andthe CA3CA1 regions of thehippocampus (Box4) Populationsparseness in theERC isestimatedat 7ndash10 [158]
with high-level sensory cortices exhibitingsimilar or higher levels of sparseness (eg variable
estimates [44ndash46]) Although lifetime sparseness does not directly translate to population sparseness recent evidencesuggests that V4and inferotemporal cortex(ITc)havea sparsenessof 10on this measure [159] It isworth notingthatlearning ratesmay vary according to neuronal selectivity andlifetime sparseness resultingin differences in learning ratesacross neocortical areasand hippocampal subregionsNeurons in early visual regions that encode frequently-occurringfeatures (ie edges)mayhave a relatively slow learning rate while neurons in higher visual regions andbeyond (eg ITcand perirhinal cortex) may have a higher learning rate to support the encoding of less-frequently occurring more-conjunctive features (eg individual objects) [12160161]
Evidence from electrophysiological recording studies in high-level visual cortical regions such as the ITc in primatesprovides support for the operation of a similarity-based coding scheme ndash whereby related categories (eg dogs andcats) are represented by overlapping neuronal codes [1740ndash43] (Figure I) Representational similarity analysis (RSA) of the ITc population response duringpassive viewing of pictures reveals codingof 1047297ne-grained categorical structure (egof a set of animate and inanimate objects) ndash that iswell 1047297t by deep convolutional neural networks which have algorithmicparallels with feedforward processing in the ventral visual stream [1740] While analogous similarity-based coding wasobserved using fMRI in the human homolog of ITc [41] there wasno evidence for greater within-category (cf between-category) representational similarity in any subregion of the hippocampus in a recent fMRI study [162] which foundevidence consistent with the importance of pattern separation in episodic memory Instead similarity-based coding inthis studywasobservedin theperirhinal andparahippocampal cortexndashMTL regionsthatproject tothe ERC and thataretypically considered to be intermediate zones (ie between the hippocampal and neocortical systems) in CLS theory
Dissimilarity
[percenle of 1 ndash r ]0 100
Monkey ITc Human ITc
AnimateNaturalNot human
Body Fa ce B ody FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
AnimateNaturalNot human
Bo dy Face Body FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
Figure I Similarity-Based Coding in High-Level Visual Cortex Representational dissimilarity matrices (RDM)re1047298ect the correlation (ie 1 r where r is the Pearson correlation coef 1047297cient) between the response of voxel patterns(fMRI in humans [41] right panel) or neuronal populations (electrophysiological recording in monkey [43]
left panel) to a
set of 92 object images RDMs are analogous in monkey and human ITc The RDMs show that the representations of animate objects are similar as are those of inanimate objects In addition to this clear animatendashinanimate distinctionobject coding in ITc exhibits 1047297ner categorical structure (eg for faces body parts) visible in these RDMs (also see [41])Reproduced with permission from [41]
520 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
rodents [7273] This generalized replayndash simultaneous reactivation of multiple related traces during testing or of 1047298ine periods ndash mayfacilitate the creation
of
new representations
f rom the
recombination
of
multiple relatedepisodes (lsquostored generalizationsrsquo) [5] and the discovery of novel relationships (eg shortcuts)[7273]
Empirical
evidencealsosupports
a
roleforthehippocampusin
category-
and
so-calledlsquostatisticalrsquo
learning [105ndash107] the mechanisms in
REMERGE and other
related modelsthat
rely on
separate memory
traces for individual
i tems allow weak hippocampal
tracesthat
support
only relat ively poor item recognition to
mediate
near-normal generalization[5108]
Box 6 Generalization Through Recurrence in the Hippocampal System
The REMERGEmodel (FigureI ) [5] which re1047298ects a synthesisof interactive activationand competition (IAC)models [163]and exemplar models of memory [108164165] constitutes an abstraction and simpli1047297cation of the multi-stagecircuitry of the hippocampal systeminto twoprincipal layers feature andconjunctivelayers broadly corresponding to the
ERC and hippocampus proper respectively The localist coding (eg unit AB) in the conjunctive layer re1047298ects anidealization of the sparsely distributed pattern-separated codes in the DGCA3 subregions of the hippocampus (Boxes2ndash4) that support episodic memory (eg for trials involving presentation of A and B objects together)
An essential principle of the model ndash mediated by the bidirectional excitatory connections between feature andconjunctive layers ndash is the principle of recurrence between the hippocampus proper and neocortical regions suchas the ERC (termed lsquobig-looprsquo recurrence to distinguish it from the internal recurrence known to exist within the CA3region) This allows recirculation of network output as a subsequent input to the system Intuitively this functionality iscrucial to allowing the model to discover the higher-order structure present within a
set of related episodes an initialprobe on the feature layer (eg denoting stimuli present on screen during a test trial) prompts the activation of experiences containing these elements on the conjunctive layer which in turn drives a new pattern of feature layeractivity that re1047298ects not only the external input but also the content of retrieved experiences This in turn leads to theactivation of conjunctive units denoting experiences related to the new feature layer pattern and so on This can bringabout a situation where for example the presentation of A and C can result in the activationof AB and BC which jointlyactivate B in turn further activating AB andBC which then suppress other conjuncts involvingA andC This produces astable state in which AB BC and A B and C are al l act ivated at the same time ndash thereby effectively inferring a link between A andC Longer-rangeinferences (egBndashE) canalsobe supportedby therecurrent mechanism([5] for details)Formally the function of the network can be viewed as carrying out recurrent similarity computation Unlike otherexemplar models [108164165] in which similarity computation is performed only on external inputs REMERGEperforms such computations on inputs affected by its own outputs
Conjuncve
Feature
AB
A B C D E F
BC CD DE EF
Figure I A Schematic of the Architecture of REMERGE Recurrent architecture of REMERGE showing its two-layer architecture with inputoutput units for possible constituents of experiences (A ndashF) conjunctive units representingpairs of constituents that have occurred together (AB BC etc) bidirectional connections (broken arrows) betweenconjuncts and their constituents and recurrent inhibition (broad arrow) among conjunctive units Adapted from [5]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 523
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
inferenceparadigm [590110]) Such representations then become the contents of episodic memorysubject
to
storage
in
the
hippocampus
The
distinction
between
encoding-
and
retrieval-based
models
can
be
related
more
broadly
tothe
1047297nding
of lsquoconceptrsquo cells
hippocampal
neurons
which
come
to
respond
to
common
featuresacross many events for example cells for speci1047297c odors [111] time-points within an episode[112]
attributes
of
a
task
[113]
and
even
cells
that
1047297re
to
any
picture
or
the
name
of
a
famousperson
[114]
In
Box
7
we
review
empirical 1047297ndings
concerning
concept
cells
and
pattern
overlap
sometimes observed in parts of hippocampus and consider how well these 1047297ndings 1047297t within theperspective
that
the
hippocampus
supports
pattern
separation
Rapid
Schema-Dependent
ConsolidationIt
is
useful
to
distinguish
systems-level
consolidation
from
what
we
refer
to
as
within-systemconsolidation
The
former
refers
to
the
gradual
integration
of
knowledge
into
neocortical
circuitswhile
the
latter
denotes
stabilization
of
recently
formed
memories
within
the
hippocampusperhaps
through
stabilization
of
synapses
among
hippocampal
neurons
[89] In
the
initialformulation
of
CLS
systems-level
consolidation
was
viewed
as
temporally
extended
(egspanning
years
or
even
decades
in
humans
[3451ndash53])
Although
it
was
noted
in
[1]
thatthe
timeframe
could
be
highly
variable
(depending
perhaps
on
the
rate
of
replay
of
memory
Box 7 Concept Cells and Nodal CodingsReports of concept cells in thehippocampushavebeen takenas contradictinga tenet ofCLStheorybut theexistence of such neurons is notnecessarilyinconsistentwith itgiven that thetheoryexpects differenthippocampalregions to vary interms of contextspeci1047297city andalso permits variationwithin hippocampal regions (Box 3) Evidence supporting theCLSprediction of context-speci1047297city in theCA3and DGcomes from a recent intracranial recording study in humans [166] Inthis study neurons in CA3DG andalso in the subiculum tended to discriminate between different imagesof a famousperson ndash with responses correlating with successful performance in a recognition memory task that required discri-minating previously experienced targets from similar lures Neurons in other MTL areas (ie entorhinal and parahippo-campal cortices) exhibitedmore invariant lsquoconcept cell likersquo responses that were not linked tomemory performance (theCA1 subregion was sparsely sampled in this study)
It is also interesting to consider the1047297ndingof lsquosplitterrsquo cells in a task where animalsmust alternatebetween turning left andright on successive trials in a T maze [167ndash179] here someCA1 and CA3 place cellsfor locations onthe central stemof the T maze are modulated by the trajectory of the rat (eg whether it will subsequently turn left or right) whereas othersare
trajectory-independentThisphenomenon knownas partial remapping [48170ndash172] is consistent with theidea that
pattern separation is a matter of
degree in our theory [2737] As such we should expectpartly overlapping representa-tions (ie ratherthan fully independent lsquochartsrsquo [121]) whenenvironmental changes are suf 1047297ciently small (Box3)We alsoexpectthe greatest differentiationin DGand at an early point in learningTo ourknowledge no studies have yetrecordedfrom DG in this paradigm
In a recent study representational similarity analysis techniques [173] were applied to ensemble recordingdata collectedwhile rats performed a context-guided rewarddiscrimination task [113] As expected the population codes in CA3 andCA1were dominatedby context andplace coding although other task dimensions ndash reward value and item ndashwere alsorepresented [113] (also see [174]) Although there was some representational overlap across locations based on valueand item CA3CA1 codes were consistent with incomplete but still strong pattern separation especially in the dorsalhippocampus Overall these 1047297ndings appear consistent with the CLS with the provision that pattern separation is amatter of degree andmay vary by task andregionWhyCA3 showsgreater speci1047297citythanCA1in somestudies but notothers requires further exploration
524 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
large amplitude weight changes occurred during the learning of schema-consistent
but not schema-inconsistent
information ndash
emulating the
schema-dependent pattern of neocortical plasticity-related gene expression reported in [8] A theo-retical analysis of
multilayer neural
networks makes clear why
themodel exhibits these effects[20]
the analysis
shows
that
the
rate of
learning within
a
multilayered
neural
network of
thetype that
CLS attributes to
the neocortex
[20]
will always
depend
on
the state of knowledge
Box 8 Rapid Integration of New Learning in the Neocortex When Does it Occur
In the event arena paradigm [78] (Figure I) hippocampal lesions prevent acquisition of new schema-consistentassociations By contrast hippocampal lesions performed as little as 48 h after learning leave memory intact Oneexplanation for the crucial but temporary nature of the hippocampal contribution is replay even a
few minutes with the
hippocampus intact couldallowmultiple replays eachone incrementing the strength of intra-neocortical connections Inan investigation of induction of plasticity-related genes in neocortex [8] the hippocampuswas intact for 80minutes afterinitial exposure to the new associations These 1047297ndings raise the broader question of when rapid integration of newlearning into the neocortex occurs and whether it can occur even without a hippocampus
A substantial body of work from several laboratories now supports the view that a single period of sleep can producechanges in how experiences froma single learning session impact on subsequent responding As key examples somestudies have reported increased levelsof linking inferences [175] andothershave reported increased lexical competitionand related phenomena[109176] attributedto a singlesleepsessionThese1047297ndingsare often interpreted asevidenceof rapidsystems-level consolidation (eg [176])
However thematerials used arenot obviously highly consistentwith priorknowledge in most cases and therefore under the CLS framework wewould not expect full integration into neocorticalnetworks in such a short time-period An alternative interpretation (illustrated in [5]) is that replays during sleep increasethe strength robustness and rate of activation of new hippocampus-dependent traces and that such strengtheningmay be suf 1047297cient to account for the observed effects Thus the 1047297ndings are consistent with the view that integration of these new memories into neocortical structures proceeds over a considerably longer time-period
Work with the lsquofast mappingrsquo paradigm in humanswith hippocampal lesions [177] provides another potential source of evidence about rapid neocortical learning of arbitrary new information In this paradigm human participants seepairs of pictures of objects ndash onefamiliar andone unfamiliar ndash and are asked a question such as lsquois thenumbats tail pointing uprsquoinferring that the unfamiliar name lsquonumbatrsquomust refer to the unfamiliar object [177] Some studies 1047297nd that patients withextensive hippocampus damage show retention of the new objectndashname association at a
delayed test [178179]suggesting very rapid neocortical learning even without a hippocampus However the 1047297nding has proven dif 1047297cult toreplicate [180ndash182] future studies should continue to investigate this issue
(A) (B)Original paired associates
1 2
3
4
5 5
4
8
3
7
2
6
Introducon of new paired associates
Figure I Schematic Illustration of the Event Arena Paradigm (A) Overhead view of 16 m 16 m event arenarats are cuedwithone of
six food 1047298avors (eg banana) each associated with a location in thearena (eg location 3) andare required to gofromany of the four start-boxesto a speci1047297c location to retrieve food (B)Following gradual learning of the originalset twonew 1047298avor-placepairs are introduced(eg cinnamonndashlocation7 nutmegndashlocation8) Rapidschema-dependent one-shot learning of these new PAs is observed (see Box text) Figure based on experimental designdescribed in [7]
526 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
allocatedneuronal codes that are non-overlapping or orthogonal (eg [26]) Notably the advantagesof
this
coding
scheme
for
episodic
memory ndash
reduction
of
interference
between
similar
butdistinct
events ndash
may
also
have
signi1047297cant
bene1047297ts for
continual
learning
Speci1047297cally
thismechanism allows the rapid creation of distinct non-interfering representations for multipletasks
to
which
an
agent
has
been
exposed
in
sequential
fashion
The
utility
of
this
function
andthe ubiquity of continual learning is well established in the domain of spatial navigation wherethe
notion
of
a
task
can
be
related
to
that
of
an
environmental
context
rodents
are
able
to
learnand
sustain
robust
representations
of
many
different
environments
(eg
gt10
environments
in[120])
with
each
environment
being
represented
by
a
pattern-separated
representational
space
Box 9 Experience Replay in Deep Q-Networks
Instead of employing a standard online learning method in which each unit of play experience (consisting of a stateaction next state and resulting reward) is used immediately to adjust connection weights and then discarded anexperience replay buffer similar to the hippocampus is used This allows learning based on randomly chosen subsets of
recent experiencesstored in the replay buffer([119] fordetails)to beinterleavedwith ongoing game-play Theapproach isin line with 1047297ndings cited above [66] that hippocampal replay reactivates reward related neurons in striatum in accordwith the hypothesis that hippocampus-dependent RL facilitates learning during off-line periods
Experience replayin theDQN architecturewascrucial in (i)maximizing data ef 1047297ciency allowing each unit of experience tobe reusedin many updates (egmirroringbene1047297ts of repeated time-compressedhippocampal replay) and (ii) smoothingout learning and avoiding unstable response policies that can result from the tendency of the current policy to bias theexperienced samples The approach minimizes learning from consecutive samples which is undesirable owing to theirstrongly correlated nature and inconsistent with the implicit assumptions built into neural-network learning algorithmsInstead experience replay allows updates within the deep Q-network to be performed on non-adjacent samples from aset of recent experiences in a fashion that breaks up these correlations while sti ll relying on relevant statistics Thedramatic advantage of a network implementing interleaved learning through experience replay was illustrated by theeffects of disabling replayon network performance this causeda severedrop in performance to at best30 of whenexperience replay was present [119] Note that the uniform sampling mechanismas implemented treats all transitions inthe replay memory as if they were equal Recent work [183] shows that biasing replay towards signi1047297cant events ndash
speci1047297cally experiences that are associated with high reward prediction errors ndash yields further gains This mechanismwhich resonateswith therole of the hippocampus in reweighting experiences as discussedabove allows information tobe harvested from rare experiences that may be particularly informative
528 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 2 Functional Roles of Subregions of the Medial Temporal Lobes
Work within the CLS framework [27116141] relies on the anatomical and physiological properties of MTL subregionsand the computational insights of others [92526] to characterize the computations performedwithin these structures
Entorhinal Cortex (ERC) Input to the Hippocampal SystemDuring an experience inputs from neocortex produces a pattern of activation in the ERC that may be thought of as acompressed description of the patterns in the contributing cortical areas (Figure I illustrative active neurons in the ERCare shown in blue) ERC neurons give rise to projections to three subregions of the hippocampus proper the dentategyrus (DG)CA1and CA3[2884]
Pattern selection andpattern separation
novel ERCpatternsare thought to activate asmall setof previously uncommitted DGneurons (shownin redndash theseneuronsmaybe relatively youngneurons createdby neurogenesis) These neurons in turn select a random subset of neurons in CA3 via large lsquodetonator synapsesrsquo(shownas reddots on theprojection from DG toCA3) to serve as therepresentationof thememory in CA3 ensuring thatthenew CA3pattern is asdistinct as possible from theCA3 patterns forothermemories includingthose forexperiencessimilar to the new experience (Boxes 3 and4) Pattern completion recurrent connections from the active CA3neuronsonto other active CA3 neurons are strengthened during the experience such that if a subset of the same neurons laterbecomes active the rest of the pattern will be reactivated Direct connections from ERC to CA3 are also strengthenedallowing the ERC input to directly activate the pattern in CA3during retrieval without requiring DG involvement (Box 3)Pattern reinstatement in ERC and neocortex [116141]
The connections from ERC to CA1 and back are thought tochange relatively slowly to allow stable correspondence between patterns in CA1 and ERC Strengthening of connec-tions from the active CA3 neurons to the active CA1 neurons during memory encoding allows this CA1 pattern to be
reactivated when thecorresponding CA3pattern is reactivated the stable connections from CA1 to ERCthen allow theappropriate pattern there to be reactivated and stable connections between ERC andneocortical areas propagate thereactivated ERCpattern to the neocortex Importantlythe bidirectional projectionsbetweenCA1andERCand betweenERC and neocortex support the formation and decoding of invertible CA1 representations of ERC and neocorticalpatternsand allow recurrent computations These connections shouldnot changerapidly given theextendedrole of thehippocampus in memory ndash otherwise reinstatement in the neocortex of memories stored in the hippocampus would bedif 1047297cult [61]
CA3
CA1
DG
ERC
Neocortex Neocortex
Figure
I
Hippocampal
Subregions
Connectivity
and
Representation
Schematic depictions of neurons (withcircular or triangular cell bodies) are shown along with schematic depictions of projections from neurons in an area toneurons in thesameor other areas (greyor colored lines ndash red coloring indicatesprojectionswith highly-plastic synapseswhile grey coloring illustrates relatively less-plastic or stable projections) CA1 output to ERC then propagates out toneocortex ERCandeven resultingneocorticalactivitycan befed back into thehippocampus(broken line)as proposed inthe REMERGE model (see below)
Trendsin CognitiveSciences July 2016 Vol 20 No 7 517
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
hippo-campal representation formed in learning an event affords a way of allowing gradual integrationof
knowledge
of
the
event
into
neocortical
knowledge
structures
This
can
occur
if
the
hippo-campal representation can reactivate or replay the contents of the new experience back to theneocortex
interleaved
with
replay
andor
ongoing
exposure
to
other
experiences
[1]
In
this
waythe
new
experience
becomes
part
of
the
database
of
experiences
that
govern
the
values
of
theconnections
in
the
neocortical
learning
system
[51ndash53] Which
other
memories
are
selected
forinterleaving
with
the
new
experience
remains
an
open
question
Most
simply
the
hippocampusmight
replay
recent
novel
experiences
interleaved
with
all
other
recent
experiences
still
stored
in
Box 3 Pattern Separation and Completion in Different Subregions of the Hippocampus
Pattern separationand completion [25ndash27] are de1047297nedin terms oftransformationsthat affectthe overlap or similarity amongpatterns of neuralactivity [28142] Patternseparationmakes similarpatternsmoredistinct through conjunctivecoding [925] in which each outputneuron respondsonly to a speci1047297c combinationof activeinputneurons Figures IA and IB illustrate how this can occur Pattern separation is thought to be implemented in DG (see Box 4) using higher-order conjunctions that
reduce overlap even more than illustrated in the 1047297gure
Pattern completion is a process that takesa fragmentof a pattern and1047297llsin theremaining features (asin recallinga lion upon seeingthe scenewhere thelionpreviouslyappeared)or that takesa pattern similarto a familiar patternandmakes it evenmore similarto itComputational simulations [27] have shownhowtheCA3region mightcombine featuresof patternseparationand completion such that moderate andhighoverlap results in pattern completion towardthe storedmemory butless overlapresults in thecreationof a newmemory [37133143] (FigureIC)In this account when environmentalinput produces a pattern in ERCsimilar to a previous pattern theCA3outputs a pattern closerto theone it previously used for this ERCpattern [124144] However when theenvironmentproduces an input on theERC that haslowoverlap with patterns stored previously the DG recruits a new statistically independent cell population in CA3 (ie pattern separation [27]) Emerging evidencesuggests that the amountof overlap required forpattern completion (aswell as other characteristics of hippocampal processing) maydifferacross theproximal-distal[145146] anddorsondashventral axes [98147ndash150] of thehippocampus andmay be shapedby neuromodulatory factors(eg Acetylcholine) [85151] Also incompletepatterns require less overlap with a storedpattern than distorted ones for completion to occur so that partial cues will tend to produce completion aswhen oneseesthe watering hole and remembers seeing a lion there previously [27]
Several studies point to differences between theCA3andCA1 regions in how their neural activity patterns respond to changes to the environment [37] broadly theCA1 region tends to mirror the degree of overlap in the inputs from the ERC while CA3 shows more discontinuous responses re1047298ecting either pattern separation or
completion [134152]
Input overlap Input overlap
Paern separaon in DG(A) (B) (C) Separaon and compleon in CA3
O u t p u t o v e r l a p
O u t p u t o v e r l a p
00
1
0
1
1
0
1
Figure I Conjunctive Coding Pattern Separation and Pattern Completion (A) A set of 10 conjunctive unitswithconnections from a layer of 5 input units isshown twicewith differentinputpatternsHere each conjunctive unit detects activity in a distinct pair of input units (arrows)The outputfor each pattern is sparser thanthe input (ie30 vs 60 respectively) andthe twooutputs overlap less than thetwo correspondinginputs (ie33 vs67 respectively overlap is thenumber of activeunitsshared by twopatternsdivided by thenumber of units activein each)DG mayuse higher-order conjunctions magnifying these effects (B)An illustration of the general form of a pattern separation function showing the relationship between input and output overlap Arrows indicate the overlap of the inputs and outputsshown in the left panel (C) The separation-and-completionpro1047297le associated with CA3 where low levels of input overlap are reduced further while higher levels areincreased [2737]
518 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
comple-mentary properties of each of the two component systems allowing new information to berapidly stored in the hippocampus and then slowly integrated into neocortical representations This
process
sometimes
labeled lsquosystems
level
consolidationrsquo
[51] arises
within
the
theoryfrom gradual cortical learning driven by replay of the new information interleaved with otheractivity
to
minimize
disruption
of
existing
knowledge
during
the
integration
of
the
newinformation
Empirical Evidence of Replay
Because
of
its
centrality
in
the
theory
we
highlight
key
empiricalevidence
that
replay
events
really
do
occur
The
data
come
primarily
from
rodents
recordedduring
periods
of
inactivity
(including
sleep)
in
which
hippocampal
neurons
exhibit
large
irregularactivity
(LIA)
patterns
that
are
distinct
from
the
activity
patterns
observed
during
active
states[23]
During
LIA
states
synchronous
discharges
thought
to
be
initiated
in
hippocampal
areaCA3
produce sharp-wave ripples
(SWRs)
which
are
propagated
to
neocortex
SWRs
re1047298ectthe
reactivation
of
recent
experiences
expressed
as
the
sequential
1047297ring
of
so-called
place
cellscells
that 1047297re
when
the
animal
is
at
a
speci1047297c location
[2357ndash59]
These
replay
events
appear
tobe
time-compressed
by
a
factor
of
about
20
bringing
neuronal
spikes
that
were
well-separatedin
time
during
an
actual
experience
into
a
time-window
that
enhances
synaptic
plasticity
both
Box 4 Sparse Conjunctive Coding and Pattern Separation in the Dentate Gyrus
Neuronal codes range from the extreme of localist codes ndash where neurons respond highly selectively to single entities(lsquograndmother cellsrsquo) to dense distributedcodeswhere items arecoded through theactivity ofmany (eg 50) neuronsin
an area [153154]
While localist codes minimize interference andare easily decodable they are inef 1047297cient in terms of
representational capacity By contrast densedistributed codesare capacity-ef 1047297cient however they are costly in termsof metabolic cost and relatively dif 1047297cult to decode These are endpoints on a continuumquanti1047297ed by a measure calledsparsity where lsquopopulationrsquo sparsity indexes theproportion of neurons that 1047297re in response to a given stimuluslocationand lsquolifetimersquo sparsity indexes the proportion of stimuli to which a single neuron responds [26153155] For example apopulationsparsity of
1meansthatonly 1of the neuronsin a
populationare activein representinga given inputTworandomly selected sparsepatternstend tohave lowoverlap (for tworandomlyselectedpatternsof equalsparsity over thesame setof neurons theaverageproportion of neuronsin eitherpattern that is active in theotheris equal to thesparsity)but neurons still participate in several different memories making them more ef 1047297cient than localist codes Despitevariability in estimatesof thesparsity ofa givenbrain region [27153156157] theDG iswidelybelievedto sustain amongthe sparsest neural code in the brain (05ndash1 population sparseness) [25ndash27] The CA3 region to which the DGprojects is thought to be less sparse (25 [47])
Many studies 1047297nd less-sparse patterns in CA1 than CA3 [134152]
The unique functional and anatomicalproperties of the DG suggest the origins of its sparse pattern-separated code Theperforant pathfromtheERC (containing200000neurons intherodent)projects toa layerof 1millionofDGgranulecellsCombinedwith thehigh levels of inhibition in theDG this supports theformation of highlysparse conjunctive representa-tions such that each neuron in DG responds only when several input neurons aresimultaneouslyactive reducing overlapbetweensimilar input patterns [25ndash27136] Evidencealso suggests thatnew DGneuronsarisefromstemcells throughoutadult lifethesenewneuronsmaybe preferentially recruitedin theformation ofmemories[136] further reducingoverlapwithpreviouslystored
memoriesTheCA3pattern fora memoryis then selectedby theactiveDG neurons eachofwhichhas alsquodetonatorrsquo synapse to15 randomly selectedCA3neurons This process helpsminimize theoverlap of CA3patterns fordifferent memories increasing storage capacity and minimizing interference between them even if the two memoriesrepresentsimilar events thathavehighlyoverlappingpatternsin neocortex andERCEmpiricalevidenceprovidessupport forthis with one study [137] showing that the representation supported by DGwashighly sensitive to small changes in theenvironmentdespiteevidence thatincominginputsfrom theERCwere little affected(alsosee [133145])
FurthermoreDGlesions impairananimalsrsquo abilitytolearntoresponddifferentlyintwoverysimilarenvironmentswhileleavingtheabilitytolearnto respond differently in two environments that are not similar [136]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 519
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 5 Similarity-Based Coding in High-Level Visual Cortex
High-level visual regions of the neocortex are thought to support distributed representations that are inferred to be lesssparsethan those of theDG andthe CA3CA1 regions of thehippocampus (Box4) Populationsparseness in theERC isestimatedat 7ndash10 [158]
with high-level sensory cortices exhibitingsimilar or higher levels of sparseness (eg variable
estimates [44ndash46]) Although lifetime sparseness does not directly translate to population sparseness recent evidencesuggests that V4and inferotemporal cortex(ITc)havea sparsenessof 10on this measure [159] It isworth notingthatlearning ratesmay vary according to neuronal selectivity andlifetime sparseness resultingin differences in learning ratesacross neocortical areasand hippocampal subregionsNeurons in early visual regions that encode frequently-occurringfeatures (ie edges)mayhave a relatively slow learning rate while neurons in higher visual regions andbeyond (eg ITcand perirhinal cortex) may have a higher learning rate to support the encoding of less-frequently occurring more-conjunctive features (eg individual objects) [12160161]
Evidence from electrophysiological recording studies in high-level visual cortical regions such as the ITc in primatesprovides support for the operation of a similarity-based coding scheme ndash whereby related categories (eg dogs andcats) are represented by overlapping neuronal codes [1740ndash43] (Figure I) Representational similarity analysis (RSA) of the ITc population response duringpassive viewing of pictures reveals codingof 1047297ne-grained categorical structure (egof a set of animate and inanimate objects) ndash that iswell 1047297t by deep convolutional neural networks which have algorithmicparallels with feedforward processing in the ventral visual stream [1740] While analogous similarity-based coding wasobserved using fMRI in the human homolog of ITc [41] there wasno evidence for greater within-category (cf between-category) representational similarity in any subregion of the hippocampus in a recent fMRI study [162] which foundevidence consistent with the importance of pattern separation in episodic memory Instead similarity-based coding inthis studywasobservedin theperirhinal andparahippocampal cortexndashMTL regionsthatproject tothe ERC and thataretypically considered to be intermediate zones (ie between the hippocampal and neocortical systems) in CLS theory
Dissimilarity
[percenle of 1 ndash r ]0 100
Monkey ITc Human ITc
AnimateNaturalNot human
Body Fa ce B ody FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
AnimateNaturalNot human
Bo dy Face Body FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
Figure I Similarity-Based Coding in High-Level Visual Cortex Representational dissimilarity matrices (RDM)re1047298ect the correlation (ie 1 r where r is the Pearson correlation coef 1047297cient) between the response of voxel patterns(fMRI in humans [41] right panel) or neuronal populations (electrophysiological recording in monkey [43]
left panel) to a
set of 92 object images RDMs are analogous in monkey and human ITc The RDMs show that the representations of animate objects are similar as are those of inanimate objects In addition to this clear animatendashinanimate distinctionobject coding in ITc exhibits 1047297ner categorical structure (eg for faces body parts) visible in these RDMs (also see [41])Reproduced with permission from [41]
520 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
rodents [7273] This generalized replayndash simultaneous reactivation of multiple related traces during testing or of 1047298ine periods ndash mayfacilitate the creation
of
new representations
f rom the
recombination
of
multiple relatedepisodes (lsquostored generalizationsrsquo) [5] and the discovery of novel relationships (eg shortcuts)[7273]
Empirical
evidencealsosupports
a
roleforthehippocampusin
category-
and
so-calledlsquostatisticalrsquo
learning [105ndash107] the mechanisms in
REMERGE and other
related modelsthat
rely on
separate memory
traces for individual
i tems allow weak hippocampal
tracesthat
support
only relat ively poor item recognition to
mediate
near-normal generalization[5108]
Box 6 Generalization Through Recurrence in the Hippocampal System
The REMERGEmodel (FigureI ) [5] which re1047298ects a synthesisof interactive activationand competition (IAC)models [163]and exemplar models of memory [108164165] constitutes an abstraction and simpli1047297cation of the multi-stagecircuitry of the hippocampal systeminto twoprincipal layers feature andconjunctivelayers broadly corresponding to the
ERC and hippocampus proper respectively The localist coding (eg unit AB) in the conjunctive layer re1047298ects anidealization of the sparsely distributed pattern-separated codes in the DGCA3 subregions of the hippocampus (Boxes2ndash4) that support episodic memory (eg for trials involving presentation of A and B objects together)
An essential principle of the model ndash mediated by the bidirectional excitatory connections between feature andconjunctive layers ndash is the principle of recurrence between the hippocampus proper and neocortical regions suchas the ERC (termed lsquobig-looprsquo recurrence to distinguish it from the internal recurrence known to exist within the CA3region) This allows recirculation of network output as a subsequent input to the system Intuitively this functionality iscrucial to allowing the model to discover the higher-order structure present within a
set of related episodes an initialprobe on the feature layer (eg denoting stimuli present on screen during a test trial) prompts the activation of experiences containing these elements on the conjunctive layer which in turn drives a new pattern of feature layeractivity that re1047298ects not only the external input but also the content of retrieved experiences This in turn leads to theactivation of conjunctive units denoting experiences related to the new feature layer pattern and so on This can bringabout a situation where for example the presentation of A and C can result in the activationof AB and BC which jointlyactivate B in turn further activating AB andBC which then suppress other conjuncts involvingA andC This produces astable state in which AB BC and A B and C are al l act ivated at the same time ndash thereby effectively inferring a link between A andC Longer-rangeinferences (egBndashE) canalsobe supportedby therecurrent mechanism([5] for details)Formally the function of the network can be viewed as carrying out recurrent similarity computation Unlike otherexemplar models [108164165] in which similarity computation is performed only on external inputs REMERGEperforms such computations on inputs affected by its own outputs
Conjuncve
Feature
AB
A B C D E F
BC CD DE EF
Figure I A Schematic of the Architecture of REMERGE Recurrent architecture of REMERGE showing its two-layer architecture with inputoutput units for possible constituents of experiences (A ndashF) conjunctive units representingpairs of constituents that have occurred together (AB BC etc) bidirectional connections (broken arrows) betweenconjuncts and their constituents and recurrent inhibition (broad arrow) among conjunctive units Adapted from [5]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 523
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
inferenceparadigm [590110]) Such representations then become the contents of episodic memorysubject
to
storage
in
the
hippocampus
The
distinction
between
encoding-
and
retrieval-based
models
can
be
related
more
broadly
tothe
1047297nding
of lsquoconceptrsquo cells
hippocampal
neurons
which
come
to
respond
to
common
featuresacross many events for example cells for speci1047297c odors [111] time-points within an episode[112]
attributes
of
a
task
[113]
and
even
cells
that
1047297re
to
any
picture
or
the
name
of
a
famousperson
[114]
In
Box
7
we
review
empirical 1047297ndings
concerning
concept
cells
and
pattern
overlap
sometimes observed in parts of hippocampus and consider how well these 1047297ndings 1047297t within theperspective
that
the
hippocampus
supports
pattern
separation
Rapid
Schema-Dependent
ConsolidationIt
is
useful
to
distinguish
systems-level
consolidation
from
what
we
refer
to
as
within-systemconsolidation
The
former
refers
to
the
gradual
integration
of
knowledge
into
neocortical
circuitswhile
the
latter
denotes
stabilization
of
recently
formed
memories
within
the
hippocampusperhaps
through
stabilization
of
synapses
among
hippocampal
neurons
[89] In
the
initialformulation
of
CLS
systems-level
consolidation
was
viewed
as
temporally
extended
(egspanning
years
or
even
decades
in
humans
[3451ndash53])
Although
it
was
noted
in
[1]
thatthe
timeframe
could
be
highly
variable
(depending
perhaps
on
the
rate
of
replay
of
memory
Box 7 Concept Cells and Nodal CodingsReports of concept cells in thehippocampushavebeen takenas contradictinga tenet ofCLStheorybut theexistence of such neurons is notnecessarilyinconsistentwith itgiven that thetheoryexpects differenthippocampalregions to vary interms of contextspeci1047297city andalso permits variationwithin hippocampal regions (Box 3) Evidence supporting theCLSprediction of context-speci1047297city in theCA3and DGcomes from a recent intracranial recording study in humans [166] Inthis study neurons in CA3DG andalso in the subiculum tended to discriminate between different imagesof a famousperson ndash with responses correlating with successful performance in a recognition memory task that required discri-minating previously experienced targets from similar lures Neurons in other MTL areas (ie entorhinal and parahippo-campal cortices) exhibitedmore invariant lsquoconcept cell likersquo responses that were not linked tomemory performance (theCA1 subregion was sparsely sampled in this study)
It is also interesting to consider the1047297ndingof lsquosplitterrsquo cells in a task where animalsmust alternatebetween turning left andright on successive trials in a T maze [167ndash179] here someCA1 and CA3 place cellsfor locations onthe central stemof the T maze are modulated by the trajectory of the rat (eg whether it will subsequently turn left or right) whereas othersare
trajectory-independentThisphenomenon knownas partial remapping [48170ndash172] is consistent with theidea that
pattern separation is a matter of
degree in our theory [2737] As such we should expectpartly overlapping representa-tions (ie ratherthan fully independent lsquochartsrsquo [121]) whenenvironmental changes are suf 1047297ciently small (Box3)We alsoexpectthe greatest differentiationin DGand at an early point in learningTo ourknowledge no studies have yetrecordedfrom DG in this paradigm
In a recent study representational similarity analysis techniques [173] were applied to ensemble recordingdata collectedwhile rats performed a context-guided rewarddiscrimination task [113] As expected the population codes in CA3 andCA1were dominatedby context andplace coding although other task dimensions ndash reward value and item ndashwere alsorepresented [113] (also see [174]) Although there was some representational overlap across locations based on valueand item CA3CA1 codes were consistent with incomplete but still strong pattern separation especially in the dorsalhippocampus Overall these 1047297ndings appear consistent with the CLS with the provision that pattern separation is amatter of degree andmay vary by task andregionWhyCA3 showsgreater speci1047297citythanCA1in somestudies but notothers requires further exploration
524 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
large amplitude weight changes occurred during the learning of schema-consistent
but not schema-inconsistent
information ndash
emulating the
schema-dependent pattern of neocortical plasticity-related gene expression reported in [8] A theo-retical analysis of
multilayer neural
networks makes clear why
themodel exhibits these effects[20]
the analysis
shows
that
the
rate of
learning within
a
multilayered
neural
network of
thetype that
CLS attributes to
the neocortex
[20]
will always
depend
on
the state of knowledge
Box 8 Rapid Integration of New Learning in the Neocortex When Does it Occur
In the event arena paradigm [78] (Figure I) hippocampal lesions prevent acquisition of new schema-consistentassociations By contrast hippocampal lesions performed as little as 48 h after learning leave memory intact Oneexplanation for the crucial but temporary nature of the hippocampal contribution is replay even a
few minutes with the
hippocampus intact couldallowmultiple replays eachone incrementing the strength of intra-neocortical connections Inan investigation of induction of plasticity-related genes in neocortex [8] the hippocampuswas intact for 80minutes afterinitial exposure to the new associations These 1047297ndings raise the broader question of when rapid integration of newlearning into the neocortex occurs and whether it can occur even without a hippocampus
A substantial body of work from several laboratories now supports the view that a single period of sleep can producechanges in how experiences froma single learning session impact on subsequent responding As key examples somestudies have reported increased levelsof linking inferences [175] andothershave reported increased lexical competitionand related phenomena[109176] attributedto a singlesleepsessionThese1047297ndingsare often interpreted asevidenceof rapidsystems-level consolidation (eg [176])
However thematerials used arenot obviously highly consistentwith priorknowledge in most cases and therefore under the CLS framework wewould not expect full integration into neocorticalnetworks in such a short time-period An alternative interpretation (illustrated in [5]) is that replays during sleep increasethe strength robustness and rate of activation of new hippocampus-dependent traces and that such strengtheningmay be suf 1047297cient to account for the observed effects Thus the 1047297ndings are consistent with the view that integration of these new memories into neocortical structures proceeds over a considerably longer time-period
Work with the lsquofast mappingrsquo paradigm in humanswith hippocampal lesions [177] provides another potential source of evidence about rapid neocortical learning of arbitrary new information In this paradigm human participants seepairs of pictures of objects ndash onefamiliar andone unfamiliar ndash and are asked a question such as lsquois thenumbats tail pointing uprsquoinferring that the unfamiliar name lsquonumbatrsquomust refer to the unfamiliar object [177] Some studies 1047297nd that patients withextensive hippocampus damage show retention of the new objectndashname association at a
delayed test [178179]suggesting very rapid neocortical learning even without a hippocampus However the 1047297nding has proven dif 1047297cult toreplicate [180ndash182] future studies should continue to investigate this issue
(A) (B)Original paired associates
1 2
3
4
5 5
4
8
3
7
2
6
Introducon of new paired associates
Figure I Schematic Illustration of the Event Arena Paradigm (A) Overhead view of 16 m 16 m event arenarats are cuedwithone of
six food 1047298avors (eg banana) each associated with a location in thearena (eg location 3) andare required to gofromany of the four start-boxesto a speci1047297c location to retrieve food (B)Following gradual learning of the originalset twonew 1047298avor-placepairs are introduced(eg cinnamonndashlocation7 nutmegndashlocation8) Rapidschema-dependent one-shot learning of these new PAs is observed (see Box text) Figure based on experimental designdescribed in [7]
526 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
allocatedneuronal codes that are non-overlapping or orthogonal (eg [26]) Notably the advantagesof
this
coding
scheme
for
episodic
memory ndash
reduction
of
interference
between
similar
butdistinct
events ndash
may
also
have
signi1047297cant
bene1047297ts for
continual
learning
Speci1047297cally
thismechanism allows the rapid creation of distinct non-interfering representations for multipletasks
to
which
an
agent
has
been
exposed
in
sequential
fashion
The
utility
of
this
function
andthe ubiquity of continual learning is well established in the domain of spatial navigation wherethe
notion
of
a
task
can
be
related
to
that
of
an
environmental
context
rodents
are
able
to
learnand
sustain
robust
representations
of
many
different
environments
(eg
gt10
environments
in[120])
with
each
environment
being
represented
by
a
pattern-separated
representational
space
Box 9 Experience Replay in Deep Q-Networks
Instead of employing a standard online learning method in which each unit of play experience (consisting of a stateaction next state and resulting reward) is used immediately to adjust connection weights and then discarded anexperience replay buffer similar to the hippocampus is used This allows learning based on randomly chosen subsets of
recent experiencesstored in the replay buffer([119] fordetails)to beinterleavedwith ongoing game-play Theapproach isin line with 1047297ndings cited above [66] that hippocampal replay reactivates reward related neurons in striatum in accordwith the hypothesis that hippocampus-dependent RL facilitates learning during off-line periods
Experience replayin theDQN architecturewascrucial in (i)maximizing data ef 1047297ciency allowing each unit of experience tobe reusedin many updates (egmirroringbene1047297ts of repeated time-compressedhippocampal replay) and (ii) smoothingout learning and avoiding unstable response policies that can result from the tendency of the current policy to bias theexperienced samples The approach minimizes learning from consecutive samples which is undesirable owing to theirstrongly correlated nature and inconsistent with the implicit assumptions built into neural-network learning algorithmsInstead experience replay allows updates within the deep Q-network to be performed on non-adjacent samples from aset of recent experiences in a fashion that breaks up these correlations while sti ll relying on relevant statistics Thedramatic advantage of a network implementing interleaved learning through experience replay was illustrated by theeffects of disabling replayon network performance this causeda severedrop in performance to at best30 of whenexperience replay was present [119] Note that the uniform sampling mechanismas implemented treats all transitions inthe replay memory as if they were equal Recent work [183] shows that biasing replay towards signi1047297cant events ndash
speci1047297cally experiences that are associated with high reward prediction errors ndash yields further gains This mechanismwhich resonateswith therole of the hippocampus in reweighting experiences as discussedabove allows information tobe harvested from rare experiences that may be particularly informative
528 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 2 Functional Roles of Subregions of the Medial Temporal Lobes
Work within the CLS framework [27116141] relies on the anatomical and physiological properties of MTL subregionsand the computational insights of others [92526] to characterize the computations performedwithin these structures
Entorhinal Cortex (ERC) Input to the Hippocampal SystemDuring an experience inputs from neocortex produces a pattern of activation in the ERC that may be thought of as acompressed description of the patterns in the contributing cortical areas (Figure I illustrative active neurons in the ERCare shown in blue) ERC neurons give rise to projections to three subregions of the hippocampus proper the dentategyrus (DG)CA1and CA3[2884]
Pattern selection andpattern separation
novel ERCpatternsare thought to activate asmall setof previously uncommitted DGneurons (shownin redndash theseneuronsmaybe relatively youngneurons createdby neurogenesis) These neurons in turn select a random subset of neurons in CA3 via large lsquodetonator synapsesrsquo(shownas reddots on theprojection from DG toCA3) to serve as therepresentationof thememory in CA3 ensuring thatthenew CA3pattern is asdistinct as possible from theCA3 patterns forothermemories includingthose forexperiencessimilar to the new experience (Boxes 3 and4) Pattern completion recurrent connections from the active CA3neuronsonto other active CA3 neurons are strengthened during the experience such that if a subset of the same neurons laterbecomes active the rest of the pattern will be reactivated Direct connections from ERC to CA3 are also strengthenedallowing the ERC input to directly activate the pattern in CA3during retrieval without requiring DG involvement (Box 3)Pattern reinstatement in ERC and neocortex [116141]
The connections from ERC to CA1 and back are thought tochange relatively slowly to allow stable correspondence between patterns in CA1 and ERC Strengthening of connec-tions from the active CA3 neurons to the active CA1 neurons during memory encoding allows this CA1 pattern to be
reactivated when thecorresponding CA3pattern is reactivated the stable connections from CA1 to ERCthen allow theappropriate pattern there to be reactivated and stable connections between ERC andneocortical areas propagate thereactivated ERCpattern to the neocortex Importantlythe bidirectional projectionsbetweenCA1andERCand betweenERC and neocortex support the formation and decoding of invertible CA1 representations of ERC and neocorticalpatternsand allow recurrent computations These connections shouldnot changerapidly given theextendedrole of thehippocampus in memory ndash otherwise reinstatement in the neocortex of memories stored in the hippocampus would bedif 1047297cult [61]
CA3
CA1
DG
ERC
Neocortex Neocortex
Figure
I
Hippocampal
Subregions
Connectivity
and
Representation
Schematic depictions of neurons (withcircular or triangular cell bodies) are shown along with schematic depictions of projections from neurons in an area toneurons in thesameor other areas (greyor colored lines ndash red coloring indicatesprojectionswith highly-plastic synapseswhile grey coloring illustrates relatively less-plastic or stable projections) CA1 output to ERC then propagates out toneocortex ERCandeven resultingneocorticalactivitycan befed back into thehippocampus(broken line)as proposed inthe REMERGE model (see below)
Trendsin CognitiveSciences July 2016 Vol 20 No 7 517
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
hippo-campal representation formed in learning an event affords a way of allowing gradual integrationof
knowledge
of
the
event
into
neocortical
knowledge
structures
This
can
occur
if
the
hippo-campal representation can reactivate or replay the contents of the new experience back to theneocortex
interleaved
with
replay
andor
ongoing
exposure
to
other
experiences
[1]
In
this
waythe
new
experience
becomes
part
of
the
database
of
experiences
that
govern
the
values
of
theconnections
in
the
neocortical
learning
system
[51ndash53] Which
other
memories
are
selected
forinterleaving
with
the
new
experience
remains
an
open
question
Most
simply
the
hippocampusmight
replay
recent
novel
experiences
interleaved
with
all
other
recent
experiences
still
stored
in
Box 3 Pattern Separation and Completion in Different Subregions of the Hippocampus
Pattern separationand completion [25ndash27] are de1047297nedin terms oftransformationsthat affectthe overlap or similarity amongpatterns of neuralactivity [28142] Patternseparationmakes similarpatternsmoredistinct through conjunctivecoding [925] in which each outputneuron respondsonly to a speci1047297c combinationof activeinputneurons Figures IA and IB illustrate how this can occur Pattern separation is thought to be implemented in DG (see Box 4) using higher-order conjunctions that
reduce overlap even more than illustrated in the 1047297gure
Pattern completion is a process that takesa fragmentof a pattern and1047297llsin theremaining features (asin recallinga lion upon seeingthe scenewhere thelionpreviouslyappeared)or that takesa pattern similarto a familiar patternandmakes it evenmore similarto itComputational simulations [27] have shownhowtheCA3region mightcombine featuresof patternseparationand completion such that moderate andhighoverlap results in pattern completion towardthe storedmemory butless overlapresults in thecreationof a newmemory [37133143] (FigureIC)In this account when environmentalinput produces a pattern in ERCsimilar to a previous pattern theCA3outputs a pattern closerto theone it previously used for this ERCpattern [124144] However when theenvironmentproduces an input on theERC that haslowoverlap with patterns stored previously the DG recruits a new statistically independent cell population in CA3 (ie pattern separation [27]) Emerging evidencesuggests that the amountof overlap required forpattern completion (aswell as other characteristics of hippocampal processing) maydifferacross theproximal-distal[145146] anddorsondashventral axes [98147ndash150] of thehippocampus andmay be shapedby neuromodulatory factors(eg Acetylcholine) [85151] Also incompletepatterns require less overlap with a storedpattern than distorted ones for completion to occur so that partial cues will tend to produce completion aswhen oneseesthe watering hole and remembers seeing a lion there previously [27]
Several studies point to differences between theCA3andCA1 regions in how their neural activity patterns respond to changes to the environment [37] broadly theCA1 region tends to mirror the degree of overlap in the inputs from the ERC while CA3 shows more discontinuous responses re1047298ecting either pattern separation or
completion [134152]
Input overlap Input overlap
Paern separaon in DG(A) (B) (C) Separaon and compleon in CA3
O u t p u t o v e r l a p
O u t p u t o v e r l a p
00
1
0
1
1
0
1
Figure I Conjunctive Coding Pattern Separation and Pattern Completion (A) A set of 10 conjunctive unitswithconnections from a layer of 5 input units isshown twicewith differentinputpatternsHere each conjunctive unit detects activity in a distinct pair of input units (arrows)The outputfor each pattern is sparser thanthe input (ie30 vs 60 respectively) andthe twooutputs overlap less than thetwo correspondinginputs (ie33 vs67 respectively overlap is thenumber of activeunitsshared by twopatternsdivided by thenumber of units activein each)DG mayuse higher-order conjunctions magnifying these effects (B)An illustration of the general form of a pattern separation function showing the relationship between input and output overlap Arrows indicate the overlap of the inputs and outputsshown in the left panel (C) The separation-and-completionpro1047297le associated with CA3 where low levels of input overlap are reduced further while higher levels areincreased [2737]
518 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
comple-mentary properties of each of the two component systems allowing new information to berapidly stored in the hippocampus and then slowly integrated into neocortical representations This
process
sometimes
labeled lsquosystems
level
consolidationrsquo
[51] arises
within
the
theoryfrom gradual cortical learning driven by replay of the new information interleaved with otheractivity
to
minimize
disruption
of
existing
knowledge
during
the
integration
of
the
newinformation
Empirical Evidence of Replay
Because
of
its
centrality
in
the
theory
we
highlight
key
empiricalevidence
that
replay
events
really
do
occur
The
data
come
primarily
from
rodents
recordedduring
periods
of
inactivity
(including
sleep)
in
which
hippocampal
neurons
exhibit
large
irregularactivity
(LIA)
patterns
that
are
distinct
from
the
activity
patterns
observed
during
active
states[23]
During
LIA
states
synchronous
discharges
thought
to
be
initiated
in
hippocampal
areaCA3
produce sharp-wave ripples
(SWRs)
which
are
propagated
to
neocortex
SWRs
re1047298ectthe
reactivation
of
recent
experiences
expressed
as
the
sequential
1047297ring
of
so-called
place
cellscells
that 1047297re
when
the
animal
is
at
a
speci1047297c location
[2357ndash59]
These
replay
events
appear
tobe
time-compressed
by
a
factor
of
about
20
bringing
neuronal
spikes
that
were
well-separatedin
time
during
an
actual
experience
into
a
time-window
that
enhances
synaptic
plasticity
both
Box 4 Sparse Conjunctive Coding and Pattern Separation in the Dentate Gyrus
Neuronal codes range from the extreme of localist codes ndash where neurons respond highly selectively to single entities(lsquograndmother cellsrsquo) to dense distributedcodeswhere items arecoded through theactivity ofmany (eg 50) neuronsin
an area [153154]
While localist codes minimize interference andare easily decodable they are inef 1047297cient in terms of
representational capacity By contrast densedistributed codesare capacity-ef 1047297cient however they are costly in termsof metabolic cost and relatively dif 1047297cult to decode These are endpoints on a continuumquanti1047297ed by a measure calledsparsity where lsquopopulationrsquo sparsity indexes theproportion of neurons that 1047297re in response to a given stimuluslocationand lsquolifetimersquo sparsity indexes the proportion of stimuli to which a single neuron responds [26153155] For example apopulationsparsity of
1meansthatonly 1of the neuronsin a
populationare activein representinga given inputTworandomly selected sparsepatternstend tohave lowoverlap (for tworandomlyselectedpatternsof equalsparsity over thesame setof neurons theaverageproportion of neuronsin eitherpattern that is active in theotheris equal to thesparsity)but neurons still participate in several different memories making them more ef 1047297cient than localist codes Despitevariability in estimatesof thesparsity ofa givenbrain region [27153156157] theDG iswidelybelievedto sustain amongthe sparsest neural code in the brain (05ndash1 population sparseness) [25ndash27] The CA3 region to which the DGprojects is thought to be less sparse (25 [47])
Many studies 1047297nd less-sparse patterns in CA1 than CA3 [134152]
The unique functional and anatomicalproperties of the DG suggest the origins of its sparse pattern-separated code Theperforant pathfromtheERC (containing200000neurons intherodent)projects toa layerof 1millionofDGgranulecellsCombinedwith thehigh levels of inhibition in theDG this supports theformation of highlysparse conjunctive representa-tions such that each neuron in DG responds only when several input neurons aresimultaneouslyactive reducing overlapbetweensimilar input patterns [25ndash27136] Evidencealso suggests thatnew DGneuronsarisefromstemcells throughoutadult lifethesenewneuronsmaybe preferentially recruitedin theformation ofmemories[136] further reducingoverlapwithpreviouslystored
memoriesTheCA3pattern fora memoryis then selectedby theactiveDG neurons eachofwhichhas alsquodetonatorrsquo synapse to15 randomly selectedCA3neurons This process helpsminimize theoverlap of CA3patterns fordifferent memories increasing storage capacity and minimizing interference between them even if the two memoriesrepresentsimilar events thathavehighlyoverlappingpatternsin neocortex andERCEmpiricalevidenceprovidessupport forthis with one study [137] showing that the representation supported by DGwashighly sensitive to small changes in theenvironmentdespiteevidence thatincominginputsfrom theERCwere little affected(alsosee [133145])
FurthermoreDGlesions impairananimalsrsquo abilitytolearntoresponddifferentlyintwoverysimilarenvironmentswhileleavingtheabilitytolearnto respond differently in two environments that are not similar [136]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 519
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 5 Similarity-Based Coding in High-Level Visual Cortex
High-level visual regions of the neocortex are thought to support distributed representations that are inferred to be lesssparsethan those of theDG andthe CA3CA1 regions of thehippocampus (Box4) Populationsparseness in theERC isestimatedat 7ndash10 [158]
with high-level sensory cortices exhibitingsimilar or higher levels of sparseness (eg variable
estimates [44ndash46]) Although lifetime sparseness does not directly translate to population sparseness recent evidencesuggests that V4and inferotemporal cortex(ITc)havea sparsenessof 10on this measure [159] It isworth notingthatlearning ratesmay vary according to neuronal selectivity andlifetime sparseness resultingin differences in learning ratesacross neocortical areasand hippocampal subregionsNeurons in early visual regions that encode frequently-occurringfeatures (ie edges)mayhave a relatively slow learning rate while neurons in higher visual regions andbeyond (eg ITcand perirhinal cortex) may have a higher learning rate to support the encoding of less-frequently occurring more-conjunctive features (eg individual objects) [12160161]
Evidence from electrophysiological recording studies in high-level visual cortical regions such as the ITc in primatesprovides support for the operation of a similarity-based coding scheme ndash whereby related categories (eg dogs andcats) are represented by overlapping neuronal codes [1740ndash43] (Figure I) Representational similarity analysis (RSA) of the ITc population response duringpassive viewing of pictures reveals codingof 1047297ne-grained categorical structure (egof a set of animate and inanimate objects) ndash that iswell 1047297t by deep convolutional neural networks which have algorithmicparallels with feedforward processing in the ventral visual stream [1740] While analogous similarity-based coding wasobserved using fMRI in the human homolog of ITc [41] there wasno evidence for greater within-category (cf between-category) representational similarity in any subregion of the hippocampus in a recent fMRI study [162] which foundevidence consistent with the importance of pattern separation in episodic memory Instead similarity-based coding inthis studywasobservedin theperirhinal andparahippocampal cortexndashMTL regionsthatproject tothe ERC and thataretypically considered to be intermediate zones (ie between the hippocampal and neocortical systems) in CLS theory
Dissimilarity
[percenle of 1 ndash r ]0 100
Monkey ITc Human ITc
AnimateNaturalNot human
Body Fa ce B ody FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
AnimateNaturalNot human
Bo dy Face Body FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
Figure I Similarity-Based Coding in High-Level Visual Cortex Representational dissimilarity matrices (RDM)re1047298ect the correlation (ie 1 r where r is the Pearson correlation coef 1047297cient) between the response of voxel patterns(fMRI in humans [41] right panel) or neuronal populations (electrophysiological recording in monkey [43]
left panel) to a
set of 92 object images RDMs are analogous in monkey and human ITc The RDMs show that the representations of animate objects are similar as are those of inanimate objects In addition to this clear animatendashinanimate distinctionobject coding in ITc exhibits 1047297ner categorical structure (eg for faces body parts) visible in these RDMs (also see [41])Reproduced with permission from [41]
520 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
rodents [7273] This generalized replayndash simultaneous reactivation of multiple related traces during testing or of 1047298ine periods ndash mayfacilitate the creation
of
new representations
f rom the
recombination
of
multiple relatedepisodes (lsquostored generalizationsrsquo) [5] and the discovery of novel relationships (eg shortcuts)[7273]
Empirical
evidencealsosupports
a
roleforthehippocampusin
category-
and
so-calledlsquostatisticalrsquo
learning [105ndash107] the mechanisms in
REMERGE and other
related modelsthat
rely on
separate memory
traces for individual
i tems allow weak hippocampal
tracesthat
support
only relat ively poor item recognition to
mediate
near-normal generalization[5108]
Box 6 Generalization Through Recurrence in the Hippocampal System
The REMERGEmodel (FigureI ) [5] which re1047298ects a synthesisof interactive activationand competition (IAC)models [163]and exemplar models of memory [108164165] constitutes an abstraction and simpli1047297cation of the multi-stagecircuitry of the hippocampal systeminto twoprincipal layers feature andconjunctivelayers broadly corresponding to the
ERC and hippocampus proper respectively The localist coding (eg unit AB) in the conjunctive layer re1047298ects anidealization of the sparsely distributed pattern-separated codes in the DGCA3 subregions of the hippocampus (Boxes2ndash4) that support episodic memory (eg for trials involving presentation of A and B objects together)
An essential principle of the model ndash mediated by the bidirectional excitatory connections between feature andconjunctive layers ndash is the principle of recurrence between the hippocampus proper and neocortical regions suchas the ERC (termed lsquobig-looprsquo recurrence to distinguish it from the internal recurrence known to exist within the CA3region) This allows recirculation of network output as a subsequent input to the system Intuitively this functionality iscrucial to allowing the model to discover the higher-order structure present within a
set of related episodes an initialprobe on the feature layer (eg denoting stimuli present on screen during a test trial) prompts the activation of experiences containing these elements on the conjunctive layer which in turn drives a new pattern of feature layeractivity that re1047298ects not only the external input but also the content of retrieved experiences This in turn leads to theactivation of conjunctive units denoting experiences related to the new feature layer pattern and so on This can bringabout a situation where for example the presentation of A and C can result in the activationof AB and BC which jointlyactivate B in turn further activating AB andBC which then suppress other conjuncts involvingA andC This produces astable state in which AB BC and A B and C are al l act ivated at the same time ndash thereby effectively inferring a link between A andC Longer-rangeinferences (egBndashE) canalsobe supportedby therecurrent mechanism([5] for details)Formally the function of the network can be viewed as carrying out recurrent similarity computation Unlike otherexemplar models [108164165] in which similarity computation is performed only on external inputs REMERGEperforms such computations on inputs affected by its own outputs
Conjuncve
Feature
AB
A B C D E F
BC CD DE EF
Figure I A Schematic of the Architecture of REMERGE Recurrent architecture of REMERGE showing its two-layer architecture with inputoutput units for possible constituents of experiences (A ndashF) conjunctive units representingpairs of constituents that have occurred together (AB BC etc) bidirectional connections (broken arrows) betweenconjuncts and their constituents and recurrent inhibition (broad arrow) among conjunctive units Adapted from [5]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 523
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
inferenceparadigm [590110]) Such representations then become the contents of episodic memorysubject
to
storage
in
the
hippocampus
The
distinction
between
encoding-
and
retrieval-based
models
can
be
related
more
broadly
tothe
1047297nding
of lsquoconceptrsquo cells
hippocampal
neurons
which
come
to
respond
to
common
featuresacross many events for example cells for speci1047297c odors [111] time-points within an episode[112]
attributes
of
a
task
[113]
and
even
cells
that
1047297re
to
any
picture
or
the
name
of
a
famousperson
[114]
In
Box
7
we
review
empirical 1047297ndings
concerning
concept
cells
and
pattern
overlap
sometimes observed in parts of hippocampus and consider how well these 1047297ndings 1047297t within theperspective
that
the
hippocampus
supports
pattern
separation
Rapid
Schema-Dependent
ConsolidationIt
is
useful
to
distinguish
systems-level
consolidation
from
what
we
refer
to
as
within-systemconsolidation
The
former
refers
to
the
gradual
integration
of
knowledge
into
neocortical
circuitswhile
the
latter
denotes
stabilization
of
recently
formed
memories
within
the
hippocampusperhaps
through
stabilization
of
synapses
among
hippocampal
neurons
[89] In
the
initialformulation
of
CLS
systems-level
consolidation
was
viewed
as
temporally
extended
(egspanning
years
or
even
decades
in
humans
[3451ndash53])
Although
it
was
noted
in
[1]
thatthe
timeframe
could
be
highly
variable
(depending
perhaps
on
the
rate
of
replay
of
memory
Box 7 Concept Cells and Nodal CodingsReports of concept cells in thehippocampushavebeen takenas contradictinga tenet ofCLStheorybut theexistence of such neurons is notnecessarilyinconsistentwith itgiven that thetheoryexpects differenthippocampalregions to vary interms of contextspeci1047297city andalso permits variationwithin hippocampal regions (Box 3) Evidence supporting theCLSprediction of context-speci1047297city in theCA3and DGcomes from a recent intracranial recording study in humans [166] Inthis study neurons in CA3DG andalso in the subiculum tended to discriminate between different imagesof a famousperson ndash with responses correlating with successful performance in a recognition memory task that required discri-minating previously experienced targets from similar lures Neurons in other MTL areas (ie entorhinal and parahippo-campal cortices) exhibitedmore invariant lsquoconcept cell likersquo responses that were not linked tomemory performance (theCA1 subregion was sparsely sampled in this study)
It is also interesting to consider the1047297ndingof lsquosplitterrsquo cells in a task where animalsmust alternatebetween turning left andright on successive trials in a T maze [167ndash179] here someCA1 and CA3 place cellsfor locations onthe central stemof the T maze are modulated by the trajectory of the rat (eg whether it will subsequently turn left or right) whereas othersare
trajectory-independentThisphenomenon knownas partial remapping [48170ndash172] is consistent with theidea that
pattern separation is a matter of
degree in our theory [2737] As such we should expectpartly overlapping representa-tions (ie ratherthan fully independent lsquochartsrsquo [121]) whenenvironmental changes are suf 1047297ciently small (Box3)We alsoexpectthe greatest differentiationin DGand at an early point in learningTo ourknowledge no studies have yetrecordedfrom DG in this paradigm
In a recent study representational similarity analysis techniques [173] were applied to ensemble recordingdata collectedwhile rats performed a context-guided rewarddiscrimination task [113] As expected the population codes in CA3 andCA1were dominatedby context andplace coding although other task dimensions ndash reward value and item ndashwere alsorepresented [113] (also see [174]) Although there was some representational overlap across locations based on valueand item CA3CA1 codes were consistent with incomplete but still strong pattern separation especially in the dorsalhippocampus Overall these 1047297ndings appear consistent with the CLS with the provision that pattern separation is amatter of degree andmay vary by task andregionWhyCA3 showsgreater speci1047297citythanCA1in somestudies but notothers requires further exploration
524 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
large amplitude weight changes occurred during the learning of schema-consistent
but not schema-inconsistent
information ndash
emulating the
schema-dependent pattern of neocortical plasticity-related gene expression reported in [8] A theo-retical analysis of
multilayer neural
networks makes clear why
themodel exhibits these effects[20]
the analysis
shows
that
the
rate of
learning within
a
multilayered
neural
network of
thetype that
CLS attributes to
the neocortex
[20]
will always
depend
on
the state of knowledge
Box 8 Rapid Integration of New Learning in the Neocortex When Does it Occur
In the event arena paradigm [78] (Figure I) hippocampal lesions prevent acquisition of new schema-consistentassociations By contrast hippocampal lesions performed as little as 48 h after learning leave memory intact Oneexplanation for the crucial but temporary nature of the hippocampal contribution is replay even a
few minutes with the
hippocampus intact couldallowmultiple replays eachone incrementing the strength of intra-neocortical connections Inan investigation of induction of plasticity-related genes in neocortex [8] the hippocampuswas intact for 80minutes afterinitial exposure to the new associations These 1047297ndings raise the broader question of when rapid integration of newlearning into the neocortex occurs and whether it can occur even without a hippocampus
A substantial body of work from several laboratories now supports the view that a single period of sleep can producechanges in how experiences froma single learning session impact on subsequent responding As key examples somestudies have reported increased levelsof linking inferences [175] andothershave reported increased lexical competitionand related phenomena[109176] attributedto a singlesleepsessionThese1047297ndingsare often interpreted asevidenceof rapidsystems-level consolidation (eg [176])
However thematerials used arenot obviously highly consistentwith priorknowledge in most cases and therefore under the CLS framework wewould not expect full integration into neocorticalnetworks in such a short time-period An alternative interpretation (illustrated in [5]) is that replays during sleep increasethe strength robustness and rate of activation of new hippocampus-dependent traces and that such strengtheningmay be suf 1047297cient to account for the observed effects Thus the 1047297ndings are consistent with the view that integration of these new memories into neocortical structures proceeds over a considerably longer time-period
Work with the lsquofast mappingrsquo paradigm in humanswith hippocampal lesions [177] provides another potential source of evidence about rapid neocortical learning of arbitrary new information In this paradigm human participants seepairs of pictures of objects ndash onefamiliar andone unfamiliar ndash and are asked a question such as lsquois thenumbats tail pointing uprsquoinferring that the unfamiliar name lsquonumbatrsquomust refer to the unfamiliar object [177] Some studies 1047297nd that patients withextensive hippocampus damage show retention of the new objectndashname association at a
delayed test [178179]suggesting very rapid neocortical learning even without a hippocampus However the 1047297nding has proven dif 1047297cult toreplicate [180ndash182] future studies should continue to investigate this issue
(A) (B)Original paired associates
1 2
3
4
5 5
4
8
3
7
2
6
Introducon of new paired associates
Figure I Schematic Illustration of the Event Arena Paradigm (A) Overhead view of 16 m 16 m event arenarats are cuedwithone of
six food 1047298avors (eg banana) each associated with a location in thearena (eg location 3) andare required to gofromany of the four start-boxesto a speci1047297c location to retrieve food (B)Following gradual learning of the originalset twonew 1047298avor-placepairs are introduced(eg cinnamonndashlocation7 nutmegndashlocation8) Rapidschema-dependent one-shot learning of these new PAs is observed (see Box text) Figure based on experimental designdescribed in [7]
526 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
allocatedneuronal codes that are non-overlapping or orthogonal (eg [26]) Notably the advantagesof
this
coding
scheme
for
episodic
memory ndash
reduction
of
interference
between
similar
butdistinct
events ndash
may
also
have
signi1047297cant
bene1047297ts for
continual
learning
Speci1047297cally
thismechanism allows the rapid creation of distinct non-interfering representations for multipletasks
to
which
an
agent
has
been
exposed
in
sequential
fashion
The
utility
of
this
function
andthe ubiquity of continual learning is well established in the domain of spatial navigation wherethe
notion
of
a
task
can
be
related
to
that
of
an
environmental
context
rodents
are
able
to
learnand
sustain
robust
representations
of
many
different
environments
(eg
gt10
environments
in[120])
with
each
environment
being
represented
by
a
pattern-separated
representational
space
Box 9 Experience Replay in Deep Q-Networks
Instead of employing a standard online learning method in which each unit of play experience (consisting of a stateaction next state and resulting reward) is used immediately to adjust connection weights and then discarded anexperience replay buffer similar to the hippocampus is used This allows learning based on randomly chosen subsets of
recent experiencesstored in the replay buffer([119] fordetails)to beinterleavedwith ongoing game-play Theapproach isin line with 1047297ndings cited above [66] that hippocampal replay reactivates reward related neurons in striatum in accordwith the hypothesis that hippocampus-dependent RL facilitates learning during off-line periods
Experience replayin theDQN architecturewascrucial in (i)maximizing data ef 1047297ciency allowing each unit of experience tobe reusedin many updates (egmirroringbene1047297ts of repeated time-compressedhippocampal replay) and (ii) smoothingout learning and avoiding unstable response policies that can result from the tendency of the current policy to bias theexperienced samples The approach minimizes learning from consecutive samples which is undesirable owing to theirstrongly correlated nature and inconsistent with the implicit assumptions built into neural-network learning algorithmsInstead experience replay allows updates within the deep Q-network to be performed on non-adjacent samples from aset of recent experiences in a fashion that breaks up these correlations while sti ll relying on relevant statistics Thedramatic advantage of a network implementing interleaved learning through experience replay was illustrated by theeffects of disabling replayon network performance this causeda severedrop in performance to at best30 of whenexperience replay was present [119] Note that the uniform sampling mechanismas implemented treats all transitions inthe replay memory as if they were equal Recent work [183] shows that biasing replay towards signi1047297cant events ndash
speci1047297cally experiences that are associated with high reward prediction errors ndash yields further gains This mechanismwhich resonateswith therole of the hippocampus in reweighting experiences as discussedabove allows information tobe harvested from rare experiences that may be particularly informative
528 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
hippo-campal representation formed in learning an event affords a way of allowing gradual integrationof
knowledge
of
the
event
into
neocortical
knowledge
structures
This
can
occur
if
the
hippo-campal representation can reactivate or replay the contents of the new experience back to theneocortex
interleaved
with
replay
andor
ongoing
exposure
to
other
experiences
[1]
In
this
waythe
new
experience
becomes
part
of
the
database
of
experiences
that
govern
the
values
of
theconnections
in
the
neocortical
learning
system
[51ndash53] Which
other
memories
are
selected
forinterleaving
with
the
new
experience
remains
an
open
question
Most
simply
the
hippocampusmight
replay
recent
novel
experiences
interleaved
with
all
other
recent
experiences
still
stored
in
Box 3 Pattern Separation and Completion in Different Subregions of the Hippocampus
Pattern separationand completion [25ndash27] are de1047297nedin terms oftransformationsthat affectthe overlap or similarity amongpatterns of neuralactivity [28142] Patternseparationmakes similarpatternsmoredistinct through conjunctivecoding [925] in which each outputneuron respondsonly to a speci1047297c combinationof activeinputneurons Figures IA and IB illustrate how this can occur Pattern separation is thought to be implemented in DG (see Box 4) using higher-order conjunctions that
reduce overlap even more than illustrated in the 1047297gure
Pattern completion is a process that takesa fragmentof a pattern and1047297llsin theremaining features (asin recallinga lion upon seeingthe scenewhere thelionpreviouslyappeared)or that takesa pattern similarto a familiar patternandmakes it evenmore similarto itComputational simulations [27] have shownhowtheCA3region mightcombine featuresof patternseparationand completion such that moderate andhighoverlap results in pattern completion towardthe storedmemory butless overlapresults in thecreationof a newmemory [37133143] (FigureIC)In this account when environmentalinput produces a pattern in ERCsimilar to a previous pattern theCA3outputs a pattern closerto theone it previously used for this ERCpattern [124144] However when theenvironmentproduces an input on theERC that haslowoverlap with patterns stored previously the DG recruits a new statistically independent cell population in CA3 (ie pattern separation [27]) Emerging evidencesuggests that the amountof overlap required forpattern completion (aswell as other characteristics of hippocampal processing) maydifferacross theproximal-distal[145146] anddorsondashventral axes [98147ndash150] of thehippocampus andmay be shapedby neuromodulatory factors(eg Acetylcholine) [85151] Also incompletepatterns require less overlap with a storedpattern than distorted ones for completion to occur so that partial cues will tend to produce completion aswhen oneseesthe watering hole and remembers seeing a lion there previously [27]
Several studies point to differences between theCA3andCA1 regions in how their neural activity patterns respond to changes to the environment [37] broadly theCA1 region tends to mirror the degree of overlap in the inputs from the ERC while CA3 shows more discontinuous responses re1047298ecting either pattern separation or
completion [134152]
Input overlap Input overlap
Paern separaon in DG(A) (B) (C) Separaon and compleon in CA3
O u t p u t o v e r l a p
O u t p u t o v e r l a p
00
1
0
1
1
0
1
Figure I Conjunctive Coding Pattern Separation and Pattern Completion (A) A set of 10 conjunctive unitswithconnections from a layer of 5 input units isshown twicewith differentinputpatternsHere each conjunctive unit detects activity in a distinct pair of input units (arrows)The outputfor each pattern is sparser thanthe input (ie30 vs 60 respectively) andthe twooutputs overlap less than thetwo correspondinginputs (ie33 vs67 respectively overlap is thenumber of activeunitsshared by twopatternsdivided by thenumber of units activein each)DG mayuse higher-order conjunctions magnifying these effects (B)An illustration of the general form of a pattern separation function showing the relationship between input and output overlap Arrows indicate the overlap of the inputs and outputsshown in the left panel (C) The separation-and-completionpro1047297le associated with CA3 where low levels of input overlap are reduced further while higher levels areincreased [2737]
518 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
comple-mentary properties of each of the two component systems allowing new information to berapidly stored in the hippocampus and then slowly integrated into neocortical representations This
process
sometimes
labeled lsquosystems
level
consolidationrsquo
[51] arises
within
the
theoryfrom gradual cortical learning driven by replay of the new information interleaved with otheractivity
to
minimize
disruption
of
existing
knowledge
during
the
integration
of
the
newinformation
Empirical Evidence of Replay
Because
of
its
centrality
in
the
theory
we
highlight
key
empiricalevidence
that
replay
events
really
do
occur
The
data
come
primarily
from
rodents
recordedduring
periods
of
inactivity
(including
sleep)
in
which
hippocampal
neurons
exhibit
large
irregularactivity
(LIA)
patterns
that
are
distinct
from
the
activity
patterns
observed
during
active
states[23]
During
LIA
states
synchronous
discharges
thought
to
be
initiated
in
hippocampal
areaCA3
produce sharp-wave ripples
(SWRs)
which
are
propagated
to
neocortex
SWRs
re1047298ectthe
reactivation
of
recent
experiences
expressed
as
the
sequential
1047297ring
of
so-called
place
cellscells
that 1047297re
when
the
animal
is
at
a
speci1047297c location
[2357ndash59]
These
replay
events
appear
tobe
time-compressed
by
a
factor
of
about
20
bringing
neuronal
spikes
that
were
well-separatedin
time
during
an
actual
experience
into
a
time-window
that
enhances
synaptic
plasticity
both
Box 4 Sparse Conjunctive Coding and Pattern Separation in the Dentate Gyrus
Neuronal codes range from the extreme of localist codes ndash where neurons respond highly selectively to single entities(lsquograndmother cellsrsquo) to dense distributedcodeswhere items arecoded through theactivity ofmany (eg 50) neuronsin
an area [153154]
While localist codes minimize interference andare easily decodable they are inef 1047297cient in terms of
representational capacity By contrast densedistributed codesare capacity-ef 1047297cient however they are costly in termsof metabolic cost and relatively dif 1047297cult to decode These are endpoints on a continuumquanti1047297ed by a measure calledsparsity where lsquopopulationrsquo sparsity indexes theproportion of neurons that 1047297re in response to a given stimuluslocationand lsquolifetimersquo sparsity indexes the proportion of stimuli to which a single neuron responds [26153155] For example apopulationsparsity of
1meansthatonly 1of the neuronsin a
populationare activein representinga given inputTworandomly selected sparsepatternstend tohave lowoverlap (for tworandomlyselectedpatternsof equalsparsity over thesame setof neurons theaverageproportion of neuronsin eitherpattern that is active in theotheris equal to thesparsity)but neurons still participate in several different memories making them more ef 1047297cient than localist codes Despitevariability in estimatesof thesparsity ofa givenbrain region [27153156157] theDG iswidelybelievedto sustain amongthe sparsest neural code in the brain (05ndash1 population sparseness) [25ndash27] The CA3 region to which the DGprojects is thought to be less sparse (25 [47])
Many studies 1047297nd less-sparse patterns in CA1 than CA3 [134152]
The unique functional and anatomicalproperties of the DG suggest the origins of its sparse pattern-separated code Theperforant pathfromtheERC (containing200000neurons intherodent)projects toa layerof 1millionofDGgranulecellsCombinedwith thehigh levels of inhibition in theDG this supports theformation of highlysparse conjunctive representa-tions such that each neuron in DG responds only when several input neurons aresimultaneouslyactive reducing overlapbetweensimilar input patterns [25ndash27136] Evidencealso suggests thatnew DGneuronsarisefromstemcells throughoutadult lifethesenewneuronsmaybe preferentially recruitedin theformation ofmemories[136] further reducingoverlapwithpreviouslystored
memoriesTheCA3pattern fora memoryis then selectedby theactiveDG neurons eachofwhichhas alsquodetonatorrsquo synapse to15 randomly selectedCA3neurons This process helpsminimize theoverlap of CA3patterns fordifferent memories increasing storage capacity and minimizing interference between them even if the two memoriesrepresentsimilar events thathavehighlyoverlappingpatternsin neocortex andERCEmpiricalevidenceprovidessupport forthis with one study [137] showing that the representation supported by DGwashighly sensitive to small changes in theenvironmentdespiteevidence thatincominginputsfrom theERCwere little affected(alsosee [133145])
FurthermoreDGlesions impairananimalsrsquo abilitytolearntoresponddifferentlyintwoverysimilarenvironmentswhileleavingtheabilitytolearnto respond differently in two environments that are not similar [136]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 519
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 5 Similarity-Based Coding in High-Level Visual Cortex
High-level visual regions of the neocortex are thought to support distributed representations that are inferred to be lesssparsethan those of theDG andthe CA3CA1 regions of thehippocampus (Box4) Populationsparseness in theERC isestimatedat 7ndash10 [158]
with high-level sensory cortices exhibitingsimilar or higher levels of sparseness (eg variable
estimates [44ndash46]) Although lifetime sparseness does not directly translate to population sparseness recent evidencesuggests that V4and inferotemporal cortex(ITc)havea sparsenessof 10on this measure [159] It isworth notingthatlearning ratesmay vary according to neuronal selectivity andlifetime sparseness resultingin differences in learning ratesacross neocortical areasand hippocampal subregionsNeurons in early visual regions that encode frequently-occurringfeatures (ie edges)mayhave a relatively slow learning rate while neurons in higher visual regions andbeyond (eg ITcand perirhinal cortex) may have a higher learning rate to support the encoding of less-frequently occurring more-conjunctive features (eg individual objects) [12160161]
Evidence from electrophysiological recording studies in high-level visual cortical regions such as the ITc in primatesprovides support for the operation of a similarity-based coding scheme ndash whereby related categories (eg dogs andcats) are represented by overlapping neuronal codes [1740ndash43] (Figure I) Representational similarity analysis (RSA) of the ITc population response duringpassive viewing of pictures reveals codingof 1047297ne-grained categorical structure (egof a set of animate and inanimate objects) ndash that iswell 1047297t by deep convolutional neural networks which have algorithmicparallels with feedforward processing in the ventral visual stream [1740] While analogous similarity-based coding wasobserved using fMRI in the human homolog of ITc [41] there wasno evidence for greater within-category (cf between-category) representational similarity in any subregion of the hippocampus in a recent fMRI study [162] which foundevidence consistent with the importance of pattern separation in episodic memory Instead similarity-based coding inthis studywasobservedin theperirhinal andparahippocampal cortexndashMTL regionsthatproject tothe ERC and thataretypically considered to be intermediate zones (ie between the hippocampal and neocortical systems) in CLS theory
Dissimilarity
[percenle of 1 ndash r ]0 100
Monkey ITc Human ITc
AnimateNaturalNot human
Body Fa ce B ody FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
AnimateNaturalNot human
Bo dy Face Body FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
Figure I Similarity-Based Coding in High-Level Visual Cortex Representational dissimilarity matrices (RDM)re1047298ect the correlation (ie 1 r where r is the Pearson correlation coef 1047297cient) between the response of voxel patterns(fMRI in humans [41] right panel) or neuronal populations (electrophysiological recording in monkey [43]
left panel) to a
set of 92 object images RDMs are analogous in monkey and human ITc The RDMs show that the representations of animate objects are similar as are those of inanimate objects In addition to this clear animatendashinanimate distinctionobject coding in ITc exhibits 1047297ner categorical structure (eg for faces body parts) visible in these RDMs (also see [41])Reproduced with permission from [41]
520 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
rodents [7273] This generalized replayndash simultaneous reactivation of multiple related traces during testing or of 1047298ine periods ndash mayfacilitate the creation
of
new representations
f rom the
recombination
of
multiple relatedepisodes (lsquostored generalizationsrsquo) [5] and the discovery of novel relationships (eg shortcuts)[7273]
Empirical
evidencealsosupports
a
roleforthehippocampusin
category-
and
so-calledlsquostatisticalrsquo
learning [105ndash107] the mechanisms in
REMERGE and other
related modelsthat
rely on
separate memory
traces for individual
i tems allow weak hippocampal
tracesthat
support
only relat ively poor item recognition to
mediate
near-normal generalization[5108]
Box 6 Generalization Through Recurrence in the Hippocampal System
The REMERGEmodel (FigureI ) [5] which re1047298ects a synthesisof interactive activationand competition (IAC)models [163]and exemplar models of memory [108164165] constitutes an abstraction and simpli1047297cation of the multi-stagecircuitry of the hippocampal systeminto twoprincipal layers feature andconjunctivelayers broadly corresponding to the
ERC and hippocampus proper respectively The localist coding (eg unit AB) in the conjunctive layer re1047298ects anidealization of the sparsely distributed pattern-separated codes in the DGCA3 subregions of the hippocampus (Boxes2ndash4) that support episodic memory (eg for trials involving presentation of A and B objects together)
An essential principle of the model ndash mediated by the bidirectional excitatory connections between feature andconjunctive layers ndash is the principle of recurrence between the hippocampus proper and neocortical regions suchas the ERC (termed lsquobig-looprsquo recurrence to distinguish it from the internal recurrence known to exist within the CA3region) This allows recirculation of network output as a subsequent input to the system Intuitively this functionality iscrucial to allowing the model to discover the higher-order structure present within a
set of related episodes an initialprobe on the feature layer (eg denoting stimuli present on screen during a test trial) prompts the activation of experiences containing these elements on the conjunctive layer which in turn drives a new pattern of feature layeractivity that re1047298ects not only the external input but also the content of retrieved experiences This in turn leads to theactivation of conjunctive units denoting experiences related to the new feature layer pattern and so on This can bringabout a situation where for example the presentation of A and C can result in the activationof AB and BC which jointlyactivate B in turn further activating AB andBC which then suppress other conjuncts involvingA andC This produces astable state in which AB BC and A B and C are al l act ivated at the same time ndash thereby effectively inferring a link between A andC Longer-rangeinferences (egBndashE) canalsobe supportedby therecurrent mechanism([5] for details)Formally the function of the network can be viewed as carrying out recurrent similarity computation Unlike otherexemplar models [108164165] in which similarity computation is performed only on external inputs REMERGEperforms such computations on inputs affected by its own outputs
Conjuncve
Feature
AB
A B C D E F
BC CD DE EF
Figure I A Schematic of the Architecture of REMERGE Recurrent architecture of REMERGE showing its two-layer architecture with inputoutput units for possible constituents of experiences (A ndashF) conjunctive units representingpairs of constituents that have occurred together (AB BC etc) bidirectional connections (broken arrows) betweenconjuncts and their constituents and recurrent inhibition (broad arrow) among conjunctive units Adapted from [5]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 523
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
inferenceparadigm [590110]) Such representations then become the contents of episodic memorysubject
to
storage
in
the
hippocampus
The
distinction
between
encoding-
and
retrieval-based
models
can
be
related
more
broadly
tothe
1047297nding
of lsquoconceptrsquo cells
hippocampal
neurons
which
come
to
respond
to
common
featuresacross many events for example cells for speci1047297c odors [111] time-points within an episode[112]
attributes
of
a
task
[113]
and
even
cells
that
1047297re
to
any
picture
or
the
name
of
a
famousperson
[114]
In
Box
7
we
review
empirical 1047297ndings
concerning
concept
cells
and
pattern
overlap
sometimes observed in parts of hippocampus and consider how well these 1047297ndings 1047297t within theperspective
that
the
hippocampus
supports
pattern
separation
Rapid
Schema-Dependent
ConsolidationIt
is
useful
to
distinguish
systems-level
consolidation
from
what
we
refer
to
as
within-systemconsolidation
The
former
refers
to
the
gradual
integration
of
knowledge
into
neocortical
circuitswhile
the
latter
denotes
stabilization
of
recently
formed
memories
within
the
hippocampusperhaps
through
stabilization
of
synapses
among
hippocampal
neurons
[89] In
the
initialformulation
of
CLS
systems-level
consolidation
was
viewed
as
temporally
extended
(egspanning
years
or
even
decades
in
humans
[3451ndash53])
Although
it
was
noted
in
[1]
thatthe
timeframe
could
be
highly
variable
(depending
perhaps
on
the
rate
of
replay
of
memory
Box 7 Concept Cells and Nodal CodingsReports of concept cells in thehippocampushavebeen takenas contradictinga tenet ofCLStheorybut theexistence of such neurons is notnecessarilyinconsistentwith itgiven that thetheoryexpects differenthippocampalregions to vary interms of contextspeci1047297city andalso permits variationwithin hippocampal regions (Box 3) Evidence supporting theCLSprediction of context-speci1047297city in theCA3and DGcomes from a recent intracranial recording study in humans [166] Inthis study neurons in CA3DG andalso in the subiculum tended to discriminate between different imagesof a famousperson ndash with responses correlating with successful performance in a recognition memory task that required discri-minating previously experienced targets from similar lures Neurons in other MTL areas (ie entorhinal and parahippo-campal cortices) exhibitedmore invariant lsquoconcept cell likersquo responses that were not linked tomemory performance (theCA1 subregion was sparsely sampled in this study)
It is also interesting to consider the1047297ndingof lsquosplitterrsquo cells in a task where animalsmust alternatebetween turning left andright on successive trials in a T maze [167ndash179] here someCA1 and CA3 place cellsfor locations onthe central stemof the T maze are modulated by the trajectory of the rat (eg whether it will subsequently turn left or right) whereas othersare
trajectory-independentThisphenomenon knownas partial remapping [48170ndash172] is consistent with theidea that
pattern separation is a matter of
degree in our theory [2737] As such we should expectpartly overlapping representa-tions (ie ratherthan fully independent lsquochartsrsquo [121]) whenenvironmental changes are suf 1047297ciently small (Box3)We alsoexpectthe greatest differentiationin DGand at an early point in learningTo ourknowledge no studies have yetrecordedfrom DG in this paradigm
In a recent study representational similarity analysis techniques [173] were applied to ensemble recordingdata collectedwhile rats performed a context-guided rewarddiscrimination task [113] As expected the population codes in CA3 andCA1were dominatedby context andplace coding although other task dimensions ndash reward value and item ndashwere alsorepresented [113] (also see [174]) Although there was some representational overlap across locations based on valueand item CA3CA1 codes were consistent with incomplete but still strong pattern separation especially in the dorsalhippocampus Overall these 1047297ndings appear consistent with the CLS with the provision that pattern separation is amatter of degree andmay vary by task andregionWhyCA3 showsgreater speci1047297citythanCA1in somestudies but notothers requires further exploration
524 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
large amplitude weight changes occurred during the learning of schema-consistent
but not schema-inconsistent
information ndash
emulating the
schema-dependent pattern of neocortical plasticity-related gene expression reported in [8] A theo-retical analysis of
multilayer neural
networks makes clear why
themodel exhibits these effects[20]
the analysis
shows
that
the
rate of
learning within
a
multilayered
neural
network of
thetype that
CLS attributes to
the neocortex
[20]
will always
depend
on
the state of knowledge
Box 8 Rapid Integration of New Learning in the Neocortex When Does it Occur
In the event arena paradigm [78] (Figure I) hippocampal lesions prevent acquisition of new schema-consistentassociations By contrast hippocampal lesions performed as little as 48 h after learning leave memory intact Oneexplanation for the crucial but temporary nature of the hippocampal contribution is replay even a
few minutes with the
hippocampus intact couldallowmultiple replays eachone incrementing the strength of intra-neocortical connections Inan investigation of induction of plasticity-related genes in neocortex [8] the hippocampuswas intact for 80minutes afterinitial exposure to the new associations These 1047297ndings raise the broader question of when rapid integration of newlearning into the neocortex occurs and whether it can occur even without a hippocampus
A substantial body of work from several laboratories now supports the view that a single period of sleep can producechanges in how experiences froma single learning session impact on subsequent responding As key examples somestudies have reported increased levelsof linking inferences [175] andothershave reported increased lexical competitionand related phenomena[109176] attributedto a singlesleepsessionThese1047297ndingsare often interpreted asevidenceof rapidsystems-level consolidation (eg [176])
However thematerials used arenot obviously highly consistentwith priorknowledge in most cases and therefore under the CLS framework wewould not expect full integration into neocorticalnetworks in such a short time-period An alternative interpretation (illustrated in [5]) is that replays during sleep increasethe strength robustness and rate of activation of new hippocampus-dependent traces and that such strengtheningmay be suf 1047297cient to account for the observed effects Thus the 1047297ndings are consistent with the view that integration of these new memories into neocortical structures proceeds over a considerably longer time-period
Work with the lsquofast mappingrsquo paradigm in humanswith hippocampal lesions [177] provides another potential source of evidence about rapid neocortical learning of arbitrary new information In this paradigm human participants seepairs of pictures of objects ndash onefamiliar andone unfamiliar ndash and are asked a question such as lsquois thenumbats tail pointing uprsquoinferring that the unfamiliar name lsquonumbatrsquomust refer to the unfamiliar object [177] Some studies 1047297nd that patients withextensive hippocampus damage show retention of the new objectndashname association at a
delayed test [178179]suggesting very rapid neocortical learning even without a hippocampus However the 1047297nding has proven dif 1047297cult toreplicate [180ndash182] future studies should continue to investigate this issue
(A) (B)Original paired associates
1 2
3
4
5 5
4
8
3
7
2
6
Introducon of new paired associates
Figure I Schematic Illustration of the Event Arena Paradigm (A) Overhead view of 16 m 16 m event arenarats are cuedwithone of
six food 1047298avors (eg banana) each associated with a location in thearena (eg location 3) andare required to gofromany of the four start-boxesto a speci1047297c location to retrieve food (B)Following gradual learning of the originalset twonew 1047298avor-placepairs are introduced(eg cinnamonndashlocation7 nutmegndashlocation8) Rapidschema-dependent one-shot learning of these new PAs is observed (see Box text) Figure based on experimental designdescribed in [7]
526 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
allocatedneuronal codes that are non-overlapping or orthogonal (eg [26]) Notably the advantagesof
this
coding
scheme
for
episodic
memory ndash
reduction
of
interference
between
similar
butdistinct
events ndash
may
also
have
signi1047297cant
bene1047297ts for
continual
learning
Speci1047297cally
thismechanism allows the rapid creation of distinct non-interfering representations for multipletasks
to
which
an
agent
has
been
exposed
in
sequential
fashion
The
utility
of
this
function
andthe ubiquity of continual learning is well established in the domain of spatial navigation wherethe
notion
of
a
task
can
be
related
to
that
of
an
environmental
context
rodents
are
able
to
learnand
sustain
robust
representations
of
many
different
environments
(eg
gt10
environments
in[120])
with
each
environment
being
represented
by
a
pattern-separated
representational
space
Box 9 Experience Replay in Deep Q-Networks
Instead of employing a standard online learning method in which each unit of play experience (consisting of a stateaction next state and resulting reward) is used immediately to adjust connection weights and then discarded anexperience replay buffer similar to the hippocampus is used This allows learning based on randomly chosen subsets of
recent experiencesstored in the replay buffer([119] fordetails)to beinterleavedwith ongoing game-play Theapproach isin line with 1047297ndings cited above [66] that hippocampal replay reactivates reward related neurons in striatum in accordwith the hypothesis that hippocampus-dependent RL facilitates learning during off-line periods
Experience replayin theDQN architecturewascrucial in (i)maximizing data ef 1047297ciency allowing each unit of experience tobe reusedin many updates (egmirroringbene1047297ts of repeated time-compressedhippocampal replay) and (ii) smoothingout learning and avoiding unstable response policies that can result from the tendency of the current policy to bias theexperienced samples The approach minimizes learning from consecutive samples which is undesirable owing to theirstrongly correlated nature and inconsistent with the implicit assumptions built into neural-network learning algorithmsInstead experience replay allows updates within the deep Q-network to be performed on non-adjacent samples from aset of recent experiences in a fashion that breaks up these correlations while sti ll relying on relevant statistics Thedramatic advantage of a network implementing interleaved learning through experience replay was illustrated by theeffects of disabling replayon network performance this causeda severedrop in performance to at best30 of whenexperience replay was present [119] Note that the uniform sampling mechanismas implemented treats all transitions inthe replay memory as if they were equal Recent work [183] shows that biasing replay towards signi1047297cant events ndash
speci1047297cally experiences that are associated with high reward prediction errors ndash yields further gains This mechanismwhich resonateswith therole of the hippocampus in reweighting experiences as discussedabove allows information tobe harvested from rare experiences that may be particularly informative
528 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
comple-mentary properties of each of the two component systems allowing new information to berapidly stored in the hippocampus and then slowly integrated into neocortical representations This
process
sometimes
labeled lsquosystems
level
consolidationrsquo
[51] arises
within
the
theoryfrom gradual cortical learning driven by replay of the new information interleaved with otheractivity
to
minimize
disruption
of
existing
knowledge
during
the
integration
of
the
newinformation
Empirical Evidence of Replay
Because
of
its
centrality
in
the
theory
we
highlight
key
empiricalevidence
that
replay
events
really
do
occur
The
data
come
primarily
from
rodents
recordedduring
periods
of
inactivity
(including
sleep)
in
which
hippocampal
neurons
exhibit
large
irregularactivity
(LIA)
patterns
that
are
distinct
from
the
activity
patterns
observed
during
active
states[23]
During
LIA
states
synchronous
discharges
thought
to
be
initiated
in
hippocampal
areaCA3
produce sharp-wave ripples
(SWRs)
which
are
propagated
to
neocortex
SWRs
re1047298ectthe
reactivation
of
recent
experiences
expressed
as
the
sequential
1047297ring
of
so-called
place
cellscells
that 1047297re
when
the
animal
is
at
a
speci1047297c location
[2357ndash59]
These
replay
events
appear
tobe
time-compressed
by
a
factor
of
about
20
bringing
neuronal
spikes
that
were
well-separatedin
time
during
an
actual
experience
into
a
time-window
that
enhances
synaptic
plasticity
both
Box 4 Sparse Conjunctive Coding and Pattern Separation in the Dentate Gyrus
Neuronal codes range from the extreme of localist codes ndash where neurons respond highly selectively to single entities(lsquograndmother cellsrsquo) to dense distributedcodeswhere items arecoded through theactivity ofmany (eg 50) neuronsin
an area [153154]
While localist codes minimize interference andare easily decodable they are inef 1047297cient in terms of
representational capacity By contrast densedistributed codesare capacity-ef 1047297cient however they are costly in termsof metabolic cost and relatively dif 1047297cult to decode These are endpoints on a continuumquanti1047297ed by a measure calledsparsity where lsquopopulationrsquo sparsity indexes theproportion of neurons that 1047297re in response to a given stimuluslocationand lsquolifetimersquo sparsity indexes the proportion of stimuli to which a single neuron responds [26153155] For example apopulationsparsity of
1meansthatonly 1of the neuronsin a
populationare activein representinga given inputTworandomly selected sparsepatternstend tohave lowoverlap (for tworandomlyselectedpatternsof equalsparsity over thesame setof neurons theaverageproportion of neuronsin eitherpattern that is active in theotheris equal to thesparsity)but neurons still participate in several different memories making them more ef 1047297cient than localist codes Despitevariability in estimatesof thesparsity ofa givenbrain region [27153156157] theDG iswidelybelievedto sustain amongthe sparsest neural code in the brain (05ndash1 population sparseness) [25ndash27] The CA3 region to which the DGprojects is thought to be less sparse (25 [47])
Many studies 1047297nd less-sparse patterns in CA1 than CA3 [134152]
The unique functional and anatomicalproperties of the DG suggest the origins of its sparse pattern-separated code Theperforant pathfromtheERC (containing200000neurons intherodent)projects toa layerof 1millionofDGgranulecellsCombinedwith thehigh levels of inhibition in theDG this supports theformation of highlysparse conjunctive representa-tions such that each neuron in DG responds only when several input neurons aresimultaneouslyactive reducing overlapbetweensimilar input patterns [25ndash27136] Evidencealso suggests thatnew DGneuronsarisefromstemcells throughoutadult lifethesenewneuronsmaybe preferentially recruitedin theformation ofmemories[136] further reducingoverlapwithpreviouslystored
memoriesTheCA3pattern fora memoryis then selectedby theactiveDG neurons eachofwhichhas alsquodetonatorrsquo synapse to15 randomly selectedCA3neurons This process helpsminimize theoverlap of CA3patterns fordifferent memories increasing storage capacity and minimizing interference between them even if the two memoriesrepresentsimilar events thathavehighlyoverlappingpatternsin neocortex andERCEmpiricalevidenceprovidessupport forthis with one study [137] showing that the representation supported by DGwashighly sensitive to small changes in theenvironmentdespiteevidence thatincominginputsfrom theERCwere little affected(alsosee [133145])
FurthermoreDGlesions impairananimalsrsquo abilitytolearntoresponddifferentlyintwoverysimilarenvironmentswhileleavingtheabilitytolearnto respond differently in two environments that are not similar [136]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 519
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 5 Similarity-Based Coding in High-Level Visual Cortex
High-level visual regions of the neocortex are thought to support distributed representations that are inferred to be lesssparsethan those of theDG andthe CA3CA1 regions of thehippocampus (Box4) Populationsparseness in theERC isestimatedat 7ndash10 [158]
with high-level sensory cortices exhibitingsimilar or higher levels of sparseness (eg variable
estimates [44ndash46]) Although lifetime sparseness does not directly translate to population sparseness recent evidencesuggests that V4and inferotemporal cortex(ITc)havea sparsenessof 10on this measure [159] It isworth notingthatlearning ratesmay vary according to neuronal selectivity andlifetime sparseness resultingin differences in learning ratesacross neocortical areasand hippocampal subregionsNeurons in early visual regions that encode frequently-occurringfeatures (ie edges)mayhave a relatively slow learning rate while neurons in higher visual regions andbeyond (eg ITcand perirhinal cortex) may have a higher learning rate to support the encoding of less-frequently occurring more-conjunctive features (eg individual objects) [12160161]
Evidence from electrophysiological recording studies in high-level visual cortical regions such as the ITc in primatesprovides support for the operation of a similarity-based coding scheme ndash whereby related categories (eg dogs andcats) are represented by overlapping neuronal codes [1740ndash43] (Figure I) Representational similarity analysis (RSA) of the ITc population response duringpassive viewing of pictures reveals codingof 1047297ne-grained categorical structure (egof a set of animate and inanimate objects) ndash that iswell 1047297t by deep convolutional neural networks which have algorithmicparallels with feedforward processing in the ventral visual stream [1740] While analogous similarity-based coding wasobserved using fMRI in the human homolog of ITc [41] there wasno evidence for greater within-category (cf between-category) representational similarity in any subregion of the hippocampus in a recent fMRI study [162] which foundevidence consistent with the importance of pattern separation in episodic memory Instead similarity-based coding inthis studywasobservedin theperirhinal andparahippocampal cortexndashMTL regionsthatproject tothe ERC and thataretypically considered to be intermediate zones (ie between the hippocampal and neocortical systems) in CLS theory
Dissimilarity
[percenle of 1 ndash r ]0 100
Monkey ITc Human ITc
AnimateNaturalNot human
Body Fa ce B ody FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
AnimateNaturalNot human
Bo dy Face Body FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
Figure I Similarity-Based Coding in High-Level Visual Cortex Representational dissimilarity matrices (RDM)re1047298ect the correlation (ie 1 r where r is the Pearson correlation coef 1047297cient) between the response of voxel patterns(fMRI in humans [41] right panel) or neuronal populations (electrophysiological recording in monkey [43]
left panel) to a
set of 92 object images RDMs are analogous in monkey and human ITc The RDMs show that the representations of animate objects are similar as are those of inanimate objects In addition to this clear animatendashinanimate distinctionobject coding in ITc exhibits 1047297ner categorical structure (eg for faces body parts) visible in these RDMs (also see [41])Reproduced with permission from [41]
520 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
rodents [7273] This generalized replayndash simultaneous reactivation of multiple related traces during testing or of 1047298ine periods ndash mayfacilitate the creation
of
new representations
f rom the
recombination
of
multiple relatedepisodes (lsquostored generalizationsrsquo) [5] and the discovery of novel relationships (eg shortcuts)[7273]
Empirical
evidencealsosupports
a
roleforthehippocampusin
category-
and
so-calledlsquostatisticalrsquo
learning [105ndash107] the mechanisms in
REMERGE and other
related modelsthat
rely on
separate memory
traces for individual
i tems allow weak hippocampal
tracesthat
support
only relat ively poor item recognition to
mediate
near-normal generalization[5108]
Box 6 Generalization Through Recurrence in the Hippocampal System
The REMERGEmodel (FigureI ) [5] which re1047298ects a synthesisof interactive activationand competition (IAC)models [163]and exemplar models of memory [108164165] constitutes an abstraction and simpli1047297cation of the multi-stagecircuitry of the hippocampal systeminto twoprincipal layers feature andconjunctivelayers broadly corresponding to the
ERC and hippocampus proper respectively The localist coding (eg unit AB) in the conjunctive layer re1047298ects anidealization of the sparsely distributed pattern-separated codes in the DGCA3 subregions of the hippocampus (Boxes2ndash4) that support episodic memory (eg for trials involving presentation of A and B objects together)
An essential principle of the model ndash mediated by the bidirectional excitatory connections between feature andconjunctive layers ndash is the principle of recurrence between the hippocampus proper and neocortical regions suchas the ERC (termed lsquobig-looprsquo recurrence to distinguish it from the internal recurrence known to exist within the CA3region) This allows recirculation of network output as a subsequent input to the system Intuitively this functionality iscrucial to allowing the model to discover the higher-order structure present within a
set of related episodes an initialprobe on the feature layer (eg denoting stimuli present on screen during a test trial) prompts the activation of experiences containing these elements on the conjunctive layer which in turn drives a new pattern of feature layeractivity that re1047298ects not only the external input but also the content of retrieved experiences This in turn leads to theactivation of conjunctive units denoting experiences related to the new feature layer pattern and so on This can bringabout a situation where for example the presentation of A and C can result in the activationof AB and BC which jointlyactivate B in turn further activating AB andBC which then suppress other conjuncts involvingA andC This produces astable state in which AB BC and A B and C are al l act ivated at the same time ndash thereby effectively inferring a link between A andC Longer-rangeinferences (egBndashE) canalsobe supportedby therecurrent mechanism([5] for details)Formally the function of the network can be viewed as carrying out recurrent similarity computation Unlike otherexemplar models [108164165] in which similarity computation is performed only on external inputs REMERGEperforms such computations on inputs affected by its own outputs
Conjuncve
Feature
AB
A B C D E F
BC CD DE EF
Figure I A Schematic of the Architecture of REMERGE Recurrent architecture of REMERGE showing its two-layer architecture with inputoutput units for possible constituents of experiences (A ndashF) conjunctive units representingpairs of constituents that have occurred together (AB BC etc) bidirectional connections (broken arrows) betweenconjuncts and their constituents and recurrent inhibition (broad arrow) among conjunctive units Adapted from [5]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 523
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
inferenceparadigm [590110]) Such representations then become the contents of episodic memorysubject
to
storage
in
the
hippocampus
The
distinction
between
encoding-
and
retrieval-based
models
can
be
related
more
broadly
tothe
1047297nding
of lsquoconceptrsquo cells
hippocampal
neurons
which
come
to
respond
to
common
featuresacross many events for example cells for speci1047297c odors [111] time-points within an episode[112]
attributes
of
a
task
[113]
and
even
cells
that
1047297re
to
any
picture
or
the
name
of
a
famousperson
[114]
In
Box
7
we
review
empirical 1047297ndings
concerning
concept
cells
and
pattern
overlap
sometimes observed in parts of hippocampus and consider how well these 1047297ndings 1047297t within theperspective
that
the
hippocampus
supports
pattern
separation
Rapid
Schema-Dependent
ConsolidationIt
is
useful
to
distinguish
systems-level
consolidation
from
what
we
refer
to
as
within-systemconsolidation
The
former
refers
to
the
gradual
integration
of
knowledge
into
neocortical
circuitswhile
the
latter
denotes
stabilization
of
recently
formed
memories
within
the
hippocampusperhaps
through
stabilization
of
synapses
among
hippocampal
neurons
[89] In
the
initialformulation
of
CLS
systems-level
consolidation
was
viewed
as
temporally
extended
(egspanning
years
or
even
decades
in
humans
[3451ndash53])
Although
it
was
noted
in
[1]
thatthe
timeframe
could
be
highly
variable
(depending
perhaps
on
the
rate
of
replay
of
memory
Box 7 Concept Cells and Nodal CodingsReports of concept cells in thehippocampushavebeen takenas contradictinga tenet ofCLStheorybut theexistence of such neurons is notnecessarilyinconsistentwith itgiven that thetheoryexpects differenthippocampalregions to vary interms of contextspeci1047297city andalso permits variationwithin hippocampal regions (Box 3) Evidence supporting theCLSprediction of context-speci1047297city in theCA3and DGcomes from a recent intracranial recording study in humans [166] Inthis study neurons in CA3DG andalso in the subiculum tended to discriminate between different imagesof a famousperson ndash with responses correlating with successful performance in a recognition memory task that required discri-minating previously experienced targets from similar lures Neurons in other MTL areas (ie entorhinal and parahippo-campal cortices) exhibitedmore invariant lsquoconcept cell likersquo responses that were not linked tomemory performance (theCA1 subregion was sparsely sampled in this study)
It is also interesting to consider the1047297ndingof lsquosplitterrsquo cells in a task where animalsmust alternatebetween turning left andright on successive trials in a T maze [167ndash179] here someCA1 and CA3 place cellsfor locations onthe central stemof the T maze are modulated by the trajectory of the rat (eg whether it will subsequently turn left or right) whereas othersare
trajectory-independentThisphenomenon knownas partial remapping [48170ndash172] is consistent with theidea that
pattern separation is a matter of
degree in our theory [2737] As such we should expectpartly overlapping representa-tions (ie ratherthan fully independent lsquochartsrsquo [121]) whenenvironmental changes are suf 1047297ciently small (Box3)We alsoexpectthe greatest differentiationin DGand at an early point in learningTo ourknowledge no studies have yetrecordedfrom DG in this paradigm
In a recent study representational similarity analysis techniques [173] were applied to ensemble recordingdata collectedwhile rats performed a context-guided rewarddiscrimination task [113] As expected the population codes in CA3 andCA1were dominatedby context andplace coding although other task dimensions ndash reward value and item ndashwere alsorepresented [113] (also see [174]) Although there was some representational overlap across locations based on valueand item CA3CA1 codes were consistent with incomplete but still strong pattern separation especially in the dorsalhippocampus Overall these 1047297ndings appear consistent with the CLS with the provision that pattern separation is amatter of degree andmay vary by task andregionWhyCA3 showsgreater speci1047297citythanCA1in somestudies but notothers requires further exploration
524 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
large amplitude weight changes occurred during the learning of schema-consistent
but not schema-inconsistent
information ndash
emulating the
schema-dependent pattern of neocortical plasticity-related gene expression reported in [8] A theo-retical analysis of
multilayer neural
networks makes clear why
themodel exhibits these effects[20]
the analysis
shows
that
the
rate of
learning within
a
multilayered
neural
network of
thetype that
CLS attributes to
the neocortex
[20]
will always
depend
on
the state of knowledge
Box 8 Rapid Integration of New Learning in the Neocortex When Does it Occur
In the event arena paradigm [78] (Figure I) hippocampal lesions prevent acquisition of new schema-consistentassociations By contrast hippocampal lesions performed as little as 48 h after learning leave memory intact Oneexplanation for the crucial but temporary nature of the hippocampal contribution is replay even a
few minutes with the
hippocampus intact couldallowmultiple replays eachone incrementing the strength of intra-neocortical connections Inan investigation of induction of plasticity-related genes in neocortex [8] the hippocampuswas intact for 80minutes afterinitial exposure to the new associations These 1047297ndings raise the broader question of when rapid integration of newlearning into the neocortex occurs and whether it can occur even without a hippocampus
A substantial body of work from several laboratories now supports the view that a single period of sleep can producechanges in how experiences froma single learning session impact on subsequent responding As key examples somestudies have reported increased levelsof linking inferences [175] andothershave reported increased lexical competitionand related phenomena[109176] attributedto a singlesleepsessionThese1047297ndingsare often interpreted asevidenceof rapidsystems-level consolidation (eg [176])
However thematerials used arenot obviously highly consistentwith priorknowledge in most cases and therefore under the CLS framework wewould not expect full integration into neocorticalnetworks in such a short time-period An alternative interpretation (illustrated in [5]) is that replays during sleep increasethe strength robustness and rate of activation of new hippocampus-dependent traces and that such strengtheningmay be suf 1047297cient to account for the observed effects Thus the 1047297ndings are consistent with the view that integration of these new memories into neocortical structures proceeds over a considerably longer time-period
Work with the lsquofast mappingrsquo paradigm in humanswith hippocampal lesions [177] provides another potential source of evidence about rapid neocortical learning of arbitrary new information In this paradigm human participants seepairs of pictures of objects ndash onefamiliar andone unfamiliar ndash and are asked a question such as lsquois thenumbats tail pointing uprsquoinferring that the unfamiliar name lsquonumbatrsquomust refer to the unfamiliar object [177] Some studies 1047297nd that patients withextensive hippocampus damage show retention of the new objectndashname association at a
delayed test [178179]suggesting very rapid neocortical learning even without a hippocampus However the 1047297nding has proven dif 1047297cult toreplicate [180ndash182] future studies should continue to investigate this issue
(A) (B)Original paired associates
1 2
3
4
5 5
4
8
3
7
2
6
Introducon of new paired associates
Figure I Schematic Illustration of the Event Arena Paradigm (A) Overhead view of 16 m 16 m event arenarats are cuedwithone of
six food 1047298avors (eg banana) each associated with a location in thearena (eg location 3) andare required to gofromany of the four start-boxesto a speci1047297c location to retrieve food (B)Following gradual learning of the originalset twonew 1047298avor-placepairs are introduced(eg cinnamonndashlocation7 nutmegndashlocation8) Rapidschema-dependent one-shot learning of these new PAs is observed (see Box text) Figure based on experimental designdescribed in [7]
526 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
allocatedneuronal codes that are non-overlapping or orthogonal (eg [26]) Notably the advantagesof
this
coding
scheme
for
episodic
memory ndash
reduction
of
interference
between
similar
butdistinct
events ndash
may
also
have
signi1047297cant
bene1047297ts for
continual
learning
Speci1047297cally
thismechanism allows the rapid creation of distinct non-interfering representations for multipletasks
to
which
an
agent
has
been
exposed
in
sequential
fashion
The
utility
of
this
function
andthe ubiquity of continual learning is well established in the domain of spatial navigation wherethe
notion
of
a
task
can
be
related
to
that
of
an
environmental
context
rodents
are
able
to
learnand
sustain
robust
representations
of
many
different
environments
(eg
gt10
environments
in[120])
with
each
environment
being
represented
by
a
pattern-separated
representational
space
Box 9 Experience Replay in Deep Q-Networks
Instead of employing a standard online learning method in which each unit of play experience (consisting of a stateaction next state and resulting reward) is used immediately to adjust connection weights and then discarded anexperience replay buffer similar to the hippocampus is used This allows learning based on randomly chosen subsets of
recent experiencesstored in the replay buffer([119] fordetails)to beinterleavedwith ongoing game-play Theapproach isin line with 1047297ndings cited above [66] that hippocampal replay reactivates reward related neurons in striatum in accordwith the hypothesis that hippocampus-dependent RL facilitates learning during off-line periods
Experience replayin theDQN architecturewascrucial in (i)maximizing data ef 1047297ciency allowing each unit of experience tobe reusedin many updates (egmirroringbene1047297ts of repeated time-compressedhippocampal replay) and (ii) smoothingout learning and avoiding unstable response policies that can result from the tendency of the current policy to bias theexperienced samples The approach minimizes learning from consecutive samples which is undesirable owing to theirstrongly correlated nature and inconsistent with the implicit assumptions built into neural-network learning algorithmsInstead experience replay allows updates within the deep Q-network to be performed on non-adjacent samples from aset of recent experiences in a fashion that breaks up these correlations while sti ll relying on relevant statistics Thedramatic advantage of a network implementing interleaved learning through experience replay was illustrated by theeffects of disabling replayon network performance this causeda severedrop in performance to at best30 of whenexperience replay was present [119] Note that the uniform sampling mechanismas implemented treats all transitions inthe replay memory as if they were equal Recent work [183] shows that biasing replay towards signi1047297cant events ndash
speci1047297cally experiences that are associated with high reward prediction errors ndash yields further gains This mechanismwhich resonateswith therole of the hippocampus in reweighting experiences as discussedabove allows information tobe harvested from rare experiences that may be particularly informative
528 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 5 Similarity-Based Coding in High-Level Visual Cortex
High-level visual regions of the neocortex are thought to support distributed representations that are inferred to be lesssparsethan those of theDG andthe CA3CA1 regions of thehippocampus (Box4) Populationsparseness in theERC isestimatedat 7ndash10 [158]
with high-level sensory cortices exhibitingsimilar or higher levels of sparseness (eg variable
estimates [44ndash46]) Although lifetime sparseness does not directly translate to population sparseness recent evidencesuggests that V4and inferotemporal cortex(ITc)havea sparsenessof 10on this measure [159] It isworth notingthatlearning ratesmay vary according to neuronal selectivity andlifetime sparseness resultingin differences in learning ratesacross neocortical areasand hippocampal subregionsNeurons in early visual regions that encode frequently-occurringfeatures (ie edges)mayhave a relatively slow learning rate while neurons in higher visual regions andbeyond (eg ITcand perirhinal cortex) may have a higher learning rate to support the encoding of less-frequently occurring more-conjunctive features (eg individual objects) [12160161]
Evidence from electrophysiological recording studies in high-level visual cortical regions such as the ITc in primatesprovides support for the operation of a similarity-based coding scheme ndash whereby related categories (eg dogs andcats) are represented by overlapping neuronal codes [1740ndash43] (Figure I) Representational similarity analysis (RSA) of the ITc population response duringpassive viewing of pictures reveals codingof 1047297ne-grained categorical structure (egof a set of animate and inanimate objects) ndash that iswell 1047297t by deep convolutional neural networks which have algorithmicparallels with feedforward processing in the ventral visual stream [1740] While analogous similarity-based coding wasobserved using fMRI in the human homolog of ITc [41] there wasno evidence for greater within-category (cf between-category) representational similarity in any subregion of the hippocampus in a recent fMRI study [162] which foundevidence consistent with the importance of pattern separation in episodic memory Instead similarity-based coding inthis studywasobservedin theperirhinal andparahippocampal cortexndashMTL regionsthatproject tothe ERC and thataretypically considered to be intermediate zones (ie between the hippocampal and neocortical systems) in CLS theory
Dissimilarity
[percenle of 1 ndash r ]0 100
Monkey ITc Human ITc
AnimateNaturalNot human
Body Fa ce B ody FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
AnimateNaturalNot human
Bo dy Face Body FaceHuman Arficial
Inanimate
A n i m a t e
N a t u r a l
N o t h u m a n
B o d y
F a c e
B o d y
F a c e
H u m a n
A r fi c i a l
I n a n i m a t e
Figure I Similarity-Based Coding in High-Level Visual Cortex Representational dissimilarity matrices (RDM)re1047298ect the correlation (ie 1 r where r is the Pearson correlation coef 1047297cient) between the response of voxel patterns(fMRI in humans [41] right panel) or neuronal populations (electrophysiological recording in monkey [43]
left panel) to a
set of 92 object images RDMs are analogous in monkey and human ITc The RDMs show that the representations of animate objects are similar as are those of inanimate objects In addition to this clear animatendashinanimate distinctionobject coding in ITc exhibits 1047297ner categorical structure (eg for faces body parts) visible in these RDMs (also see [41])Reproduced with permission from [41]
520 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
rodents [7273] This generalized replayndash simultaneous reactivation of multiple related traces during testing or of 1047298ine periods ndash mayfacilitate the creation
of
new representations
f rom the
recombination
of
multiple relatedepisodes (lsquostored generalizationsrsquo) [5] and the discovery of novel relationships (eg shortcuts)[7273]
Empirical
evidencealsosupports
a
roleforthehippocampusin
category-
and
so-calledlsquostatisticalrsquo
learning [105ndash107] the mechanisms in
REMERGE and other
related modelsthat
rely on
separate memory
traces for individual
i tems allow weak hippocampal
tracesthat
support
only relat ively poor item recognition to
mediate
near-normal generalization[5108]
Box 6 Generalization Through Recurrence in the Hippocampal System
The REMERGEmodel (FigureI ) [5] which re1047298ects a synthesisof interactive activationand competition (IAC)models [163]and exemplar models of memory [108164165] constitutes an abstraction and simpli1047297cation of the multi-stagecircuitry of the hippocampal systeminto twoprincipal layers feature andconjunctivelayers broadly corresponding to the
ERC and hippocampus proper respectively The localist coding (eg unit AB) in the conjunctive layer re1047298ects anidealization of the sparsely distributed pattern-separated codes in the DGCA3 subregions of the hippocampus (Boxes2ndash4) that support episodic memory (eg for trials involving presentation of A and B objects together)
An essential principle of the model ndash mediated by the bidirectional excitatory connections between feature andconjunctive layers ndash is the principle of recurrence between the hippocampus proper and neocortical regions suchas the ERC (termed lsquobig-looprsquo recurrence to distinguish it from the internal recurrence known to exist within the CA3region) This allows recirculation of network output as a subsequent input to the system Intuitively this functionality iscrucial to allowing the model to discover the higher-order structure present within a
set of related episodes an initialprobe on the feature layer (eg denoting stimuli present on screen during a test trial) prompts the activation of experiences containing these elements on the conjunctive layer which in turn drives a new pattern of feature layeractivity that re1047298ects not only the external input but also the content of retrieved experiences This in turn leads to theactivation of conjunctive units denoting experiences related to the new feature layer pattern and so on This can bringabout a situation where for example the presentation of A and C can result in the activationof AB and BC which jointlyactivate B in turn further activating AB andBC which then suppress other conjuncts involvingA andC This produces astable state in which AB BC and A B and C are al l act ivated at the same time ndash thereby effectively inferring a link between A andC Longer-rangeinferences (egBndashE) canalsobe supportedby therecurrent mechanism([5] for details)Formally the function of the network can be viewed as carrying out recurrent similarity computation Unlike otherexemplar models [108164165] in which similarity computation is performed only on external inputs REMERGEperforms such computations on inputs affected by its own outputs
Conjuncve
Feature
AB
A B C D E F
BC CD DE EF
Figure I A Schematic of the Architecture of REMERGE Recurrent architecture of REMERGE showing its two-layer architecture with inputoutput units for possible constituents of experiences (A ndashF) conjunctive units representingpairs of constituents that have occurred together (AB BC etc) bidirectional connections (broken arrows) betweenconjuncts and their constituents and recurrent inhibition (broad arrow) among conjunctive units Adapted from [5]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 523
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
inferenceparadigm [590110]) Such representations then become the contents of episodic memorysubject
to
storage
in
the
hippocampus
The
distinction
between
encoding-
and
retrieval-based
models
can
be
related
more
broadly
tothe
1047297nding
of lsquoconceptrsquo cells
hippocampal
neurons
which
come
to
respond
to
common
featuresacross many events for example cells for speci1047297c odors [111] time-points within an episode[112]
attributes
of
a
task
[113]
and
even
cells
that
1047297re
to
any
picture
or
the
name
of
a
famousperson
[114]
In
Box
7
we
review
empirical 1047297ndings
concerning
concept
cells
and
pattern
overlap
sometimes observed in parts of hippocampus and consider how well these 1047297ndings 1047297t within theperspective
that
the
hippocampus
supports
pattern
separation
Rapid
Schema-Dependent
ConsolidationIt
is
useful
to
distinguish
systems-level
consolidation
from
what
we
refer
to
as
within-systemconsolidation
The
former
refers
to
the
gradual
integration
of
knowledge
into
neocortical
circuitswhile
the
latter
denotes
stabilization
of
recently
formed
memories
within
the
hippocampusperhaps
through
stabilization
of
synapses
among
hippocampal
neurons
[89] In
the
initialformulation
of
CLS
systems-level
consolidation
was
viewed
as
temporally
extended
(egspanning
years
or
even
decades
in
humans
[3451ndash53])
Although
it
was
noted
in
[1]
thatthe
timeframe
could
be
highly
variable
(depending
perhaps
on
the
rate
of
replay
of
memory
Box 7 Concept Cells and Nodal CodingsReports of concept cells in thehippocampushavebeen takenas contradictinga tenet ofCLStheorybut theexistence of such neurons is notnecessarilyinconsistentwith itgiven that thetheoryexpects differenthippocampalregions to vary interms of contextspeci1047297city andalso permits variationwithin hippocampal regions (Box 3) Evidence supporting theCLSprediction of context-speci1047297city in theCA3and DGcomes from a recent intracranial recording study in humans [166] Inthis study neurons in CA3DG andalso in the subiculum tended to discriminate between different imagesof a famousperson ndash with responses correlating with successful performance in a recognition memory task that required discri-minating previously experienced targets from similar lures Neurons in other MTL areas (ie entorhinal and parahippo-campal cortices) exhibitedmore invariant lsquoconcept cell likersquo responses that were not linked tomemory performance (theCA1 subregion was sparsely sampled in this study)
It is also interesting to consider the1047297ndingof lsquosplitterrsquo cells in a task where animalsmust alternatebetween turning left andright on successive trials in a T maze [167ndash179] here someCA1 and CA3 place cellsfor locations onthe central stemof the T maze are modulated by the trajectory of the rat (eg whether it will subsequently turn left or right) whereas othersare
trajectory-independentThisphenomenon knownas partial remapping [48170ndash172] is consistent with theidea that
pattern separation is a matter of
degree in our theory [2737] As such we should expectpartly overlapping representa-tions (ie ratherthan fully independent lsquochartsrsquo [121]) whenenvironmental changes are suf 1047297ciently small (Box3)We alsoexpectthe greatest differentiationin DGand at an early point in learningTo ourknowledge no studies have yetrecordedfrom DG in this paradigm
In a recent study representational similarity analysis techniques [173] were applied to ensemble recordingdata collectedwhile rats performed a context-guided rewarddiscrimination task [113] As expected the population codes in CA3 andCA1were dominatedby context andplace coding although other task dimensions ndash reward value and item ndashwere alsorepresented [113] (also see [174]) Although there was some representational overlap across locations based on valueand item CA3CA1 codes were consistent with incomplete but still strong pattern separation especially in the dorsalhippocampus Overall these 1047297ndings appear consistent with the CLS with the provision that pattern separation is amatter of degree andmay vary by task andregionWhyCA3 showsgreater speci1047297citythanCA1in somestudies but notothers requires further exploration
524 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
large amplitude weight changes occurred during the learning of schema-consistent
but not schema-inconsistent
information ndash
emulating the
schema-dependent pattern of neocortical plasticity-related gene expression reported in [8] A theo-retical analysis of
multilayer neural
networks makes clear why
themodel exhibits these effects[20]
the analysis
shows
that
the
rate of
learning within
a
multilayered
neural
network of
thetype that
CLS attributes to
the neocortex
[20]
will always
depend
on
the state of knowledge
Box 8 Rapid Integration of New Learning in the Neocortex When Does it Occur
In the event arena paradigm [78] (Figure I) hippocampal lesions prevent acquisition of new schema-consistentassociations By contrast hippocampal lesions performed as little as 48 h after learning leave memory intact Oneexplanation for the crucial but temporary nature of the hippocampal contribution is replay even a
few minutes with the
hippocampus intact couldallowmultiple replays eachone incrementing the strength of intra-neocortical connections Inan investigation of induction of plasticity-related genes in neocortex [8] the hippocampuswas intact for 80minutes afterinitial exposure to the new associations These 1047297ndings raise the broader question of when rapid integration of newlearning into the neocortex occurs and whether it can occur even without a hippocampus
A substantial body of work from several laboratories now supports the view that a single period of sleep can producechanges in how experiences froma single learning session impact on subsequent responding As key examples somestudies have reported increased levelsof linking inferences [175] andothershave reported increased lexical competitionand related phenomena[109176] attributedto a singlesleepsessionThese1047297ndingsare often interpreted asevidenceof rapidsystems-level consolidation (eg [176])
However thematerials used arenot obviously highly consistentwith priorknowledge in most cases and therefore under the CLS framework wewould not expect full integration into neocorticalnetworks in such a short time-period An alternative interpretation (illustrated in [5]) is that replays during sleep increasethe strength robustness and rate of activation of new hippocampus-dependent traces and that such strengtheningmay be suf 1047297cient to account for the observed effects Thus the 1047297ndings are consistent with the view that integration of these new memories into neocortical structures proceeds over a considerably longer time-period
Work with the lsquofast mappingrsquo paradigm in humanswith hippocampal lesions [177] provides another potential source of evidence about rapid neocortical learning of arbitrary new information In this paradigm human participants seepairs of pictures of objects ndash onefamiliar andone unfamiliar ndash and are asked a question such as lsquois thenumbats tail pointing uprsquoinferring that the unfamiliar name lsquonumbatrsquomust refer to the unfamiliar object [177] Some studies 1047297nd that patients withextensive hippocampus damage show retention of the new objectndashname association at a
delayed test [178179]suggesting very rapid neocortical learning even without a hippocampus However the 1047297nding has proven dif 1047297cult toreplicate [180ndash182] future studies should continue to investigate this issue
(A) (B)Original paired associates
1 2
3
4
5 5
4
8
3
7
2
6
Introducon of new paired associates
Figure I Schematic Illustration of the Event Arena Paradigm (A) Overhead view of 16 m 16 m event arenarats are cuedwithone of
six food 1047298avors (eg banana) each associated with a location in thearena (eg location 3) andare required to gofromany of the four start-boxesto a speci1047297c location to retrieve food (B)Following gradual learning of the originalset twonew 1047298avor-placepairs are introduced(eg cinnamonndashlocation7 nutmegndashlocation8) Rapidschema-dependent one-shot learning of these new PAs is observed (see Box text) Figure based on experimental designdescribed in [7]
526 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
allocatedneuronal codes that are non-overlapping or orthogonal (eg [26]) Notably the advantagesof
this
coding
scheme
for
episodic
memory ndash
reduction
of
interference
between
similar
butdistinct
events ndash
may
also
have
signi1047297cant
bene1047297ts for
continual
learning
Speci1047297cally
thismechanism allows the rapid creation of distinct non-interfering representations for multipletasks
to
which
an
agent
has
been
exposed
in
sequential
fashion
The
utility
of
this
function
andthe ubiquity of continual learning is well established in the domain of spatial navigation wherethe
notion
of
a
task
can
be
related
to
that
of
an
environmental
context
rodents
are
able
to
learnand
sustain
robust
representations
of
many
different
environments
(eg
gt10
environments
in[120])
with
each
environment
being
represented
by
a
pattern-separated
representational
space
Box 9 Experience Replay in Deep Q-Networks
Instead of employing a standard online learning method in which each unit of play experience (consisting of a stateaction next state and resulting reward) is used immediately to adjust connection weights and then discarded anexperience replay buffer similar to the hippocampus is used This allows learning based on randomly chosen subsets of
recent experiencesstored in the replay buffer([119] fordetails)to beinterleavedwith ongoing game-play Theapproach isin line with 1047297ndings cited above [66] that hippocampal replay reactivates reward related neurons in striatum in accordwith the hypothesis that hippocampus-dependent RL facilitates learning during off-line periods
Experience replayin theDQN architecturewascrucial in (i)maximizing data ef 1047297ciency allowing each unit of experience tobe reusedin many updates (egmirroringbene1047297ts of repeated time-compressedhippocampal replay) and (ii) smoothingout learning and avoiding unstable response policies that can result from the tendency of the current policy to bias theexperienced samples The approach minimizes learning from consecutive samples which is undesirable owing to theirstrongly correlated nature and inconsistent with the implicit assumptions built into neural-network learning algorithmsInstead experience replay allows updates within the deep Q-network to be performed on non-adjacent samples from aset of recent experiences in a fashion that breaks up these correlations while sti ll relying on relevant statistics Thedramatic advantage of a network implementing interleaved learning through experience replay was illustrated by theeffects of disabling replayon network performance this causeda severedrop in performance to at best30 of whenexperience replay was present [119] Note that the uniform sampling mechanismas implemented treats all transitions inthe replay memory as if they were equal Recent work [183] shows that biasing replay towards signi1047297cant events ndash
speci1047297cally experiences that are associated with high reward prediction errors ndash yields further gains This mechanismwhich resonateswith therole of the hippocampus in reweighting experiences as discussedabove allows information tobe harvested from rare experiences that may be particularly informative
528 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
rodents [7273] This generalized replayndash simultaneous reactivation of multiple related traces during testing or of 1047298ine periods ndash mayfacilitate the creation
of
new representations
f rom the
recombination
of
multiple relatedepisodes (lsquostored generalizationsrsquo) [5] and the discovery of novel relationships (eg shortcuts)[7273]
Empirical
evidencealsosupports
a
roleforthehippocampusin
category-
and
so-calledlsquostatisticalrsquo
learning [105ndash107] the mechanisms in
REMERGE and other
related modelsthat
rely on
separate memory
traces for individual
i tems allow weak hippocampal
tracesthat
support
only relat ively poor item recognition to
mediate
near-normal generalization[5108]
Box 6 Generalization Through Recurrence in the Hippocampal System
The REMERGEmodel (FigureI ) [5] which re1047298ects a synthesisof interactive activationand competition (IAC)models [163]and exemplar models of memory [108164165] constitutes an abstraction and simpli1047297cation of the multi-stagecircuitry of the hippocampal systeminto twoprincipal layers feature andconjunctivelayers broadly corresponding to the
ERC and hippocampus proper respectively The localist coding (eg unit AB) in the conjunctive layer re1047298ects anidealization of the sparsely distributed pattern-separated codes in the DGCA3 subregions of the hippocampus (Boxes2ndash4) that support episodic memory (eg for trials involving presentation of A and B objects together)
An essential principle of the model ndash mediated by the bidirectional excitatory connections between feature andconjunctive layers ndash is the principle of recurrence between the hippocampus proper and neocortical regions suchas the ERC (termed lsquobig-looprsquo recurrence to distinguish it from the internal recurrence known to exist within the CA3region) This allows recirculation of network output as a subsequent input to the system Intuitively this functionality iscrucial to allowing the model to discover the higher-order structure present within a
set of related episodes an initialprobe on the feature layer (eg denoting stimuli present on screen during a test trial) prompts the activation of experiences containing these elements on the conjunctive layer which in turn drives a new pattern of feature layeractivity that re1047298ects not only the external input but also the content of retrieved experiences This in turn leads to theactivation of conjunctive units denoting experiences related to the new feature layer pattern and so on This can bringabout a situation where for example the presentation of A and C can result in the activationof AB and BC which jointlyactivate B in turn further activating AB andBC which then suppress other conjuncts involvingA andC This produces astable state in which AB BC and A B and C are al l act ivated at the same time ndash thereby effectively inferring a link between A andC Longer-rangeinferences (egBndashE) canalsobe supportedby therecurrent mechanism([5] for details)Formally the function of the network can be viewed as carrying out recurrent similarity computation Unlike otherexemplar models [108164165] in which similarity computation is performed only on external inputs REMERGEperforms such computations on inputs affected by its own outputs
Conjuncve
Feature
AB
A B C D E F
BC CD DE EF
Figure I A Schematic of the Architecture of REMERGE Recurrent architecture of REMERGE showing its two-layer architecture with inputoutput units for possible constituents of experiences (A ndashF) conjunctive units representingpairs of constituents that have occurred together (AB BC etc) bidirectional connections (broken arrows) betweenconjuncts and their constituents and recurrent inhibition (broad arrow) among conjunctive units Adapted from [5]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 523
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
inferenceparadigm [590110]) Such representations then become the contents of episodic memorysubject
to
storage
in
the
hippocampus
The
distinction
between
encoding-
and
retrieval-based
models
can
be
related
more
broadly
tothe
1047297nding
of lsquoconceptrsquo cells
hippocampal
neurons
which
come
to
respond
to
common
featuresacross many events for example cells for speci1047297c odors [111] time-points within an episode[112]
attributes
of
a
task
[113]
and
even
cells
that
1047297re
to
any
picture
or
the
name
of
a
famousperson
[114]
In
Box
7
we
review
empirical 1047297ndings
concerning
concept
cells
and
pattern
overlap
sometimes observed in parts of hippocampus and consider how well these 1047297ndings 1047297t within theperspective
that
the
hippocampus
supports
pattern
separation
Rapid
Schema-Dependent
ConsolidationIt
is
useful
to
distinguish
systems-level
consolidation
from
what
we
refer
to
as
within-systemconsolidation
The
former
refers
to
the
gradual
integration
of
knowledge
into
neocortical
circuitswhile
the
latter
denotes
stabilization
of
recently
formed
memories
within
the
hippocampusperhaps
through
stabilization
of
synapses
among
hippocampal
neurons
[89] In
the
initialformulation
of
CLS
systems-level
consolidation
was
viewed
as
temporally
extended
(egspanning
years
or
even
decades
in
humans
[3451ndash53])
Although
it
was
noted
in
[1]
thatthe
timeframe
could
be
highly
variable
(depending
perhaps
on
the
rate
of
replay
of
memory
Box 7 Concept Cells and Nodal CodingsReports of concept cells in thehippocampushavebeen takenas contradictinga tenet ofCLStheorybut theexistence of such neurons is notnecessarilyinconsistentwith itgiven that thetheoryexpects differenthippocampalregions to vary interms of contextspeci1047297city andalso permits variationwithin hippocampal regions (Box 3) Evidence supporting theCLSprediction of context-speci1047297city in theCA3and DGcomes from a recent intracranial recording study in humans [166] Inthis study neurons in CA3DG andalso in the subiculum tended to discriminate between different imagesof a famousperson ndash with responses correlating with successful performance in a recognition memory task that required discri-minating previously experienced targets from similar lures Neurons in other MTL areas (ie entorhinal and parahippo-campal cortices) exhibitedmore invariant lsquoconcept cell likersquo responses that were not linked tomemory performance (theCA1 subregion was sparsely sampled in this study)
It is also interesting to consider the1047297ndingof lsquosplitterrsquo cells in a task where animalsmust alternatebetween turning left andright on successive trials in a T maze [167ndash179] here someCA1 and CA3 place cellsfor locations onthe central stemof the T maze are modulated by the trajectory of the rat (eg whether it will subsequently turn left or right) whereas othersare
trajectory-independentThisphenomenon knownas partial remapping [48170ndash172] is consistent with theidea that
pattern separation is a matter of
degree in our theory [2737] As such we should expectpartly overlapping representa-tions (ie ratherthan fully independent lsquochartsrsquo [121]) whenenvironmental changes are suf 1047297ciently small (Box3)We alsoexpectthe greatest differentiationin DGand at an early point in learningTo ourknowledge no studies have yetrecordedfrom DG in this paradigm
In a recent study representational similarity analysis techniques [173] were applied to ensemble recordingdata collectedwhile rats performed a context-guided rewarddiscrimination task [113] As expected the population codes in CA3 andCA1were dominatedby context andplace coding although other task dimensions ndash reward value and item ndashwere alsorepresented [113] (also see [174]) Although there was some representational overlap across locations based on valueand item CA3CA1 codes were consistent with incomplete but still strong pattern separation especially in the dorsalhippocampus Overall these 1047297ndings appear consistent with the CLS with the provision that pattern separation is amatter of degree andmay vary by task andregionWhyCA3 showsgreater speci1047297citythanCA1in somestudies but notothers requires further exploration
524 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
large amplitude weight changes occurred during the learning of schema-consistent
but not schema-inconsistent
information ndash
emulating the
schema-dependent pattern of neocortical plasticity-related gene expression reported in [8] A theo-retical analysis of
multilayer neural
networks makes clear why
themodel exhibits these effects[20]
the analysis
shows
that
the
rate of
learning within
a
multilayered
neural
network of
thetype that
CLS attributes to
the neocortex
[20]
will always
depend
on
the state of knowledge
Box 8 Rapid Integration of New Learning in the Neocortex When Does it Occur
In the event arena paradigm [78] (Figure I) hippocampal lesions prevent acquisition of new schema-consistentassociations By contrast hippocampal lesions performed as little as 48 h after learning leave memory intact Oneexplanation for the crucial but temporary nature of the hippocampal contribution is replay even a
few minutes with the
hippocampus intact couldallowmultiple replays eachone incrementing the strength of intra-neocortical connections Inan investigation of induction of plasticity-related genes in neocortex [8] the hippocampuswas intact for 80minutes afterinitial exposure to the new associations These 1047297ndings raise the broader question of when rapid integration of newlearning into the neocortex occurs and whether it can occur even without a hippocampus
A substantial body of work from several laboratories now supports the view that a single period of sleep can producechanges in how experiences froma single learning session impact on subsequent responding As key examples somestudies have reported increased levelsof linking inferences [175] andothershave reported increased lexical competitionand related phenomena[109176] attributedto a singlesleepsessionThese1047297ndingsare often interpreted asevidenceof rapidsystems-level consolidation (eg [176])
However thematerials used arenot obviously highly consistentwith priorknowledge in most cases and therefore under the CLS framework wewould not expect full integration into neocorticalnetworks in such a short time-period An alternative interpretation (illustrated in [5]) is that replays during sleep increasethe strength robustness and rate of activation of new hippocampus-dependent traces and that such strengtheningmay be suf 1047297cient to account for the observed effects Thus the 1047297ndings are consistent with the view that integration of these new memories into neocortical structures proceeds over a considerably longer time-period
Work with the lsquofast mappingrsquo paradigm in humanswith hippocampal lesions [177] provides another potential source of evidence about rapid neocortical learning of arbitrary new information In this paradigm human participants seepairs of pictures of objects ndash onefamiliar andone unfamiliar ndash and are asked a question such as lsquois thenumbats tail pointing uprsquoinferring that the unfamiliar name lsquonumbatrsquomust refer to the unfamiliar object [177] Some studies 1047297nd that patients withextensive hippocampus damage show retention of the new objectndashname association at a
delayed test [178179]suggesting very rapid neocortical learning even without a hippocampus However the 1047297nding has proven dif 1047297cult toreplicate [180ndash182] future studies should continue to investigate this issue
(A) (B)Original paired associates
1 2
3
4
5 5
4
8
3
7
2
6
Introducon of new paired associates
Figure I Schematic Illustration of the Event Arena Paradigm (A) Overhead view of 16 m 16 m event arenarats are cuedwithone of
six food 1047298avors (eg banana) each associated with a location in thearena (eg location 3) andare required to gofromany of the four start-boxesto a speci1047297c location to retrieve food (B)Following gradual learning of the originalset twonew 1047298avor-placepairs are introduced(eg cinnamonndashlocation7 nutmegndashlocation8) Rapidschema-dependent one-shot learning of these new PAs is observed (see Box text) Figure based on experimental designdescribed in [7]
526 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
allocatedneuronal codes that are non-overlapping or orthogonal (eg [26]) Notably the advantagesof
this
coding
scheme
for
episodic
memory ndash
reduction
of
interference
between
similar
butdistinct
events ndash
may
also
have
signi1047297cant
bene1047297ts for
continual
learning
Speci1047297cally
thismechanism allows the rapid creation of distinct non-interfering representations for multipletasks
to
which
an
agent
has
been
exposed
in
sequential
fashion
The
utility
of
this
function
andthe ubiquity of continual learning is well established in the domain of spatial navigation wherethe
notion
of
a
task
can
be
related
to
that
of
an
environmental
context
rodents
are
able
to
learnand
sustain
robust
representations
of
many
different
environments
(eg
gt10
environments
in[120])
with
each
environment
being
represented
by
a
pattern-separated
representational
space
Box 9 Experience Replay in Deep Q-Networks
Instead of employing a standard online learning method in which each unit of play experience (consisting of a stateaction next state and resulting reward) is used immediately to adjust connection weights and then discarded anexperience replay buffer similar to the hippocampus is used This allows learning based on randomly chosen subsets of
recent experiencesstored in the replay buffer([119] fordetails)to beinterleavedwith ongoing game-play Theapproach isin line with 1047297ndings cited above [66] that hippocampal replay reactivates reward related neurons in striatum in accordwith the hypothesis that hippocampus-dependent RL facilitates learning during off-line periods
Experience replayin theDQN architecturewascrucial in (i)maximizing data ef 1047297ciency allowing each unit of experience tobe reusedin many updates (egmirroringbene1047297ts of repeated time-compressedhippocampal replay) and (ii) smoothingout learning and avoiding unstable response policies that can result from the tendency of the current policy to bias theexperienced samples The approach minimizes learning from consecutive samples which is undesirable owing to theirstrongly correlated nature and inconsistent with the implicit assumptions built into neural-network learning algorithmsInstead experience replay allows updates within the deep Q-network to be performed on non-adjacent samples from aset of recent experiences in a fashion that breaks up these correlations while sti ll relying on relevant statistics Thedramatic advantage of a network implementing interleaved learning through experience replay was illustrated by theeffects of disabling replayon network performance this causeda severedrop in performance to at best30 of whenexperience replay was present [119] Note that the uniform sampling mechanismas implemented treats all transitions inthe replay memory as if they were equal Recent work [183] shows that biasing replay towards signi1047297cant events ndash
speci1047297cally experiences that are associated with high reward prediction errors ndash yields further gains This mechanismwhich resonateswith therole of the hippocampus in reweighting experiences as discussedabove allows information tobe harvested from rare experiences that may be particularly informative
528 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
rodents [7273] This generalized replayndash simultaneous reactivation of multiple related traces during testing or of 1047298ine periods ndash mayfacilitate the creation
of
new representations
f rom the
recombination
of
multiple relatedepisodes (lsquostored generalizationsrsquo) [5] and the discovery of novel relationships (eg shortcuts)[7273]
Empirical
evidencealsosupports
a
roleforthehippocampusin
category-
and
so-calledlsquostatisticalrsquo
learning [105ndash107] the mechanisms in
REMERGE and other
related modelsthat
rely on
separate memory
traces for individual
i tems allow weak hippocampal
tracesthat
support
only relat ively poor item recognition to
mediate
near-normal generalization[5108]
Box 6 Generalization Through Recurrence in the Hippocampal System
The REMERGEmodel (FigureI ) [5] which re1047298ects a synthesisof interactive activationand competition (IAC)models [163]and exemplar models of memory [108164165] constitutes an abstraction and simpli1047297cation of the multi-stagecircuitry of the hippocampal systeminto twoprincipal layers feature andconjunctivelayers broadly corresponding to the
ERC and hippocampus proper respectively The localist coding (eg unit AB) in the conjunctive layer re1047298ects anidealization of the sparsely distributed pattern-separated codes in the DGCA3 subregions of the hippocampus (Boxes2ndash4) that support episodic memory (eg for trials involving presentation of A and B objects together)
An essential principle of the model ndash mediated by the bidirectional excitatory connections between feature andconjunctive layers ndash is the principle of recurrence between the hippocampus proper and neocortical regions suchas the ERC (termed lsquobig-looprsquo recurrence to distinguish it from the internal recurrence known to exist within the CA3region) This allows recirculation of network output as a subsequent input to the system Intuitively this functionality iscrucial to allowing the model to discover the higher-order structure present within a
set of related episodes an initialprobe on the feature layer (eg denoting stimuli present on screen during a test trial) prompts the activation of experiences containing these elements on the conjunctive layer which in turn drives a new pattern of feature layeractivity that re1047298ects not only the external input but also the content of retrieved experiences This in turn leads to theactivation of conjunctive units denoting experiences related to the new feature layer pattern and so on This can bringabout a situation where for example the presentation of A and C can result in the activationof AB and BC which jointlyactivate B in turn further activating AB andBC which then suppress other conjuncts involvingA andC This produces astable state in which AB BC and A B and C are al l act ivated at the same time ndash thereby effectively inferring a link between A andC Longer-rangeinferences (egBndashE) canalsobe supportedby therecurrent mechanism([5] for details)Formally the function of the network can be viewed as carrying out recurrent similarity computation Unlike otherexemplar models [108164165] in which similarity computation is performed only on external inputs REMERGEperforms such computations on inputs affected by its own outputs
Conjuncve
Feature
AB
A B C D E F
BC CD DE EF
Figure I A Schematic of the Architecture of REMERGE Recurrent architecture of REMERGE showing its two-layer architecture with inputoutput units for possible constituents of experiences (A ndashF) conjunctive units representingpairs of constituents that have occurred together (AB BC etc) bidirectional connections (broken arrows) betweenconjuncts and their constituents and recurrent inhibition (broad arrow) among conjunctive units Adapted from [5]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 523
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
inferenceparadigm [590110]) Such representations then become the contents of episodic memorysubject
to
storage
in
the
hippocampus
The
distinction
between
encoding-
and
retrieval-based
models
can
be
related
more
broadly
tothe
1047297nding
of lsquoconceptrsquo cells
hippocampal
neurons
which
come
to
respond
to
common
featuresacross many events for example cells for speci1047297c odors [111] time-points within an episode[112]
attributes
of
a
task
[113]
and
even
cells
that
1047297re
to
any
picture
or
the
name
of
a
famousperson
[114]
In
Box
7
we
review
empirical 1047297ndings
concerning
concept
cells
and
pattern
overlap
sometimes observed in parts of hippocampus and consider how well these 1047297ndings 1047297t within theperspective
that
the
hippocampus
supports
pattern
separation
Rapid
Schema-Dependent
ConsolidationIt
is
useful
to
distinguish
systems-level
consolidation
from
what
we
refer
to
as
within-systemconsolidation
The
former
refers
to
the
gradual
integration
of
knowledge
into
neocortical
circuitswhile
the
latter
denotes
stabilization
of
recently
formed
memories
within
the
hippocampusperhaps
through
stabilization
of
synapses
among
hippocampal
neurons
[89] In
the
initialformulation
of
CLS
systems-level
consolidation
was
viewed
as
temporally
extended
(egspanning
years
or
even
decades
in
humans
[3451ndash53])
Although
it
was
noted
in
[1]
thatthe
timeframe
could
be
highly
variable
(depending
perhaps
on
the
rate
of
replay
of
memory
Box 7 Concept Cells and Nodal CodingsReports of concept cells in thehippocampushavebeen takenas contradictinga tenet ofCLStheorybut theexistence of such neurons is notnecessarilyinconsistentwith itgiven that thetheoryexpects differenthippocampalregions to vary interms of contextspeci1047297city andalso permits variationwithin hippocampal regions (Box 3) Evidence supporting theCLSprediction of context-speci1047297city in theCA3and DGcomes from a recent intracranial recording study in humans [166] Inthis study neurons in CA3DG andalso in the subiculum tended to discriminate between different imagesof a famousperson ndash with responses correlating with successful performance in a recognition memory task that required discri-minating previously experienced targets from similar lures Neurons in other MTL areas (ie entorhinal and parahippo-campal cortices) exhibitedmore invariant lsquoconcept cell likersquo responses that were not linked tomemory performance (theCA1 subregion was sparsely sampled in this study)
It is also interesting to consider the1047297ndingof lsquosplitterrsquo cells in a task where animalsmust alternatebetween turning left andright on successive trials in a T maze [167ndash179] here someCA1 and CA3 place cellsfor locations onthe central stemof the T maze are modulated by the trajectory of the rat (eg whether it will subsequently turn left or right) whereas othersare
trajectory-independentThisphenomenon knownas partial remapping [48170ndash172] is consistent with theidea that
pattern separation is a matter of
degree in our theory [2737] As such we should expectpartly overlapping representa-tions (ie ratherthan fully independent lsquochartsrsquo [121]) whenenvironmental changes are suf 1047297ciently small (Box3)We alsoexpectthe greatest differentiationin DGand at an early point in learningTo ourknowledge no studies have yetrecordedfrom DG in this paradigm
In a recent study representational similarity analysis techniques [173] were applied to ensemble recordingdata collectedwhile rats performed a context-guided rewarddiscrimination task [113] As expected the population codes in CA3 andCA1were dominatedby context andplace coding although other task dimensions ndash reward value and item ndashwere alsorepresented [113] (also see [174]) Although there was some representational overlap across locations based on valueand item CA3CA1 codes were consistent with incomplete but still strong pattern separation especially in the dorsalhippocampus Overall these 1047297ndings appear consistent with the CLS with the provision that pattern separation is amatter of degree andmay vary by task andregionWhyCA3 showsgreater speci1047297citythanCA1in somestudies but notothers requires further exploration
524 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
large amplitude weight changes occurred during the learning of schema-consistent
but not schema-inconsistent
information ndash
emulating the
schema-dependent pattern of neocortical plasticity-related gene expression reported in [8] A theo-retical analysis of
multilayer neural
networks makes clear why
themodel exhibits these effects[20]
the analysis
shows
that
the
rate of
learning within
a
multilayered
neural
network of
thetype that
CLS attributes to
the neocortex
[20]
will always
depend
on
the state of knowledge
Box 8 Rapid Integration of New Learning in the Neocortex When Does it Occur
In the event arena paradigm [78] (Figure I) hippocampal lesions prevent acquisition of new schema-consistentassociations By contrast hippocampal lesions performed as little as 48 h after learning leave memory intact Oneexplanation for the crucial but temporary nature of the hippocampal contribution is replay even a
few minutes with the
hippocampus intact couldallowmultiple replays eachone incrementing the strength of intra-neocortical connections Inan investigation of induction of plasticity-related genes in neocortex [8] the hippocampuswas intact for 80minutes afterinitial exposure to the new associations These 1047297ndings raise the broader question of when rapid integration of newlearning into the neocortex occurs and whether it can occur even without a hippocampus
A substantial body of work from several laboratories now supports the view that a single period of sleep can producechanges in how experiences froma single learning session impact on subsequent responding As key examples somestudies have reported increased levelsof linking inferences [175] andothershave reported increased lexical competitionand related phenomena[109176] attributedto a singlesleepsessionThese1047297ndingsare often interpreted asevidenceof rapidsystems-level consolidation (eg [176])
However thematerials used arenot obviously highly consistentwith priorknowledge in most cases and therefore under the CLS framework wewould not expect full integration into neocorticalnetworks in such a short time-period An alternative interpretation (illustrated in [5]) is that replays during sleep increasethe strength robustness and rate of activation of new hippocampus-dependent traces and that such strengtheningmay be suf 1047297cient to account for the observed effects Thus the 1047297ndings are consistent with the view that integration of these new memories into neocortical structures proceeds over a considerably longer time-period
Work with the lsquofast mappingrsquo paradigm in humanswith hippocampal lesions [177] provides another potential source of evidence about rapid neocortical learning of arbitrary new information In this paradigm human participants seepairs of pictures of objects ndash onefamiliar andone unfamiliar ndash and are asked a question such as lsquois thenumbats tail pointing uprsquoinferring that the unfamiliar name lsquonumbatrsquomust refer to the unfamiliar object [177] Some studies 1047297nd that patients withextensive hippocampus damage show retention of the new objectndashname association at a
delayed test [178179]suggesting very rapid neocortical learning even without a hippocampus However the 1047297nding has proven dif 1047297cult toreplicate [180ndash182] future studies should continue to investigate this issue
(A) (B)Original paired associates
1 2
3
4
5 5
4
8
3
7
2
6
Introducon of new paired associates
Figure I Schematic Illustration of the Event Arena Paradigm (A) Overhead view of 16 m 16 m event arenarats are cuedwithone of
six food 1047298avors (eg banana) each associated with a location in thearena (eg location 3) andare required to gofromany of the four start-boxesto a speci1047297c location to retrieve food (B)Following gradual learning of the originalset twonew 1047298avor-placepairs are introduced(eg cinnamonndashlocation7 nutmegndashlocation8) Rapidschema-dependent one-shot learning of these new PAs is observed (see Box text) Figure based on experimental designdescribed in [7]
526 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
allocatedneuronal codes that are non-overlapping or orthogonal (eg [26]) Notably the advantagesof
this
coding
scheme
for
episodic
memory ndash
reduction
of
interference
between
similar
butdistinct
events ndash
may
also
have
signi1047297cant
bene1047297ts for
continual
learning
Speci1047297cally
thismechanism allows the rapid creation of distinct non-interfering representations for multipletasks
to
which
an
agent
has
been
exposed
in
sequential
fashion
The
utility
of
this
function
andthe ubiquity of continual learning is well established in the domain of spatial navigation wherethe
notion
of
a
task
can
be
related
to
that
of
an
environmental
context
rodents
are
able
to
learnand
sustain
robust
representations
of
many
different
environments
(eg
gt10
environments
in[120])
with
each
environment
being
represented
by
a
pattern-separated
representational
space
Box 9 Experience Replay in Deep Q-Networks
Instead of employing a standard online learning method in which each unit of play experience (consisting of a stateaction next state and resulting reward) is used immediately to adjust connection weights and then discarded anexperience replay buffer similar to the hippocampus is used This allows learning based on randomly chosen subsets of
recent experiencesstored in the replay buffer([119] fordetails)to beinterleavedwith ongoing game-play Theapproach isin line with 1047297ndings cited above [66] that hippocampal replay reactivates reward related neurons in striatum in accordwith the hypothesis that hippocampus-dependent RL facilitates learning during off-line periods
Experience replayin theDQN architecturewascrucial in (i)maximizing data ef 1047297ciency allowing each unit of experience tobe reusedin many updates (egmirroringbene1047297ts of repeated time-compressedhippocampal replay) and (ii) smoothingout learning and avoiding unstable response policies that can result from the tendency of the current policy to bias theexperienced samples The approach minimizes learning from consecutive samples which is undesirable owing to theirstrongly correlated nature and inconsistent with the implicit assumptions built into neural-network learning algorithmsInstead experience replay allows updates within the deep Q-network to be performed on non-adjacent samples from aset of recent experiences in a fashion that breaks up these correlations while sti ll relying on relevant statistics Thedramatic advantage of a network implementing interleaved learning through experience replay was illustrated by theeffects of disabling replayon network performance this causeda severedrop in performance to at best30 of whenexperience replay was present [119] Note that the uniform sampling mechanismas implemented treats all transitions inthe replay memory as if they were equal Recent work [183] shows that biasing replay towards signi1047297cant events ndash
speci1047297cally experiences that are associated with high reward prediction errors ndash yields further gains This mechanismwhich resonateswith therole of the hippocampus in reweighting experiences as discussedabove allows information tobe harvested from rare experiences that may be particularly informative
528 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
rodents [7273] This generalized replayndash simultaneous reactivation of multiple related traces during testing or of 1047298ine periods ndash mayfacilitate the creation
of
new representations
f rom the
recombination
of
multiple relatedepisodes (lsquostored generalizationsrsquo) [5] and the discovery of novel relationships (eg shortcuts)[7273]
Empirical
evidencealsosupports
a
roleforthehippocampusin
category-
and
so-calledlsquostatisticalrsquo
learning [105ndash107] the mechanisms in
REMERGE and other
related modelsthat
rely on
separate memory
traces for individual
i tems allow weak hippocampal
tracesthat
support
only relat ively poor item recognition to
mediate
near-normal generalization[5108]
Box 6 Generalization Through Recurrence in the Hippocampal System
The REMERGEmodel (FigureI ) [5] which re1047298ects a synthesisof interactive activationand competition (IAC)models [163]and exemplar models of memory [108164165] constitutes an abstraction and simpli1047297cation of the multi-stagecircuitry of the hippocampal systeminto twoprincipal layers feature andconjunctivelayers broadly corresponding to the
ERC and hippocampus proper respectively The localist coding (eg unit AB) in the conjunctive layer re1047298ects anidealization of the sparsely distributed pattern-separated codes in the DGCA3 subregions of the hippocampus (Boxes2ndash4) that support episodic memory (eg for trials involving presentation of A and B objects together)
An essential principle of the model ndash mediated by the bidirectional excitatory connections between feature andconjunctive layers ndash is the principle of recurrence between the hippocampus proper and neocortical regions suchas the ERC (termed lsquobig-looprsquo recurrence to distinguish it from the internal recurrence known to exist within the CA3region) This allows recirculation of network output as a subsequent input to the system Intuitively this functionality iscrucial to allowing the model to discover the higher-order structure present within a
set of related episodes an initialprobe on the feature layer (eg denoting stimuli present on screen during a test trial) prompts the activation of experiences containing these elements on the conjunctive layer which in turn drives a new pattern of feature layeractivity that re1047298ects not only the external input but also the content of retrieved experiences This in turn leads to theactivation of conjunctive units denoting experiences related to the new feature layer pattern and so on This can bringabout a situation where for example the presentation of A and C can result in the activationof AB and BC which jointlyactivate B in turn further activating AB andBC which then suppress other conjuncts involvingA andC This produces astable state in which AB BC and A B and C are al l act ivated at the same time ndash thereby effectively inferring a link between A andC Longer-rangeinferences (egBndashE) canalsobe supportedby therecurrent mechanism([5] for details)Formally the function of the network can be viewed as carrying out recurrent similarity computation Unlike otherexemplar models [108164165] in which similarity computation is performed only on external inputs REMERGEperforms such computations on inputs affected by its own outputs
Conjuncve
Feature
AB
A B C D E F
BC CD DE EF
Figure I A Schematic of the Architecture of REMERGE Recurrent architecture of REMERGE showing its two-layer architecture with inputoutput units for possible constituents of experiences (A ndashF) conjunctive units representingpairs of constituents that have occurred together (AB BC etc) bidirectional connections (broken arrows) betweenconjuncts and their constituents and recurrent inhibition (broad arrow) among conjunctive units Adapted from [5]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 523
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
inferenceparadigm [590110]) Such representations then become the contents of episodic memorysubject
to
storage
in
the
hippocampus
The
distinction
between
encoding-
and
retrieval-based
models
can
be
related
more
broadly
tothe
1047297nding
of lsquoconceptrsquo cells
hippocampal
neurons
which
come
to
respond
to
common
featuresacross many events for example cells for speci1047297c odors [111] time-points within an episode[112]
attributes
of
a
task
[113]
and
even
cells
that
1047297re
to
any
picture
or
the
name
of
a
famousperson
[114]
In
Box
7
we
review
empirical 1047297ndings
concerning
concept
cells
and
pattern
overlap
sometimes observed in parts of hippocampus and consider how well these 1047297ndings 1047297t within theperspective
that
the
hippocampus
supports
pattern
separation
Rapid
Schema-Dependent
ConsolidationIt
is
useful
to
distinguish
systems-level
consolidation
from
what
we
refer
to
as
within-systemconsolidation
The
former
refers
to
the
gradual
integration
of
knowledge
into
neocortical
circuitswhile
the
latter
denotes
stabilization
of
recently
formed
memories
within
the
hippocampusperhaps
through
stabilization
of
synapses
among
hippocampal
neurons
[89] In
the
initialformulation
of
CLS
systems-level
consolidation
was
viewed
as
temporally
extended
(egspanning
years
or
even
decades
in
humans
[3451ndash53])
Although
it
was
noted
in
[1]
thatthe
timeframe
could
be
highly
variable
(depending
perhaps
on
the
rate
of
replay
of
memory
Box 7 Concept Cells and Nodal CodingsReports of concept cells in thehippocampushavebeen takenas contradictinga tenet ofCLStheorybut theexistence of such neurons is notnecessarilyinconsistentwith itgiven that thetheoryexpects differenthippocampalregions to vary interms of contextspeci1047297city andalso permits variationwithin hippocampal regions (Box 3) Evidence supporting theCLSprediction of context-speci1047297city in theCA3and DGcomes from a recent intracranial recording study in humans [166] Inthis study neurons in CA3DG andalso in the subiculum tended to discriminate between different imagesof a famousperson ndash with responses correlating with successful performance in a recognition memory task that required discri-minating previously experienced targets from similar lures Neurons in other MTL areas (ie entorhinal and parahippo-campal cortices) exhibitedmore invariant lsquoconcept cell likersquo responses that were not linked tomemory performance (theCA1 subregion was sparsely sampled in this study)
It is also interesting to consider the1047297ndingof lsquosplitterrsquo cells in a task where animalsmust alternatebetween turning left andright on successive trials in a T maze [167ndash179] here someCA1 and CA3 place cellsfor locations onthe central stemof the T maze are modulated by the trajectory of the rat (eg whether it will subsequently turn left or right) whereas othersare
trajectory-independentThisphenomenon knownas partial remapping [48170ndash172] is consistent with theidea that
pattern separation is a matter of
degree in our theory [2737] As such we should expectpartly overlapping representa-tions (ie ratherthan fully independent lsquochartsrsquo [121]) whenenvironmental changes are suf 1047297ciently small (Box3)We alsoexpectthe greatest differentiationin DGand at an early point in learningTo ourknowledge no studies have yetrecordedfrom DG in this paradigm
In a recent study representational similarity analysis techniques [173] were applied to ensemble recordingdata collectedwhile rats performed a context-guided rewarddiscrimination task [113] As expected the population codes in CA3 andCA1were dominatedby context andplace coding although other task dimensions ndash reward value and item ndashwere alsorepresented [113] (also see [174]) Although there was some representational overlap across locations based on valueand item CA3CA1 codes were consistent with incomplete but still strong pattern separation especially in the dorsalhippocampus Overall these 1047297ndings appear consistent with the CLS with the provision that pattern separation is amatter of degree andmay vary by task andregionWhyCA3 showsgreater speci1047297citythanCA1in somestudies but notothers requires further exploration
524 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
large amplitude weight changes occurred during the learning of schema-consistent
but not schema-inconsistent
information ndash
emulating the
schema-dependent pattern of neocortical plasticity-related gene expression reported in [8] A theo-retical analysis of
multilayer neural
networks makes clear why
themodel exhibits these effects[20]
the analysis
shows
that
the
rate of
learning within
a
multilayered
neural
network of
thetype that
CLS attributes to
the neocortex
[20]
will always
depend
on
the state of knowledge
Box 8 Rapid Integration of New Learning in the Neocortex When Does it Occur
In the event arena paradigm [78] (Figure I) hippocampal lesions prevent acquisition of new schema-consistentassociations By contrast hippocampal lesions performed as little as 48 h after learning leave memory intact Oneexplanation for the crucial but temporary nature of the hippocampal contribution is replay even a
few minutes with the
hippocampus intact couldallowmultiple replays eachone incrementing the strength of intra-neocortical connections Inan investigation of induction of plasticity-related genes in neocortex [8] the hippocampuswas intact for 80minutes afterinitial exposure to the new associations These 1047297ndings raise the broader question of when rapid integration of newlearning into the neocortex occurs and whether it can occur even without a hippocampus
A substantial body of work from several laboratories now supports the view that a single period of sleep can producechanges in how experiences froma single learning session impact on subsequent responding As key examples somestudies have reported increased levelsof linking inferences [175] andothershave reported increased lexical competitionand related phenomena[109176] attributedto a singlesleepsessionThese1047297ndingsare often interpreted asevidenceof rapidsystems-level consolidation (eg [176])
However thematerials used arenot obviously highly consistentwith priorknowledge in most cases and therefore under the CLS framework wewould not expect full integration into neocorticalnetworks in such a short time-period An alternative interpretation (illustrated in [5]) is that replays during sleep increasethe strength robustness and rate of activation of new hippocampus-dependent traces and that such strengtheningmay be suf 1047297cient to account for the observed effects Thus the 1047297ndings are consistent with the view that integration of these new memories into neocortical structures proceeds over a considerably longer time-period
Work with the lsquofast mappingrsquo paradigm in humanswith hippocampal lesions [177] provides another potential source of evidence about rapid neocortical learning of arbitrary new information In this paradigm human participants seepairs of pictures of objects ndash onefamiliar andone unfamiliar ndash and are asked a question such as lsquois thenumbats tail pointing uprsquoinferring that the unfamiliar name lsquonumbatrsquomust refer to the unfamiliar object [177] Some studies 1047297nd that patients withextensive hippocampus damage show retention of the new objectndashname association at a
delayed test [178179]suggesting very rapid neocortical learning even without a hippocampus However the 1047297nding has proven dif 1047297cult toreplicate [180ndash182] future studies should continue to investigate this issue
(A) (B)Original paired associates
1 2
3
4
5 5
4
8
3
7
2
6
Introducon of new paired associates
Figure I Schematic Illustration of the Event Arena Paradigm (A) Overhead view of 16 m 16 m event arenarats are cuedwithone of
six food 1047298avors (eg banana) each associated with a location in thearena (eg location 3) andare required to gofromany of the four start-boxesto a speci1047297c location to retrieve food (B)Following gradual learning of the originalset twonew 1047298avor-placepairs are introduced(eg cinnamonndashlocation7 nutmegndashlocation8) Rapidschema-dependent one-shot learning of these new PAs is observed (see Box text) Figure based on experimental designdescribed in [7]
526 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
allocatedneuronal codes that are non-overlapping or orthogonal (eg [26]) Notably the advantagesof
this
coding
scheme
for
episodic
memory ndash
reduction
of
interference
between
similar
butdistinct
events ndash
may
also
have
signi1047297cant
bene1047297ts for
continual
learning
Speci1047297cally
thismechanism allows the rapid creation of distinct non-interfering representations for multipletasks
to
which
an
agent
has
been
exposed
in
sequential
fashion
The
utility
of
this
function
andthe ubiquity of continual learning is well established in the domain of spatial navigation wherethe
notion
of
a
task
can
be
related
to
that
of
an
environmental
context
rodents
are
able
to
learnand
sustain
robust
representations
of
many
different
environments
(eg
gt10
environments
in[120])
with
each
environment
being
represented
by
a
pattern-separated
representational
space
Box 9 Experience Replay in Deep Q-Networks
Instead of employing a standard online learning method in which each unit of play experience (consisting of a stateaction next state and resulting reward) is used immediately to adjust connection weights and then discarded anexperience replay buffer similar to the hippocampus is used This allows learning based on randomly chosen subsets of
recent experiencesstored in the replay buffer([119] fordetails)to beinterleavedwith ongoing game-play Theapproach isin line with 1047297ndings cited above [66] that hippocampal replay reactivates reward related neurons in striatum in accordwith the hypothesis that hippocampus-dependent RL facilitates learning during off-line periods
Experience replayin theDQN architecturewascrucial in (i)maximizing data ef 1047297ciency allowing each unit of experience tobe reusedin many updates (egmirroringbene1047297ts of repeated time-compressedhippocampal replay) and (ii) smoothingout learning and avoiding unstable response policies that can result from the tendency of the current policy to bias theexperienced samples The approach minimizes learning from consecutive samples which is undesirable owing to theirstrongly correlated nature and inconsistent with the implicit assumptions built into neural-network learning algorithmsInstead experience replay allows updates within the deep Q-network to be performed on non-adjacent samples from aset of recent experiences in a fashion that breaks up these correlations while sti ll relying on relevant statistics Thedramatic advantage of a network implementing interleaved learning through experience replay was illustrated by theeffects of disabling replayon network performance this causeda severedrop in performance to at best30 of whenexperience replay was present [119] Note that the uniform sampling mechanismas implemented treats all transitions inthe replay memory as if they were equal Recent work [183] shows that biasing replay towards signi1047297cant events ndash
speci1047297cally experiences that are associated with high reward prediction errors ndash yields further gains This mechanismwhich resonateswith therole of the hippocampus in reweighting experiences as discussedabove allows information tobe harvested from rare experiences that may be particularly informative
528 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
inferenceparadigm [590110]) Such representations then become the contents of episodic memorysubject
to
storage
in
the
hippocampus
The
distinction
between
encoding-
and
retrieval-based
models
can
be
related
more
broadly
tothe
1047297nding
of lsquoconceptrsquo cells
hippocampal
neurons
which
come
to
respond
to
common
featuresacross many events for example cells for speci1047297c odors [111] time-points within an episode[112]
attributes
of
a
task
[113]
and
even
cells
that
1047297re
to
any
picture
or
the
name
of
a
famousperson
[114]
In
Box
7
we
review
empirical 1047297ndings
concerning
concept
cells
and
pattern
overlap
sometimes observed in parts of hippocampus and consider how well these 1047297ndings 1047297t within theperspective
that
the
hippocampus
supports
pattern
separation
Rapid
Schema-Dependent
ConsolidationIt
is
useful
to
distinguish
systems-level
consolidation
from
what
we
refer
to
as
within-systemconsolidation
The
former
refers
to
the
gradual
integration
of
knowledge
into
neocortical
circuitswhile
the
latter
denotes
stabilization
of
recently
formed
memories
within
the
hippocampusperhaps
through
stabilization
of
synapses
among
hippocampal
neurons
[89] In
the
initialformulation
of
CLS
systems-level
consolidation
was
viewed
as
temporally
extended
(egspanning
years
or
even
decades
in
humans
[3451ndash53])
Although
it
was
noted
in
[1]
thatthe
timeframe
could
be
highly
variable
(depending
perhaps
on
the
rate
of
replay
of
memory
Box 7 Concept Cells and Nodal CodingsReports of concept cells in thehippocampushavebeen takenas contradictinga tenet ofCLStheorybut theexistence of such neurons is notnecessarilyinconsistentwith itgiven that thetheoryexpects differenthippocampalregions to vary interms of contextspeci1047297city andalso permits variationwithin hippocampal regions (Box 3) Evidence supporting theCLSprediction of context-speci1047297city in theCA3and DGcomes from a recent intracranial recording study in humans [166] Inthis study neurons in CA3DG andalso in the subiculum tended to discriminate between different imagesof a famousperson ndash with responses correlating with successful performance in a recognition memory task that required discri-minating previously experienced targets from similar lures Neurons in other MTL areas (ie entorhinal and parahippo-campal cortices) exhibitedmore invariant lsquoconcept cell likersquo responses that were not linked tomemory performance (theCA1 subregion was sparsely sampled in this study)
It is also interesting to consider the1047297ndingof lsquosplitterrsquo cells in a task where animalsmust alternatebetween turning left andright on successive trials in a T maze [167ndash179] here someCA1 and CA3 place cellsfor locations onthe central stemof the T maze are modulated by the trajectory of the rat (eg whether it will subsequently turn left or right) whereas othersare
trajectory-independentThisphenomenon knownas partial remapping [48170ndash172] is consistent with theidea that
pattern separation is a matter of
degree in our theory [2737] As such we should expectpartly overlapping representa-tions (ie ratherthan fully independent lsquochartsrsquo [121]) whenenvironmental changes are suf 1047297ciently small (Box3)We alsoexpectthe greatest differentiationin DGand at an early point in learningTo ourknowledge no studies have yetrecordedfrom DG in this paradigm
In a recent study representational similarity analysis techniques [173] were applied to ensemble recordingdata collectedwhile rats performed a context-guided rewarddiscrimination task [113] As expected the population codes in CA3 andCA1were dominatedby context andplace coding although other task dimensions ndash reward value and item ndashwere alsorepresented [113] (also see [174]) Although there was some representational overlap across locations based on valueand item CA3CA1 codes were consistent with incomplete but still strong pattern separation especially in the dorsalhippocampus Overall these 1047297ndings appear consistent with the CLS with the provision that pattern separation is amatter of degree andmay vary by task andregionWhyCA3 showsgreater speci1047297citythanCA1in somestudies but notothers requires further exploration
524 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
large amplitude weight changes occurred during the learning of schema-consistent
but not schema-inconsistent
information ndash
emulating the
schema-dependent pattern of neocortical plasticity-related gene expression reported in [8] A theo-retical analysis of
multilayer neural
networks makes clear why
themodel exhibits these effects[20]
the analysis
shows
that
the
rate of
learning within
a
multilayered
neural
network of
thetype that
CLS attributes to
the neocortex
[20]
will always
depend
on
the state of knowledge
Box 8 Rapid Integration of New Learning in the Neocortex When Does it Occur
In the event arena paradigm [78] (Figure I) hippocampal lesions prevent acquisition of new schema-consistentassociations By contrast hippocampal lesions performed as little as 48 h after learning leave memory intact Oneexplanation for the crucial but temporary nature of the hippocampal contribution is replay even a
few minutes with the
hippocampus intact couldallowmultiple replays eachone incrementing the strength of intra-neocortical connections Inan investigation of induction of plasticity-related genes in neocortex [8] the hippocampuswas intact for 80minutes afterinitial exposure to the new associations These 1047297ndings raise the broader question of when rapid integration of newlearning into the neocortex occurs and whether it can occur even without a hippocampus
A substantial body of work from several laboratories now supports the view that a single period of sleep can producechanges in how experiences froma single learning session impact on subsequent responding As key examples somestudies have reported increased levelsof linking inferences [175] andothershave reported increased lexical competitionand related phenomena[109176] attributedto a singlesleepsessionThese1047297ndingsare often interpreted asevidenceof rapidsystems-level consolidation (eg [176])
However thematerials used arenot obviously highly consistentwith priorknowledge in most cases and therefore under the CLS framework wewould not expect full integration into neocorticalnetworks in such a short time-period An alternative interpretation (illustrated in [5]) is that replays during sleep increasethe strength robustness and rate of activation of new hippocampus-dependent traces and that such strengtheningmay be suf 1047297cient to account for the observed effects Thus the 1047297ndings are consistent with the view that integration of these new memories into neocortical structures proceeds over a considerably longer time-period
Work with the lsquofast mappingrsquo paradigm in humanswith hippocampal lesions [177] provides another potential source of evidence about rapid neocortical learning of arbitrary new information In this paradigm human participants seepairs of pictures of objects ndash onefamiliar andone unfamiliar ndash and are asked a question such as lsquois thenumbats tail pointing uprsquoinferring that the unfamiliar name lsquonumbatrsquomust refer to the unfamiliar object [177] Some studies 1047297nd that patients withextensive hippocampus damage show retention of the new objectndashname association at a
delayed test [178179]suggesting very rapid neocortical learning even without a hippocampus However the 1047297nding has proven dif 1047297cult toreplicate [180ndash182] future studies should continue to investigate this issue
(A) (B)Original paired associates
1 2
3
4
5 5
4
8
3
7
2
6
Introducon of new paired associates
Figure I Schematic Illustration of the Event Arena Paradigm (A) Overhead view of 16 m 16 m event arenarats are cuedwithone of
six food 1047298avors (eg banana) each associated with a location in thearena (eg location 3) andare required to gofromany of the four start-boxesto a speci1047297c location to retrieve food (B)Following gradual learning of the originalset twonew 1047298avor-placepairs are introduced(eg cinnamonndashlocation7 nutmegndashlocation8) Rapidschema-dependent one-shot learning of these new PAs is observed (see Box text) Figure based on experimental designdescribed in [7]
526 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
allocatedneuronal codes that are non-overlapping or orthogonal (eg [26]) Notably the advantagesof
this
coding
scheme
for
episodic
memory ndash
reduction
of
interference
between
similar
butdistinct
events ndash
may
also
have
signi1047297cant
bene1047297ts for
continual
learning
Speci1047297cally
thismechanism allows the rapid creation of distinct non-interfering representations for multipletasks
to
which
an
agent
has
been
exposed
in
sequential
fashion
The
utility
of
this
function
andthe ubiquity of continual learning is well established in the domain of spatial navigation wherethe
notion
of
a
task
can
be
related
to
that
of
an
environmental
context
rodents
are
able
to
learnand
sustain
robust
representations
of
many
different
environments
(eg
gt10
environments
in[120])
with
each
environment
being
represented
by
a
pattern-separated
representational
space
Box 9 Experience Replay in Deep Q-Networks
Instead of employing a standard online learning method in which each unit of play experience (consisting of a stateaction next state and resulting reward) is used immediately to adjust connection weights and then discarded anexperience replay buffer similar to the hippocampus is used This allows learning based on randomly chosen subsets of
recent experiencesstored in the replay buffer([119] fordetails)to beinterleavedwith ongoing game-play Theapproach isin line with 1047297ndings cited above [66] that hippocampal replay reactivates reward related neurons in striatum in accordwith the hypothesis that hippocampus-dependent RL facilitates learning during off-line periods
Experience replayin theDQN architecturewascrucial in (i)maximizing data ef 1047297ciency allowing each unit of experience tobe reusedin many updates (egmirroringbene1047297ts of repeated time-compressedhippocampal replay) and (ii) smoothingout learning and avoiding unstable response policies that can result from the tendency of the current policy to bias theexperienced samples The approach minimizes learning from consecutive samples which is undesirable owing to theirstrongly correlated nature and inconsistent with the implicit assumptions built into neural-network learning algorithmsInstead experience replay allows updates within the deep Q-network to be performed on non-adjacent samples from aset of recent experiences in a fashion that breaks up these correlations while sti ll relying on relevant statistics Thedramatic advantage of a network implementing interleaved learning through experience replay was illustrated by theeffects of disabling replayon network performance this causeda severedrop in performance to at best30 of whenexperience replay was present [119] Note that the uniform sampling mechanismas implemented treats all transitions inthe replay memory as if they were equal Recent work [183] shows that biasing replay towards signi1047297cant events ndash
speci1047297cally experiences that are associated with high reward prediction errors ndash yields further gains This mechanismwhich resonateswith therole of the hippocampus in reweighting experiences as discussedabove allows information tobe harvested from rare experiences that may be particularly informative
528 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
large amplitude weight changes occurred during the learning of schema-consistent
but not schema-inconsistent
information ndash
emulating the
schema-dependent pattern of neocortical plasticity-related gene expression reported in [8] A theo-retical analysis of
multilayer neural
networks makes clear why
themodel exhibits these effects[20]
the analysis
shows
that
the
rate of
learning within
a
multilayered
neural
network of
thetype that
CLS attributes to
the neocortex
[20]
will always
depend
on
the state of knowledge
Box 8 Rapid Integration of New Learning in the Neocortex When Does it Occur
In the event arena paradigm [78] (Figure I) hippocampal lesions prevent acquisition of new schema-consistentassociations By contrast hippocampal lesions performed as little as 48 h after learning leave memory intact Oneexplanation for the crucial but temporary nature of the hippocampal contribution is replay even a
few minutes with the
hippocampus intact couldallowmultiple replays eachone incrementing the strength of intra-neocortical connections Inan investigation of induction of plasticity-related genes in neocortex [8] the hippocampuswas intact for 80minutes afterinitial exposure to the new associations These 1047297ndings raise the broader question of when rapid integration of newlearning into the neocortex occurs and whether it can occur even without a hippocampus
A substantial body of work from several laboratories now supports the view that a single period of sleep can producechanges in how experiences froma single learning session impact on subsequent responding As key examples somestudies have reported increased levelsof linking inferences [175] andothershave reported increased lexical competitionand related phenomena[109176] attributedto a singlesleepsessionThese1047297ndingsare often interpreted asevidenceof rapidsystems-level consolidation (eg [176])
However thematerials used arenot obviously highly consistentwith priorknowledge in most cases and therefore under the CLS framework wewould not expect full integration into neocorticalnetworks in such a short time-period An alternative interpretation (illustrated in [5]) is that replays during sleep increasethe strength robustness and rate of activation of new hippocampus-dependent traces and that such strengtheningmay be suf 1047297cient to account for the observed effects Thus the 1047297ndings are consistent with the view that integration of these new memories into neocortical structures proceeds over a considerably longer time-period
Work with the lsquofast mappingrsquo paradigm in humanswith hippocampal lesions [177] provides another potential source of evidence about rapid neocortical learning of arbitrary new information In this paradigm human participants seepairs of pictures of objects ndash onefamiliar andone unfamiliar ndash and are asked a question such as lsquois thenumbats tail pointing uprsquoinferring that the unfamiliar name lsquonumbatrsquomust refer to the unfamiliar object [177] Some studies 1047297nd that patients withextensive hippocampus damage show retention of the new objectndashname association at a
delayed test [178179]suggesting very rapid neocortical learning even without a hippocampus However the 1047297nding has proven dif 1047297cult toreplicate [180ndash182] future studies should continue to investigate this issue
(A) (B)Original paired associates
1 2
3
4
5 5
4
8
3
7
2
6
Introducon of new paired associates
Figure I Schematic Illustration of the Event Arena Paradigm (A) Overhead view of 16 m 16 m event arenarats are cuedwithone of
six food 1047298avors (eg banana) each associated with a location in thearena (eg location 3) andare required to gofromany of the four start-boxesto a speci1047297c location to retrieve food (B)Following gradual learning of the originalset twonew 1047298avor-placepairs are introduced(eg cinnamonndashlocation7 nutmegndashlocation8) Rapidschema-dependent one-shot learning of these new PAs is observed (see Box text) Figure based on experimental designdescribed in [7]
526 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
allocatedneuronal codes that are non-overlapping or orthogonal (eg [26]) Notably the advantagesof
this
coding
scheme
for
episodic
memory ndash
reduction
of
interference
between
similar
butdistinct
events ndash
may
also
have
signi1047297cant
bene1047297ts for
continual
learning
Speci1047297cally
thismechanism allows the rapid creation of distinct non-interfering representations for multipletasks
to
which
an
agent
has
been
exposed
in
sequential
fashion
The
utility
of
this
function
andthe ubiquity of continual learning is well established in the domain of spatial navigation wherethe
notion
of
a
task
can
be
related
to
that
of
an
environmental
context
rodents
are
able
to
learnand
sustain
robust
representations
of
many
different
environments
(eg
gt10
environments
in[120])
with
each
environment
being
represented
by
a
pattern-separated
representational
space
Box 9 Experience Replay in Deep Q-Networks
Instead of employing a standard online learning method in which each unit of play experience (consisting of a stateaction next state and resulting reward) is used immediately to adjust connection weights and then discarded anexperience replay buffer similar to the hippocampus is used This allows learning based on randomly chosen subsets of
recent experiencesstored in the replay buffer([119] fordetails)to beinterleavedwith ongoing game-play Theapproach isin line with 1047297ndings cited above [66] that hippocampal replay reactivates reward related neurons in striatum in accordwith the hypothesis that hippocampus-dependent RL facilitates learning during off-line periods
Experience replayin theDQN architecturewascrucial in (i)maximizing data ef 1047297ciency allowing each unit of experience tobe reusedin many updates (egmirroringbene1047297ts of repeated time-compressedhippocampal replay) and (ii) smoothingout learning and avoiding unstable response policies that can result from the tendency of the current policy to bias theexperienced samples The approach minimizes learning from consecutive samples which is undesirable owing to theirstrongly correlated nature and inconsistent with the implicit assumptions built into neural-network learning algorithmsInstead experience replay allows updates within the deep Q-network to be performed on non-adjacent samples from aset of recent experiences in a fashion that breaks up these correlations while sti ll relying on relevant statistics Thedramatic advantage of a network implementing interleaved learning through experience replay was illustrated by theeffects of disabling replayon network performance this causeda severedrop in performance to at best30 of whenexperience replay was present [119] Note that the uniform sampling mechanismas implemented treats all transitions inthe replay memory as if they were equal Recent work [183] shows that biasing replay towards signi1047297cant events ndash
speci1047297cally experiences that are associated with high reward prediction errors ndash yields further gains This mechanismwhich resonateswith therole of the hippocampus in reweighting experiences as discussedabove allows information tobe harvested from rare experiences that may be particularly informative
528 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
large amplitude weight changes occurred during the learning of schema-consistent
but not schema-inconsistent
information ndash
emulating the
schema-dependent pattern of neocortical plasticity-related gene expression reported in [8] A theo-retical analysis of
multilayer neural
networks makes clear why
themodel exhibits these effects[20]
the analysis
shows
that
the
rate of
learning within
a
multilayered
neural
network of
thetype that
CLS attributes to
the neocortex
[20]
will always
depend
on
the state of knowledge
Box 8 Rapid Integration of New Learning in the Neocortex When Does it Occur
In the event arena paradigm [78] (Figure I) hippocampal lesions prevent acquisition of new schema-consistentassociations By contrast hippocampal lesions performed as little as 48 h after learning leave memory intact Oneexplanation for the crucial but temporary nature of the hippocampal contribution is replay even a
few minutes with the
hippocampus intact couldallowmultiple replays eachone incrementing the strength of intra-neocortical connections Inan investigation of induction of plasticity-related genes in neocortex [8] the hippocampuswas intact for 80minutes afterinitial exposure to the new associations These 1047297ndings raise the broader question of when rapid integration of newlearning into the neocortex occurs and whether it can occur even without a hippocampus
A substantial body of work from several laboratories now supports the view that a single period of sleep can producechanges in how experiences froma single learning session impact on subsequent responding As key examples somestudies have reported increased levelsof linking inferences [175] andothershave reported increased lexical competitionand related phenomena[109176] attributedto a singlesleepsessionThese1047297ndingsare often interpreted asevidenceof rapidsystems-level consolidation (eg [176])
However thematerials used arenot obviously highly consistentwith priorknowledge in most cases and therefore under the CLS framework wewould not expect full integration into neocorticalnetworks in such a short time-period An alternative interpretation (illustrated in [5]) is that replays during sleep increasethe strength robustness and rate of activation of new hippocampus-dependent traces and that such strengtheningmay be suf 1047297cient to account for the observed effects Thus the 1047297ndings are consistent with the view that integration of these new memories into neocortical structures proceeds over a considerably longer time-period
Work with the lsquofast mappingrsquo paradigm in humanswith hippocampal lesions [177] provides another potential source of evidence about rapid neocortical learning of arbitrary new information In this paradigm human participants seepairs of pictures of objects ndash onefamiliar andone unfamiliar ndash and are asked a question such as lsquois thenumbats tail pointing uprsquoinferring that the unfamiliar name lsquonumbatrsquomust refer to the unfamiliar object [177] Some studies 1047297nd that patients withextensive hippocampus damage show retention of the new objectndashname association at a
delayed test [178179]suggesting very rapid neocortical learning even without a hippocampus However the 1047297nding has proven dif 1047297cult toreplicate [180ndash182] future studies should continue to investigate this issue
(A) (B)Original paired associates
1 2
3
4
5 5
4
8
3
7
2
6
Introducon of new paired associates
Figure I Schematic Illustration of the Event Arena Paradigm (A) Overhead view of 16 m 16 m event arenarats are cuedwithone of
six food 1047298avors (eg banana) each associated with a location in thearena (eg location 3) andare required to gofromany of the four start-boxesto a speci1047297c location to retrieve food (B)Following gradual learning of the originalset twonew 1047298avor-placepairs are introduced(eg cinnamonndashlocation7 nutmegndashlocation8) Rapidschema-dependent one-shot learning of these new PAs is observed (see Box text) Figure based on experimental designdescribed in [7]
526 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
allocatedneuronal codes that are non-overlapping or orthogonal (eg [26]) Notably the advantagesof
this
coding
scheme
for
episodic
memory ndash
reduction
of
interference
between
similar
butdistinct
events ndash
may
also
have
signi1047297cant
bene1047297ts for
continual
learning
Speci1047297cally
thismechanism allows the rapid creation of distinct non-interfering representations for multipletasks
to
which
an
agent
has
been
exposed
in
sequential
fashion
The
utility
of
this
function
andthe ubiquity of continual learning is well established in the domain of spatial navigation wherethe
notion
of
a
task
can
be
related
to
that
of
an
environmental
context
rodents
are
able
to
learnand
sustain
robust
representations
of
many
different
environments
(eg
gt10
environments
in[120])
with
each
environment
being
represented
by
a
pattern-separated
representational
space
Box 9 Experience Replay in Deep Q-Networks
Instead of employing a standard online learning method in which each unit of play experience (consisting of a stateaction next state and resulting reward) is used immediately to adjust connection weights and then discarded anexperience replay buffer similar to the hippocampus is used This allows learning based on randomly chosen subsets of
recent experiencesstored in the replay buffer([119] fordetails)to beinterleavedwith ongoing game-play Theapproach isin line with 1047297ndings cited above [66] that hippocampal replay reactivates reward related neurons in striatum in accordwith the hypothesis that hippocampus-dependent RL facilitates learning during off-line periods
Experience replayin theDQN architecturewascrucial in (i)maximizing data ef 1047297ciency allowing each unit of experience tobe reusedin many updates (egmirroringbene1047297ts of repeated time-compressedhippocampal replay) and (ii) smoothingout learning and avoiding unstable response policies that can result from the tendency of the current policy to bias theexperienced samples The approach minimizes learning from consecutive samples which is undesirable owing to theirstrongly correlated nature and inconsistent with the implicit assumptions built into neural-network learning algorithmsInstead experience replay allows updates within the deep Q-network to be performed on non-adjacent samples from aset of recent experiences in a fashion that breaks up these correlations while sti ll relying on relevant statistics Thedramatic advantage of a network implementing interleaved learning through experience replay was illustrated by theeffects of disabling replayon network performance this causeda severedrop in performance to at best30 of whenexperience replay was present [119] Note that the uniform sampling mechanismas implemented treats all transitions inthe replay memory as if they were equal Recent work [183] shows that biasing replay towards signi1047297cant events ndash
speci1047297cally experiences that are associated with high reward prediction errors ndash yields further gains This mechanismwhich resonateswith therole of the hippocampus in reweighting experiences as discussedabove allows information tobe harvested from rare experiences that may be particularly informative
528 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
allocatedneuronal codes that are non-overlapping or orthogonal (eg [26]) Notably the advantagesof
this
coding
scheme
for
episodic
memory ndash
reduction
of
interference
between
similar
butdistinct
events ndash
may
also
have
signi1047297cant
bene1047297ts for
continual
learning
Speci1047297cally
thismechanism allows the rapid creation of distinct non-interfering representations for multipletasks
to
which
an
agent
has
been
exposed
in
sequential
fashion
The
utility
of
this
function
andthe ubiquity of continual learning is well established in the domain of spatial navigation wherethe
notion
of
a
task
can
be
related
to
that
of
an
environmental
context
rodents
are
able
to
learnand
sustain
robust
representations
of
many
different
environments
(eg
gt10
environments
in[120])
with
each
environment
being
represented
by
a
pattern-separated
representational
space
Box 9 Experience Replay in Deep Q-Networks
Instead of employing a standard online learning method in which each unit of play experience (consisting of a stateaction next state and resulting reward) is used immediately to adjust connection weights and then discarded anexperience replay buffer similar to the hippocampus is used This allows learning based on randomly chosen subsets of
recent experiencesstored in the replay buffer([119] fordetails)to beinterleavedwith ongoing game-play Theapproach isin line with 1047297ndings cited above [66] that hippocampal replay reactivates reward related neurons in striatum in accordwith the hypothesis that hippocampus-dependent RL facilitates learning during off-line periods
Experience replayin theDQN architecturewascrucial in (i)maximizing data ef 1047297ciency allowing each unit of experience tobe reusedin many updates (egmirroringbene1047297ts of repeated time-compressedhippocampal replay) and (ii) smoothingout learning and avoiding unstable response policies that can result from the tendency of the current policy to bias theexperienced samples The approach minimizes learning from consecutive samples which is undesirable owing to theirstrongly correlated nature and inconsistent with the implicit assumptions built into neural-network learning algorithmsInstead experience replay allows updates within the deep Q-network to be performed on non-adjacent samples from aset of recent experiences in a fashion that breaks up these correlations while sti ll relying on relevant statistics Thedramatic advantage of a network implementing interleaved learning through experience replay was illustrated by theeffects of disabling replayon network performance this causeda severedrop in performance to at best30 of whenexperience replay was present [119] Note that the uniform sampling mechanismas implemented treats all transitions inthe replay memory as if they were equal Recent work [183] shows that biasing replay towards signi1047297cant events ndash
speci1047297cally experiences that are associated with high reward prediction errors ndash yields further gains This mechanismwhich resonateswith therole of the hippocampus in reweighting experiences as discussedabove allows information tobe harvested from rare experiences that may be particularly informative
528 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
allocatedneuronal codes that are non-overlapping or orthogonal (eg [26]) Notably the advantagesof
this
coding
scheme
for
episodic
memory ndash
reduction
of
interference
between
similar
butdistinct
events ndash
may
also
have
signi1047297cant
bene1047297ts for
continual
learning
Speci1047297cally
thismechanism allows the rapid creation of distinct non-interfering representations for multipletasks
to
which
an
agent
has
been
exposed
in
sequential
fashion
The
utility
of
this
function
andthe ubiquity of continual learning is well established in the domain of spatial navigation wherethe
notion
of
a
task
can
be
related
to
that
of
an
environmental
context
rodents
are
able
to
learnand
sustain
robust
representations
of
many
different
environments
(eg
gt10
environments
in[120])
with
each
environment
being
represented
by
a
pattern-separated
representational
space
Box 9 Experience Replay in Deep Q-Networks
Instead of employing a standard online learning method in which each unit of play experience (consisting of a stateaction next state and resulting reward) is used immediately to adjust connection weights and then discarded anexperience replay buffer similar to the hippocampus is used This allows learning based on randomly chosen subsets of
recent experiencesstored in the replay buffer([119] fordetails)to beinterleavedwith ongoing game-play Theapproach isin line with 1047297ndings cited above [66] that hippocampal replay reactivates reward related neurons in striatum in accordwith the hypothesis that hippocampus-dependent RL facilitates learning during off-line periods
Experience replayin theDQN architecturewascrucial in (i)maximizing data ef 1047297ciency allowing each unit of experience tobe reusedin many updates (egmirroringbene1047297ts of repeated time-compressedhippocampal replay) and (ii) smoothingout learning and avoiding unstable response policies that can result from the tendency of the current policy to bias theexperienced samples The approach minimizes learning from consecutive samples which is undesirable owing to theirstrongly correlated nature and inconsistent with the implicit assumptions built into neural-network learning algorithmsInstead experience replay allows updates within the deep Q-network to be performed on non-adjacent samples from aset of recent experiences in a fashion that breaks up these correlations while sti ll relying on relevant statistics Thedramatic advantage of a network implementing interleaved learning through experience replay was illustrated by theeffects of disabling replayon network performance this causeda severedrop in performance to at best30 of whenexperience replay was present [119] Note that the uniform sampling mechanismas implemented treats all transitions inthe replay memory as if they were equal Recent work [183] shows that biasing replay towards signi1047297cant events ndash
speci1047297cally experiences that are associated with high reward prediction errors ndash yields further gains This mechanismwhich resonateswith therole of the hippocampus in reweighting experiences as discussedabove allows information tobe harvested from rare experiences that may be particularly informative
528 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
Box 10 Neural Networks with External Memory and the Hippocampus
The neural Turing machine (NTM) [125] consists of two basic components an external memory and a neural network controller that is distinguished by its ability to interact with the external memory (Figure I) An external memory allowsspeci1047297c
inputs(suchas items to be remembered) or theresults of intermediate computations to bewrittento it andthen
to be read out in a content- or location-based addressable fashion [184]
The controller interacts with the external memory through write and read heads that focus on particular parts of thememory matrix through attentional addressing mechanisms Content-based addressing focuses attention on memoryslots
based on their similarity to the current values (ie lsquokeyrsquo) emitted by the controller The graded similarity-basednature of these addressingmechanisms allows the architecture to be trained using the continuous learning signals thatdrive learning in other deep neural networks [10] The controller may be a feedforward network but is more typically arecurrent network exploiting specialized long-short-term memory (LSTM) modules [185] that can learn to retaininformation over very extended numbers of time-steps In contrast to standard neural networks the architecture of the NTMallows a separationof computation from memory as in conventional computers [125] Thisallows the NTM tolearn to perform algorithms independently of the variables concerned (also see [186])
Whileparallelshavebeendrawnbetweenthe externalmemoryof theNTMandworkingmemory [125] the characteristicsof its external memory can easily be related to long-termmemory systems as well Indeed content-based addressableexternalmemories of thiskind share functionalitieswith attractor networks [145]
an architectureoften used tomodel thecomputational functions performed by the CA3 subregion of the hippocampus (eg storage and retrieval of episodic
memories) [187]
There are further points of connection between the operation of the NTM and the hippocampusinformation is not stored and retained indiscriminately instead it is selected based on an estimate of potential futurerelevance (see section lsquoProposed Role for the Hippocampus in Circumventing the Statistics of the Environmentrsquo)
Input (Xt) Output (Yt)
Controller
Write heads
External memory
Read heads
Figure I NTM and the Paired Associative Recall Task
The input to the controller is a sequence of column vectors The network receives one column per time-step and the 1047297gure shows thecolumns presentedover 29 consecutive time-steps indexed by t The input here consists of a sequence of items where each item is three binary random vectors
presentedin adjacent time-steps Twoitems arehighlighted onein a greenboxand onein a redbox A delimiter symbol(in row 4) appears in the time-step preceding each item After three items have been presented a different delimitersymbol(row5)occurs followedbya query (single item ingreenbox)The network respondscorrectlywith theappropriatetarget
(red box) Schematic representation of external memory matrix shown Adapted with permission from [125]
Trendsin CognitiveSciences July 2016 Vol 20 No 7 529
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
It is also worth noting that the neuropsychological testing of story recall can be considered to bea
version
of
the
QampA
task
used
in
machine
learning
(eg
[126])
When
the
amount
of
storycontent to be retained exceeds a few sentences this task is crucially dependent on the memorystorage
properties
of
the
hippocampus
Indeed
the
speci1047297c working
of
the
REMERGE
model
of the
hippocampus ndash recurrent similarity computation such
that
the
output
of
the
episodicsystem is recirculated as a new input ndash has parallels in a recent machine-learning algorithmdeveloped
for
the
purpose
of
QampA
termed
a lsquomemory
network rsquo [127]
Speci1047297cally
a
learneddense
feature-vector
representation
of
an
input
query
(eg lsquowhere
is
the
milkrsquo) is
used
to
retrieve the sentence with the most similar feature vector in the database (eg lsquoJoe left the milk rsquo)a
combined
feature
representation
of
the
initial
query
and
retrieved
sentence
is
then
used
toidentify
similar
sentences
earlier
in
the
story
(lsquoJoe
traveled
to
the
of 1047297cersquo) this
process
iterates
untila
response
is
emitted
by
the
network
(lsquothe
of 1047297cersquo) The
joint
dependence
of
this
system
on
input output
feature
representations
that
are
developed
gradually
through
training
with
a
large
corpusof
text
and
on
individual
stored
sentences
nicely
parallels
the
complementary
roles
of
neocorticaland
hippocampal
representations
in
CLS
theory
and
REMERGE
Concluding
Remarks
We
have argued
that
the core
features of
the
memory
architecture
proposed
by
CLS theorycontinue
to
provide
a
useful framework
for understanding the organization
of
learningsystems
in
the brain We
have however re1047297ned
and extended the theory
in
several
waysFirst we
now encompass a
broader and more-signi1047297cant role
for the hippocampus ingeneralization
than
previously thought Second
we
have
amended the statement thatneocortical learning is
constrained to
be
slow per se ndash
instead
we
now clarify
that
the rateof
neocortical learning is
dependent
on
prior knowledge
and
can be
relatively fast under someconditions
Together
these
revisions to
the
theory
imply
a
softening of
the
originally strictdichotomy
between the characteristics
of
neocortical (slow
learning
parametric
and
there-fore
generalizing) and
hippocampal (fast-learning
item-based)
systems In
addition we
haveextended the proposed
functions for the
fast-learning hippocampal system suggesting thatthis system
can circumvent
the
general statistics of
the environment by
reweighting expe-riences
that
are of
signi1047297cance
Finally
we
have
highlighted the broad
applicability
of
theprinciples
of
CLS theory to
developing
agents
with
arti1047297cial
intel ligence an area which wehope will continue to
rise
in
interest
and become a
signi1047297cant
direction for future
research (seeOutstanding
Questions)
Acknowledgments
We are very grateful to Adam Cain for help with creating the 1047297gures and Greg Wayne and Nikolaus Kriegeskorte for
comments on an earlier version of the paper
References1 McClelland JL et al (1995) Why there are complementary
learning systems in the hippocampus and neocortex insightsfrom the successes and fai lures of connect ionist models of learning and memory Psychol Rev 102 419ndash457
2 OrsquoNeill J et al (2010) Play i t again react ivat ion of wakingexperience and memory Trends Neurosci 33 220ndash229
3 Wikenheiser AM andRedish AD (2015)Decodingthe cogni-tive map ensemble hippocampal sequences and decision mak-ing Curr Opin Neurobiol 32 8ndash15
4 Zeithamova D et a l (2012) The hippocampus and inferentialreasoningbuildingmemoriesto navigate futuredecisions FrontHum Neurosci 6 1ndash14
Outstanding
QuestionsUnder what conditions does the pro-posed hippocampal reweighting of experiences result in a biased neocor-
tical model of environmental structure
Are hippocampal representationsupdated to incorporate changes inneocortical representations (the lsquoindexmaintenancersquo problem) andif so how
What is the fate of hippocampal mem-ory traces after systems-level consoli-dation is complete
What are the precise conditions underwhich rapid systems-level consolida-tion can occur
Are hippocampal memory traces sus-ceptible to reconsolidation in a waythatmirrorsamygdala-dependentmemories(eg in fear-conditioning paradigms)
Whatneocortical mechanismscomple-ment hippocampal replay in facilitatingcontinual learning
What algorithmic functionalities andimplementational schemes are desir-able for an external memory moduleboth forhumanlearnersand forarti1047297cialagents
530 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
5 Kumaran D andMcClellandJL (2012) Generalization throughthe recurrent interaction of episodic memories A model of thehippocampal system Psychol Rev 119 573ndash616
6 Eichenbaum H (2004) Hippocampus cognitive processes andneural representations that underlie declarativememoryNeuron
44 109ndash120
7 Tse D et al (2007) Schemas and memory consolidation Sci-ence 316 76ndash82
8 Tse D et a l (2011) Schema-dependent gene activation andmemory encoding in neocortex Science 333 891ndash895
9 Marr D (1971)Simple memory a theory forarchicortexPhilosTrans R Soc L B Biol Sci 262 23ndash81
10 Rumelhart DE et al (1986) Learning representations by back-propagating errors Nature 323 533ndash536
11 Sejnowski TJ and Rosenberg CR (1987) Parallel networksthat learn to pronounceEnglish text Complex Syst1 145ndash168
12 Guyonneau R et al (2004) Temporal codes and sparse repre-sentations a key to understanding rapid processing in thevisualsystem J Physiol Paris 98 487ndash497
13 Plaut DC et a l (1996) Understanding normal and impairedwordreadingcomputational principlesin quasi-regular domainsPsychol Rev 103 56ndash115
15 Rumelhart DE (1990) Brain style computation learning andgeneralization In An Introduction to Electronic and Neural Net-
works (ZornetzerSF etal eds) pp 405ndash420Academic Press
16 LeCun Y et al (2015) Deep learning Nature 521 436ndash444
17 Yamins DL et a l (2014) Performance-optimized hierarchicalmodels predict neural responses in higher visual cortex ProcNatl Acad Sci USA 111 8619ndash8624
18 Yamins DL and DiCarlo JJ (2016) Using goal-driven deeplearning models to understand sensory cortex Nat Neurosci19 356ndash365
19 Saxe AM et al (2015) Learning hierarchical categories in deepneural networks In Proceedings of the 35th Annual Conferenceof the Cognitive Science Society pp 1271ndash1276 CognitiveScience Society
20 SaxeAM etal (2014)Exactsolutions to the nonlineardynamics
of learning in deep linear neural networks21 McCloskeyM andCohen NJ (1989) Catastrophic forgettingin
connectionist networks the problem of sequential learning InThe Psychology of Learning andMotivation (Vol 20) (Bower GH ed) pp 109ndash165 Academic Press
22 Ratcliff R (1990) Connectionist models of recognition memoryconstraints imposed by learning and forgetting functions Psy-chol Rev 97 285ndash308
23 French RM (1999) Catastrophic forgetting in connectionistnetworks Trends Cogn Sci 3 128ndash135
24 Carpenter GA and Grossberg S (1987) A massively parallelarchitecture for a self-organizing neural pattern recognition archi-tecture Comput Vision Graph Image Process 37 54ndash115
25 McNaughton BL andMorris RG (1987) Hippocampal synap-tic enhancement and information storage within a distributedmemory system Trends Neurosci 10 408ndash415
26 Treves A and Rolls ET (1992) Computational constraintssuggest the need for two distinct input systems to the hippo-
campal CA3 network Hippocampus 2 189ndash199
27 OrsquoReilly RCand McClellandJL (1994) Hippocampal conjunc-tive encoding storage and recall avoiding a trade-off Hippo-campus 4 661ndash682
28 Knierim JJ et al (2006) Hippocampal placecells parallel inputstreams subregional processing and implications for episodicmemory Hippocampus 16 755ndash764
29 Cohen NJ and Eichenbaum HB (1994) Memory Amnesia
and the Hippocampal System MIT Press
30 OrsquoReilly RCand RudyJW (2001) Conjunctiverepresentationsin learning and memory principles of cortical and hippocampalfunction Psychol Rev 108 311ndash345
31 Norman KA and OrsquoReilly RC (2003) Modeling hippocampaland neocort ical cont ribu tions to recogni tion memory a
32 Mayes A et al (2007) Associative memory and the medialtemporal lobes Trends Cogn Sci 11 126ndash135
33 Davachi L (2006) Itemcontext andrelationalepisodicencoding
in humans Curr Opin Neurobiol 16 693ndash70034 Squire LR et al (2004) The medial temporal lobe Annu Rev
Neurosci 27 279ndash306
35 Schiller D et al (2015) Memory and space towards an inder-standing of the cognitive map J Neurosci 35 13904ndash13911
36 OrsquoReilly RC et a l (2014) Complementary learning systemsCogn Sci 38 1229ndash1248
37 Knierim JJ and Neunuebel JP (2016) Tracking the 1047298ow of hippocampal computation pattern separation pattern comple-tionand attractordynamicsNeurobiolLearnMem 12938ndash49
38 JohnstonST etal (2016)Paradoxof patternseparationand adultneurogenesis a dual role for new neurons balancing memoryresolution and robustness Neurobiol Learn Mem 129 60ndash68
39 Bengio Y et a l (2013) Representation learning a review andnew perspectives IEEE Trans Pattern Anal Mach Intell 351798ndash1828
40 Khaligh-Razavi SM and Kriegeskorte N (2014) Deep super-
vised but not unsupervised models may expla in IT cortica lrepresentation PLoS Comput Biol 10 e1003915
41 Kriegeskorte N et al (2008) Matching categorical object rep-resentations in inferior temporal cortex of man and monkeyNeuron 60 1126ndash1141
42 Clarke A andTyler LK(2014) Object-speci1047297c semantic codingin human perirhinal cortex J Neurosci 34 4766ndash4775
43 Kiani R et a l (2007) Object category structure in responsepatterns of neuronal population in monkey inferior temporalcortex J Neurophysiol 97 4296ndash4309
44 McNaughton BL (2010) Cortical hierarchies sleep and theextract ion of knowledge from memory Art 1047297 cial Intell 174205ndash2014
45 Leibold C and Kempter R (2008) Sparseness constrains theprolongation of memory lifetime via synaptic metaplasticityCereb Cortex 18 67ndash77
46 Rolls ET et al (1997) The representational capacity of the
distributed encoding of information provided by populations of neurons in primate temporal visual cortex Exp Brain Res 114149ndash162
47 Barnes CA et al (1990) Comparison of spatial and temporalcharacteristics of neuronal activity in sequential stages of hippo-campal processing Prog Brain Res 83 287ndash300
48 McKenzie S et a l (2015) Representation of memories in thecorticalndashhippocampal system results from the application of populationsimilarity analyses NeurobiolLearnMemPublishedonline December 31 2015 httpdxdoiorg101016jnlm201512008
49 Cutting J (1978) A cognitiveapproachto KorsakoffssyndromeCortex 14 485ndash495
50 McClelland JL (2011) Memory as a
constructive process theparallel-distributed processing apporach In The Memory Pro-
cess Neuroscienti 1047297 c
and Humanist Perspectives (Nalbantian Pet al eds) pp 99ndash129 MIT Press
51 Frankland PW and Bontempi B (2005) The organization of
recent and remote memories Nat Rev Neurosci 6 119ndash13052 Winocur G et al (2010) Memory formation and long-term reten-
tion in humans and animals convergencetowardsa transforma-tion account of hippocampalndashneocortical interactionsNeuropsychologia 48 2339ndash2356
53 Squire LRetal (1984) Themedial temporal region andmemoryconsolidation a new hypothesis InMemory Consolidation Psy-
chobiologyof Cognition (Weingartner H andParker ES eds)pp 185ndash210 Psychology Press
54 Robins A (1996) Consolidation in neural networks and in thesleeping brain Conn Sci 8 259ndash276
55 Tononi G and Cirelli C (2014) Sleep and the price of plasticityfrom synaptic and cellular homeostasisto memory consolidationand integration Neuron 81 12ndash34
Trendsin CognitiveSciences July 2016 Vol 20 No 7 531
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
65 JiD andWilson MA (2007)Coordinatedmemory replayin thevisual cortex and hippocampus during sleepNat Neurosci 10100ndash107
66 Lansink CS etal (2009) Hippocampus leadsventral striatum inreplay of placendashreward information PLoS Biol 7 e1000173
67 Ego-Stengel V and Wilson MA (2010) Disruption of ripple-associatedhippocampal activity during rest impairs spatial learn-ing in the rat Hippocampus 201ndash10
86 McNamara CG et al (2014) Dopaminergic neurons promotehippocampal reactivation and spatial memory persistence NatNeurosci 17 1658ndash1660
87 Sara SJ (2009)The locus coeruleus andnoradrenergic modu-lation of cognition Nat Rev Neurosci 10 211ndash223
88 McGaugh JL (2004) The amybdala modulates the consolida-tionof memoriesof emotionally arousing experiences AnnuRevNeurosci 27 1ndash28
89 Redondo RL and Morris RG (2011) Making memories lastthe synaptic tagging andcapturehypothesisNatRev Neurosci12 17ndash30
90 Kumaran D (2012) What representations and computationsunderpin the contribution of the hippocampus to generalizationand inference Front Hum Neurosci 6 157
91 Bunsey M and Eichenbaum H (1996) Conservation of hippo-campal memory funct ion in rats and humans Nature 379255ndash257
92 Zeithamova D and Preston AR (2010) Flexible memoriesdifferential roles for medial temporal lobe and prefrontal cortexin cross-episode binding J Neurosci 30 14676ndash14684
93 Preston AR etal (2004) Hippocampal contribution to the noveluse of relational information in declarative memory Hippocam- pus 14 148ndash152
94 Dusek JA and Eichenbaum H (1997) The hippocampus andmemory for orderly stimulus relationsProc Natl AcadSci US A 94 7109ndash7114
95 Shohamy D and Wagner AD (2008) Integrating memories inthehuman brain hippocampal-midbrainencodingof overlappingevents Neuron 60 378ndash389
96 Zeithamova D et a l (2012) Hippocampal and ventral medialprefrontal activation during retrieval-mediated learning supportsnovel inference Neuron 75 168ndash179
97 Milivojevic B et al (2015) Insight recon1047297gures hippocampal-prefrontal memories Curr Biol 25 821ndash830
98 Schlichting ML et a l (2015) Learning-related
representationalchanges reveal dissociable integration and separation signaturesin the hippocampusand prefrontal cortexNatCommun6 8151
99 Eichenbaum H et al (1999) The hippocampus memory andplace cells is it spatial memoryor a memoryspaceNeuron 23209ndash226
100 Howard MWetal (2005) Thetemporalcontextmodelin spatialnavigationand relationallearningtoward a common explanationof medial temporal lobe function across domains Psychol Rev112 75ndash116
101 Kloosterman F et a l (2004) Two reentrant pathways in thehippocampalndashentorhinal systemHippocampus 14 1026ndash1039
102 Eichenbaum H and Cohen NJ (2014) Can we reconcile thedeclarativememoryand spatial navigationviews on hippocampalfunction Neuron 83 764ndash770
103 Burgess N (2006) Computational models of the spatial andmnemonic functions of the hippocampus In The Hippocampus
(Andersen P et al eds) pp 715ndash750 Oxford University Press
104 Willshaw DJ et al (2015) Memory model ling and Marr acommentary on Marr (1971) lsquoSimple memory a theory of archi-cortexrsquo
Philos Trans R Soc B Biol Sci 370 20140383
105 Schapiro AC etal (2014)The necessity of themedial temporallobe for statistical learning J Cogn Neurosci 26 1736ndash1747
106 Knowlton BJ and Squire LR (1993) The learning of catego-ries parallel brain systemsfor item memoryand category knowl-edge Science 262 1747ndash1749
107 Shohamy D and Turk-Browne NB (2013) Mechanisms forwidespread hippocampal involvement in cognition J Exp Psy-chol Gen 142 1159ndash1170
532 Trends in CognitiveSciences July 2016 Vol 20No 7
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
109 Tamminen J et a l (2015) From speci1047297c examples to generalknowledge in language learning Cogn Psychol 79 1ndash39
110 Walker MPand Stickgold R (2010) Overnight alchemy sleep-
dependent memory evolution Nat Rev Neurosci 11 218111 Wood ER et al (1999) The global record of memory in hippo-
campal neuronal activity Nature 397 613ndash616
112 Eichenbaum H (2014) Time cells in the hippocampus a newdimension for mapping memoriesNat RevNeurosci 15732ndash744
113 McKenzie S etal (2014) Hippocampal representationof relatedand opposing memories develop within distinct hierarchicallyorganized neural schemas Neuron 83 202ndash215
114 Quiroga RQ et a l (2005) Invariant visual representation bysingle neurons in the human brain Nature 435 1102ndash1107
115 McClelland JL (2013) Incorporating rapid neocortical learningof new schema-consistent information into complementarylearningsystemstheory
J
ExpPsychol Gen
142
1190ndash1210
116 McClelland JL and Goddard NH (1996) Considerations aris-ing from a complementary learn ing systems perspective onhippocampus and neocortex Hippocampus 6 654ndash665
117 Hinton GE et al (1986) Distributed representations In Explo- rations in the Microstructure of Cognition Vol 1 Foundations
(Rumelhart DE et al eds) pp 77ndash109 MIT Press
118 Krizhevsky A et a l (2012) Imagenet classi1047297cation with deepconvolutional neural networks Adv Neural Inf Process Syst25 1106ndash1114
119 Mnih V et a l (2015) Human-level control through deep rein-forcement learning Nature 518 529ndash533
120 Alme CB et al (2014) Place cells in the hippocampus elevenmaps for eleven rooms Proc Nat l Acad Sci USA 11118428ndash18435
121 Samsonovich A and McNaughton BL (1997) Path integrationand cognitive mapping in a continuous attractor neural network model J Neurosci 17 5900ndash5920
122 Buzsaki G andMoser EI (2013)Memorynavigationand thetarhythmin thehippocampalndashentorhinalsystemNatNeurosci16130ndash138
123 Renno-Costa C etal (2014) A signatureof attractordynamicsinthe CA3 region of the hippocampus PLoS Comput Biol 10e1003641
124 Wills TJ et al (2005) Attractor dynamics in the hippocampalrepresentation of the local environment Science 308 873ndash876
Published online October15 2014 httparxivorgabs14103916
128 ScovilleWBand Milner B (1957)Loss of recentmemory afterbilateral hippocampal lesions J Neurol Neurosurg Psychiatry 20 11ndash12
129 Nadel L and Moscovitch M (1997) Memory consolidationretrograde amnesia and the hippocampal complex Curr OpinNeurobiol 7 217ndash227
130 MoscovitchM et al (2005) Functionalneuroanatomy of remoteepisodicsemanticand spatial memory a uni1047297ed account basedon multiple trace theory J Anat 207 35ndash66
131 Yassa MA and Stark CE (2011) Pattern separation in thehippocampus Trends Neurosci 34 515ndash525
132 Liu X et al (2012) Optogenetic stimulation of a hippocampalengram activates fear memory recall Nature 484 381ndash385
133 LeutgebJK etal (2007) Pattern separationin thedentate gyrusand CA3 of the hippocampus Science 315 961ndash966
134 LeutgebS etal (2004) Distinct ensemblecodes in hippocampalareas CA3 and CA1 Science 305 1295ndash1298
136 McHugh TJ etal (2007) Dentate gyrusNMDA receptorsmedi-ate rapid pattern separation in the hippocampal network Sci-ence 317 94ndash99
137 Neunuebel JP andKnierimJJ (2014)CA3 retrieves coherentrepresentations from degraded input direct evidence for CA3pattern completion and dentate gyrus pattern separation Neu- ron 81 416ndash427
138 Nakazawa K et al (2002) Requirement for hippocampal CA3
NMDA receptors in associative memory recall Science 297211ndash218
139 Jezek K etal (2011) Theta-paced 1047298ickering between place-cellmaps in the hippocampus Nature 478 246ndash249
140 Richards BA et al (2014) Patterns across multiple memoriesare identi1047297ed over time Nat Neurosci 17 981ndash986
141 Ketz N et al (2013) Theta coordinated error-driven learning inthe hippocampus PLoS Comput Biol 9 e1003067
142 Kumaran D andMaguire EA (2009)Novelty signals a windowinto hippocampal informationprocessing TrendsCognSci 1347ndash54
143 Moser EI andMoserMB (2003)One-shot memory in hippo-campal CA3 networks Neuron 38 147ndash148
144 Chaudhuri R and Fiete I (2016) Computational principles of memory Nat Neurosci 19 394ndash403
145 Lee H et a l (2015) Neural population evidence of functionalheterogeneity alongthe CA3 transverse axis pattern completion
versus pattern separation Neuron 87 1093ndash1105
146 Lu L etal (2015)Topographyof placemaps along theCA3-to-CA2 axis of the hippocampus Neuron 87 1078ndash1092
147 Collin SH et al (2015) Memory hierarchies map onto thehippocampal longaxis inhumansNatNeurosci181562ndash1564
148 Poppenk J et al (2013) Long-axis specialization of the humanhippocampus Trends Cogn Sci 17 230ndash240
149 Strange BA et al (2014) Functional organization of the hippo-campal longitudinal axis Nat Rev Neurosci 15 655ndash669
150 Ranganath C and Ritchey M (2012) Two cortical systems formemory-guided behaviour Nat Rev Neurosci 13 713ndash726
151 Hasselmo ME andSchnell E (1994)Laminar selectivity of thecholinergic suppression of synaptic transmission in rat hippo-campal region CA1 computational modeling and brain slicephysiology J Neurosci 14 3898ndash3914
152 Vazdarjanova A and Guzowski JF (2004) Differences in hip-pocampal neuronal population responses to modi1047297cations of an
environmental context evidence for distinct yet complementaryfunctions of CA3 and CA1 ensembles J Neurosci 24 6489ndash6496
161 Grossberg S (1987) Competitive learning from interactive acti-vation to adaptive resonance Cogn Sci 11 23ndash63
162 LaRocque KF et al (2013) Global similarity and pattern sepa-ration in the human medial temporal lobe predict subsequentmemory J Neurosci 33 5466ndash5474
163 McClelland JL and Rumelhart DE (1981) An interactiveactivation
model of contex t
e ffec ts in let te r percept ionPart 1 An account of the bas ic 1047297ndings Psychol Rev 88375ndash407
Trendsin CognitiveSciences July 2016 Vol 20 No 7 533
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391
7252019 What Learning Systems Do Intelligent Agents Need Complementary Learning Systems Theory Updated
165 Hintzman DL (1986) lsquoSchema abstractionrsquo in a multiple-tracememory model Psychol Rev 93 411ndash428
166 Suthana NA et al (2015) Speci1047297c responses of human hippo-
campal neurons are associated with better memory Proc Natl Acad Sci USA 112 10503ndash10508
167 Wood ER et al (2000) Hippocampal neurons encode informa-tion about different types of memory episodes occurring in thesame location Neuron 27 623ndash633
168 Ferbinteanu
J and Shapiro
ML
(2003) Prospective andretrospective memory coding in the hippocampus Neuron 401227ndash1239
169 Bower MR et al (2005) Sequential-context-dependent hippo-campa l ac ti vi ty i s no t necessary to lea rn sequences withrepeated elements J Neurosci 25 1313ndash1323
170 MacDonald CJ et a l (2013) Distinct hippocampal time cellsequences represent odor memories in immobil ized rats JNeurosci 33 14607ndash14616
171 Markus EJ etal (1995) Interactions between location and task affectthe spatial anddirectional 1047297ringof hippocampal neurons JNeurosci 15 7079ndash7094
172 Skaggs WE and McNaughton BL (1998) Spatial 1047297ringproperties of hippocampal CA1 populations in an environmentcontaining two visually identical regions J Neurosci 18 8455ndash8466
173 Kriegeskorte N et al (2008) Representational similarity analysisndash connectingthe branchesof systemsneuroscienceFront SystNeurosci 2 4
174 Komorowski RW et al (2009) Robust conjunctive item-placecoding by hippocampal neurons parallels learning whathappenswhere J Neurosci 29 9918ndash9929
175 EllenbogenJM etal (2007) Human relationalmemory requirestime and sleep Proc Natl Acad Sci USA 104 7723ndash7728
176 Dumay N andGaskell MG(2007)Sleep-associated changes inthementalrepresentationofspokenwords Psychol
Sci1835ndash39
177 Coutanche MN and Thompson-Schill SL (2014) Fast map-
ping rapidly integrates information into existing memory net-works J Exp Psychol Gen 143 2296ndash2303
178 Sharon T etal (2011) Rapidneocorticalacquisition of long-termarbitrary associations independent of the hippocampus ProcNatl Acad Sci USA 108 1146ndash1151
179 Merhav M et al (2014) Neocortical catastrophic interference inhealthy and amnesic adults a paradoxical matter of time Hip- pocampus 24 1653ndash1662
180 Smith CN et al (2014) Comparison of explicit and incidentallearning strategies in memory-impaired patients Proc Natl
Acad Sci USA 111 475ndash479
181 Warren DE and Duff MC (2014) Not so fast hippocampalamnesia slows word learning despite successful fast mappingHippocampus 24 920ndash933
182 Greve A et al (2014) No evidence that lsquofast-mappingrsquo bene1047297tsnovel learningin healthyolderadultsNeuropsychologia 6052ndash59
183 Schaul T et al (2016) Prioritized experience replay In Interna-
tional Conference on Learning Representations184 Gallistel CR (1990) The Organization of LearningMIT Press
185 Hochreiter S and Schmidhuber J (1997) Long short-termmemory Neural Comput 9 1735ndash1780
186 Santoro A etal (2016) Meta-Learning withmemory augmentedneural networks In International Conference in Machine
Learning
187 Treves A and Rolls ET (1994) Computational analysis of therole of the hippocampus in memory Hippocampus 4 374ndash391