Associative memory for online incremental learning in a noisy environment IJCNN2007 Tokyo Institute of Technology Akihito Sudo, Akihiro Sato, and Osamu Hasegawa 2007/08/13
Associative memory
for online incremental learning
in a noisy environment
IJCNN2007
Tokyo Institute of Technology
Akihito Sudo, Akihiro Sato, and Osamu Hasegawa
2007/08/13
Copyright(C) 2007 Akihito Sudo All rights reserved. 1
Associative memory and intelligent robot
Associative memories has been used for intelligent robots.[1]
Following properties are necessary for those associative
memories.
Incremental Learning
Noise robustness
Dealing with real-valued data.
Many-to-many association
[1] K. Itoh et al., “New memory model for humanoid robots – introduction of co-
associative memory using mutually coupled chaotic neural networks,” Proc. of the 2005
International Joint Conerene on Neural Networks, pp. 2790–2795, 2005
Copyright(C) 2007 Akihito Sudo All rights reserved. 2
Related Works
There are two types of associative memories
1.Distributed Learning Associative Memory
E.g. Hopfield Network
E.g. Bidirectional Associative Memory (BAM)
2.Competitive Learning Associative Memory
E.g. KFMAM (Associative memory model extended from SOM)
Copyright(C) 2007 Akihito Sudo All rights reserved. 3
Problem of Distributed Associative Memory
Distributed associative memories forget previously learned
knowledge when learning new knowledge incrementally.
French has pointed out difficulty of avoiding that phenomenon
for distributed associative memories[1].
[1]R. French, “Using Semi-Distributed Representation to Overcome
Catastrophic Forgetting in Connectionist Networks,” Pm. of the 13h
Annual Cognitive Science Society Conference, pp. 173–178, 1991
Copyright(C) 2007 Akihito Sudo All rights reserved. 4
On the other hand…
Competitive learning associative memories is relatively suited
for incremental learning.
KFMAM-FW[1], which is competitive learning model, was
proposed for incremental learning.
[1]T. Yamada et al., “Sequential Learning for Associative Memory using
Kohonen Feature Map,” in Proc. of the 1999 International Joint
Confereneon Neural Networks, pp. 1920–1923, 1999.
Copyright(C) 2007 Akihito Sudo All rights reserved. 5
Problem of KFMAM-FW
A user must determine the number of nodes before starting to
train KFMAM-FW.
KFMAM-FW is less suited to environments where maximum
number of patterns to be learned cannot be revealed in advance.
If too much nodes are allocated, KFMAM-FW can’t learn all of
knowledge.
If too less nodes are allocated, KFMAM-FW suffers from memory
waste and unnecessary computational loads.
Copyright(C) 2007 Akihito Sudo All rights reserved. 6
Propose an associative memory model, SOINN-AM, which
has following properties
Overcomes the above limitation of incremental learning
Noise robustness
Dealing with real-valued data
Many to many association
Objectives
Copyright(C) 2007 Akihito Sudo All rights reserved. 7
Feature of Proposed method 1
Nodes arise in self-organizing manner.
Users don’t have to determine the number of nodes in advance.
Previously learned knowledge is not forgot even when learned
incrementally.
Shortage or redundancy of nodes is avoidable.
Therefore…
Copyright(C) 2007 Akihito Sudo All rights reserved. 8
Feature of Proposed method 2
In addition to incremental learning, the proposed method
realizes following features:
Robustness to noise
Dealing with real-valued data
Many-to-many association
Copyright(C) 2007 Akihito Sudo All rights reserved. 9
Architecture of Proposed Method
Input 1
・・・ ・・・
Input
Layer
Input 2
Competitive
Layer
Nodes in the competitive layer are
categorized with edges.
A node in the input layer obtains real
value and feed it into competitive layer.
Nodes in the competitive layer
holds associative pairs.
In the competitive layer, nodes are
autonomously produced in learning
phase.
The prototype node is generated
at the center of the cluster.
Copyright(C) 2007 Akihito Sudo All rights reserved. 10
Two phases in the proposed method
1. Training Phase
Training data are input into the input layer.
Nodes which hold associative pair are generated and eliminated
autonomously.
2. Recalling Phase
A real-valued vector is input into the input layer.
A corresponding real-valued vector is recalled as a result of
association.
Copyright(C) 2007 Akihito Sudo All rights reserved. 11
Training Phase 1
Generate Input into Competitive Layer
1. Input two vectors F=[f1,…,fn] and R=[r1,…,rm] into the input
layer.
2. Combine those two vectors into one vector
as X=[f1,…,fn, r1,…,rm].
3. Perturb X with Gaussian noise as follows;
Ic=X+nsi
where, nsi ~ N(0, si2).
4. Feed Ic to the competitive layer.
Gaussian Distribution. Mean is 0 and variance is si2.
Copyright(C) 2007 Akihito Sudo All rights reserved. 12
Training Phase 2
Find Winners in Competitive Layer
5. Find 1st and 2nd nearest node in the competitive layer to Ic.
Nodes in the competitive layer
Ic
Competitive Layer First Winner
Second Winner
Copyright(C) 2007 Akihito Sudo All rights reserved. 13
Training Phase 3
Judge whether input is unknown knowledge
6. Calculate “similarity threshold di” for both 1st winner and 2nd winner as
follow;
7. Verify that
where r and q are 1st and 2nd winners respectively.
8. If (1) doesn’t hold, input training data is an unknown knowledge.
Neighbors of i-th node
Weight of i-th node
(1)
Copyright(C) 2007 Akihito Sudo All rights reserved. 14
Training Phase 4
When training pattern is unknown knowledge
9. Create new node, weight of which is Ic, when input training
data is unknown data.
Copyright(C) 2007 Akihito Sudo All rights reserved. 15
Training Phase 4’
When training pattern is NOT unknown knowledge
9’. If the edge between 1st winner and 2nd winner does not exist,
create a new edge between them.
10’. Set the age of the edge between 1st winner and 2nd winner to
zero.
11’. Add 1 to the age of all edges emanating from 1st winner, and
remove the edges whose ages are greater than parameter Ledge
.
Copyright(C) 2007 Akihito Sudo All rights reserved. 16
Training Phase 5
Move nodes
12. Add ∆Wr and ∆Wi to the weights of 1st winner and its
neighbors where
Weight of 1st winner
Parameter
Copyright(C) 2007 Akihito Sudo All rights reserved. 17
Training Phase 6
Remove unnecessary nodes
13. If the number of input pattern is integer multiple of
parameter l, remove nodes with no neighbors.
Competitive Layer
Copyright(C) 2007 Akihito Sudo All rights reserved. 18
Recalling Phase 1
1. Input a pattern K as an associative key.
2. Derive the mean distance dk↔i between K and each node as
where
Copyright(C) 2007 Akihito Sudo All rights reserved. 19
Recalling Phase 2
3. If dk↔i < dr, generate O from the representative node r of the cluster to
which the i-th node belongs as
where
Copyright(C) 2007 Akihito Sudo All rights reserved. 20
Recalling Phase 3
4. Output all patterns generated in step 3. If no node satisfies dk↔i < dr,
reply Unknown pattern.
Copyright(C) 2007 Akihito Sudo All rights reserved. 21
Experiment~Methods~
Distributed Learning Associative Memory
BAM with PRLAB[1]
Competitive Learning Associative Memory
SOINN-AM (Proposed Method)
KFMAM[2]
KFMAM-FW[3]
[1]H. Oh and S.C. Kothari, “Adaptation of the relaxation method for learning in bidirectional associative memory,” IEEE Tans.
Neural Networks, Vol.5, No.4, pp. 576–583, 1994.
[2]H. Ichiki et al., “Kohonen feature maps as a supervised learning machine,” in Proc. of the IEEE International Conference on
Neural Networks, pp. 1944–1948, 1993.
[3]T. Yamada et al., “Sequential Learning for Associative Memory using Kohonen Feature Map,” in Proc. of the 1999
International Joint Conference on Neural Networks, pp. 1920–1923, 1999.
Copyright(C) 2007 Akihito Sudo All rights reserved. 22
Experiment ~data~
We employed following data for the experiment.
Binary Images7×7 pixels Alphabetical Image
Gray scale Images 92×112 pixels facial images from AT&T facial image database
Copyright(C) 2007 Akihito Sudo All rights reserved. 23
Experiment 1 ~Incremental Learning~
Training Data
Systems obtained the training data sequentially.
NOT in Batch manner!
At recalling phase, capital letters were fed for associative keys.
・・・ , ,
Copyright(C) 2007 Akihito Sudo All rights reserved. 24
Results
Only proposed method and KFMAM-FW (36 nodes and 64 nodes) recall correctly for all
associative keys.
But, KFMAM-FW (25 nodes and 16 nodes) were going into infinite loop at training
phase.
Proposed→
Infinite Loop
Infinite Loop
Copyright(C) 2007 Akihito Sudo All rights reserved. 25
Experiment 2 ~many-to-many association~
Training Data
Associative keys were A , C , F , J at recalling
phase.
Result
Proposed method recall all patterns perfectly.
Copyright(C) 2007 Akihito Sudo All rights reserved. 26
Experiment 3 ~Sensitivity to Noise~
Training Data
Patterns generated by adding binary noise to capital letters
were presented as associative keys at recalling phase.
・・・ , ,
Copyright(C) 2007 Akihito Sudo All rights reserved. 27
Results
Proposed method was more robust to noise in any noise level.
0
10
20
30
40
50
60
70
80
90
100
0 2 4 6 8 10 12 14 16 18 20 22 24 26Noise level (%)
Per
fect
rec
all
(%
)
SOINN-AM
KFMAM-FW
KFMAM (batch learning)
KFMAM (sequential learning)
BAM (batch learning)
BAM (sequential learning)
Copyright(C) 2007 Akihito Sudo All rights reserved. 28
Experiment 4 ~Gray scale images~
Training Data: 5 to 1 association for 10 people
Result
The mean error per each pixel was between 1.0×10-5~2.0×10-5.
10 people
Copyright(C) 2007 Akihito Sudo All rights reserved. 29
Conclusion
We proposed novel associative memory model which has
following properties;
suited for incremental learning;
robust to noise;
being able to deal with gray-scale data;
being able to deal with many-to-many association.