Unsupervised learning Networks
•Associative Memory Networks
ELE571
Digital Neural Networks
Associative Memory Networks
•feedforward type (one-shot-recovery)
•feedback type e.g., the Hopfield network (iterative recovery)
Associative Memory Networks recalls the original undistorted pattern from a distorted or partially-missing pattern.
Associative Memory Model (feedfoward type )
b
a
W could be (1) symmetric or not, (2) square or not
nonlinear unit: e.g. threshold
b(k) W = b(k) m b(m) T a(m) = a(k)
Bidirectional Associative Memory
a1 = [1 1 1 1 –1 1 1 1 1]
a2 = [1 -1 1 -1 1 -1 1 –1 1]
2 0 2 0 0 0 2 0 2
0 2 0 2 -2 2 0 2 0
2 0 2 0 0 0 2 0 2
0 2 0 2 -2 2 0 2 0
0 -2 0 -2 2 -2 0 -2 0
0 2 0 2 -2 2 0 2 0
2 0 2 0 0 0 2 0 2
0 2 0 2 -2 2 0 2 0
2 0 2 0 0 0 2 0 2
The weight matrix
1 1 1 1 –1 1 1 1 1 1 -1 1 -1 1 -1 1 –1 1X=[ ]
W =XTX=
Associative Memory Model (feedback type )
aold
anew
W must be
•normalized
•W = WH W=X+X
Each iteration in AMM(W) comprises two substeps:
(a) Projection of aold onto W-plane
anet = W aold
(b) Remap the net-vector to closest symbol vector:
anew = T[anet]
anew = arg min anew ss
|| anew - W aold ||
The two substeps in one iteration can be sumarized as one procedure:
initial vector symbol
vector
perfect attractor
Γρ x -plane
Linear projection onto x-plane
g-update in DML, shown as
s-update in DML, shown asResymbolization nonlinear mapping
2 steps in one AMM iteration
^
^
=
=
anet is the (least-square-error) projection of aold
onto the (column) subspace of W.
Inherent properties of the signals (patterns)
to be retrieved:
•Orthogonality
•Higher-order statistics
•Constant modulus property
• FAE -Property
• others
Common Assumptions on Signal
for Associative Retrieval
Source
H
XS
Channel Observationg+
ε1
v S = ŝgX=gH S =
Blind Recovery of MIMO System
Blind Recovery of MIMO System
h11
h21
h1p
h2p
hq1
hqp
S
+
g1
g2
gq
Goal: to find g, such that v gH, and
v S = [ 0 .. 0 1 0 .. 0 ] S = sj
s1
sp
si
Signal Recoverability
H is PR (perfectly recoverable) if and only if H has full column rank, i.e. an inverse exists.
Assumptions on MIMO System
For Deterministic H, ……
non-recoverable
1 2 1 2 1 2
[ ]
1 2 1 3 1 2
[ ]
Examples for Flat MIMO
recoverable
If perfectly recoverable, e.g.
1 2 1 3 1 2
[ ]Parallel Equalizer
ŝ = H+X = G X
H
XS
Signal Constellation
Ci
g+ε1
Example: Wireless MIMO System
v S = ŝgX=gH S =
Signal recovery via g:
Given v s = ŝ , for ŝ to be always a valid symbol for any valid symbol
vector s, if and only if
v [ 0 .. 0 ±1 0 .. 0 ]
FAE()-Property: Finite-Alphabet Exclusiveness
Suppose that a v W = b. For the output b to be always a valid symbol sequence
given whatever v,
the necessary and sufficient condition is that
v = E(k) .
Theorem: FAE -Property
In other words, it is impossible to produce a valid but
different output symbol vector.
if and only if
v [ 0 .. 0 ±1 0 .. 0 ]
-1 +1 +1 +1 +1 +1 • •+1 +1 +1 -1 -1 +1 • •-1 -1 +1 -1 +1 +1 • •S = [ ],
v S = [valid symbols]
FAE()-Property: Finite-Alphabet Exclusiveness
If v ≠ [ 0 .. 0 ±1 0 .. 0 ]If v = [ 0 .. 0 ±1 0 .. 0 ]
ŝ [Finite Alphabet]
“EM”:
v S = ŝBlind-BLAST
gX=gH S =
ĝ= ŝ X+
š= T[ ĝ X]E-step
M-step
•The E step determines the best guess of the membership function zj .
• The M step determines the best parameters, θn , which maximizes the likelihood function.
ŝ = T[gX ]E-step
M-step ğ= š X+
Combined EM: š= T[ ĝ X] = T[ ŝ X+ X] = T[ ŝ W]
ŝold=ŝŝnew
Associative Memory Network
W=X+X
ŝnewŝnew
:threshold
ŝnew= Sign[ŝoldW]
initial
vector ŝNearest symbol vector
gx -plane
Linear projection onto gx - plane
FA nonlinear mapping
=
=
ŝ’ = T[ŝW]
g= ŝold X+
A symbol vector a* is a "perfect attractor" of AMM(W) if and only if
• a* is symbol vector
• a* = W a*
Definition: Perfect Attractor of AMM(W)
Let v= [ f(1) ≠0 f(2) … f(p)]
1
2
p
q
1
1
Thus f(p) =0
= 0
Let ûi= [ ui (1) ui(2) … ui(p) ]T
Let v’= [ f(1) ≠0 f(2) … f(p-1) 0]
Let ǖi = v ûi Let ǖ’i = v’ ûi
Compare the two programsand determine the differences in performance.
Why such a difference?
MatLab Exercise
p = zeros(1,100);for j=1:100; S = sign(randn(5,200)); A = randn(5,5)+eye(5); X = A*S + 0.01*randn(5,200); s = sign(randn(200,1)); W = X'*inv(X*X')*X; for i=1:20; sold = s; s = tanh(100*W*s); s = sign(s); end while norm(s-W*s)> 5.0,
s = sign(randn(200,1)); for i=1:20; sold = s; s = tanh(100*W*s); s = sign(s); end end p(j) = max(abs(S*s));endhist(p)
p = zeros(1,100);for j=1:100; S = sign(randn(5,200)); A = randn(5,5)+eye(5); X = A*S + 0.01*randn(5,200); s = sign(randn(200,1)); W = X'*inv(X*X')*X; for i=1:20; sold = s; s = tanh(100*W*s); s = sign(s); end p(j) = max(abs(S*s));endhist(p)