-
2586 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 68, 2020
Graph-Adaptive Semi-Supervised Tracking ofDynamic Processes Over
Switching Network Modes
Qin Lu , Member, IEEE, Vassilis N. Ioannidis , Student Member,
IEEE,and Georgios B. Giannakis , Fellow, IEEE
Abstract—A plethora of network-science related applicationscall
for inference of spatio-temporal graph processes. Such aninference
task can be aided by the underlying graph topology thatmight jump
over discrete modes. For example, the connectivityin dynamic brain
networks, switches among candidate topologies,each corresponding to
a different emotional state, also known asthe network mode. Taking
advantage of limited nodal observations,the present contribution
deals with semi-supervised tracking of dy-namic processes over a
given candidate set of graphs with unknownswitches. Towards this
end, a dynamical model is introduced tocapture the per-slot spatial
correlation using the active topology,as well as the temporal
variation across slots through a state-spacemodel. A scalable
graph-adaptive Bayesian approach is developed,based on what is
termed interacting multi-graph model (IMGM),to track the dynamic
nodal processes and the active graph topologyon-the-fly. Besides
switching topologies, the proposed IMGM algo-rithm can accommodate
various generalizations, including multi-ple dynamic functions,
multiple kernels, and adaptive observationnoise covariances. IMGM
learns the dynamical model that bestfits the data from a pool of
available models. Thus, the resul-tant adaptive algorithm does not
require offline model training.Numerical tests with synthetic and
real datasets demonstrate thesuperior tracking performance of the
novel approach compared tothe mode-clairvoyant existing
alternatives.
Index Terms—Dynamic graph processes, switching networkmodes,
online scalable Bayesian inference, multi-kernel learning.
I. INTRODUCTION
GRAPHS capture relations among entities (nodes), andhave found
widespread application in various fields, in-cluding sociology,
biology, neuroscience and economics [15],[30]. Attributes collected
in interdependent feature vectors pernode represent processes over
the graph. Given such vectorsfrom a subset of nodes, various
applications call for semi-supervised learning (SSL) of processes
across all network nodes.The scarcity of nodal observations can be
due to e.g., cost, andcomputational or privacy constraints. For
example, individualsin social networks may be reluctant to share
personal infor-mation, while acquiring nodal samples in brain
networks mayrequire invasive procedures such as
electrocorticography.
Manuscript received June 1, 2019; revised January 31, 2020 and
March 16,2020; accepted March 23, 2020. Date of publication April
3, 2020; date ofcurrent version May 1, 2020. The associate editor
coordinating the review ofthis manuscript and approving it for
publication was Dr. Pierre Borgnat. Thiswork was supported by NSF
under Grants 1508993, 1711471, and 1901134.(Corresponding author:
Qin Lu.)
The authors are with the Department of ECE and Digital
TechnologyCenter, University of Minnesota, Minneapolis, MN 55414
USA (e-mail:[email protected]; [email protected];
[email protected]).
Digital Object Identifier 10.1109/TSP.2020.2984889
SSL tasks over networks can leverage the prior informationof the
underlying graph topology that captures nodal inter-dependencies
[12]. Existing approaches to reconstructing time-invariant (TI)
graph processes often rely on the smoothness ofgraph processes
[16], [31], which asserts that connected verticeshave similar
features. In social networks where nodes and edgesrepresent users
and their friendships, one can infer the age of aspecific user from
her or his friends’ age. Other than smoothnessinference from
limited nodal observations can rely on e.g.,‘graph bandlimitedness’
[10], [29], sparsity and overcompletedictionaries [11]. Most of
these approaches can be unified underthe framework of learning
using graph kernels; see e.g., [26].
The aforementioned SSL task becomes more challengingwhen nodal
processes are nonstationary, and the graph topologyis also
time-varying. In a brain network for instance, where
nodescorrespond to brain regions and edges capture
dependenciesamong them, one may be interested in predicting the
dynamicprocesses as well as the varying interconnections. An
interestingtime-varying topology model switches over a set of
connectivitypatterns, also known as “network modes” [5]. For
example, theconnectivity among human brain regions varies as the
humans’emotional, mental or physical activities change [36].
Coupledwith the topology, the dynamics of nodal processes can
alsoswitch among different modes. Switching dynamical modelshave
been typically employed to characterize the multi-modalbehavior of
control systems [27], as well as kinematics of ma-neuvering targets
such as drones [6]. Nevertheless, graph-basedswitching dynamical
models have not been considered so far.
Several attempts have been made to reconstruct dynamicgraph
processes in the presence of possibly time-varying topolo-gies.
Inference of slow-varying processes over graphs has beenpursued
using the so-termed graph bandlimited model in [10],[35]. On the
other hand, graph kernel-based estimators have beenleveraged in
[25], [14] to reconstruct general dynamic processes.All these
contemporary approaches rely on a known graph topol-ogy and fixed
dynamic models. However, the dynamic graphcan change or switch in
an unknown fashion among a set ofpossibly known topologies, which
may reflect sudden changesin the partially observed signals.
Furthermore, even when notopology switches occur, the graph process
can evolve overmultiple dynamical models across time, and thus a
fixed modelmay be inadequate.
The present paper puts forth an approach for
semi-supervisedtracking and extrapolation of dynamic nodal
processes overswitching graphs. Our contribution is threefold.
1053-587X © 2020 IEEE. Personal use is permitted, but
republication/redistribution requires IEEE permission.See
https://www.ieee.org/publications/rights/index.html for more
information.
Authorized licensed use limited to: University of Minnesota.
Downloaded on May 08,2020 at 00:39:08 UTC from IEEE Xplore.
Restrictions apply.
https://orcid.org/0000-0002-4051-1396https://orcid.org/0000-0002-8367-0733https://orcid.org/0000-0002-0196-0260mailto:[email protected]:[email protected]:[email protected]
-
LU et al.: GRAPH-ADAPTIVE SEMI-SUPERVISED TRACKING OF DYNAMIC
PROCESSES OVER SWITCHING NETWORK MODES 2587
C1. The evolution of dynamic processes over switchinggraphs is
captured by a first-order vector autoregres-sive model, where the
transition matrix and the processnoise covariance matrix depend on
the active mode-conditioned topology. The resulting graph-adaptive
dy-namical model accounts for both spatial correlationwithin one
slot and temporal variations across slots.
C2. Given a candidate set of the aforementioned mode-conditioned
dynamical models and measurements on asubset of nodes, we put forth
a scalable graph-awareBayesian tracker, termed interacting
multi-graph model(IMGM), to jointly estimate the graph processes
andactive network modes on-the-fly.
C3. Further, the proposed IMGM framework accommodatesvarious
modeling extensions, including switching non-linear dynamical
functions, multiple kernels, and adap-tive observation covariances.
By accounting for thesedynamical models, IMGM adapts on the
observed dataand selects the pertinent model per time slot
withoutrequiring offline training.
If observations were available at all nodes, it would have
beenpossible to identify the active topology per slot without
explicitlymodeling the nodal process dynamics [5]. Relative to [5],
thiswork leverages the dynamics to reconstruct unavailable
nodaldata, while at the same time identifying the active mode
andtracking the nodal processes. Not necessarily graph related
yetsimilar to that of [5] is the goal of subspace clustering [32],
butdifferent from the work here mode dynamics are not exploitedto
reconstruct unavailable nodal processes.
The rest of the paper is organized as follows. Section II
startswith preliminaries to formulate the problem that is solved
inSection III. Section IV deals with modeling generalizations ofthe
IMGM approach. Numerical results and conclusions arepresented in
Sections V and VI, respectively. Part of this paperis published in
our conference precursors [20], [21].
Notation: Scalars are denoted by lowercase, column vectorsby
bold lowercase, and matrices by bold uppercase fonts. Su-perscripts
�, −1 and † denote transpose, inverse and pseudo-inverse,
respectively; while 1N stands for the N × 1 all-onesvector; and N
(x;μ,K) for the probability density function(pdf) of a Gaussian
random vector x with mean μ, and co-variance matrix K. Finally, if
A is a matrix and x a vector,then ||x||2A := x�A−1x, ||x||22 :=
x�x, ‖A‖1 represents theL1-norm of the vectorized matrix, and ‖A‖2F
is the Frobeniusnorm of A.
II. PROBLEM FORMULATION
Consider a time-varying graph Gt with N nodes indexed bythe
vertex set V := {1, . . . , N}. Per slot t, the relationship
be-tween nodes is captured by anN ×N adjacency matrixAt withAt(n,
n
′) representing the weight of the edge connecting nodesn and n′.
The focus will be on graphs whose topology jumpsamong a known set
of S candidate adjacency matrices; that is,At = A
σtt ∈ {A1t , . . . ,ASt }, where the per-slot active
topology
index σt ∈ S := {1, . . . , S} describes the so-called
“networkmode.” The active mode-conditioned Laplacian matrix is
then
TABLE IEXAMPLES OF LAPLACIAN KERNELS
given by Lσtt = Dσtt −Aσtt , where Dσtt = diag{Aσtt 1N} de-
notes the graph degree matrix. Switching topologies emergein
several networked systems. Besides brain networks [36],network
topologies from information cascades exhibit switchingpatterns
[5].
A dynamic graph process is defined as the mapping x : V ×T �→ R,
where T := {1, 2, . . .} is the set of slot indices. Thus,xt(n)
represents the attribute of node n at slot t. For instance,it may
represent the value of a stock n at day t. The valuesover all the
nodes at slot t are collected in the vector xt :=[xt(1), . . . ,
xt(N)]
�.In several applications, processes over only a subset of M
<
N vertices are observed, which yields the observation model
zt = Htxt + et (1)
where Ht ∈ {0, 1}M×N is the time-varying observation
(orsampling) matrix, whose rows sum up to 1, and et is
theobservation noise that accounts for unmodeled
uncertainties,assumed to be white and Gaussian distributed with
mean zeroand covariance Rt.
A. Kernel-Based Inference of TI Graph Processes
Towards learning dynamic graph processes, it is instructiveto
first outline the kernel-based inference of TI graph
processes.Consider a TI adjacency matrix A and observation model z
=Hx+ e, which are given by dropping slot index t in the
time-varying scenario. To uniquely reconstruct x, one may rely
onthe regularized least-squares formulation
x̂ = arg minx
‖z−Hx‖22 + μΩ(x) (2)
where Ω(·) is a chosen monotonic regularizing function alongwith
the scalar μ ≥ 0 that controls the importance of the
regu-larization term vis-a-vis the fitting error.
For undirected graphs with symmetric adjacency matrix A,the
so-called Laplacian regularizer is given by
ΩLR(x) := x�Lx =
1
2
N∑
n=1
N∑
n′=1
A(n, n′)(x(n)−x(n′))2 (3)
whereL is the TI Laplacian matrix. The regularizer (3)
promotessmoothness of the estimated signal on the graph as
verticesconnected by strong links (large A(n, n′)) will have
similarsignal estimates to minimize (3). To facilitate other
propertiessuch as diffusion or graph bandlimitedness, the Laplacian
matrixin (3) is replaced by r(L), where the scalar energy mappingr
: R �→ R+ is applied on the eigenvalues of L to promotedesired
properties, see e.g., Table I. The pseudo-inverse of r(L)
Authorized licensed use limited to: University of Minnesota.
Downloaded on May 08,2020 at 00:39:08 UTC from IEEE Xplore.
Restrictions apply.
-
2588 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 68, 2020
yields the graph Laplacian kernel [26]
K := r†(L). (4)
By considering Ω(x) := ‖x‖2K, we recover the family of ker-nel
ridge regression (KRR) estimators, which enjoys well-documented
reconstruction performance [14], [31].
For directed graphs, one can not directly apply the KRRframework
since A is not symmetric. Nevertheless, attemptshave also been made
towards KRR by redefining a positivesemidefinite Laplacian matrix
[3], [4], [8], [9]. In [9], sucha valid matrix is constructed as L
:= U− (UĀ+ Ā�U)/2,where Ā := D−1A and U := diag(u) with u
denoting the lefteigenvector of Ā. Based on this definition,
kernel matrices canbe constructed, allowing the KRR framework to
accommodatedirected graphs as well. The methods in this paper apply
to bothdirected and undirected graphs.
So far, we outlined SSL on graphs using a
deterministickernel-based framework. It is however instructive to
present aBayesian generative model for KRR estimation. First,
considerthat the prior pdf of x is p(x) = N (x;0,K) and the
likelihoodof x based on observation z is given by p(z|x) = N
(z;Hx,R)withR = μIM . Under these Gaussian densities, the maximum
aposteriori (MAP) estimator ofx given z is equivalent to the
KRRestimator, which amounts to the linear minimum mean-squareerror
(LMMSE) estimator
x̂ = arg maxx
p(x|z) = arg maxx
p(z|x)p(x)
= arg minx
‖z−Hx‖2R + ‖x‖2K. (5)
Graph processes with arbitrary dynamics render the inferencetask
in (5) intractable, in general. Fortunately, structured dynam-ical
models, such as the one dealt with in the ensuing section,can lead
to tractable estimators.
B. Modeling Dynamic Processes Over Switching Graphs
One possible approach is to pursue an instantaneous per-slotKRR
estimator based on zt in (1). This estimator however, doesnot
account for the xt−1 to xt transition that can benefit theestimator
ofxt from observations other thanzt, and thus improveestimation
performance [14], [25].
Exploitation of graph process dynamics calls for modelingthe
evolution from xt−1 to xt, which arguably depends on theunderlying
topology [14], [25]. To capture the dynamics ofprocesses over
switching graphs, we model the evolution fromxt−1 to xt as the
first-order Markov process
xt = Fσtt xt−1 + η
σtt (6)
where the state transition matrix is a known function f of
theactive adjacency matrix given by
Fσtt := f (Aσtt ) . (7)
The mode-conditioned process noise ησtt is assumed uncorre-lated
with the state, white, and Gaussian distributed with zeromean and
covariance Kσtt , which is the Laplacian kernel (4).
The model in (6) accounts for the spatio-temporal dependenceof
graph processes in the following two aspects.
i) The temporal dynamics across two consecutive slots
arecaptured by the state transition matrix of the
so-termed“transition graph.” With Fσtt = A
σtt , the transition model
(6) amounts to a graph diffusion process [29].ii) The spatial
correlations across nodes within t are captured
by the Laplacian kernel Kσtt of the process noise covari-ance.
By setting Fσtt xt−1 = 0, the dynamical model (6)reduces to xt =
η
σtt , which together with (1), constitutes
the generative model for TI graph processes, leading tothe MAP
estimate given by (5). Incidentally, such a co-variance model
implies that xt is “graph stationary” [24].A related noise model
was also adopted in [14] to promotesmoothness of the estimates.
The dynamical model in (6) describes what is also known as
aswitching linear dynamical system (SLDS) [23], and it is
widelyemployed in the tracking community to capture the
kinematicstate evolution of maneuvering targets [6].
Problem statement: Given T observations ZT := [z1 . . . zT ]as
in (1), and candidate models {{Fst ,Kst}Ss=1}Tt=1 as in (6),the
goal is to jointly track the dynamic graph processes XT :=[x1 . .
.xT ], and the discrete modes {σt}Tt=1.
III. SCALABLE GRAPH-AWARE BAYESIAN TRACKER
In this section, we develop a Bayesian approach to trackdynamic
graph processes over switching graphs. First, giventhe Markovian
state transition model in (6), the prior joint pdfof the nodal
processes in XT can be expressed as
p(XT ) = p(xT |xT−1;σT )p(XT−1) = · · ·=T∏
t=1
p(xt|xt−1;σt)
=T∏
t=1
(S∑
s=1
wst p(xt|xt−1;σt = s))
where we explicitly incorporate σt in p(xt|xt−1) to stress
theactive topology present, and wst encodes the existence of
themode σt = s with wst ∈ {0, 1} and
∑Ss=1 w
st = 1.
Furthermore, since et in (1) is temporally white, the
condi-tional data pdf also factorizes as
p(ZT |XT ) =T∏
t=1
p(zt|xt).
Hence, Bayes’ rule yields the posterior joint state pdf as
p(XT |ZT ) ∝ p(ZT |XT )p(XT )
=
T∏
t=1
p(zt|xt)(
S∑
s=1
wst p(xt|xt−1;σt = s)).
(8)
Since, et and ησtt are Gaussian, the conditional likelihood
p(zt|xt) and the transition pdf p(xt|xt−1;σt = s) are also
Gaus-sian, that is
p(zt|xt) = N (zt;Htxt,Rt)p(xt|xt−1;σt = s) = N (xt;Fstxt−1,Kst
).
Authorized licensed use limited to: University of Minnesota.
Downloaded on May 08,2020 at 00:39:08 UTC from IEEE Xplore.
Restrictions apply.
-
LU et al.: GRAPH-ADAPTIVE SEMI-SUPERVISED TRACKING OF DYNAMIC
PROCESSES OVER SWITCHING NETWORK MODES 2589
Thus, the MAP state estimates in batch form are (cf. (8))
arg min{xt}Tt=1
{{wst }Ss=1}Tt=1
1
2
T∑
t=1
[‖zt −Htxt‖2Rt+
S∑
s=1
wst ‖xt − Fstxt−1‖2Kst
]
s.to wst ∈ {0, 1},S∑
s=1
wst = 1. (9)
Unfortunately, (9) is a mixed integer program whose
optimalsolution is given by enumerating all the ST combinations
ofdiscrete network modes across T slots, and then applying aKalman
smoother for each mode combination, thus incurringcomputational
complexity O(TSTN3).
Targeting a computationally efficient solver with xt and
σtestimates obtained on-the-fly, we will build on the
interactingmulti-model (IMM) algorithm [7] that has been applied to
targettracking [22] and air traffic control [19], but without
graph-related information. Taking into account dynamically
switchinggraph topologies, we will naturally term the resultant
algo-rithm interacting multi-graph model (IMGM). Given
partiallyobserved nodal samples zt and a candidate set of
switchinggraphs, IMGM is a graph-adaptive Bayesian tracker that
es-timates the active network mode σt together with the N
scalarnodal processes in xt.
Our IMGM replaces the hard constraint wst ∈ {0, 1} with thesoft
one wst ∈ [0, 1]. To further stress that the weight is basedon
observations up to t, wst is replaced with w
st|t. Thus, one can
interpret wst|t as the posterior probability mass function
(pmf)of mode s being active at slot t, namely wst|t = Pr(σt =
s|Zt).Different from (9) where σt was viewed as deterministic,
wewill next model it as a first-order Markov chain parameterizedby
the S × S mode transition matrix Π, whose (i, j)th entry
πij = Pr(σt = i|σt−1 = j) (10)
denotes the transition probability from mode j at slot t− 1
tomode i at slot t. The parameters ofΠ are pre-selected. A
practicalchoice for Π is to set its diagonal entries to π0 ∈ [0.9,
1), andthe rest to (1− π0)/(S − 1) [6].
IMGM leverages the current observation zt to propagate
theposterior marginal state pdf p(xt−1|Zt−1) to p(xt|Zt).
Towardsthis end, we start by approximating the mode-conditional
pos-terior of xt with a Gaussian pdf
p(xt|σt = s,Zt) ≈ N (xt; x̂st|t,Pst|t) (11)
where x̂st|t and Pst|t are the mean and the covariance
matrix
associated with mode s. Bayes’ rule and the total
probabilitytheorem (TPT) yields the marginal posterior
p(xt|Zt) =S∑
s=1
Pr(σt=s|Zt) p(xt|σt=s,Zt)
≈S∑
s=1
wst|t N (xt; x̂st|t,Pst|t) (12)
approximated by a Gaussian mixture (GM) pdf, which is
param-eterized by the set
Pt := {wst|t, x̂st|t,Pst|t, s = 1, . . . , S}. (13)
This GM model facilitates the propagation fromp(xt−1|Zt−1)to
p(xt|Zt) through updates of the elements in Pt−1 to those inPt.
These updates will be implemented using the prediction
andcorrection of the mode pmf and the mode-conditional state pdfas
detailed next.
A. Prediction
At the end of slot t− 1, the posterior marginal state pdf
ischaracterized by Pt−1. Before the arrival of a new observationzt,
IMGM leverages the mode and state evolution models (cf.(10) and
(6)) to make predictions about the mode pmf and themode-conditional
state pdf, respectively.
1) Predicted Mode Pmf: Based on the Markov transitionmodel (10),
the predicted mode pmf is readily obtained via TPTand Bayes’ rule
as
wst|t−1 :=Pr(σt = s|Zt−1)=S∑
s′=1
Pr(σt = s,σt−1 = s′|Zt−1)
=
S∑
s′=1
Pr(σt = s|σt−1 = s,′ Zt−1)Pr(σt−1 = s′|Zt−1)
=
S∑
s′=1
πss′ws′
t−1|t−1 . (14)
2) Predicted State Pdf: Since p(xt−1|σt−1 = s,′ Zt−1) isGaussian
(cf. (11)), the linear-Gaussian state transition model(6)
conditioned on σt = s allows one to deduce that
p(xt|σt = s, σt−1 = s,′ Zt−1) = N (xt; x̂s,s′
t|t−1,Ps,s′
t|t−1) (15)
where the subscript (s, s′) refers to the conditioning on
modes(s, s′) at slots t and t− 1 respectively.
The first two moments of the pdf in (15) are given by
x̂s,s′
t|t−1 = Fst x̂
s′
t−1|t−1 (16a)
Ps,s′
t|t−1 = FstP
s′
t−1|t−1 (Fst )
� +Kst . (16b)
Next, using the TPT and Bayes’ rule, we express the
predictedmode-conditional state pdf at t as
p(xt|σt = s,Zt−1)
=
S∑
s′=1
Pr(σt−1=s′|σt =s,Zt−1)p(xt|σt = s, σt−1 =s,′ Zt−1)
(17)
where Pr(σt−1 = s′|σt = s,Zt−1) := ws′ |s
t−1|t can be interpretedas the backward mode transition
probability, which upon
Authorized licensed use limited to: University of Minnesota.
Downloaded on May 08,2020 at 00:39:08 UTC from IEEE Xplore.
Restrictions apply.
-
2590 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 68, 2020
Algorithm 1: One Recursion of the IMGM Algorithm.
1: Input: Pt−1, zt, {Fst ,Kst}Ss=1, Rt, Ht, Π2: for s = 1 to S
do3: S1 Prediction4: S1.1 of mode pmf via (14)5: S1.2 of
mode-conditional state pdf via (16), (18),
and (20)6: S2 Correction7: S2.1 of mode-conditional state pdf
via (23)8: S2.2 of mode pmf via (24)9: S3 Fusion of
mode-conditional state pdfs via (27)
10: end for11: Output: Pt, x̂t|t, Pt|t
appealing to Bayes’ rule and the TPT, boils down to
ws′|st−1|t =
Pr(σt−1 = s′|Zt−1)Pr(σt = s|σt−1 = s,′ Zt−1)∑S
s′=1Pr(σt−1 = s′|Zt−1)Pr(σt = s|σt−1 = s,′ Zt−1)
=ws
′
t−1|t−1πss′∑Ss′=1 w
s′t−1|t−1πss′
. (18)
So far, the predicted mode-conditional state pdf (17) is aGM
pdf. A GM prior however, does not evolve to a Gaussianposterior pdf
with Gaussian likelihood. To maintain Gaussianityof the posterior
mode-conditional state pdf as in (11), we willapproximate (17) by
the following Gaussian pdf
p(xt|σt = s,Zt−1) ≈ N (xt; x̂st|t−1,Pst|t−1) (19)
where x̂st|t−1 and Pst|t−1 are chosen to match the first two
moments of the GM pdf (17) as
x̂st|t−1 =S∑
s′=1
ws′ |st−1|tx̂
s,s′
t|t−1 (20a)
Pst|t−1 =S∑
s′=1
ws′ |st−1|t
(Ps,s
′
t|t−1
+ (x̂s,s′
t|t−1 − x̂st|t−1)(x̂
s,s′
t|t−1 − x̂st|t−1)
�)
. (20b)
Approximating non-Gaussian pdfs with Gaussian ones is
awell-documented approach to effect scalability in
approximate(Bayesian) inference, including variational inference
and ex-pectation propagation; see [23] and the references therein.
Withmoments of the approximating Gaussian pdf matched to that ofthe
non-Gaussian one (cf. (20)), the KL divergence between thetwo pdfs
is minimized.
Up to now, we have obtained the predicted mode pmf and
themode-conditional state pdf, which will be propagated to
theirposterior counterparts after a new zt is observed.
B. Correction
1) Posterior Mode-Conditional State Pdf: Given thenew
observation zt, the approximate predicted mode-conditional state
pdf (19) is propagated to its posterior via
Bayes’ rule as
p(xt|σt = s,Zt) = p(xt|σt = s, zt,Zt−1)
=p(xt|σt = s,Zt−1)p(zt|xt, σt = s,Zt−1)
p(zt|σt = s,Zt−1)(21)
where p(zt|xt, σt = s,Zt−1) = N (zt;Htxt,Rt), since zt
isindependent of Zt−1 and σt. Hence, with the likelihood andthe
prior (cf. (19)) being Gaussian, it holds that
p(xt|σt = s,Zt) = N (xt; x̂st|t,Pst|t) (22)
where the first two moments x̂st|t and Pst|t are obtained via
the
Kalman update as (see e.g., [6])
ẑst|t−1 = Htx̂st|t−1 (23a)
Φst|t−1 = HtPst|t−1 (Ht)
� +Rt (23b)
Gst = Pst|t−1 (Ht)
� (Φst )−1 (23c)
x̂st|t = x̂st|t−1 +G
st (zt − ẑst|t−1) (23d)
Pst|t = Pst|t−1 −GstΦst|t−1 (Gst )
� . (23e)
2) Posterior Mode Pmf: Upon applying Bayes’ rule, theposterior
mode pmf is
wst|t = Pr(σt = s|zt,Zt−1)
=p(zt|σt = s,Zt−1)Pr(σt = s|Zt−1)∑Ss=1 p(zt|σt = s,Zt−1)Pr(σt =
s|Zt−1)
(24)
where the first factor p(zt|σt = s,Zt−1) is computable via
(14),and the second factor is the normalizing pdf in (21)
p(zt|σt = s,Zt−1) =∫
p(zt,xt|σt = s,Zt−1)dxt
=
∫p(zt|xt)p(xt|σt = s,Zt−1)dxt
(25)
that can be shown to be the Gaussian N (zt; ẑst|t−1,Φst|t−1)
with
ẑst|t−1 and Φst|t−1 given by (23a) and (23b), respectively.
C. Fusion
Finally, the marginal posterior state pdf is given by fusing
theindividual mode-conditional posteriors to obtain the GM
p(xt|Zt) =S∑
s=1
wst|tN (xt; x̂st|t,Pst|t) (26)
whose first two moments are
x̂t|t =S∑
s=1
wst|tx̂st|t (27a)
Pt|t =S∑
s=1
wst|t
(Pst|t + (x̂
st|t − x̂t|t)(x̂st|t − x̂t|t)�
). (27b)
Authorized licensed use limited to: University of Minnesota.
Downloaded on May 08,2020 at 00:39:08 UTC from IEEE Xplore.
Restrictions apply.
-
LU et al.: GRAPH-ADAPTIVE SEMI-SUPERVISED TRACKING OF DYNAMIC
PROCESSES OVER SWITCHING NETWORK MODES 2591
Fig. 1. Flowchart of IMGM with S = 2 modes for one recursion,
where yellow is used for mode 1, and blue for mode 2. Each mode
predicts the first twomoments of the state pdf at slot t assuming
the active mode is 1, or 2, respectively. Then the predictive state
pdf conditioned on mode s ∈ {1, 2} at slot t is obtainedby fusing
the contributions from modes 1 and 2 at slot t− 1 (denoted by the
green lines in the figure). After receiving new observation zt,
each mode updates thefirst two moments of the state pdf. Aided by
lst = N (zt; ẑst|t−1,Φ
st|t−1), each mode obtains the posterior weight, based on which
the fused state moments in the
green box are acquired.
Thus, the posterior mean (27a) is the minimum mean-squareerror
(MMSE) estimator of xt, whose uncertainty is character-ized by the
covariance matrix (27b). On the other hand, uponapproximating the
GM in (26) with a single Gaussian pdf havingmatched moments, (27a)
can also be interpreted as the MAPestimator of xt.
The implementation steps of the IMGM algorithm for onerecursion
are summarized in Alg.1, and the flowchart of IMGMforS = 2modes is
presented in Fig. 1. Note that at initialization,the mode
probabilities and mode-conditional state pdfs are setto be
identical across modes; that is,
ws0|0 =1
S, x̂s0|0 = x̂0, P
s0|0 = P0, s = 1, . . . , S (28)
where x̂0 and P0 encode our prior information about the
initialstate distribution.
IMGM incurs low computational complexity of orderO(STN3) over T
slots, which is clearly more affordable thanthe exponential
complexity of the optimal solution of (9). To fur-ther maintain
scalability for N �, the graph can be divided intoNg subgraphs,
each with at most �N/Ng nodes. Upon leverag-ing distributed solvers
along the lines of [28], the computationalcomplexity per subgraph
is O(S�N/Ng3), yielding an overallcomplexity of order O(SNg�N/Ng3).
Hence, scalability forlarge graphs can be effected by adjusting Ng
. However, how to
optimally chooseNg and divide the graph based on the topology,is
an interesting future direction.
A few remarks are now in order.Remark 1: IMGM is a memoryless
online algorithm that
requires no storage of past observations. All information
aboutthe past is summarized by the parameter set Pt−1 that
definesthe GM pdf for the marginal state distribution.
Remark 2: Different from our IMGM, the classical IMM [6]first
approximates a GM by a single Gaussian for each modethat
corresponds to an updated mode-conditional state posterior,which is
the input to one of S parallel mode-dependent Kalmanfilters with
prediction and correction. Adhering to both Gaussianpredicted (19)
and posterior (22) mode-conditional state pdfs,IMGM predicts a GM
per mode (17) that is then approximatedby a single mode-conditional
Gaussian (19), before running Sparallel Kalman correction steps.
The order of approximationand prediction makes no difference for
linear state transitionmodels.
IV. MULTI-KERNEL, TRANSITION, AND NOISE ADAPTIVITY
This section shows that IMGM can also be utilized to
trackdynamic processes and adapt the graph kernel(s),
transitionfunction, and noise variance per slot even for a fixed
graph.
To start with, the linear state transition model in (6) may
notbe able to fully capture the dynamics of the graph
processes,
Authorized licensed use limited to: University of Minnesota.
Downloaded on May 08,2020 at 00:39:08 UTC from IEEE Xplore.
Restrictions apply.
-
2592 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 68, 2020
necessitating a general nonlinear transition model, which
isgiven by a nonlinear function f(Aσt ,xt−1). Moreover, the statein
several applications may adhere to a different dynamic modelper
slot. For example, the stocks in an economic network abideby
different evolution patterns, e.g. in a period of
economicrecession. Hence, the transition function f (7) may jump
amonga candidate set ofL transition functions {f1, . . . , fL} for
a giventopology at slot t.
Besides f , the noise model in (6) can be sensitive to
theselection of the appropriate kernel (4). To deal with this,
adictionary of candidate ‘basis kernels’ (4) can be constructedwith
energy mappings in the set {r1(·) . . . rK(·)}. Hence, theextended
dynamical model of graph processes that further ac-counts for
multiple kernels and switching nonlinear dynamicfunctions is given
by
xt = flt(Aσtt ,xt−1) + η
kt,σtt (29)
where lt ∈ L := {1, . . . , L} is the active dynamic function
in-dex, and σt ∈ S denotes the active topology index; while
thezero-mean Gaussian process noise ηkt,σtt has covariance
matrixKkt,σtt with active kernel function indexkt ∈ K := {1, . . .
,K}.
On the other hand, the observation noise covariance in (1)is
selected as Rt = μIM . The scale μ is typically tuned
viacross-validation offline among a candidate set of grid
points{μ1, . . . , μR}. However, μ remains fixed for all t, and
does notadapt to the data across slots t. To avoid cross-validation
andeffect a data-driven choice of μ, we can recast the
observationmodel in (1) as
zt = Htxt + ertt (30)
where the covariance matrix of ertt is Rrtt = μ
rtIM with rt ∈R := {1, . . . , R}.
To incorporate switching topologies, dynamic functions,
ker-nels, and observation noise covariances, we construct S̄ = S
×L×K ×R candidate dynamical models with the active model{f lt(Aσt ,
·),Kkt,σtt ,Rrtt } indicated by the extended networkmode σt := (lt,
σt, kt, rt) ∈ S̄ , where the extended networkmode set is S̄ := L ×
S × K ×R. Before invoking the IMGMalgorithm, one has to rescale the
cost function in (9) by re-placing Rrtt with IM , and subsequently
absorbing R
rtt into the
process noise covariance as K̃kt,σt,rtt = Kkt,σtt /μ
rt , such thatthe fitting error in (9) will have no scaling
factor. The expandedcandidate list of dynamical models at slot t is
then constructedas {f l(Aσ, ·), K̃k,σ,rt ,σ ∈ S̄}. Subsequently, by
changing Rtto IM in (23b), the IMGM algorithm is readily applied
withonly one revision in (15) for nonlinear dynamical models.
Asalluded to in the previous discussion, IMGM strives to maintaina
Gaussian mode-conditional state pdf. Thus, to approximatethe
nonlinear transformation of a Gaussian state pdf by
anotherGaussian, we can leverage the unscented transformation as
inunscented KF [34], or, just linearize the nonlinear
transitionfunctions, as in extended KF [6].
Three more remarks are in order.Remark 3: For dynamical models
with unknown parameters,
candidate dynamical models can be constructed and IMGMcan learn
the model parameters that best fit the data on-the-fly,
thus circumventing extra offline model training. Such a
jointsystem identification and state estimation problem has
beenconsidered in the KF literature along three prevailing
lines.The first leverages the expectation-maximization algorithm
toiterate between state estimation and system identification,
butthe online characteristic is compromised; see e.g., [33].
Thesecond approach chooses the model parameters from a
knowndictionary, and applies the classical IMM approach to select
theappropriate parameters online [18]. Recently, for models
withunknown process, and observation noise covariance matrices,a
variational Bayesian approach is employed to obtain pdfestimates
[13].
Remark 4: With only one graph and Flt,σtt = 0, IMGM of-fers an
online probabilistic multi-kernel based alternative to re-construct
TI graph processes, which complements rather nicelythe
deterministic multi-kernel KRR framework in [26]. For afixed set of
nodes, the complexity of IMGM for time-invariantgraphs is the same
as that for the time-varying case.
V. NUMERICAL TESTS
In this section, we evaluate the performance of the pro-posed
IMGM approach using synthetic and real data, and com-pare it with
existing algorithms, including the kernel Kalmanfilter (KKF) [25];
the adaptive least mean-square (LMS) al-gorithm [10] with bandwidth
BLMS ∈ {2, 4, 6, . . . , 20} andstep size μLMS ∈ {0.5, 0.6, 0.7, .
. . , 2}; as well as the dis-tributed least-squares reconstruction
(DLSR) [35] with band-width BDLSR ∈ {2, 4, 6, . . . , 20} and step
sizes μDLSR ∈{0.2, 0.4, 0.6, . . . , 2} and βDLSR ∈ {0.1, 0.2, . .
. , 0.9}. BothLMS and DLSR can track slowly time-varying
B-bandlimitedgraph processes. Unless stated otherwise, the reported
perfor-mances of LMS and DLSR are best-performing in terms ofNMSE
with hyperparameters selected from the candidate sets.Also, we
consider the oracle of IMGM (abbreviated as “IMGM-O”), which relies
on the dynamical model (6), but with knownσt.To compare on equal
footing with LMS and DLSR, which cannotdeal with time-varying
observation matrices, we setHt = H forall t ∈ {1, . . . , T}. For
experiments with switching graphs, thecompeting algorithms know the
active graph topology per slott, whereas our mode-agnostic IMGM
estimates σt on-the-fly.The performance metric is the normalized
mean-square error(NMSE) over unobserved nodes, which is given
by
NMSE(t) := ‖Hct(x̂t|t − xt
)‖22/‖Hctxt‖22 (31)
where Hct is the sampling matrix for the unobserved nodes. Dueto
the random sampling scheme, the performance is averagedover 100
random sampling realizations.
A. Synthetic Data
We consider a synthetic dynamic process over a networkwith N =
60 nodes, and S = 2 modes. The graph topologiesassociated with the
two modes at slot 1 are generated by twosymmetric Erdős-Rényi
random graphs with edge existenceprobabilities 0.1 and 0.2,
respectively. In the following slots, werandomly choose two pairs
of nodes, and the edge between eachpair is flipped relative to the
previous slot per mode. The network
Authorized licensed use limited to: University of Minnesota.
Downloaded on May 08,2020 at 00:39:08 UTC from IEEE Xplore.
Restrictions apply.
-
LU et al.: GRAPH-ADAPTIVE SEMI-SUPERVISED TRACKING OF DYNAMIC
PROCESSES OVER SWITCHING NETWORK MODES 2593
Fig. 2. Posterior mode pmfs of IMGM for synthetic data.
Fig. 3. NMSE for synthetic data.
switches from mode 1 to mode 2 at slot 6, and back to mode 1
atslot 11 over a total T = 15 slots. The dynamic graph processxt is
generated according to (6) with F
σtt = 0.2(A
σtt + IN )
and ησtt ∼ N (ησtt ;0,Kσtt ), where Kσtt is a diffusion
kernelwith a = 0.1 (see ). The observations are generated based
on(1) with M = 30 and R = 4IM . Only IMGM-O was comparedwith IMGM
because the rest of the approaches have no infor-mation about the
generative model. The average mode posteriorprobabilities produced
by IMGM over 100 Monte-Carlo runsare shown Fig. 2, which
demonstrates that the IMGM is capableof keeping track of the active
network mode in the presence ofunknown switches. Further, Fig. 3
plots the NMSE over time, andillustrates that IMGM achieves the
same NMSE as IMGM-O,which relies on extra information. Fig. 4
depicts the estimatedprocesses along with the corresponding true
values over anunobserved node. The perfect tracking of the true
signal furthervalidates IMGM’s nearly optimal reconstruction
performance.
Fig. 4. True and estimated processes over an unobserved node for
syntheticdata.
B. Brain ECoG Dataset
Next, we experiment with the brain ECoG data obtained froman
epilepsy study [17]. The ECoG time series were obtainedfrom N = 76
electrodes implanted in a patient’s brain beforeand after a
seizure, where the onset of the seizure was identifiedby a
neurophysiologist. Therefore, there are S = 2 modes, thepre-ictal
and ictal mode that correspond to before and after theseizure. We
extract 250 samples from the dataset for each ofthe two modes,
which are preprocessed by subtracting the sam-ple mean and
normalizing by the sample standard deviation. Thepreprocessed
samples are then concatenated so that σt = 1 fort = 1, . . . , 250,
and σt = 2 for t = 251, . . . , 500. We constructa time-invariant
symmetric correlation graph for each of the twomodes, which is a
special case of the problem statement at theend of Section II. The
ECoG signals are modeled to evolve basedon (6), where the state
transition matrixFσtt = 0.15(A
σt + IN ),and process noise covariance Kσtt is a diffusion
kernel withparameter a = 2. Here, the value 0.15 and a = 2 are
selected toyield the lowest NMSE from the sets {0.1, 0.11, 0.12, .
. . , 0.3}and {1, 1.2, 1.4, . . . , 3}. The observations are
generated as in(1) with M = 53, and R = 10−2IM .
Fig. 5 shows the posterior mode probabilities {wst|t}2s=1
pro-duced by IMGM over 100 random sampling schemes. Here,IMGM plays
the role of a “neurophysiologist” who detects theonset of an
epileptic seizure. In addition, the NMSE of IMGMis comparable to
that of the mode-clairvoyant IMGM-O, whilemarkedly outperforming
KKF, LMS and DLSR, as confirmed byFig. 6. The NMSEs for all methods
undergo a peak at the onset ofthe ictal, while for LMS and KKF the
NMSEs are considerablylarger during the ictal period. As in Fig. 7,
the estimated brainsignals from IMGM and IMGM-O over an unobserved
nodeagree quite well with the corresponding true values, which
ishowever not the case for the rest of the approaches. Further,
Fig. 8demonstrates that IMGM enjoys lower NMSE as the number Mof
sampled nodes grows.
Authorized licensed use limited to: University of Minnesota.
Downloaded on May 08,2020 at 00:39:08 UTC from IEEE Xplore.
Restrictions apply.
-
2594 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 68, 2020
Fig. 5. Posterior mode pmfs of IMGM for ECoG data.
Fig. 6. NMSE for ECoG data (μLMS = 0.6, BLMS = 2, μDLSR =
1.2,BDLSR = 6, βDLSR = 0.5).
C. Temperature Prediction
The next dataset comprises hourly temperature measurementsat N =
109 measuring stations across the continental UnitedStates in 2010
by the National Climatic Data Center [1]. A time-invariant graph
was constructed based on geographical distancesas in [25]. Even
though only one graph is available, IMGM canstill be applied to
track the dynamic processes and simultane-ously learn the model
parameters that best fit the data as inSection IV, which would
otherwise need an offline trainingprocess. The value xt(n)
represents the tth temperature sam-ple recorded at the nth station.
The sampling interval in ourexperiment is chosen to be one day. The
number of observednodes is M = 44, and observation noise covariance
is selectedfrom the candidate set as Rrt = μrtIM , where μrt ∈ 10−4
×{1, 2, . . . , 5}. The transition matrix is taken as Fltt =
clt(A+
Fig. 7. True and estimated brain signals over an unobserved node
for ECoGdata (μLMS = 0.6, BLMS = 2, μDLSR = 1.2, BDLSR = 6, βDLSR =
0.5).
Fig. 8. Overall NMSE of IMGM versus M for ECoG data.
IN ), where clt takes value from 0.05 to 0.15 with uniform
grid0.02. The process noise is given by a diffusion kernel with a =
2.Thus, IMGM is equipped with 30 candidate dynamical models,among
which the best performing one is assigned to IMGM-Owith Ft =
0.05(A+ IN ) and R = 10−4IM .
As shown in Fig. 9, the mode-agnostic IMGM demonstratessuperior
tracking performance compared to the KKF, LMS andDLSR, while it
also showcases performance comparable toIMGM-O. Hence, IMGM is
capable of selecting the dynamicalmodel that best fits the data
on-the-fly. Fig. 10 further corrobo-rates this assertion by
displaying the true and estimated networkdelays from the candidate
approaches over an unobserved node.The probability of existence of
each model is reported by theposterior mode pmf wst|t as w
1t|t ≈ 1, and wst|t ≈ 0 for s other-
wise.
Authorized licensed use limited to: University of Minnesota.
Downloaded on May 08,2020 at 00:39:08 UTC from IEEE Xplore.
Restrictions apply.
-
LU et al.: GRAPH-ADAPTIVE SEMI-SUPERVISED TRACKING OF DYNAMIC
PROCESSES OVER SWITCHING NETWORK MODES 2595
Fig. 9. NMSE for temperature data (μLMS = 1.5, BLMS = 10, μDLSR
=1.2, BDLSR = 4, βDLSR = 0.5).
Fig. 10. True and estimated temperature values over an
unobserved location(μLMS = 1.5, BLMS = 10, μDLSR = 1.2, BDLSR = 4,
βDLSR = 0.5).
D. Network Delay Prediction
The last dataset records measurements of path delays on
theInternet2backbone [2]. The network comprises 9 end-nodes and26
directed links. There are N = 70 paths, each connectingtwo
origin-destination nodes by a subset of the 26 links. Theactive
links for each path are described by the path-link routingmatrix B
∈ {0, 1}70×26, whose (n, l)th entry Bn,l is 1, if path ntraverses
link l, and 0 otherwise. With each vertex representingone of these
paths, an undirected graph is constructed with the(n, n′)th entry
(n �= n′) of the adjacency matrix as
A(n, n′) =
∑26l=1 Bn,lBn,′l∑26
l=1 Bn,l +∑26
l=1 Bn,′l −∑26
l=1 Bn,lBn,′l
Fig. 11. NMSE for network delay data (μLMS = 1.5, BLMS =
12).
Fig. 12. True and estimated network delays over an unobserved
path (μLMS =1.5, BLMS = 12).
which places large weights for vertices (paths) with a
largenumber of common links. The graph process xt(n) representsthe
delay of path n in minutes.
The number of observed nodes is selected to be M = 20.The
candidate dynamical models for IMGM are configuredas follows. The
state transition matrix is selected to be Ft =0.17(A+ IN ). Process
noise covariance Kkt is chosen from aset of K = 8 diffusion kernels
(cf. ) with parameter akt takingvalues from 0.6 to 2 with uniform
space 0.2. Observation noisecovariance for the observation model
(30) is Rrt = μrtIM ,with μrt ∈ {10−4, 10−3, 10−2, 10−1}. The
number of candi-date models for IMGM is then S̄ = 8× 4 = 32, among
whichIMGM-O is equipped with the best performing one: a = 0.6
and
Authorized licensed use limited to: University of Minnesota.
Downloaded on May 08,2020 at 00:39:08 UTC from IEEE Xplore.
Restrictions apply.
-
2596 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 68, 2020
R = 10−2IM . For this experiment, we did not employ DLSRbecause
it did not yield comparable performance to the rest ofthe
alternatives.
Adaptively choosing a model with kernel parameter akt andnoise
parameter μrt from the candidate set, IMGM exhibitssuperior
tracking performance compared to the single-modelalternatives, as
confirmed by Fig. 11. This also corroborates thatIMGM provides a
probabilistic multi-kernel learning approach.The estimated and true
network delays from an unobserved nodeover the entire observation
interval are plotted in Fig. 12.
VI. CONCLUSION
This paper dealt with tracking dynamic graph processes
thatevolve over a candidate set of graph topologies with
unknownswitches. To this end, a dynamical model was introduced to
cap-ture both spatial and temporal variations of the graph
processesthrough the notion of active mode-conditioned topology.
Sub-sequently, given observations over a subset of nodes, a
scalableBayesian tracker, termed IMGM, was developed to carry
outsemi-supervised tracking of the dynamic graph processes
jointlywith the active network mode. The novel IMGM solver
lendsitself to several important generalizations, including
dynamicfunction switches, multiple kernels, and adaptive
observationnoise covariances. Accounting for all these models,
IMGMoffers an online approach to select the one that best fits the
dataadaptively, while at the same time tracks the graph
processes.Numerical tests on synthetic and real data corroborated
theperformance gain of the novel IMGM approach.
REFERENCES
[1] “1981-2010 U.S. climate normals,” [Online]. Available:
https://www.
ncdc.noaa.gov/data-access/land-based-station-data/land-based-datasets/climat
e-normals/1981-2010-normals-data. Accessed: Apr.29, 2019.
[2] “One-way ping internet2,” [Online]. Available:
http://software.internet2.edu/owamp/, Accessed: Apr. 29, 2019.
[3] R. Agaev and P. Chebotarev, “The matrix of maximum out
forests ofa digraph and its applications,” Autom. Remote Control,
vol. 61, no. 9,pp. 1424–1450, 2000.
[4] R. P. Agaev and P. Chebotarev, “Spanning forests of a
digraph andtheir applications,” Autom. Remote Control, vol. 62, no.
3, pp. 443–466,Mar. 2001.
[5] B. Baingana and G. B. Giannakis, “Tracking switched dynamic
networktopologies from information cascades,” IEEE Trans. Signal
Process.,vol. 65, no. 4, pp. 985–997, Feb. 2017.
[6] Y. Bar-Shalom, X.-R. Li, and T. Kirubarajan, Estimation With
Applicationsto Tracking and Navigation: Theory Algorithms and
Software. Hoboken,NJ, USA: Wiley, 2004.
[7] H. A. Blom and Y. Bar-Shalom, “The interacting multiple
model algorithmfor systems with Markovian switching coefficients,”
IEEE Trans. Autom.Control, vol. 33, no. 8, pp. 780–783, Aug.
1988.
[8] D. Boley, G. Ranjan, and Z.-L. Zhang, “Commute times for a
directedgraph using an asymmetric Laplacian,” Linear Algebra its
Appl., vol. 435,no. 2, pp. 224–242, Jul. 2011.
[9] F. Chung, “Laplacians and the Cheeger inequality for
directed graphs,”Ann. Combinatorics, vol. 9, no. 1, pp. 1–19, Apr.
2005.
[10] P. Di Lorenzo, S. Barbarossa, P. Banelli, and S.
Sardellitti, “Adaptiveleast mean-squares estimation of graph
signals,” IEEE Trans. Signal Inf.Process. Over Netw., vol. 2, no.
4, pp. 555–568, Dec. 2016.
[11] P. A. Forero, K. Rajawat, and G. B. Giannakis, “Prediction
of partiallyobserved dynamical processes over networks via
dictionary learning,”IEEE Trans. Signal Process., vol. 62, no. 13,
pp. 3305–3320, Jul. 2014.
[12] G. B. Giannakis, Y. Shen, and G. V. Karanikolas, “Topology
identificationand learning over graphs: Accounting for
nonlinearities and dynamics,”Proc. IEEE, vol. 106, no. 5, pp.
787–807, May 2018.
[13] Y. Huang, Y. Zhang, Z. Wu, N. Li, and J. Chambers, “A novel
adaptiveKalman filter with inaccurate process and measurement noise
covariancematrices,” IEEE Trans. Autom. Control, vol. 63, no. 2,
pp. 594–601,Feb. 2018.
[14] V. N. Ioannidis, D. Romero, and G. B. Giannakis, “Inference
of spatio-temporal functions over graphs via multikernel kriged
Kalman filter-ing,” IEEE Trans. Signal Process., vol. 66, no. 12,
pp. 3228–3239,Jun. 2018.
[15] E. D. Kolaczyk, Statistical Analysis of Network Data:
Methods and Mod-els. Berlin, Germnay: Springer, 2009.
[16] R. I. Kondor and J. Lafferty, “Diffusion kernels on graphs
and other discretestructures,” in Proc. Int. Conf. Machi. Learn.,
Jul. 2002, pp. 315–322.
[17] M. A. Kramer, E. D. Kolaczyk, and H. E. Kirsch, “Emergent
networktopology at seizure onset in humans,” Epilepsy Res., vol.
79, no. 2/3,pp. 173–186, May 2008.
[18] X. R. Li and Y. Bar-Shalom, “A recursive multiple model
approach tonoise identification,” IEEE Trans. Aerosp. Electron.
Syst., vol. 30, no. 3,pp. 671–684, Jul. 1994.
[19] X.-R. Li and Y. Bar-Shalom, “Design of an interacting
multiple modelalgorithm for air traffic control tracking,” IEEE
Trans. Control Syst.Technol., vol. 1, no. 3, pp. 186–194, Sep.
1993.
[20] Q. Lu, V. Ioannidis, and G. B. Giannakis, “Semi-supervised
trackingof dynamic processes over switching graphs,” in Proc. IEEE
Data Sci.Workshop, Minneapolis, MN, Jun. 2019, pp. 64–68.
[21] Q. Lu, V. Ioannidis, G. B. Giannakis, and M. Coutino,
“Learninggraph processes with multiple dynamical models,” in Proc.
AsilomarConf. Signals, Syst., Comput., Pacific Grove, CA, USA, Nov.
2019,pp. 1783–1787.
[22] E. Mazor, A. Averbuch, Y. Bar-Shalom, and J. Dayan,
“Interacting multiplemodel methods in target tracking: A survey,”
IEEE Trans. Aerosp. Electron.Syst., vol. 34, no. 1, pp. 103–123,
Jan. 1998.
[23] K. P. Murphy, Machine Learning: A Probabilistic
Perspective. Cambridge,MA, USA: MIT Press, 2012.
[24] N. Perraudin and P. Vandergheynst, “Stationary signal
processing ongraphs,” IEEE Trans. Signal Process., vol. 65, no. 13,
pp. 3462–3477,Jul. 2017.
[25] D. Romero, V. N. Ioannidis, and G. B. Giannakis,
“Kernel-based recon-struction of space-time functions on dynamic
graphs,” IEEE J. Sel. TopicsSignal Process., vol. 11, no. 6, pp.
856–869, Sep. 2017.
[26] D. Romero, M. Ma, and G. B. Giannakis, “Kernel-based
reconstruction ofgraph signals,” IEEE Trans. Signal Process., vol.
65, no. 3, pp. 764–778,Feb. 2017.
[27] A. V. Savkin and R. J. Evans, Hybrid Dynamical Systems:
Controller andSensor Switching Problems. Berlin, Germany: Springer,
2002.
[28] I. D. Schizas, G. B. Giannakis, S. I. Roumeliotis, and A.
Ribeiro, “Con-sensus in Ad Hoc WSNs with noisy links—part II:
Distributed estimationand smoothing of random signals,” IEEE Trans.
Signal Process., vol. 56,no. 4, pp. 1650–1666, Apr. 2008.
[29] S. Segarra, A. G. Marques, G. Leus, and A. Ribeiro,
“Reconstruction ofgraph signals through percolation from seeding
nodes,” IEEE Trans. SignalProcess., vol. 64, no. 16, pp. 4363–4378,
Aug. 2016.
[30] D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P.
Vandergheynst,“The emerging field of signal processing on graphs:
Extending high-dimensional data analysis to networks and other
irregular domains,” IEEESignal Process. Mag., vol. 30, no. 3, pp.
83–98, May 2013.
[31] A. J. Smola and R. Kondor, “Kernels and regularization on
graphs,” inLearning Theory and Kernel Machines. Berlin, Germany:
Springer, 2003,pp. 144–158.
[32] P. A. Traganitis and G. B. Giannakis, “Sketched subspace
clustering,”IEEE Trans. Signal Process., vol. 66, no. 7, pp.
1663–1675, Apr. 2018.
[33] E. A. Wan and A. T. Nelson, “Dual Kalman filtering methods
for nonlinearprediction, smoothing and estimation,” in Proc.
Advances Neural Inf.Process. Syst., 1997, pp. 793–799.
[34] E. A. Wan and R. Van Der Merwe, “The unscented Kalman
filter fornonlinear estimation,” in Proc. IEEE Adaptive Syst. for
Signal Process.,Commun., Control Symp., 2000, pp. 153–158.
[35] X. Wang, M. Wang, and Y. Gu, “A distributed tracking
algorithm forreconstruction of graph signals,” IEEE J. Sel. Topics
Signal Process., vol. 9,no. 4, pp. 728–740, Jun. 2015.
[36] Q. Yu et al., “Application of graph theory to assess static
and dynamicbrain connectivity: Approaches for building brain
graphs,” Proc. IEEE,vol. 106, no. 5, pp. 886–906, May 2018.
Authorized licensed use limited to: University of Minnesota.
Downloaded on May 08,2020 at 00:39:08 UTC from IEEE Xplore.
Restrictions apply.
https://www. ignorespaces
ncdc.noaa.gov/data-access/land-based-station-data/land-based-datasets/climat
ignorespaces e-normals/1981-2010-normals-data ignorespaces
http://software.internet2. ignorespaces edu/owamp/
-
LU et al.: GRAPH-ADAPTIVE SEMI-SUPERVISED TRACKING OF DYNAMIC
PROCESSES OVER SWITCHING NETWORK MODES 2597
Qin Lu (Member, IEEE) received the Ph.D. degreein electrical
engineering from the University of Con-necticut (UConn), Mansfield,
CT, USA, in 2018. Sheis currently a Postdoctoral Research Associate
withthe University of Minnesota, Twin Cities, Minneapo-lis, MN,
USA. Her current research interests includemachine learning, data
science, and network science.In the past, she has worked on
statistical signalprocessing, multiple target tracking, and
underwateracoustic communications. She was awarded SummerFellowship
and Doctoral Dissertation Fellowship at
UConn. She was also the recipient of the Women of Innovation
Award inCollegian Innovation and Leadership by Connecticut
Technology Council inMarch, 2018.
Vassilis N. Ioannidis (Student Member, IEEE) re-ceived the
diploma in electrical and computer en-gineering from the National
Technical University ofAthens, Athens, Greece, in 2015, and the
M.Sc. de-gree in electrical engineering in 2017 from the
Uni-versity of Minnesota, Twin Cities, Minneapolis, MN,USA. He is
currently working toward the Ph.D. de-gree with the Department of
Electrical and ComputerEngineering. He received the Doctoral
DissertationFellowship in 2019 from the University of Minnesota.He
also received student travel awards from the IEEE
Signal Processing Society in 2017–2018 and from the IEEE (2018).
From 2014to 2015, he was a middleware consultant for Oracle in
Athens, Greece, andreceived a Performance Excellence Award. His
research interests include deepgraph learning, machine learning,
big data analytics, and network science.
Georgios B. Giannakis (Fellow, IEEE) received thediploma in
electrical engineering from the NationalTechnical University of
Athens, Athens, Greece,1981, the M.Sc. degree in electrical
engineering, theM.Sc. degree in mathematics, and the Ph.D. degree
inelectrical engineering from the University of South-ern
California (USC), Los Angeles, CA, USA, in1983, 1986, and 1986,
respectively. From 1982 to1986, he was with the USC. He was a
Faculty Memberwith the University of Virginia from 1987 to 1998,and
since 1999, he has been a Professor with the
University of Minnesota, Twin Cities, Minneapolis, MN, USA,
where he holdsan ADC Endowed Chair, a University of Minnesota
McKnight PresidentialChair in ECE, and serves as the Director of
the Digital Technology Center.His current research interests focus
on data science, and network science withapplications to the
Internet of Things, and power networks with renewables.His general
interests span the areas of statistical learning, communications,
andnetworking—subjects on which he has authored or coauthored more
than 460journal papers, 760 conference papers, 25 book chapters,
two edited books andtwo research monographs. He is the Coinventor
of 33 issued patents, and thecorecipient of 9 best journal paper
awards from the IEEE Signal Processing(SP) and Communications
Societies, including the G. Marconi Prize PaperAward in Wireless
Communications. He also received the IEEE-SPS NobertWiener Society
Award (2019); EURASIP’s A. Papoulis Society Award (2020);Technical
Achievement Awards from the IEEE-SPS (2000) and from EURASIP(2005);
the IEEE ComSoc Education Award (2019); the G. W. Taylor Awardfor
Distinguished Research from the University of Minnesota, and the
IEEEFourier Technical Field Award (2015). He is a fellow of the
National Academyof Inventors, the European Academy of Sciences, and
EURASIP. He has servedthe IEEE in a number of posts, including that
of a Distinguished Lecturer forthe IEEE-SPS.
Authorized licensed use limited to: University of Minnesota.
Downloaded on May 08,2020 at 00:39:08 UTC from IEEE Xplore.
Restrictions apply.
/ColorImageDict > /JPEG2000ColorACSImageDict >
/JPEG2000ColorImageDict > /AntiAliasGrayImages false
/CropGrayImages true /GrayImageMinResolution 150
/GrayImageMinResolutionPolicy /OK /DownsampleGrayImages false
/GrayImageDownsampleType /Bicubic /GrayImageResolution 1200
/GrayImageDepth -1 /GrayImageMinDownsampleDepth 2
/GrayImageDownsampleThreshold 1.00083 /EncodeGrayImages true
/GrayImageFilter /DCTEncode /AutoFilterGrayImages false
/GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict >
/GrayImageDict > /JPEG2000GrayACSImageDict >
/JPEG2000GrayImageDict > /AntiAliasMonoImages false
/CropMonoImages true /MonoImageMinResolution 1200
/MonoImageMinResolutionPolicy /OK /DownsampleMonoImages false
/MonoImageDownsampleType /Bicubic /MonoImageResolution 1600
/MonoImageDepth -1 /MonoImageDownsampleThreshold 1.00063
/EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode
/MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None
] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false
/PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000
0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true
/PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ]
/PDFXOutputIntentProfile (None) /PDFXOutputConditionIdentifier ()
/PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped
/False
/CreateJDFFile false /Description >>>
setdistillerparams> setpagedevice