GRAPH SPECTRA IN COMPUTER SCIENCE Dragoˇ s Cvetkovi´ c Faculty of Electrical Engineering, University of Belgrade, and Mathematical Institute SANU, Belgrade, 11000 Belgrade, Serbia e-mail: [email protected]
GRAPH SPECTRA IN COMPUTER SCIENCE
Dragos Cvetkovic
Faculty of Electrical Engineering, University of Belgrade,
and
Mathematical Institute SANU, Belgrade,
11000 Belgrade, Serbia
e-mail: [email protected]
The talk is based on the papers
Cvetkovic D., Simic S.K., Graph spectra in computer science,
Linear Algebra Appl., 434(2011), 1545-1562.
Arsic B., Cvetkovic D., Simic S.K., Skaric M., Graph spectral
techniques in computer sciences, to appear.
First paper classifies areas of computer science where graph
spectra are used while the second paper classifies graph spectral
techniques which are used.
I am not giving a survey
on applications of matrices in computer science, or
on applications of graphs in computer science
the subject of the talk :
Applications of the theory of graph spectra (or of spectral graph
theory) in computer science
Spectral graph theory is a mathematical theory where linear
algebra and graph theory meet together
A spectral graph theory is a theory in which graphs are studied
by means of eigenvalues of a matrix M which is in a prescribed
way defined for any graph.
This theory is called M–theory.
Frequently used graph matrices:
A adjacency matrix
D diagonal matrix of vertex degrees
L = D − A Laplacian
Q = D + A signless Laplacian
The spectral graph theory is the union of all these particular
theories + interactions
For example, the adjacency matrix of the graph shown in Fig. 1
b b b bx1 x2 x3 x4
Fig.1
is given by A =
0 1 0 0
1 0 1 0 .0 1 0 1
0 0 1 0For the graph G on Fig.1 we have
PG(λ) =
∣∣∣∣∣∣∣∣∣∣∣∣∣∣
λ −1 0 0
−1 λ −1 0
0 −1 λ −1
0 0 −1 λ
∣∣∣∣∣∣∣∣∣∣∣∣∣∣
= λ4 − 3λ2 + 1 .
Eigenvalues of G are 1.6180, 0.6180, − 0.6180, − 1.6180 or
1 +√
5
2,−1 +
√5
2,
1−√5
2,−1−√5
2
Adjacency matrix - characteristic features
A walk of length k in a graph (or digraph) is a sequence of (not
necessarily different) vertices x1, x2, . . . , xk, xk+1 such that for each
i = 1, 2, . . . , k there is an edge (or arc) from xi to xi+1. The walk
is closed if xk+1 = x1.
Counting walks in a graph (or digraph) is related to graph spec-
tra by the following well-known result.
Theorem. If A is the adjacency matrix of a graph, then the
(i, j)-entry a(k)ij of the matrix Ak is equal to the number of walks
of length k that originate at vertex i and terminate at vertex j.
Thus, for example, the number of closed walks of length k is
equal to the k-th spectral moment, since∑n
i=1 a(k)ii = tr(Ak) =
∑ni=1 λk
i .
Laplacian matrix - characteristic features
Let G be a connected graph on n vertices. Eigenvalues in non-
decreasing order and corresponding orthonormal eigenvectors of
the Laplacian L = D − A of G are denoted by ν1 = 0, ν2, . . . , νn
and u1, u2, . . . , un, respectively.
Note that if xT = (x1, x2, . . . , xn), then
xTLx =∑
i∼j, i<j(xi − xj)
2.
We also have
ν =∑
i∼j, i<j(xi − xj)
2
if x is a normalized eigenvector belonging to eigenvalue ν of L.
Biggs N.L., Algebraic Graph Theory, Cambridge University
Press, Cambridge, 1993.
Chung F., Spectral Graph Theory, American Mathematical So-
ciety, Providence, Rhode Island, 1997.
Cvetkovic D., Doob M., Sachs H., Spectra of Graphs, Theory
and Application, 3rd edition, Johann Ambrosius Barth Verlag,
Heidelberg–Leipzig, 1995.
Cvetkovic D., Rowlinson P., Simic S. K., An Introduction to
the Theory of Graph Spectra, Cambridge University Press, Cam-
bridge, 2009.
Typical research subjects in mathematical theory:
– characterizations of graphs by their spectra,
– inequalities for eigenvalues,
– extremal problems with eigenvalues,
– graph energy (the sum of the absolute values of eigenvalues).
Areas of Applications
1. Expanders and combinatorial optimization,
2. Complex networks and the Internet topology,
3. Data mining,
4. Computer vision and pattern recognition,
5. Internet search,
6. Load balancing and multiprocessor interconnection networks,
7. Anti-virus protection versus spread of knowledge,
8. Statistical databases and social networks,
9. Quantum computing.
10. Bioinformatics,
11. Coding theory,
12. Control theory.
Expanders and combinatorial optimization
One of the oldest applications (from 1970’s) of graph eigenvalues
in Computer Science is related to graphs called expanders.
A graph has good expanding properties if each subset of the
vertex set of small cardinality has a set of neighbors of large car-
dinality.
Expanders and some related graphs (called enlargers, magni-
fiers, concentrators and superconcentrators) appear in treatment
of several problems in Computer Science (for example, communi-
cation networks, error-correcting codes, optimizing memory space,
computing functions, sorting algorithms, etc.).
Expanders can be constructed from graphs with a small second
largest eigenvalue in modulus. Such class of graphs includes the so
called Ramanujan graphs.
Expanders are related to some problems of combinatorial op-
timization. More generally, several algorithms of combinatorial
optimization are considered as part of computer science.
Numerous relations between eigenvalues of graphs and combi-
natorial optimization have been known for last twenty years. The
section titles of an excellent expository article
Mohar B., Poljak S., Eigenvalues in combinatorial optimiza-
tion, in: Combinatorial and Graph-Theoretical Problems in
Linear Algebra, (ed. R. Brualdi, S. Friedland, V. Klee), Springer-
Verlag, New York, 1993, 107–151.
show that many problems in combinatorial optimization can be
treated using eigenvalues:
1. Introduction, 1.1. Matrices and eigenvalues of graphs;
2. Partition problems; 2.1 Graph bisection, 2.2. Connectivity
and separation, 2.3. Isoperimetric numbers, 2.4. The maximum
cut problem, 2.5. Clustering, 2.6. Graph partition;
3. Ordering, 3.1. Bandwidth and min-p-sum problems, 3.2.
Cut-width, 3.3 Ranking, 3.4. Scaling, 3.5. The quadratic assign-
ment problem;
4. Stable sets and coloring, 4.1. Chromatic number, 4.2. Lower
bounds on stable sets, 4.3. Upper bounds on stable sets, 4.4. k-
colorable subgraphs;
5. Routing problems, 5.1. Diameter and the mean distance, 5.2.
Routing, 5.3. Random walks;
6. Embedding problems;
The second smallest eigenvalue of the graph Laplacian is called
algebraic connectivity of the graph and was introduced by M.
Fiedler in the paper
Algebraic connectivity of graphs, Czech. J. Math., 23(98)(1973),
298-305.
The algebraic connectivity has been used in
D. Cvetkovic, M. Cangalovic and V. Kovacevic-Vujcic, Semidef-
inite programming methods for the symmetric traveling salesman
problem, Integer Programming and Combinatorial Optimiza-
tion, Proc. 7th Internat. IPCO Conf., Graz, Austria, June
1999, Lecture Notes Comp. Sci. 1610, Springer, Berlin, 1999,
126-136.
to formulate the following discrete semidefinite programming
model of the symmetric travelling salesman problem (STSP):
STSP:
minimize F (X) =∑n
i=1∑n
j=1
(−1
2dij
)xij + α
2
∑ni=1
∑nj=1 dij
subject to
xii = 2 + α− β (i = 1, . . . , n),∑n
j=1 xij = nα− β, (i = 1, . . . , n),
xij ∈ {α− 1, α} (j = 1, . . . , n : i < j), X ≥ 0
Here X ≥ 0 means that the matrix X is symmetric and positivesemidefinite, while α and β are chosen so that α > hn/n and 0 <β ≤ hn with hn = 2−2 cos(2π/n) being the algebraic connectivityof the cycle Cn.
Complex networks
Complex networks is a common name for various real networks
which are presented by graphs with an enormously great number
of vertices. Here belong Internet graphs, phone graphs, e-mail
graphs, social networks and many other. In spite of their diversity
such networks show some common properties.
Several models of random graphs have been used to describe
complex networks including the classical Erdos-Renyi model where
we have a constant probability for the existence of each edge. There
are models where given degree distribution is realized.
Main characteristic of complex networks is the degree and eigen-
value distribution. Both distributions obey a power low of the form
x−β for a positive β.
In particular, if nk denotes the number of vertices of degree k,
then asymptotically nk = ak−β for some constant a.
It was conjectured in
Faloutsos M., Faloutsos P., Faloutsos C., On power-low rela-
tionships of the Internet topology, Proc. ACM SIGCOMM ’99,
ACM Press, New York, 1999, 251-262.
that in networks with degree power law the largest eigenvalues of
the adjacency matrix have also a power law distribution. That was
proved under some conditions in
Mihail M. Papadimitrou C.H., On the eigenvalue power-low,
RANDOM 2002, LNCS 2483, Springer, Berlin, 2002, 254-262.
The power law for eigenvalues can be formulated in the following
way. Let λ1, λ2, . . . be non-increasing sequence of eigenvalues of
the adjacency matrix, then asymptotically λi = ai−γ for some
constant a and positive γ.
The following book is devoted to complex networks.
Chung F., Lu L., Complex Graphs and Networks, American
Mathematical Society, Providence, Rhode Island, 2006.
There are two chapters which describe spectral properties of such
networks.
Note that most of the papers on complex networks appear in
scientific journals in the area of Physics.
Empirical studies of the Internet topology have been conducted
in many papers using the normalized Laplacian matrix
L = D−12(D − A)D−1
2 = D−12LD−1
2 .
The eigenvalues γi; i = 1, 2, . . . , n of L in non-decreasing order
can be represented by points ( i−1n−1, γi) in the region [0, 1] × [0, 2]
and can be approximated by a continuous curve. It was noticed in
Vukadinovic D., Huang P., Erlebach T., A spectral analysis of
the Internet topology, 2001
that this curve is practically the same during the time for several
networks in spite of the increasing number of vertices and edges of
the corresponding graph. Therefore the authors consider the spec-
trum of L as a fingerprint of the corresponding network topology.
Data mining
Data mining discovers interesting and unknown relationships
and patterns in huge data sets. Such hidden information could
contribute very much to many domains such as image processing,
web searching, computer security and many others including those
outside computer science.
Among many tools used in data mining, spectral techniques play
an important role
Sawilla R., A survey of data mining of graphs using spectral
graph theory, Defence R&D Canada ’ Ottawa, Technical Memo-
randum TM 2008-317, Ottawa, 2008.
Here belong, in particular, clustering and ranking the vertices of
a graph. While ranking will be treated later, here we consider the
clustering.
A description of spectral clustering methods is given in the
tutorial
Luxburg U. von, A tutorial on spectral clustering, Stat. Com-
put. 17(2007), 395-416.
We shall present an algorithm for graph clustering which is based
on the Laplacian matrix of a graph.
Let G be a connected graph on n vertices. Eigenvalues in non-
decreasing order and corresponding orthonormal eigenvectors of
the Laplacian L = D − A of G are denoted by ν1 = 0, ν2, . . . , νn
and u1, u2, . . . , un, respectively.
In order to construct k clusters in a graph we form an n × k
matrix U containing the vectors u1, u2, . . . , uk as columns. We
have a geometric representation G of G in the k-dimensional space
Rk: we just take rows of U as point coordinates representing the
vertices of G. Edges are straight line segments between the cor-
responding points. Now classical clustering methods (say k-means
algorithm) should be applied to this new graph presentation.
In regular graphs we can use the adjacency matrix A instead
of the Laplacian L. We have L = rI − A for a regular graph of
degree r and λi = r − νi for i = 1, 2, . . . , n.
Eigenvectors of L are also eigenvectors of A for the correspond-
ing eigenvalues.
Instead of first k smallest eigenvalues we have to consider now
k largest eigenvalues.
Example. Consider the adjacency matrix A of a cycle of Cn
length n. It is well known that the eigenvalues λj of the matrix A
are given by λj = 2 cos(2πj/n) (j = 1, . . . , n).
For j = 1 and j = n − 1 we get the second largest eigenvalue
2 cos(2π/n) which corresponds to the second smallest eigenvalue
2− 2 cos(2π/n) of the Laplacian.
Two independent eigenvectors x, y are given by coordinates
xl = cos(2πl/n) (l = 1, . . . , n), yl = sin(2πl/n) (l = 1, . . . , n).
If we represent vertices by points (xi, yi), the picture of the graph
is a regular n-gon.
Graph representation obtained by the Laplacian matrix has been
used in graph drawings:
Koren Y., Drawing graphs by eigenvectors: theory and prac-
tice, Comput. Math. appl., 49(2005), 1867-1888.
Tutte W.T., How to draw a graph, Proc. London Math. Soc.,
13(1963),743-768.
Example. In our context interesting are also fullerene graphs
corresponding to carbon compounds called fullerenes. Fullerene
graphs are planar regular graphs of degree 3 having as faces only
pentagons and hexagons. The Euler theorem for planar graphs
yields that the number of pentagons is exactly 12. Although be-
ing planar, fullerene graphs are represented (and this really corre-
sponds to actual positions of carbon atoms in a fullerene) in 3-space
with its vertices embedded in a quasi-spherical surface.
A typical fullerene is C60. It can be described also as a truncated
icosahedron and has the shape of a football.
Figure 1: a) Planar and b) 3D visualization of the icosahedral fullerene C60
Fullerene graphs have a nice 3D-representation in which the co-
ordinates of the positions of vertices can be calculated from three
eigenvectors of the adjacency matrix (the so called topological co-
ordinates which were also used in producing the atlas
P. W. Fowler and D. E. Manolopoulos, An Atlas of Fullerenes,
Clarendon Press, Oxford, 1995.).
Together with the Laplacian L and the normalized Laplacian
L also the matrix D−1L has been used in clustering algorithms.
According to
Luxburg U. von, A tutorial on spectral clustering, Stat. Com-
put. 17(2007), 395-416.
the last matrix performs best.
Computer vision and pattern recognition
Spectral graph theory has been widely applied in computer vi-
sion and pattern recognition. Examples include image segmen-
tation, routing, image classification, etc. These methods use the
spectrum, i.e. eigenvalues and eigenvectors, of the adjacency or
Laplacian matrix of a graph.
The basic idea is to represent an image by a weighted graph with
a vertex for each pixel and the edges between the neighbouring
pixels with weight depending on how similar the pixels are.
A more sophisticated idea is to represent an image’s content by
a graph with specially selected points as vertices. The interesting
points are points in an image which have a well-defined position
and can be robustly detected.
Several other graphs are used.
The image segmentation is an important procedure in computer
vision and pattern recognition. The problem is to divide the im-
age into regions according to some criteria. Very frequently the
image segmentation is obtained using eigenvectors of some graph
matrices.
Internet search
Web search engines are based on eigenvectors of the adjacency
and some related graph matrices:
PageRank (used in Google)
Brin S., Page L., The Anatomy of Large-Scale Hypertextual
Web Search Engine, Proc. 7th International WWW Conference,
1998.
and Hyperlinked Induced Topics Search (HITS)
Kleinberg J., Authoratitive sources in a hyperlinked environ-
ment, J. ACM, 48 (1999), 604-632.
Internet is represented by a digraph G, web pages – vertices,
links – arcs.
HITS exploits eigenvectors belonging to the largest eigenvalues
of the matrices AAT and ATA where A is the adjacency matrix
of a subgraph of G induced by the set of web pages obtained from
search key words by some heuristics. Eigenvectors define ordering
of selected web pages.
PageRank uses random walks. Adjacency matrix of G is normal-
ized so that the sum of entries in each row is equal to 1. This matrix
P is the transition matrix of a Markov chain and the normalized
eigenvector of the largest eigenvalue of P T defines the equilibrium
state of the chain. Pages are ranked by the coordinates of this
eigenvector.
Expository paper
Langville A.N., Meyer C.D., A survey of eigenvector methods
for Web information retrieval, SIAM Rev., 47(2005), No. 1,
135-161.
contains a survey of both techniques.
Load balancing and multiprocessor interconnection
networks
The job which has to be executed by a multiprocessor system
is divided into parts (elementary jobs or items) that are given to
particular processors. Elementary jobs distribution among pro-
cessors can be represented by a vector x whose coordinates are
non-negative integers associated to graph vertices and indicate how
many elementary jobs are given to corresponding processors.
The load balancing problem requires creation of algorithms for
moving elementary jobs among processors in order to achieve the
uniform distribution, i.e., that the vector x is an integer multiple
of the vector j whose all coordinates are equal to 1.
There are load balancing algorithms based on graph eigenvalues
and eigenvectors.
The number of iterations is equal to the number m of non-
zero distinct Laplacian eigenvalues of the underlying graph. The
maximum vertex degree ∆ of G also affects computation of the
balancing flow. The complexity of the balancing flow calculations
essentially depends on the product m∆ and that is why this quan-
tity was proposed in
R. Elsasser, R. Kralovic, B. Monien, Sparse topologies with
small spectrum size, Theor. Comput. Sci. 307:549–565, 2003.
as a parameter relevant for the choice and the design of multipro-
cessor interconnection networks.
The following definitions of four kinds of graph tightness have
been introduced and used in Cvetkovic D., Davidovic D., 2008,
2009.
First type mixed tightness t1(G) of a graph G is defined as the
product of the number of distinct eigenvalues m and the maximum
vertex degree ∆ of G, i.e., t1(G) = m∆.
Structural tightness stt(G) is the product (D + 1)∆ where D
is diameter and ∆ is the maximum vertex degree of a graph G.
Spectral tightness spt(G) is the product of the number of dis-
tinct eigenvalues m and the largest eigenvalue λ1 of a graph G.
Second type mixed tightness t2(G) is defined as a function of
the diameter D of G and the largest eigenvalue λ1, i.e., t2(G) =
(D + 1)λ1.
Several arguments were given which support the claim that
graphs with small tightness t2 are well suited for multiprocessor
interconnection networks.
It was proved that the number of connected graphs with a
bounded tightness is finite and graphs with tightness values not
exceeding 9 are determined explicitly. There are 69 such graphs
and they contain up to 10 vertices. In addition, graphs with mini-
mal tightness values when the number of vertices is n = 2, . . . , 10
are identified.
Anti-virus protection versus spread of knowledge
The largest eigenvalue λ1 plays an important role in modelling
virus propagation in computer networks. The smaller the largest
eigenvalue, the larger the robustness of a network against the
spread of viruses. In fact, it was shown in
Wang Y., Chakrabarti D., Wang C., Faloutsos C., Epidemic
spreading in real networks: An eigenvalue viewpoint, 22nd Symp.
Reliable Distributed Computing, Florence, Italy, Oct. 6–8, 2003.
that the epidemic threshold in spreading viruses is proportional to
1/λ1. Another model of virus propagation in computer networks
has been developed in
Van Mieghem P., Omic J., Kooij R., Virus spread in networks,
with the same conclusion concerning 1/λ1.
Research and development networks (R&D networks) are stud-
ied using the largest eigenvalue of the adjacency matrix in
Konig, M. D., Battiston S., Napoletano M., Schweityer F., The
efficiency and evolution of R&D networks, Working Paper 08/95,
Economics Working Paper Series, Eidgenossische Technische Hochschule
Zurich, Zrich, 2008.
Konig, M. D., Battiston S., Napoletano M., Schweityer F., On
algebraic graph theory and the dynamics of innovation net-
works, Networks and Heterogenous Media, 3 (2008), No. 2, 201–
219.
In such networks it is desirable that the knowledge is spread
through network as much as possible. Therefore the tendency is
to achieve high values of the largest eigenvalue, just opposite to
considerations of virus propagation.
Statistical databases and social networks
Statistical databases allow only statistical access to their records.
Individual values are confidential and are not to be disclosed, either
directly or indirectly. Thus, users of a statistical database are
restricted to statistical types of queries, such as looking for the sum
of values, minimum or maximum value of some parameters, etc.
Moreover, no sequence of answered queries should enable a user
to obtain any of the confidential individual values. However, if a
user is able to reveal a confidential individual value, the database
is said to be compromised. Statistical databases that cannot be
compromised are called secure.
One can consider a restricted case where the query collection
can be described as a graph. Surprisingly, the results from
Brankovic L., Usability of secure statistical data bases, PhD
Thesis, Newcastle, Australia, 1998.
Brankovic L., Miller M., Siran J., Graphs, (0,1)-matrices and
usability of statistical data bases, Congressus Numerantium, 120
(1996), 186–192.
show an amazing connection between compromise-free query col-
lections and graphs with least eigenvalue -2. This connection was
recognized in the paper
Brankovic Lj., Cvetkovic D., The eigenspace of the eigenvalue
-2 in generalized line graphs and a problem in security of sta-
tistical data bases, Univ. Beograd, Publ. Elektrotehn. Fak., Ser.
Mat., 14 (2003), 37–48.
The problem of protecting the privacy appears also in social
networks at the Internet (for example, FaceBook). To protect the
privacy of personal data the network is randomized by deleting
some actual edges and by adding some false edges in such a way
that global characteristics of the network are unchanged. This is
achieved using eigenvalues of the adjacency matrix (in particular,
the largest one) and of the Laplacian (algebraic connectivity) as
described in the paper
Ying X., Wu X., Randomizing social networks: a spectrum pre-
serving approach, Proc. SIAM Internat. Conf. Data Mining,
SDM2008, April 24–26, 2008, Atlanta, Georgia, USA, SIAM, 2008,
739–750.
Quantum computing
Quantum computation is a model of computation based on the
principles of quantum mechanics although the corresponding com-
puters have not yet been realized. In spite of the non-existence of
actual machines, the theory of quantum computing is very much
developed.
For a general overview on Quantum Information Technology see,
for example, special issue of the journal
NEC Research & Developments, 44(2003), No. 3.
A graph is called integral if its spectrum consists entirely of
integers.
It has been discovered recently in
Christandl M., Datta N., Ekert A., Landahl A.J., Perfect state
transfer in quantum spin networks, Phys. Rev. Lett., 92(2004),187902.
that integral graphs can play a role in the so called perfect state
transfer in quantum spin networks.
According to definition in Physics, there is perfect state transfer
between two vertices of a graph if a single excitation can travel
with fidelity one between the corresponding sites of a spin system
modelled by the graph.
Mathematically, let G be a graph with adjacency matrix A and
consider the matrix H(t) = eiAt where t is a real variable and
i2 = −1. According to
Godsil C., Periodic graphs, arXiv:0806.2704 [math.CO].
perfect state transfer occurs between vertices u and v of G if there
is a value of t such that |H(t)u,v| = 1.
This can happen in integral graphs but not always.
There are exactly 13 connected, cubic, integral graphs
Bussemaker F. C., Cvetkovic D., There are exactly 13 con-
nected, cubic, integral graphs, Univ. Beograd, Publ. Elektrotehn.
Fak., Ser. Mat. Fiz., No. 544 - No. 576(1976), 43-48.
Among them are, for example, the 3-cube and the Petersen
graph.
The 3-cube is the only connected cubic integral graph with per-
fect state transfer
Severini, S., The 3-dimensional cube is the only connected cubic
graph with perfect state transfer, to appear.
Second classification: spectral tools used
Graph matrices
Spectra of several graph matrices appear in applications.
The adjacency matrix and Laplacian appear most frequently but
also the signless Laplacian as well as normalized versions of these
matrices.
Incidence, distance and other matrices can be found as well.
Sometimes the considerations move from graph matrices to gen-
eral ones; equivalently, weighted graphs appear instead of graphs.
In some cases we encounter digraphs and hyper-graphs and cor-
responding matrices as well.
In many papers the normalized Laplacian matrix L = D−12(D−
A)D−12 = D−1
2LD−12 appears. This matrix has 1’s on the diagonal,
and at an off-diagonal position (i, j) the entry is equal to 0 for non-
adjacent and− 1√didj
for adjacent vertices i, j of degrees di, dj. The
spectrum of L belongs to the interval [0, 2] independently of the
number of vertices.
For non-trivial connected graphs the matrices D−1A and (2D)−1Q =
(2D)−1(D + A) = 12(I + D−1A) are transition matrices of Markov
chains for random and lazy random walks.
Let G be a graph with adjacency matrix A and consider the
matrix H(t) = eiAt, where t is a real variable and i2 = −1. This
matrix appears in quantum computing.
Very frequently we encounter affinity or similarity matrices.
For a set of objects the entries of such matrices indicate the measure
of affinity or similarity between the corresponding objects. For a
set of points in an Euclidean space the affinity between two points
at distance d is usually defined as exp(−d2/2σ2), where σ is a
parameter.
Affinity matrices can be understood as adjacency matrices of
weighted (complete) graphs. The row sums play now the role of
vertex degrees. Such matrices can be normalized or transformed
in a Laplacian-like form.
For a digraph G one can consider symmetric matrices AAT and
ATA together with the adjacency matrix A of G. Note that AAT
(ATA) contains out- (in-)degrees on the diagonal while the (i, j)-
entry is equal to the number of common front (rear) neighbours
for vertices i and j.
The adjacency matrix of a digraph G could be normalized so
that the sum of entries in each row is equal to 1. This is achieved
by dividing the entries in each row by the out-degree of the corre-
sponding vertex. Equivalently, we form a new matrix P = D−1+ A
where D+ is the diagonal matrix of out-degrees. The matrix P is
a transition matrix of a Markov chain and the normalized eigen-
vector of the largest eigenvalue of its transpose P T defines the
steady-state of the chain.
Spectral techniques used in computer sciences
1 Significant eigenvalues
1.1 Largest eigenvalue
1.2 Algebraic connectivity
1.3 The second largest eigenvalue
1.4 The least eigenvalue
1.5 Main eigenvalues
2 Eigenvector techniques
2.1 Principal eigenvector
2.2 The Fiedler eigenvector
2.3 Other eigenvectors
3 Spectral recognition problems
3.1 The spectral distance and similarity of graphs
3.2 Interlacing theorem and spectra of subgraphs
3.3 Structural and spectral perturbations of graphs
4 Spectra of random graphs
5 Miscellaneous topics
5.1 The Hoffman polynomial
5.2 Integral graphs
5.3 Graph divisors
A selection of applications
1. Largest eigenvalue : Anti-virus protection versus
spread of knowledge
2. Main eigenvalues - controllability of agents
3. Principal eigenvector - Internet search
4. The Fiedler eigenvector - image segmentation
5. Other Laplacian eigenvectors - spectral clustering
6. Random graphs - Complex networks
7. The Hoffman polynomial - load balancing in mul-
tiprocessor systems
Largest eigenvalue : Anti-virus protection versus
spread of knowledge
The largest eigenvalue λ1 plays an important role in modelling
virus propagation in computer networks. The smaller the largest
eigenvalue, the larger the robustness of a network against the
spread of viruses.
Research and development networks (R&D networks) are stud-
ied using the largest eigenvalue of the adjacency matrix.
In such networks it is desirable that the knowledge is spread
through network as much as possible. Therefore the tendency is
to achieve high values of the largest eigenvalue, just opposite to
considerations of virus propagation.
An intuitive explanation of both phenomena, advantage to have
minimal index for virus protection and maximal index for knowl-
edge spread, can be obtained by the fact that the number of walks
of length k in a connected graph behaves asymptotically as cλk1
for a constant c > 0. The greater the number of walks the more
intensive is the spread of the mowing substance, does not matter
whether this is the virus or the knowledge.
Additional example:
A. Silva, P. Reyes, M. Debbah, Congestion in Randomly De-
ployed Wireless Ad-Hoc and Sensor Networks, Proc. Internat.
Conf. Ultra Modern Telecommunications St. Petersburg, Russia,
October 12-14, 2009.
The congestion number is the inverse of the spectral radius of
the graph. The intuitive explanation to this definition is that while
more paths of a fixed length we have in order to send information,
we can split the information on these paths and coordinate them
to arrive with the same number of hops at the receiver.
The virus propagation model of Wang Y. et al. is a discrete time
model. It uses the vector Pt = (p1,t, p2,t, . . . , pn,t)T where pi,t is
the probability that the vertex i is infected at time t. The basic
relation is
pi,t = (1− δ)pi,t−1 + β∑
i∼jpj,t−1,
where β is the virus birth rate on an edge connected to an infected
vertex and δ the virus curing rate on an infected vertex. The
corresponding matrix relation is
Pt = ((1− δ)I + βA)Pt−1,
where A is the adjacency matrix of the graph representing the
network.
Further we have Pt = StP0 for t = 0, 1, 2, . . . where S =
(1− δ)I + βA. We see that the the vector Pt will tend to a zero-
vector for t →∞ if and only if all eigenvalues of the matrix S are
smaller than 1 in modulus. This would mean that virus epidemic
has died and will happen if 1− δ + βλ1 < 1, i.e.
β
δ<
1
λ1.
Given β and δ, we see that the network is as safer as the smaller
is λ1. We can denote the quantity τ = 1λ1
as the epidemic threshold
in spreading viruses. Hence if βδ < τ the network is safe and in the
opposite case the network will be conquered by viruses.
There are numerous mathematical investigations in both direc-
tions: to find graphs in particular classes of graphs which have
minimal or maximal largest eigenvalue. We mention a few results
and references.
We need the following definition.
Definition. A graph G with the edge set EG is called a nested
split graph if its vertices can be ordered so that jq ∈ EG implies
ip ∈ EG whenever i ≤ j and p ≤ q.
This definition is used in
Cvetkovic D., Rowlinson P., Simic S. K., Eigenspaces of Graphs,
Cambridge University Press, Cambridge, 1997,
where the graphs in question were called graphs with a stepwise
adjacency matrix. Some other definitions and terms are used in
the literature, e.g. degree maximal graphs, threshold graphs. Note
that graphs with a stepwise adjacency matrix are exactly the nested
split graphs. We also have
Proposition. A graph is a nested split graph if and only if
it does not contain as an induced subgraph any of the graphs
P4, 2K2, C4.
It is well-known that in connected graphs with the given numbers
of vertices and edges the graph with maximal largest eigenvalue is
a nested split graph.
There are less results concerning minimal values of the largest
eigenvalue. The paper
Simic S., On the largest eigenvalue of bicyclic graphs, Publ. Inst.
Math.(Beograd), 46(60)(1989), 1-6.
solves the problem for bicyclic graphs.
Motivated by this fact, the authors of
van Dam E. R., Kooij R. E., The minimal spectral radius of
graphs with a given diameter, Linear Alg. Appl. 423 (2007),
408–419.
determine graphs with minimal λ1 among graphs with given num-
bers of vertices and edges, and having a given diameter.
Main eigenvalues - controllability of agents
An A-eigenvalue of a graph is called main if the corresponding
eigenspace contains a vector in which the sum of coordinates is
different from 0.
Graphs in which all eigenvalues are mutually distinct and main
have recently attracted some attention. There are no such graphs
on less than 6 vertices (except for trivial graph K1) and there are
exactly 8 connected graphs with this property on 6 vertices.
In control theory networked dynamic systems which consist of
independent ”agents” (integrators) exchanging information along
edges of a graph are considered. Such a system ”controllable” if
and only if the corresponding graph has all eigenvalues mutually
distinct and main.
Therefore connected graphs in which all eigenvalues are mutually
distinct and main are called ”controllable”.
Rahmani A., Ji M., Mesbahi M., Egerstedt M., Controllabil-
ity of multi-agent systems from a graph-theoretic perspective,
SIAM J. Control. Opt., 48(2009), No. 1, 162-186.
The following differential equation is a standard system model
for physical systems:
(1)dx
dt= Ax + bu.
Here x = x(t) is called the state vector, with given x(0), and thescalar u = u(t) is the control input. The matrix A has size n× n,while both x and b have size n× 1.
The system (1) is called controllable if the following is true;
given any vector x∗ and time t∗, there always exists a control func-
tion u(t), 0 < t < t∗, such that the solution of (1) gives x(t∗) = x∗
irrespective of x(0). That is, the state can be steered to any point
of n-dimensional vector space arbitrarily quickly.
It is well known in control theory that the system (1) is control-
lable if and only if the following controllability matrix
(2) [b Ab A2b . . . An−1b]
has full rank n.
The matrix (2) is the walk-matrix in graph theory in the case
that b is the all-one vector and A is the adjacency matrix of a
graph.
The walk matrix (2) has full rank n if and only if the number
of main eigenvalues is n. This in turn implies that all eigenvalues
should be distinct.
Mathematical considerations
Proposition 1. A graph and its complement have the same
number of main eigenvalues.
Proposition 2. The disjoint union of two controllable graphs
with disjoint spectra is a (disconnected) graph in which all
eigenvalues are mutually distinct and main.
We combine these observations in the following proposition.
Proposition 3. If G1, G2 are controllable and G1, G2 have
disjoint spectra then the join G1 5G2 is controllable.
Theorem. Controllable graphs have a trivial automorphism
group.
Proof. Any divisor of a graph contains in its spectrum all the
main eigenvalues of the graph. Hence, the only divisor of a con-
trollable graph is trivial (equal to the graph itself). On the other
hand, it is well-known that the orbits of the automorphism group
of a graph induce a divisor. This mans that the orbits in a con-
trollable graph are singletons, and this further implies that the
automorphism group contains only the identity.
Remark. This theorem is a refinement of the following theorem:
If a multigraph has no repeated eigenvalues then all of its
non-trivial automorphisms are involutions.
Computer enumeration
We used the publicly available library of programs nauty to
generate all connected graphs on a given number of vertices. The
library nauty includes a program for computing the automor-
phism groups of graphs and digraphs; it is an open source program
written in a portable subset of C, and runs on a considerable num-
ber of different systems. The implementation of the algorithm for
generating graphs is very efficient.
We have calculated the numbers of controllable graphs with up
to 9 vertices. It turns out that the numbers are 8, 85, 2275, 83034
for 6, 7, 8, 9 vertices respectively.
The 85 controllable graphs on 7 vertices can be found in a table
of connected graphs on 7 vertices in the book
Cvetkovic D., Doob M., Gutman I., Torgasev A., Recent Re-
sults in the Theory of Graph Spectra, North-Holland, Amster-
dam, 1988.
n the number of vertices (agents),
T (n) total number of connected graphs,
I(n) total number of connected graphs with trivial automor-
phism group, and
C(n) total number of connected graphs whose eigenvalues are
all distinct and main.
The numbers T (n) and C(n are known from the literature.
We have found the numbers C(n) (n = 7, 8, 9) and all of these
numbers are presented in Table 1.
n 1 2 3 4 5 6 7 8 9T (n) 1 1 2 6 21 112 853 11117 261080I(n) 1 0 0 0 0 8 144 3552 131452C(n) 1 0 0 0 0 8 85 2275 83034
Table 1: The number of controllable graphs
Conjecture
Almost all connected graphs are controllable.
Cvetkovic D., Rowlinson P., Stanic Z., Yoon M.-G., Control-
lable graphs, Bull. Acad. Serbe Sci. Arts, Cl. Sci. Math. Natur.,
Sci. Math. 143(2011), No. 36, 81-88.
Cvetkovic D., Rowlinson P., Stanic Z., Yoon M.-G., Controllable
graphs with least eigenvalue at least -2, Applicable Analysis and
Discrete Mathematics, 5(2011), No. 2, 165-175.
Principal eigenvector - Internet search
Web search engines are based on eigenvectors of the adjacency
and some related graph matrices:
PageRank (used in Google)
Brin S., Page L., The Anatomy of Large-Scale Hypertextual
Web Search Engine, Proc. 7th International WWW Conference,
1998.
and Hyperlinked Induced Topics Search (HITS)
Kleinberg J., Authoratitive sources in a hyperlinked environ-
ment, J. ACM, 48 (1999), 604-632.
Internet is represented by a digraph G, web pages – vertices,
links – arcs.
HITS exploits eigenvectors belonging to the largest eigenvalues
of the matrices AAT and ATA where A is the adjacency matrix
of a subgraph of G induced by the set of web pages obtained from
search key words by some heuristics. Eigenvectors define ordering
of selected web pages.
PageRank uses random walks. Adjacency matrix of G is normal-
ized so that the sum of entries in each row is equal to 1. This matrix
P is the transition matrix of a Markov chain and the normalized
eigenvector of the largest eigenvalue of P T defines the equilibrium
state of the chain. Pages are ranked by the coordinates of this
eigenvector.
Expository paper
Langville A.N., Meyer C.D., A survey of eigenvector methods
for Web information retrieval, SIAM Rev., 47(2005), No. 1,
135-161.
contains a survey of both techniques.
From the mathematical point of view, the subject of ranking
individuals or objects by eigenvectors of suitably chosen graph ma-
trices is very old. One of the basic references is the thesis
Wei T.H., The algebraic foundations of ranking theory, Thesis,
Cambridge, 1952.
In particular, the ranking of the participants of a round-robin
tournament can be carried out in that way (see, for example, Spec-
tra of Graphs, Theory and Application, , p. 226).
These methods have been used in the sociology for a long time
as well; see, for example,
Bonacich P. Power and centrality: A family of measures, Amer.
J. Soc., 92(1987), 1170-1182.
We reproduce here a relevant result. The following theorem of
T.H. Wei (1952) is noted in Eigenspaces of Graphs, p. 26:
Theorem. Let Nk(i) be the number of walks of length k start-
ing at vertex i of a non-bipartite connected graph G with ver-
tices 1, 2, . . . , n. Let
sk(i) =Nk(i)
∑nj=1 Nk(j)
.
Then, for k →∞, the vector (sk(1), sk(2), . . . , sk(n))T tends
towards the eigenvector corresponding to the index of G.
The Fiedler eigenvector - image segmentation
Note that one of early heuristics for graph bisection uses the
Fiedler vector, ie. the eigenvector belonging to the second small-
est eigenvalue of the graph Laplacian. This eigenvalue is called
algebraic connectivity of the graph and was introduced by M.
Fiedler in the paper
Algebraic connectivity of graphs, Czech. J. Math., 23(98)(1973),
298-305.
Shi J., Malik J., Normalized cuts and image segmentation, Proc.
IEEE Conf. Computer Vision and Pattern Recognition, 1997, 731-
737; IEEE Trans. Pattern Analysis Machine Intell., 28(2000), 888-
905.
It is shown how the Fiedler vector (i.e. the eigenvector associated
to the second smallest eigenvalue of the Laplacian matrix) can be
used to separate the foreground from the background structure in
images. The original procedure has been improved by using the
normalized Laplacian matrix (so as to maximize the normalized
graph cut).
Other Laplacian eigenvectors - spectral clustering
We have presented an algorithm for graph clustering which is
based on the Laplacian matrix of a graph.
Let G be a connected graph on n vertices. Eigenvalues in non-
decreasing order and corresponding orthonormal eigenvectors of
the Laplacian L = D − A of G are denoted by ν1 = 0, ν2, . . . , νn
and u1, u2, . . . , un, respectively.
In order to construct k clusters in a graph we form an n × k
matrix U containing the vectors u1, u2, . . . , uk as columns. We
have a geometric representation G of G in the k-dimensional space
Rk: we just take rows of U as point coordinates representing the
vertices of G. Edges are straight line segments between the cor-
responding points. Now classical clustering methods (say k-means
algorithm) should be applied to this new graph presentation.
Partial explanation of the efficiency of the algorithm
The following well-known inequality for the Rayleigh quotient
ν1 ≤ xTLx
xTx≤ νn
holds for any non-zero vector x of the corresponding dimension.
Equality holds for relevant eigenvectors.
More generally, the eigenvalue νi is the minimal value of the
Rayleigh quotient of L over the orthogonal complement of the sub-
space generated by eigenvectors u1, u2, . . . , ui−1.
Note that if xT = (x1, x2, . . . , xn), then
xTLx =∑
i∼j, i<j(xi − xj)
2.
We also have
ν =∑
i∼j, i<j(xi − xj)
2
if x is a normalized eigenvector belonging to eigenvalue ν of L.
The sum of squares of lengths of all edges in the representation
G of G is equal to ν1 + ν2 + . . . + νk. This is the minimal value
over all representations obtained via matrix U with orthonormal
columns.
Such an extremal graph representation must have remarkable
properties. It enhances the cluster-properties of the original data
and clusters can now be easily detected.
Random graphs - Complex networks
Complex networks is a common name for various real networks
which are presented by graphs with an enormously great number
of vertices. Here belong Internet graphs, phone graphs, e-mail
graphs, social networks and many other. In spite of their diversity
such networks show some common properties.
Several models of random graphs have been used to describe
complex networks including the classical Erdos-Renyi model where
we have a constant probability for the existence of each edge. There
are models where given degree distribution is realized.
Main characteristic of complex networks is the degree and eigen-
value distribution. Both distributions obey a power low of the form
x−β for a positive β.
In particular, if nk denotes the number of vertices of degree k,
then asymptotically nk = ak−β for some constant a.
A network with power law distributions is called scale-free.
An asymptotic distribution of eigenvalues of certain random
symmetric matrices, known as Wigner’s semi-circle law, has been
derived inWigner E.P., On the distribution of the roots of certain sym-
metric matrices, Ann. Math., 67(1958), 325-328.
The Hoffman polynomial - load balancing in multi-
processor systems
Let j be all-1 vector and J a square all-1 matrix.
Theorem 1. Let G be a connected graph on n vertices with
Laplacian L and distinct Laplacian eigenvalues µ1 = 0, µ2, . . . , µm.
Let h(x) = (x − µ2) · · · (x − µm). Then h(L) = aJ where
a = (−1)m−1µ2 · · ·µm/n.
In A-theory a similar result holds only for regular graphs.
Let H(x) = 1ah(x)
We have H(L)x = Jx = βj, where β is the sum of coordi-
nates of x. If x represents any job distribution the matrix 1nH(L)
transforms it into a uniform distribution. We can write
1
nH(L) = (I − 1
µ2L) · · · (I − 1
µmL)
Introducing vectors x(1) = x,x(2), . . . ,x(m) by relations
x(k) = (I − 1
µkL)x(k−1), k = 2, . . . , m (1)
we shall obtain x(m) = βnj.
Hence any vector x can be transformed to a scalar multiple of j
using this iteration process , which involves the Laplacian matrix
of the multiprocessor graph G.
This is exactly what is needed in load balancing in multiproces-
sors.
Comments and suggestions
Computer networks resistent to the spread of viruses.
We have some results on graphs with a minimal value of the
largest eigenvalue. As we know, such graphs are models of com-
puter networks resistent to the spread of viruses.
Graphs with a minimal value of the largest eigenvalue in a set
of graphs will be called minimal graphs.
Let deg(v) be the degree of the vertex v. An internal path in
some graph is a path v0, v1, . . . , vk+1 for which deg(v0), deg(vk+1) ≥3 and deg(v1) = · · · = deg(vk) = 2 (here k ≥ 0, or k ≥ 2 whenever
vk+1 = v0).
Consider connected graphs with fixed numbers of vertices and
edges.
A minimal graph cannot contain vertices of degree 1 because
deleting a vertex of degree 1 and a simultaneous insertion of a
vertex of degree 2 in the middle of an edge on an internal path
would diminish the largest eigenvalue λ1 (by a result from
Hoffman A.J., Smith J.H., On the spectral radii of topologi-
cally equivalent graphs, Recent Advances in Graph Theory, ed.
M. Fiedler, Academia Praha, 1975, 273-281.)
As a consequence, all edges belong to internal paths. Subdivid-
ing edges in such graphs we can further diminish λ1. However, we
have λ1 ≥ ∆/√
∆− 1, where ∆ is the maximum vertex degree
(as follows form a result from the same paper). Therefore it is
reasonable to choose ∆ as small as possible. Obviously we should
accept ∆ = 3 since graphs with ∆ < 3 are not of much interest in
this context.
Let H be a connected graph with q edges and with vertex degrees
at least 3. For each i = 1, 2, . . . , q we want to subdivide edge i by
inserting li vertices of degree 2. Suppose that Σqi=1li = e is fixed.
In this way we obtain a graph G. An explicit relation between the
largest eigenvalue of G and quantities l1, l2, . . . , lq can be found.
A standard procedure with Lagrange’s multipliers for finding the
minimum of an implicit function leads to the conclusion that under
some reasonable additional assumptions the quantities li should be
almost equal (equal if possible). Such a subdivision of a graph is
called balanced subdivision.
Details will appear in the paper
Belardo F., Li Marzi E.M., Simic S.K., Connected graphs of
fixed order and size with minimal index, to appear.
Indeed many of the examples of minimal graphs found by com-
puter have this property.
Earlier results on the subject are special cases of this result. It
started with a conjecture that a balanced subdivision minimizes
the largest eigenvalue in a cycle with an additional edge. The
problem was solved in a generalized form. There are results for
broken wheels and for trees.
The above presented procedure enables explicit construction of
minimal tricyclic graphs. Previous result for bicyclic graphs has
been proved in a shorter way.
Our conclusion is that balanced subdivisions of cubic graphs
should be considered as good models of virus resistent computer
networks.
Signless Laplacian could be useful
We suggest to try to create models for virus propagation and
the spread of knowledge in which the adjacency matrix would be
replaced by the signless Laplacian. In such models desirable graphs
for anti-virus protection would be those with small q1 and for R&D
networks those with large q1.
We believe that there are situations in which viruses or knowl-
edge move along lazy random walks rather than along standard
random walks. This can be expected in situations when the ver-
tices when receiving something from their neighbours are likely to
respond back with some action.
Integral graphs in load balancing
As defined, a graph is called integral if its spectrum consists
entirely of integers. Each eigenvalue has integral eigenvectors and
each eigenspace has a basis consisting of such eigenvectors.
In integral graphs load balancing algorithms, which use eigen-
values and eigenvectors, can be executed in integer arithmetics as
noted in the paper
Cvetkovic D., Davidovic T., Multiprocessor interconnection
networks with small tightness, Internat. J. Foundations Com-
puter Sci., 20(2009), No. 5, 941-963.
The further study of integral graphs in connection to multi-
processor topologies seems to be a promising subject for future
research.
Recall that 3, 15, (−2)4 is the spectrum of the Petersen graph.
An eigenvector for eigenvalue 1 and a load balancing flow
Conclusion
A great part of the theory of graph spectra is really used in
computer sciences.
The theory of graph spectra contains tools which can be ap-
plied in various subtheories of graph theory, although with varying
strength, and one can think of it as being a unifying theory for
the whole graph theory. However, spectral techniques are weak for
some problems and mathematicians could doubt about it. In ap-
plications to computer sciences spectral graph theory is considered
as very strong and perhaps one can say that its unifying mission
for graph theory has been realized through Computer Science.
A great variety of graph matrices are used depending on the
problem treated.
Due to enormous number of papers and due to various fields, it
is really difficult to produce a balanced and comprehensive survey.
Our general suggestion is that mathematicians should re-
act on the explosion of the number of papers in computer sci-
ence which use graph spectra by selecting for their own research
some subjects from or inspired by such applications.
Thank you for your attention