DISTRIBUTED RESOURCE ALLOCATION AND PERFORMANCE OPTIMIZATION FOR VIDEO COMMUNICATION OVER MESH NETWORKS BASED ON SWARM INTELLIGENCE A Dissertation presented to the Faculty of the Graduate School University of Missouri-Columbia In Partial Fulfillment Of the Requirements for the Degree Doctor of Philosophy by BO WANG Dr. Zhihai He, Dissertation Supervisor DECEMBER 2007
160
Embed
DISTRIBUTED RESOURCE ALLOCATION AND PERFORMANCE ...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
DISTRIBUTED RESOURCE ALLOCATION AND PERFORMANCE
OPTIMIZATION FOR VIDEO COMMUNICATION OVER
MESH NETWORKS BASED ON SWARM INTELLIGENCE
A Dissertation
presented to
the Faculty of the Graduate School
University of Missouri-Columbia
In Partial Fulfillment
Of the Requirements for the Degree
Doctor of Philosophy
by
BO WANG
Dr. Zhihai He, Dissertation Supervisor
DECEMBER 2007
The undersigned, appointed by the Dean of the Graduate School, have examined the
dissertation entitled
DISTRIBUTED RESOURCE ALLOCATION AND PERFORMANCE
OPTIMIZATION FOR VIDEO COMMUNICATION OVER
MESH NETWORKS BASED ON SWARM INTELLIGENCE
presented by Bo Wang,
a candidate for the degree of Doctor of Philosophy,
and hereby certify that in their opinion it is worth acceptance.
Dr. Zhihai He
Dr. Curt H. Davis
Dr. Guilherme DeSouza
Dr. Justin Legarsky
Dr. Wenjun Zeng
To my loving parents Pingxin Wang and Xianju Cheng,
my caring sister Li Wang,
and my precious wife Xiaoning Lu
ACKNOWLEDGMENTS
On completion of my dissertation, I wish to acknowledge the following persons who
helped me during my long path:
First, I would like to express my sincere gratitude to my dissertation advisor, Dr. Zhihai
He. His intellectual guidance, keen insight, motivation and encouragement have been the
great support for me throughout the whole work during my research. I would also like to
express my gratitude to Dr. Curt H. Davis, Dr. Guilherme DeSouza, Dr. Justin Legarsky,
and Dr. Wenjun Zeng, for agreeing to serve as members of my guidance and doctoral
committees, and for their time, consideration and suggestion to help improve the quality of
my dissertation.
I would like to extend my appreciation to my current labmates at Video Processing and
Networking Lab, Xi Chen, York Chung, Xiwen Zhao, and Jay Eggert for their technical
assistance and wonderful friendship. I would also like to thank all my friends at Columbia,
Missouri, who have made my life at University of Missouri-Columbia such an unforgettable
experience.
I am also thankful to the following former and current staff at University of Missouri-
Columbia, for their various forms of support during my graduate study: Jim Fischer, Shirley
Holdmeier, Betty Barfield, and Kelly Scott.
Last, I would like to acknowledge my family for all their love and support. I especially
would like to acknowledge my parents Pingxin Wang, Xianju Cheng and my sister Li Wang
for their love, support and encouragement and understanding in dealing with all the chal-
lenges I have faced in my life. I would specially acknowledge my wife, Xiaoning Lu, for her
ii
support, encouragement, understanding and unwavering love in the past ten years of my
life. Without their love and support, I could not go through so far. Nothing in a simple
paragraph can express the love I have for the four of you.
6.2 Convergence probability under different initial random solution. . . . . . . . 118
6.3 (a) Convergence of function value for problem H1; and (b) convergence of
function value for problem H2. . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.4 (a) Convergence of function value for problem H3 (n=4); and (b) convergence
of function value for problem H3 (n=8). . . . . . . . . . . . . . . . . . . . . 125
xii
ABSTRACT
Mesh networking technologies allow a system of communication devices to communi-
cate with each other over a dynamic and self-organizing wired or wireless network from
everywhere at anytime. Important examples of mesh networking include wireless sensor
networks, multimedia communication over community networks and Internet, peer-to-peer
video streaming, etc. Large-scale mesh communication networks involve a large number of
heterogeneous devices, each with different on-board computation speeds, energy supplies,
and communication capabilities, communicating over the dynamic and unreliable networks.
How to coordinate the resource utilization behaviors of these devices in a large-scale mesh
network such that each of them operates in a contributive fashion to maximize the overall
performance of the system as a whole remains a challenging task.
Network resource allocation and performance optimization can be formulated as a net-
work utility maximization (NUM) problem under resource constraints. In many mesh net-
working applications, especially in video communication over mesh networks, network utility
maximization is often a high-dimensional, nonlinear, constrained optimization problem. An
effective solution to this type of problems needs to meet the following three requirements:
distributed, asynchronous, and non-convex.
In this work, based on swarm intelligence principles, we develop a set of distributed
and asynchronous schemes for resource allocation and performance optimization for a wide
range of mesh networking-based applications. To successfully apply the swarm intelligence
principle in distributed resource allocation and performance optimization in large-scale mesh
networks, there are three important issues that need to be carefully investigated.
First, existing PSO schemes are not able to efficiently handle constraints, especially
xiii
constraints in a high-dimensional space. However, in large-scale video mesh networks, the
resource constraints are often represented by a convex region embedded in a very high di-
mensional space. To address this issue, we propose to transform the solution space defined
by resource constraints into a convex region in a low-dimensional space. We then merge
the convex condition with the swarm intelligence principle to guide the movement of each
particle to efficiently search for the optimum solution. Second, distributed optimization
requires decomposition of centralized network utility function and resource constraints into
local ones. However, in video mesh networking, the resource utilization behavior of neigh-
boring network nodes are highly coupled and interwound. In this work, we propose various
methods and approaches for decomposition of network utility function and interwound re-
source constraints. Third, one of the key challenges in resource allocation and performance
optimization is to handle critical / bottleneck links which have very limited resource however
are shared by multiple video communication sessions. We observe that the resource alloca-
tion results at these critical links have direct impact on the overall system performance. To
address this issue, we propose various schemes to fuse the resource allocation information
of neighboring optimization modules, propagate and share the resource allocation results at
critical links, use this external information to guide the of movements of particles in each
local optimization module to efficiently search for the optimum solution.
Our extensive experimental results in distributed resource allocation and performance
optimization demonstrate that the proposed schemes work efficiently and robustly. Com-
pared to existing algorithms, including gradient search and Lagrange optimization, the pro-
posed approach had the advantage of faster convergence and the ability to handle generic
network utility functions. Compared to centralized performance optimization schemes, the
proposed approach significantly reduces communication overhead while achieving similar
performance. The distributed algorithms for resource allocation and performance optimiza-
tion provide analytical insights and important guidelines for practical design of large-scale
video mesh networks.
xiv
Chapter 1
Introduction
Mesh networking technologies allow a system of communication devices to communi-
cate with each other over a dynamic and self-organizing wired or wireless network from
everywhere at anytime. Important examples of mesh networking include wireless sensor
networks, multimedia communication over community networks and Internet, peer-to-peer
video streaming, etc. Large-scale mesh communication networks involve a large number of
heterogeneous devices, each with different on-board computation speeds, energy supplies,
and communication capabilities, communicating over the dynamic and unreliable networks.
How to coordinate the resource utilization behaviors of these devices in a large-scale mesh
network such that each of them operates in a contributive fashion to maximize the overall
performance of the system as a whole remains a challenging task.
1.1 Overview
Currently there has been considerable research interest in the mesh networks, as illus-
trated in Fig. 1.1. Mesh network consists of nodes that communicate to each other and are
capable of hopping radio messages to a base station where they are passed to a PC or other
client. Each network node also acts as a router, forwarding data packets to other nodes. All
wireless mesh networking systems share a set of common requirements, include low power
consumption, ease of use, scalability, responsiveness, and range. Because of their ad hoc
nature, mesh networks can respond to conditions much more quickly and reliably than static
1
networks. If a node in a mesh network fails, the other nodes in the mesh will notice the
failure and adjust their routing accordingly without human intervention. The resilience,
flexibility, and decentralized administration make the mesh networks more attractive than
the traditional networking systems.
Figure 1.1: Illustration of mesh networks.
Network resource allocation and performance optimization can be formulated as a net-
work utility maximization (NUM) problem under resource constraints. In many mesh net-
working applications, especially in video communication over mesh networks, network utility
maximization is often a high-dimensional, nonlinear, constrained optimization problem. An
effective solution to this type of problems needs to meet the following three requirements:
(1) Distributed. Lack of a centralized powerful node for computation, mesh networks are
often not able to support centralized computation. In addition, centralized performance
optimization introduces significant communication overhead, becomes extremely costly or
even infeasible in large-scale mesh networks. The decomposability structure of network util-
ity maximization leads to the most appropriate distributed algorithm for a given network
resource allocation problem. Decomposition theory naturally provides the mathematical
method to build the foundation for the distributed control of networks. This helps us ob-
tain the most appropriate distributed algorithm for a given network resource allocation
problem, such as distributed routing, scheduling to power control and congestion control.
(2)Asynchronous. Network nodes are communicating with each other in an asynchronous
and on-demand manner. This requires the distributed performance optimization to also
2
operate in an asynchronous fashion. (3)Non-convex. As the network utility functions in
many video mesh networking applications are highly nonlinear and non-convex, an effective
solution should be able to handle generic non-convex objective functions.
Based on Kelly’s work [1] in 1998, the framework of NUM has found many applica-
tions in network resource allocation algorithms and network congestion control algorithms
[2, 3, 4, 5], such as Internat rate allocation and TCP congestion control. These work inter-
pret source rates as primal variables and link congestion prices as dual variables to solve an
implicit global utility optimization problem or lagrange dual problem. Previous research use
price-based strategy [1, 3, 6, 7, 8], where prices are computed to reflect relations between re-
source demands and supplies, to coordinate the resource allocations at multiple hops. Their
results show that the price is effective as a method to arbitrate resource allocation. Most
of the papers in the vast related on network resource allocation use a standard lagrange
dual based distributed algorithm [1, 3]. While it is well known that dual based distributed
algorithm needs utility function to be convex or concave. This is the major drawback of
dual base distributed algorithm when the application is inelastic or utility functions are
nonconvex, which leads to divergence of congestion control. Several existing distributed
optimization algorithms based on incremental sub-gradient search [9, 10] assume that the
objective function to be additive and convex. And the drawbacks of the gradient or subgra-
dient based distributed algorithm are that it is sensitive to local optima or saddle points,
and it also suffers from slow convergence speed.
1.2 This Work
In this work, based on swarm intelligence principles, we develop a set of distributed
and asynchronous schemes for resource allocation and performance optimization for a wide
range of mesh networking-based applications.
Particle swarm optimization (PSO), developed by Eberhart and Kennedy in 1995[11,
12], is a population based stochastic optimization technique which is inspired by social
3
behavior of bird flocking or fish schooling. PSO shares many similarities with evolutionary
computation techniques, such as Genetic Algorithms (GAs). The main advantage of PSO
over other global optimization strategies is that the large number of random solutions of
PSO algorithm make the technique avoid dropping in the local minimum of the optimization
problem, so that the PSO algorithm does not need to have requirement for the objective
function of the optimization problem. In the past several years, PSO algorithm has been
successfully applied in many research and application areas. It is demonstrated that PSO
gets better results in a faster, cheaper way compared with many other methods.
To successfully apply the swarm intelligence principle in distributed resource allocation
and performance optimization in large-scale mesh networks, there are three important issues
that need to be carefully investigated.
First, existing PSO schemes are not able to efficiently handle constraints. To address
this issue, we propose to transform the solution space defined by resource constraints into
a convex region in a low-dimensional space. We then merge the convex condition with the
swarm intelligence principle to guide the movement of each particle to efficiently search
for the optimum solution. Second, distributed optimization requires decomposition of
centralized network utility function and resource constraints into local ones. However, in
video mesh networking, the resource utilization behavior of neighboring network nodes are
highly coupled and interwound. In this work, we propose various methods and approaches
for decomposition of network utility function and interwound resource constraints. Third,
one of the key challenges in resource allocation and performance optimization is to handle
critical / bottleneck links which have very limited resource however are shared by multiple
video communication sessions. To address this issue, we propose various schemes to fuse the
resource allocation information of neighboring optimization modules, propagate and share
the resource allocation results at critical links, use this external information to guide the
of movements of particles in each local optimization module to efficiently search for the
optimum solution.
Our extensive experimental results in distribution resource allocation and performance
4
optimization demonstrate that the proposed schemes work efficiently and robustly. Com-
pared to existing algorithms, including gradient search and Lagrange optimization, the pro-
posed approach had the advantage of faster convergence and the ability to handle generic
network utility functions. Compared to centralized performance optimization schemes, the
proposed approach significantly reduces communication overhead while achieving similar
performance. The distributed algorithms for resource allocation and performance optimiza-
tion provide analytical insights and important guidelines for practical design of large-scale
video mesh networks.
1.3 Major Contributions of the Research
This section presents several evolutionary schemes based on swarm intelligence which
solve the different performance optimization problems over wireless sensor networks, includ-
ing the nonlinear nonconvex optimization problems which can not be solved by the lagrange
dual algorithm, meanwhile the proposed algorithms have fast convergence speed.
1.3.1 Convex Mapping of High-dimensional Resource Constraints
There is a significant body of research work on performance optimization of wireless
sensor networks, such as energy minimization, rate allocation and topology control [13, 14].
These results and algorithms are generic in their nature, and could be used to improve the
performance of WVSNs. However, they have not considered the unique characteristics of
WVSN, such as the complex nonlinear resource utilization behavior of each sensor node
function, which will potentially render these analysis and algorithms inefficient or even
impractical.
In this dissertation, we first consider the unique characteristics of WVSN and develop an
evolutionary optimization scheme using a swarm intelligence principle to solve the WVSN
performance optimization problem. We transform the solution space set by the flow balance
and energy constraints to a convex region in a low-dimensional space. Our analysis shows
5
that this transform can reduce the computational complexity and remove the interdepen-
dence between the control variables. We then merge the convex property of the new solution
space with the original swarm intelligence principle to guide the movement of each particle
which automatically satisfies the constraints during the evolutionary optimization process.
Finally our experimental results demonstrate that the proposed performance optimization
scheme is very efficient.
1.3.2 Distributed Optimization over Wireless Sensor Networks
Recently, several distributed optimization algorithms based on gradient search have been
proposed in the literature [9, 15, 10]. Most of existing approaches assume that the objective
function to be additive and convex. Otherwise, it will be very difficult to assure convergence
of the distributed gradient search algorithm, and will be sensitive to the local optima or
saddle points. In addition, existing algorithms also suffer from slow convergence speed
problem.
In this dissertation, we develop an evolutionary distributed optimization scheme using
swarm intelligence principle [16], called decentralized particle swarm optimization (DPSO),
to solve the distributed WSN optimization problem. Based on the particle swarm intelli-
gence principle, sensor nodes share information with each other through local information
exchange and communication to solve a joint estimation or optimization problem. The
proposed DPSO scheme has low communication energy cost and assures fast convergence.
In addition, the objective function does not need to be convex. We use source location
as an example to demonstrate the efficiency and evaluate the performance of the proposed
DPSO algorithm. The proposed DPSO algorithm is a distributed algorithm with the most
advantage that it will not be sensitive to the local optima or saddle points, and has very fast
convergence speed compared with distributed gradient search algorithm. Our experimental
results demonstrate that the proposed optimization scheme is very efficient and outperforms
the existing distributed gradient-based optimization schemes.
6
1.3.3 Distributed Rate Allocation for Video Mesh Networks
To successfully deploy the video mesh networking technology, there are a number of issues
that need to be carefully investigated, including packet routing, flow control, Quality of
Service guarantee, resource allocation, and performance optimization. Within the context of
large-scale mesh networks, a distributed and asynchronous solution to the resource allocation
and performance optimization is highly desired.
This dissertation presents a distributed and asynchronous particle swarm optimization
(DAPSO) evolutionary technique to optimize the network utility maximization problems.
Unlike many network rate allocation and performance optimization algorithms which can
only handle convex network utility functions, the proposed scheme is able to handle generic
nonlinear noncovex network utility functions, and has very fast convergence speed compared
with gradient based lagrange dual algorithm. For the DAPSO algorithm, we need to take two
steps: (1) decompose the global resource parameters and network utility function; and (2)
handle inter-dependent resource constraints, such as bottle neck links. We will use a specific
rate allocation and performance optimization problem in wireless video sensor network as an
example to demonstrate the efficiency and performance of the proposed scheme and compare
its performance with other algorithms, such as gradient based lagrange dual algorithm. Our
simulation results demonstrate that the proposed performance optimization scheme is very
efficient and can significantly enhance the network utility.
1.3.4 Distributed Resource Allocation for Wireless Video Sensor
Networks
In the WVSNs, the two major operations on each video sensor are video compression and
wireless transmission. And the wireless transmission is also restricted to the transmission
bandwidth. In this dissertation, we will analyze the power-rate-distortion (P-R-D) module
of the video encoding and its distortion performance, and will also analyze the transmis-
sion power consumption for the wireless video communication and its impact for the video
7
quality. Finally, based on these modules, we will approach a distributed and asynchronous
algorithm for the energy efficient resource allocation and performance optimization over
wireless video sensor networks.
In this dissertation, we present an energy efficient distributed and asynchronous PSO
(EEDAPSO) algorithm to solve the resource allocation and performance optimization prob-
lem over wireless video sensor network. We first decompose the original optimization prob-
lem into several sub-optimization problems, and next design the proper algorithm to handle
the interdependent constraints and do the performance optimization. Compared with the
centralized algorithm, our simulation results demonstrate that this proposed distributed
and asynchronous resource allocation and performance optimization scheme is very efficient
and it only needs very low communication cost.
1.3.5 Evaluation of PSO Algorithm
Random Search Techniques are convergent algorithms for constrained nonlinear prob-
lems. Based on this, we provide the mathematical convergence analysis for PSO algorithm.
We proved that PSO algorithm is a local convergence algorithm, which means that after
predefined number of function iteration, all the solutions will converge to an optimum so-
lution which is no guaranteed a global optimum. However, the experimental results based
on many difficult optimization problems show that the large number of random solutions of
the PSO algorithm that make up this technique converges to its global optimum in a good
opportunity. Meanwhile, we compare PSO algorithm with genetic algorithm (GA), and
Broydon-Fletcher-Goldfarb-Shanno (BFGS) quasi-Newton algorithm for their global search
capabilities based on a suite of difficult analytical optimization problems, the experimental
results show that the PSO algorithm has the better convergence probability to the global
optimum.
8
1.4 Dissertation Organization
Following the Introduction, the dissertation is organized as follows.
Chapter 2 first presents a background introduction and related works for network
resource allocation optimization problems. Next introduces several previous optimization
algorithms mostly used in the resource allocation optimization problems, including convex
optimization, lagrange duality, gradient or subgradient methods. Meanwhile a brief dis-
cussion about these algorithms is given. An algorithm based on swarm intelligence, called
particle swarm optimization (PSO), is also introduced. For PSO algorithm which has many
attractive features including ease of implementation and the fact that no gradient informa-
tion requirement, its social behavior and applications are also discussed.
Chapter 3 first presents an evolutionary optimization scheme using swarm intelligence
principle to solve the WVSN performance optimization problem. The algorithm transforms
the original high-dimensional constrained optimization problem to a problem in the low
dimension without constraints which can be solved by PSO algorithm. Next proposes a
distributed algorithm, called decentralized PSO (DPSO), to solve the generic parameter
estimation or performance optimization problems over WSNs. For the proposed DPSO
scheme, there is no requirement for the objective function, and the scheme will not be sensi-
tive to the local optima or saddle points. The source localization application demonstrates
the efficiency of the DPSO algorithm.
Chapter 4 presents a distributed and asynchronous particle swarm optimization (DAPSO)
evolutionary technique to optimize the network utility maximization problems. The pro-
posed DAPSO algorithm can solve the rate allocation and performance optimization prob-
lem for video communication over mesh networks very efficiently. The DAPSO algorithm
is easy and powerful, has fast convergence speed and can also solve the generic nonconvex
optimization problems efficiently which can’t be solved by the traditional lagrange dual
algorithm.
Chapter 5 presents an energy efficient distributed asynchronous PSO (EEDAPSO)
9
algorithm to solve the resource allocation and performance optimization problem over wire-
less video sensor networks, including rate allocation and power management. The proposed
EEDAPSO algorithm considers both encoding distortion and transmission distortion for
the whole video quality over the WVSN, and solves the original optimization problem in a
distributed and asynchronous way.
Chapter 6 presents the convergence analysis for PSO algorithm based on random search
techniques. PSO algorithm has been proved that it is a local convergence algorithm, but the
large number of random solutions of the PSO algorithm make this technique converge to its
global optimum in a good opportunity. Meanwhile, PSO algorithm is compared with genetic
algorithm (GA), and Broydon-Fletcher-Goldfarb-Shanno (BFGS) quasi-Newton algorithm
for their global search capabilities based on a suite of difficult analytical optimization prob-
lems.
Finally, Chapter 7 summarizes the major contributions of the research and provides
some directions of possible future works.
10
Chapter 2
Background and Related Work
This chapter presents a background introduction for network resource allocation op-
timization problems and discusses several optimization algorithms and their applications.
First we present a brief introduction for network utility maximization (NUM), which is used
for most of network resource allocation problems, and review the currently applications of
NUM and also the resource allocation optimal problems in wireless mesh networks. Next we
introduce several optimization algorithms mostly used in the resource allocation optimiza-
tion problems, such as convex optimization, lagrange duality, and gradient and subgradient
methods. And the flow control problem solved by using these kinds of algorithms is also
discussed. An algorithm based on swarm intelligence, called particle swarm optimization
(PSO), is introduced later. For PSO algorithm which has many attractive features includ-
ing ease of implementation and the fact that no gradient information requirement, its social
behavior and applications are also discussed.
2.1 Network Resource Allocation
Most of the network resource allocation problems can be formulated as a constrained
optimization problems of some network utility functions. Normally, there are three levels
methods to efficiently solve a network resource allocation optimization problem. First is
on theoretical properties, such as global optimality. It is well known as a convex optimiza-
tion. Second is on computational properties, such as centralized algorithms. Third is on
11
decomposable properties, such as dual optimization. [17, 18]
2.1.1 Overview
Network utility maximization (NUM) problems provide an important approach to con-
duct network resource allocation, such as rate allocation or power management. The decom-
posability structure of network utility maximization leads to the most appropriate distrib-
uted algorithm for a given network resource allocation problem. The distributed solutions
are particularly attractive in large scale networks, where the centralized solutions are infea-
sible, too costly, too fragile, and inapplicable [17].
Decomposition theory naturally provides the mathematical method to build the foun-
dation for the distributed control of networks. This helps us obtain the most appropriate
distributed algorithm for a given network resource allocation problem, such as distributed
routing, scheduling to power control and congestion control.
Based on Kelly’s work [1] in 1998, the framework of NUM has found many applications in
network resource allocation algorithms and network congestion control algorithms [2, 3, 4, 5],
such as Internat rate allocation and TCP congestion control. Traffic from such applications
is elastic, which is a typical example in TCP traffic over Internet. Other examples include the
available bit rate (ABR) service, which enables maximal link utilization over ATM networks
[19, 20, 21]. These work interpret source rates as primal variables and link congestion prices
as dual variables to solve an implicit global utility optimization problem or lagrange dual
problem. Some network flow control problems have been used in the context of congestion
avoidance in multihop networks by using max-flow min cost theorems [22, 23]. Furthermore,
allocation of limited network resources, such as network bandwidth, power control can also
be formulated as basic NUM problems.
Meanwhile, the framework of NUM has also been substantially extended from internet
congestion control to a general approach of understanding interactions across layers. An
optimization framework of layered structure is introduced in [24]. The cross-layer optimiza-
tion can be applied to enable a clean-slate design of the protocol stack. The algorithm
12
obtains relations to different layers of protocol stack, and couples them through a limited
amount of information passed back and forth. Different layers iterate on different subsets
at different time scales using local information to achieve individual optimization. These
local algorithms collectively achieve a global optimization objective.
Many networks resource allocation problems can be formulated as a constrained max-
imization of some utility function. The key problem here is how the available bandwidth
within the network should be shared between each controllable traffic. A distributed and
asynchronous optimization scheme is particularly attractive in large-scale broadband net-
works where a centralized optimization scheme is infeasible, too fragile and too costly. There
are many research work on network utility performance optimization on many applications
using distributed optimization scheme. Primal and lagrangian dual algorithm [1, 25, 3, 4, 26]
based on distributed scheme have been proposed to many applications to solve the utility
function optimization problem. Standard textbook [27], [28] and [29] summarize the math-
ematics distributed computation of optimization techniques.
In the dual based distributed algorithm, the lagrange dual variables can be interpreted
as shadow prices for network resource allocation. Here, each link l calculates a “price” pl for
a unit of bandwidth at link l based on the local source rate, and the source s is controlled
by the total price ps =∑
pl, where the sum is taken over all links that source s pathes.
Then based on these, the source s chooses a transmission source rate xs to maximize its own
benefit Us(xs)− ps · xs. These kinds of algorithm require the utility function to be strictly
convex or concave, where the convexity property readily leads to a distributed algorithm
that converges to a globally optimal resource allocation problem. Normally, the video data
processing utility function is nonconvex, these kinds of optimization problems can not be
handled properly by previous distributed optimization algorithm. In addition, existing
algorithms based on gradient projection also suffer from slow convergence speed problem.
Most of the papers in the vast related on network resource allocation use a standard
lagrange dual based distributed algorithm. While it is well known that dual based distrib-
uted algorithm needs utility function to be convex or concave. This is the major drawback
13
of dual base distributed algorithm when the application is inelastic or utility functions are
nonconvex, which leads to divergence of congestion control.
2.1.2 Resource Allocation Over Mesh Networks
Wireless Mesh Networks (WMNs) are formed by dynamically self-organized and self-
configured wireless nodes that use multi-hop wireless relaying. WMNs have emerged as a
key technology for next generation wireless networking. In the recent past, based on many
advantages it has: (1) inexpensive network infrastructure, (2) easiness of implementation
the network, and (3) broadband data support. WMNs attract significant research and
inspiring numerous applications, including community meshes, vehicular platoons, and home
entertainment networks [30, 31]. However, there are significant challenges in the deployment
and optimization for efficient video streaming transmission over such wireless mesh network
due to the network resources and dynamics.
A wireless mesh network consists of a large number of wireless nodes spreading across a
geographical area without the help of a fixed infrastructure. Each node in the network for-
wards packets for its peer nodes and each end-to-end flow traverses multiple hops of wireless
links from a source to a destination. Because of its ad hoc nature, wireless mesh network
can respond to conditions much more quickly and reliably than static networks. Ad hoc
networks inherit the traditional problems of communication networks, such as bandwidth
optimization, power control, and transmission quality enhancement.
In wired networks, the flows only contend at the router that performs the flow scheduling.
Compared with this, the multihop wireless networks show that the flow also contend at
shared channel if they are within the interference ranges of each other. This presents the
problem of optimizing resource allocation respecting to both resource utilization and fair
across contending multihop flows [32].
Previous research use price-based strategy [1, 3, 6, 7, 8], where prices are computed to
reflect relations between resource demands and supplies, to coordinate the resource alloca-
tions at multiple hops, and the objective functions here are needed to be strictly convex.
14
Their results show that the price is effective as a method to arbitrate resource allocation.
The basic idea is to set prices on mutually contending links based on their congestion, and
the goal is to allocate the transmission rates in such a way that whole networks’ utility
is maximized. Here, the shadow price is associated with the links to reflect the relations
between the traffic load of individual link and its bandwidth capacity, the utility is asso-
ciated with an end-to-end flow to reflect its resource requirement. Transmission rates are
chosen to respond to the aggregated price along every flow such that the whole benefits of
flows are maximized. A distributed algorithm is obtained by using the lagrangian dual of
the optimization problem and hence decomposing the problem. Some other research use
gradient and subgradient search algorithm to handle the distributed optimization problem
in sensor networks [9, 10].
Since deployment of such an ad hoc network is fast and flexible, it is very attractive
to support real time media streaming over an ad hoc network. Recently there are many
research interesting on video streaming over ad hoc networks [33, 34, 35, 36, 37, 38, 39].
When there are multiple video streaming in the network, these streams share and compete
for the common resources, such as bandwidth and transmission power. Rate allocation
in this case serves the purpose of resource allocation among these streams (see Fig.2.1).
A video stream utilizes more network resources and achieves better video quality when it
reaches a higher video source rate. The rate allocation algorithm should be fair and efficient
among the streams. A distributed algorithm will be desirable and efficient because the
whole computational burden is shared by all participating nodes, and the solution can be
easily adjusted when the network conditions are fluctuant.
2.2 Previous Optimization Techniques
In this section, we present several optimization algorithms mostly used in the resource
allocation optimization problems, such as convex optimization, lagrange duality, gradient
and subgradient methods, and a brief discussion is given.
15
Figure 2.1: Video Streaming over Ad hoc networks.
2.2.1 Optimization Algorithms
I. Convex Optimization
Since 1940s, a large effort has been into developing algorithms for solving various
classes of optimization problems, analyzing their properties, and developing implementa-
tions. Mathematical optimization is used for many applications, such as an aid to a human
decision maker, system designer, or system operator [40].
Convex optimization methods are widely used in the design and analysis of communi-
cation networks and signal processing algorithms. A convex optimization problem can be
formulated as below [40]:
min . f0(x) (2.1)
s.t. fi(x) ≤ 0, i = 1, ..., m
where the functions f0, ..., fm : Rn → R are convex, satisfy
fi(αx + βy) ≤ αfi(x) + βfi(y) (2.2)
for all x, y ∈ Rn and α, β ∈ R with α + β = 1, α ≥ 0, and β ≥ 0.
16
Convex optimization techniques are very important in engineering application, because
a local optimum is also the global optimum [40]. This is the theoretical foundation for
gradient search and lagrange dual algorithms used in distributed optimization.
Convex optimization problems are the largest subset of optimization problems which are
efficiently solvable, whereas nonconvex optimization problems are generally difficult. Cur-
rently there are many softwares which can generate accurate and reliable solutions with-
out the headaches of initialization, step size selection, and the risks of trapping in a local
optimum[41]. Once an engineering problem can be formulated into a convex optimization
problem, it is reasonable to consider it solvable.
II. Lagrange Duality
Convex optimization techniques have useful Lagrange duality properties. The original
optimization problem in Eq.(2.1) can be formulated to a lagrangian function as follow:
L(x, λ) = f0(x) +m∑
i=1
λifi(x) (2.3)
where λi is dual variable.
The original optimization problem is referred as primal optimization problem, the so-
called dual function g(λ) is defined as the minimum of the Lagrangian function
g(λ) = minx∈Rn
L(x, λ) (2.4)
Notice that, even if the original optimization problem is not convex, the dual function
g(λ) is always concave because it is a pointwise minimum of a family of linear functions[41].
The dual variable λ is dual feasible if λ ≥ 0.
For any primal feasible x and dual feasible λ, it turns out f0(x) ≥ g(λ). This means
the dual function value g(λ) always is the lower bound of the primal function f0(x). The
maximization lower bound can be obtained on the optimal value f ∗ of the original problem
by solving the dual optimization problem:
max . g(λ) (2.5)
s.t. λ ≥ 0
17
Here the optimal value f ∗ is achieved at an optimal solution x∗, f ∗ = f0(x∗).
The basic idea of decomposability structures which lead to distributed algorithms are to
decompose the original large optimization problem into distributively solvable subproblems
which are coordinated by a high-level master problem by means of some kinds of signaling
[27, 28, 29, 42], as illustrated in Fig. 2.2. Most of the existing decomposition techniques can
be classified into primal decomposition and dual decomposition [17].
Figure 2.2: Decomposition of the original optimization problem.
A primal decomposition is a approach since the original problem allocates the existing
resources by directly giving each subproblem the resources that it can use. Dual decomposi-
tion method is that the original problem sets the price for the resources to each subproblem
which can decide the amount of resources to be used.
When the original problem has a coupling constraints, the dual decomposition can be
formulated as follows:
max .∑
i
fi(xi) (2.6)
s.t.∑
i
hi(xi) ≤ b, xi ∈ Xi
where Xi is the range xi should be inside. If the constraints in (2.6) are absent, then the
problem is the primal decomposition problem. After using lagrangian duality properties,
18
the problem in (2.6) is changed to
max .∑
i
fi(xi)− λT (∑
i
hi(xi)− b) (2.7)
s.t. xi ∈ Xi
Then the dual problem is solved by updating the dual variable λ
min . g(λ) =∑
i
gi(λ) + λTb (2.8)
s.t. λ > 0
where gi(λ) is the maximization value obtained from dual function by using lagrangian for
a given λ. If the dual function g(λ) is differentiable, the dual problem in (2.8) can be solved
by using gradient methods mentioned in the following section.
III. Gradient and Subgradient Methods
Lagrange duality properties can lead to decomposability structures [17]. After perform-
ing a decomposition, the objective function of the optimization problem may or may not
be differentiable. Gradient and subgradient methods are a popular technique for iteratively
optimization problems of differentiable and nondifferentiable functions. These methods are
distinguished and very convenient because of their simplicity, little requirements of memory
usage, and amenability for parallel implementation [43, 28].
Below is a general concave maximization problem over a convex constraint set:
max . f0(x) (2.9)
s.t. x ∈ S
Both the gradient and subgradient projection methods generate a sequence feasible points
{x(t)} as follow:
x(t + 1) = [x(t) + α(t)p(t)]S (2.10)
where p(t) is a gradient of subgradient of f0 at x(t), α(t) is a positive step size, and [·]Sdenotes the projection onto a feasible convex set S. However, there is a difference between
19
gradient and subgradient methods. Each iteration of the subgradient method may not
improve the objective value as happens with a gradient method. For sufficiently small step
size α(t), the distance of the current point x(t) to the optimal solution x∗ decrease makes
the subgradient method converge.
There are many results on convergence of the gradient and subgradient methods with
different choices of step size. For a constant step size α, more convenient for distributed
algorithms, the gradient method converges to the optimal value provided that step size
is sufficiently small, whereas for the subgradient method, it is guaranteed to converge to
within some range of the optimal value, in other words, the subgradient method finds an
ε suboptimal point within a finite number of steps. The major drawbacks for the gradient
and subgradient methods are that they have slow convergence speed and are sensitive to the
local optimum and saddle points if the objective function is not strictly convex or concave.
2.2.2 Applications
I. Basic Model of Flow Control
Let us consider a generic flow control optimization problem over a large-scale communi-
cation network with a set L = {1, ..., L} logical links, each link with a capacity of cl, l ∈ L.
The communication network can be wired or wireless. The whole network is shared by a set
S = {1, ..., S} sources. Each source s emits one flow and transmits at a source rate xs which
satisfies ms ≤ xs ≤ Ms (ms ≥ 0 and Ms < ∞ are the minimum and maximum transmission
rates), uses a set L(s) ⊆ L of links in its path, and has a utility function Us(xs). Each link
l is shared by a set S(l) = {s ∈ S|l ∈ L(s)} of sources.
The flow control optimization problem can be formulated as the problem of maximizing
the network utility∑
s Us(xs), over the S sources, subject to linear network flow constraints∑
s∈S(l) xs ≤ cl. Using centralized scheme to solve this optimization problem would require
to know all utility functions, which is not reality in many applications. The existing dis-
tributed optimization algorithms based on incremental subgradient method and traditional
20
lagrange dual distributed algorithm requires the utility function U(x) to be strictly concave
or convex, which can not be satisfied to many reality applications. These kinds of algorithms
need critical convexity properties to prove convergence to global optimum.
The NUM problem optimization model can be formulated as
max .∑
s∈S(l)
Us(xs) (2.11)
s.t.∑
s∈S(l)
xs ≤ cl, l = 1, ..., L
where U(·) is the utility functions. In general, it is a nonlinear, nonconvex function. Existing
distributed optimization algorithms based on incremental sub-gradient search [9, 10] assume
that the objective function U(x) is additive and convex, and the algorithms based on price-
based lagrangian dual flow control optimization [1, 25, 3] assume that the objective function
is increasing and strictly concave.
II. Dual Optimization
A dual optimization is appropriate when the problem has a coupling constraint such
that the optimization problem can be decomposed into several subproblems. Normally,
dual optimization uses Lagrange duality properties. The basic idea in Lagrange duality is
to relax the original optimization problem in(2.11) by transferring the constraints to the
objective in the form of a weighted sum. Define the Lagrangian as:
L(x, p) =∑
s
Us(xs)−∑
l
λl(∑
s∈S(l)
xs − cl)
=∑
s
(Us(xs)− xs
∑
l∈L(S)
λl) +∑
l
λlcl
(2.12)
The optimization problem of the lagrangian dual problem is
maxx
L(x, p) =∑
s
(Us(xs)− xsλs) +
∑
l
λlcl (2.13)
s.t. xs ∈ M, s = 1, ..., S
where M = [ms,Ms] denotes the range which xs must lie in, and λs =∑
l∈L(s)
λl.
21
The dual problem is solved by updating the dual variable λs:
minλ
D(λ) =∑
s
Ds(λs) +
∑
l
λlcl (2.14)
s.t. λ ≥ 0
where Ds(λs) is the dual function obtained as the maximum value of the lagrangian solved
in (2.13) for a given λ. This approach will have appropriate results only if strong duality
holds, this means the original optimization problem should be convex.
Here we can interpret λl as the price per unit bandwidth at link l, then the λs is the total
price per unit bandwidth for all links in the path of source s, xsλs represents the bandwidth
cost to source s, and Ds(λs) represents the maximum benefit for source s can achieve at the
given price λs [3]. The total price λs for source s summarizes all the congestion information
that s needs to know. For a given λ, a unique maximizer exists since original objective
function is strictly convex. The important point here is for a given λ, individual source s
can solve problem in (2.14) separately without the need to know the information of other
sources.
If the dual function is differentiable, then the dual problem in (2.14) can be solved by
using the gradient projection method. The link price λl is adjusted as follows:
λl(t + 1) = [λl(t)− α∂D
∂λl
(x(t), λ(t))]∗ (2.15)
where α is a parameter for step size, [x]∗=max{x, β}, and β is the link price lower bound.
D is continuously differentiable
∂D
∂λl
(x(t), λ(t)) =∑
s∈S(l)
xs(t)− cl = xl(t)− cl (2.16)
where xl is the aggregate source rate at link l. Then substituting (2.16) to (2.15), we will
get the price adjustment equation for link l ∈ L.
λl(t + 1) = [λl(t) + α(xl(t)− cl)]∗ (2.17)
From the demand and supply, if the demand for aggregate source rate at link l is less
than the supply cl, then reduce price λl, otherwise increase the price λl. The adjustment for
22
price is completely distributed and can be implemented by only using local information. In
each iteration, source s solves problem in (2.14) individually and propagates its result xs(λ)
to all the links on its path. Meanwhile link l will update its price λl based on current source
rate demand and propagates the new price to sources path this link. The dual variable
(price) λ will converge to the dual optimal after finite iterations and the primal variable
(source rate) x will also converge to the optimal solution.
The basic dual optimization algorithm can be summarized as follows:
• Step 1: Initializes λl(0) to some positive value for all links.
• Step 2: Each source s locally solves problem when receives the total price λs in its
path and propagates this new source rate xs to the whole network.
• Step 3: Each link l updates its price λl when receives all rates xs that go through
this link and propagates the new price to all sources that path this link.
• Step 4: Go through Step 2 to Step 3 until the termination criterion is satisfied.
In Low’s work, he has already proof that when the following condition are satisfied: the
original utility function is strictly convex, twice continuously differentiable, the curvatures
of utility function are bounded away from zero and the time between consecutive updates is
also bounded, then starting from any initial rates inside the range M = [ms,Ms] and prices
λ(0) ≥ 0, every accumulation point (x∗, λ∗) of the sequence (x(t), λ(t)) generated by the
basic dual optimization algorithm mentioned above are primal optimal. Moreover, for all
sources s, the error in price estimation and rate calculation both converge to zero, and the
gradient estimation error is also converges to zero [3].
2.3 Particle Swarm Optimization
Particle swarm optimization (PSO), developed by Kennedy and Eberhart in 1995 [11,
12, 16], is a promising population-based new optimization technique which models the set of
23
potential problem solutions as a swarm of particles moving about in a virtual search space.
Some of the attractive features of the PSO include the ease of implementation and the fact
that no gradient information is required. It can be widely used to solve many different
optimization problems, including most of the problems that can be solved by using Genetic
Algorithms (GAs).
Many popular optimization algorithms are deterministic, like the gradient-based al-
gorithms. Compared with gradient-based algorithms, the PSO algorithm is a stochastic
algorithm that does not need any gradient information. This allows the PSO algorithm to
be used on many functions where the gradient algorithm is either unavailable or computa-
tionally to obtain. In the past several years, PSO algorithm has been successfully applied
in many research and application areas. PSO is also attractive because that there are few
parameters to adjust. It is demonstrated that PSO gets better results in a faster, cheaper
way compared with many other methods.
2.3.1 PSO Algorithm
The original PSO algorithm was based on the sociological behavior associated with
birds flock. The method of PSO is inspired by the movement of flocking birds and their
interactions with their neighbors in the group. Instead of using evolutionary operators to
manipulate the individuals, like in other evolutionary computational algorithms, each indi-
vidual in PSO in the search space with a velocity which is dynamically adjusted according
to its own flying experience and its companions’ flying experience [44].
The high-level idea of PSO can be summarized as follows. To find the minimum of an
objective function f(x) (x is a vector) within a solution space, the PSO algorithm starts
with a set of candidate solutions called particles, {xp|1 ≤ p ≤ P} distributed in the solution
space. A typical value of P is between 20 and 50 [16]. During the optimization process,
each particle xp moves within the solution space in search for the minimum of f(x) , and
the corresponding movement path is denoted by xp(t), where t represents time. At each
24
time step, the movement of particle xp is given by
xp(t + 1) = xp(t) + v (2.18)
where
v = w · v + c1Θ1[xsp − xp(t)] + c2Θ2[x
g − xp(t)] (2.19)
Here, w, c1 and c2 are weighting factors, Θ1 and Θ2 are two random numbers, these parame-
ters can influence the maximum step size that a particle can reach in a single iteration. xsp
is the best solution that the particle itself has found so far, and xg is the best solution that
all particles have found so far, the value of v should be clamped in the range [−vmax,vmax]
to reduce the likelihood that the particle might leave the search space. Fig. 2.3 gives the
illustration of PSO algorithm. Each particle, when determining its next move in search for
the global optimum, always balances the behaviors of its own and the group [16].
Figure 2.3: Illustration of particle swarm optimization.
The PSO algorithm consists of repeated applications of the particle updated equation
represented in Eq. (2.18). Fig. 2.4 gives a pseudo code example for the basic PSO algorithm.
Note that the two if statements are the equivalent of applying Eq. (2.18) and Eq. (2.19)
respectively.
The initialization mentioned in the PSO algorithm includes the following: (1). Initial-
ization of each particle xm (xm is a n-dimensional vector). Each coordinate particle xm
25
should be initialized to a value drawn from the uniform random distribution on the search
space. This distributes the initial position of all the particles randomly throughput the
search space. (2). Initialization of each vm (vm is a n-dimensional vector) of each particle
xm. Initializes each vm to a value drown from the uniform random distribution on the
interval [−vmax,vmax]. Alternatively, the vm of each particle could be initialized to 0, since
the starting positions of each particle are already randomize. The stop condition mentioned
in the PSO algorithm depends on the type of problems being solved. Normally, the PSO
algorithm runs for a fixed number of iterations or until a specified error bound is reached.
Figure 2.4: Pseudo code for the basic PSO algorithm.
2.3.2 Social Behavior
A number of scientists have created computer simulations of various interpretations of
the movement of a bird flock or fish school. It does not seem a large leap of logic to suppose
that some same rules underlie animal social behavior, including herds, school, flocks and that
of humans[12]. Sociobiologist E. O. Wilson states that individual members of the school can
26
profit from the discoveries and previous experience of all other members of the school during
the search for food, this advantage can become decisive, outweighing the disadvantages of
competition for food item, whenever the resource is unpredictably distributed in patches[45].
We can achieve that social sharing of information offers an evolutionary advantage from
this statement. This hypothesis is the fundamental to the development of particle swarm
optimization[12].
The psychological assumptions of PSO theory are general and noncontroversial. In the
search of consistent cognitions, individuals will tend to retain their own best beliefs, and
will also consider the beliefs of other colleagues. Adaptive change happens when individuals
perceive that others’ beliefs are better than their own. This concept is not new, what is new
is that these simple concepts taken together create an evolutionary information processing
technique which is powerful enough to manage the huge amount of information comprising
human knowledge[46].
2.3.3 Applications
Currently, PSO has been applied to solve many problems. One of the first applications
of PSO is in neural network introduced by J. Kennedy and R. Eberhart in 1995. One of
the authors’ first experiments involved training weights for a three layers neural network
solving the XOR problem [12]. Their results show that the PSO algorithm performs very
well on this problem. Most of PSO applications are involved in neural networks. Currently
there are many modified PSO approaches used on many aspects in neural network[47, 48,
49], including fuzzy neural networks, B-Spline for nonlinear system identification, feed-
forward neural networks with weight decay, and etc. Furthermore, paper[50] combines PSO
algorithm and fuzzy neural network together to handle the financial risk early warning
problem, which is the foundation of the effective risk management.
Genovesi [51] uses PSO to handle the electromagnetic optimization problems. In this lit-
erature, the problem of synthesis of frequency selective surfaces is handled by simultaneously
27
optimizing both the real and binary parameters. PSO algorithm is used for design optimiza-
tions of electromagnetic devices in [52]. A modified PSO (Quantum PSO) algorithm[53] is
applied to linear array antenna synthesis, which is one of the standard problems used by
antenna engineers.
PSO algorithm is also applied to the parameter extraction of an equivalent circuit model
[54] recently. And PSO combined with adaptive simulated annealing with tunneling (ASAT)
are applied to the RF circuit synthesis techniques in [55].
Another application for PSO algorithm is on sensor networks [56, 57, 58, 59, 60]. A
sink node placement optimization problem is discussed in [56]. Target location estimation
problem is discussed in [57]. Paper [58] uses PSO algorithm for cluster formation in wireless
sensor networks, and paper [60] uses PSO algorithm for clustering node problems in ad
hoc sensor networks. An optimization problem of multicast routing in sensor networks in
discussed in [59]. And in [61], PSO algorithm is used to solve programming problems in
MIMO-based cross layer design for sensor networks.
2.4 Summary
Network resource allocation optimization has been widely used in Internet, TCP traffic
and ATM performance optimizations. Most of the network resource allocation problems
can be formulated as a constrained optimization problems of some network utility func-
tions. Network utility maximization (NUM) has been used for most of network resource
allocation problems, and there are many applications currently, such as resource allocation
optimization problems over wireless mesh networks. Meanwhile, the framework of NUM
has also been substantially extended from internet congestion control to a general approach
of understanding interactions across layers.
There are several basic optimization concepts most used in the resource allocation opti-
mization problems, such as convex optimization, lagrange duality, gradient and subgradient
methods. The basic network flow control problem has already been successfully solved by
28
lagrangian dual algorithm. The mainly drawback for this algorithm is that it needs the
objective function to be strictly convex or concave, otherwise the optimization problem can
not be handled properly. And for the gradient search algorithm, it also suffers from the
slow convergence speed.
For the PSO algorithm, it has many attractive features including ease of implementation
and the fact of no gradient information requirement. The main advantage of PSO over other
global optimization strategies is that the large number of random solutions that make up
the PSO make the technique avoid dropping in the local optima or saddle points of the
optimization problem. Meanwhile, unlike Genetic Algorithm (GA), PSO has no evolution
operators such as crossover and mutation and also has fast convergence speed.
29
Chapter 3
Particle Swarm Optimization over
Wireless Sensor Networks
A wireless sensor network (WSN) is a system of spatially distributed sensor nodes to col-
lect important information about the target environment. In this Chapter, we first consider
the unique characteristics of WVSN and develop an evolutionary optimization scheme using
a swarm intelligence principle to solve the WVSN performance optimization problem [62].
We transform the solution space set by the flow balance and energy constraints to a convex
region in a low-dimensional space without constraints. We then merge the convex condition
with the swarm intelligence principle to guide the movement of each particle during the evo-
lutionary optimization process. Our experimental results demonstrate that the proposed
performance optimization scheme is very efficient. Next, we develop an evolutionary distrib-
uted optimization scheme using swarm intelligence principle, called decentralized particle
swarm optimization (DPSO), to solve the generic distributed WSN parameter estimation or
optimization problem [63]. Based on the particle swarm intelligence principle, sensor nodes
share information with each other through local information exchange and communication
to solve a joint estimation or optimization problem. We use the swarm intelligence princi-
ple to guide the movement of each particle during the evolutionary optimization process.
The proposed distributed optimization scheme has low communication energy cost and as-
sures fast convergence. In addition, the objective function is not required to be convex.
30
We use source location as an example to demonstrate the efficiency of the proposed DPSO
scheme. Our experimental results demonstrate that the proposed optimization scheme is
very efficient and outperforms the existing distributed gradient-based optimization schemes.
3.1 Convex Mapping of High-dimensional Resource Con-
straints
WSNs have been envisioned for a wide range of applications, such as battlefield in-
telligence, environmental tracking, and emergency response. Each sensor node has limited
computational capacity, battery supply and communication capability [13]. A wireless video
sensor network (WVSN) is a system of spatially distributed video sensors which capture,
process and transmit video information over a wireless ad hoc network. In WVSN, each
sensor, equipped with a video camera, is able to capture, process and transmit visual infor-
mation about the circumstance activities.
Compared to other conventional sensor networks, GPS and temperature sensor networks,
the WVSN has the following unique characteristics: (1) The video sensor data is volumi-
nous. (2) Video data compression is computationally intensive and energy consuming, which
consumes about 60-80% of the total energy [64]. Therefore, the WVSN operates under se-
vere bit and energy constraints. Performance optimization of a large-scale WVSN under
resource constraints is a nonlinear, high-dimension, constrained optimization problem.
There is a significant body of research work on performance optimization of wireless
sensor networks, such as energy minimization, rate allocation and topology control [13, 14].
These results and algorithms are generic in their nature, and could be used to improve the
performance of WVSNs. However, they have not considered the unique characteristics of
WVSN, such as the complex nonlinear resource utilization behavior of each sensor node
function, which will potentially render these analysis and algorithms inefficient or even
impractical.
31
3.1.1 WVSN Operation Models
We assume that the WVSN has K video sensor nodes {Vi}, and has L communication
links. Each node compresses the video sensor data and let the output bit rate be Ri. Let Pi
be the energy consumption used in video compression. According to our previous work on
energy consumption modeling and power-rate-distortion analysis for video compression [64],
the coding distortion of the compressed video data is give by D(Ri, Pi). Our experimental
studies [64] suggest that the P-R-D behavior can be well captured by the following model
D(Ri, Pi) = σ2i 2−λRi·g(Pi) (3.1)
Here, σ2i represents the picture variance at node Vi and λ represents the resource utilization
efficiency of the video encoder. g(P ) is the power consumption model of the microprocessor.
The analysis in [64] suggests that g(P ) = P1γ , 1 ≤ γ ≤ 3. We assume that the performance
of the whole WVSN is measured by the overall video distortion
D =K∑
i=1
D(Ri, Pi). (3.2)
The power consumption in wireless transmission between nodes can be modeled as Pt(i, j) =
ci,j × ri,j, where Pt(i, j) is the power dissipated at node i when it is transmitting to node
j, ri,j is the bit rate transmitted from node i to node j, and ci,j is the power consumption
cost of radio link from node i to node j. The cost can be computed by
ci,j = α + β × dmij , (3.3)
where α is the distance-independent constant term, β is coefficient term associated with the
distance-independent term, dij is the distance between node i and j, and m is the path loss
index. The power dissipation at a receiver can be modeled as [14]:
Pr(i) = ρ×∑
j 6=i
rji (3.4)
where∑j 6=i
rji is the total bit rate of the received video data at node i and ρ is a constant.
32
3.1.2 WVSN Performance Optimization
The WVSN performance optimization has to satisfy the flow balance and energy con-
straints. Let Ei be the energy supply at node Vi, and T the operational lifetime of the
sensor network. Mathematically, the optimization problem can be formulated as
min{Ri,Pi},{rij}
D =K∑
i=1
D(Ri, Pi)
s.t. riB +∑j 6=i
rij −∑j 6=i
rji = Ri,
Pi · T +∑k 6=i
ρ · rki · T +∑j 6=i
cij · rij · T + ciB · riB · T ≤ Ei,
0 ≤ riB, rij ≤ rmax, 0 ≤ Ri ≤ Rmax, 0 ≤ Pi ≤ Pmax (3.5)
where rij and riB are data rates transmitted from node i to node j and from node i to
base-station B, rmax, respectively [14]. Rmax and Pmax are the maximum possible values of
rates and power. The first constraint is the flow balance constraint, stating that the total
data transmitted from node i should be equal to total data received from other nodes plus
the data generated by itself. The second constraint is the power consumption constraint,
representing that the total energy used for data processing should not exceed the energy
constraint.
In the following, we develop a matrix representation of the performance optimization
problem in (3.5). We assemble all the link rate variables {rij} into a link rate vector r, and
all the node bit rates {Ri} into vector R, all the power consumption in video compression
{Pi} at each node into vector P. Let X = [r,R,P], which represents all the control variables
that need to be determined by the performance optimization. Note that the flow balance
and energy constraints are both linear. Therefore, the performance optimization problem
in (3.5) can be written in the following matrix form
minX
J(X)
s.t. M ·X = 0
N ·X ≤ E0
0 ≤ X ≤ Xmax, (3.6)
33
where M represents the topology of the network, and N represents the energy cost for video
transmission over the network. E0 represents the initial power supply. We can see that X is
a (2K+L)-dimension vector, M and N are both K×(2K+L) matrices. This is a nonlinear,
very high-dimensional, constrained optimization problem for a large-scale WVSN.
3.1.3 Transform of the Solution Space
Handling the constraints is one of the central challenges in population-based evolutionary
optimization, such as PSO. One of the existing approaches to handle constraints is as follows:
move the particle (or compute the next generation) according to the swarm intelligence
principle; check if the new position of the particle satisfies the constraints; if not, this
movement of the particle is canceled and the particle stays in its current position.
We observe that handling the constraints in the original space of vector X = [r,R,P] is
very inefficient. First, in the original space, variables are interdependent. For example, if we
change the coding bit rate of node i, then the transmission rates of its associated links, as
well as the bit rates of its neighboring nodes will be affected, because they have to satisfy the
flow balance constraints. Second, according to our preliminary studies, the probability of a
solution vector [r,R,P] generated by swarm intelligence principle satisfying the constraints
is extremely small. This will render the optimization algorithm very inefficient.
In this paper, we use a linear representation of the solution space to define a linear
transform, which is able to map the high-dimensional constrained original solution space
into a convex region in a low-dimensional space without constraints. We then develop a
new evolutionary optimization scheme using the swarm intelligence principle and the convex
properties to solve the performance optimization in this low-dimensional convex space.
I. Linear Representation of the Solution Space
First, we consider the flow balance constraint M ·X = 0. This constraint implies that
34
the vector X must be in the null space of matrix M. Therefore, X can be represented by
X =
n1∑i=1
yi · gi = GY, (3.7)
where G = [g1,g2, ...,gn1 ] are the orthonormal basis vectors of the null space and Y =
[y1, y2, ..., yn1 ]t are coefficients. We can see that n1 = (2K + L)− rank(M).
Second, we consider the energy constraint. We assume that the optimum performance
is achieved when all energy is used, i.e, N ·X = E0. According to (3.7), we have
N ·G ·Y = E0. (3.8)
We suppose X0 is one specific solution (not necessary the minimum solution) which satisfies
the flow balance and energy constraints. Therefore, X0 must be in the null space of M. Let
X0 = GY0 Since N ·G ·Y0 = E0, Therefore,
N ·G · (Y −Y0) = 0. (3.9)
This implies that Y −Y0 is in the null space of matrix N ·G. Let H = [h1,h2, ...,hn2 ] be
the orthonomal basis vectors of the null space. Therefore, we have
Y −Y0 =
n2∑i=1
zi · hi = HZ. (3.10)
II. Transform of the Solution Space
Eqs. (3.7) and (3.10) tell us that if a vector X satisfies the flow balance and energy
constraints, then it can be represented in a low-dimensional space by vector Z. Since G
and H are both orthonormal matrices, from Eqs. (3.7) and (3.10), we can see that
Z = Ht ·Gt · [X−X0]. (3.11)
This defines a transform from the original solution space S = {X|X satisfies all the con−straints in (3.6)} to a new space S′:
F : X 7−→ Z = Ht ·Gt · [X−X0]. (3.12)
35
The new solution space is given by
S′ = {Z | 0 ≤ F−1(Z) ≤ Xmax}. (3.13)
This transform approach is very important and has the following advantages.
Lemma 1 The dimension of the new space S′ is much lower than the original space S.
The transform is able to reduce number of dimensions by up to 2K, where K is the number
of sensor nodes.
Proof The matrices G and H can be obtained from the singular value decomposition of
matrices M and NG. Therefore, n1 = (2K+L)−rank(M) and n2 = (2K+L)−rank(M)−rank(NG). Therefore, the number of reduced dimensions is rank(M) + rank(NG). If the
sensor network is not ill-conditioned, such as there is no orphan nodes, rank(M)+rank(NG)
should be close or equal to 2K. A detailed proof of these arguments is omitted here due to
page limitation.
Lemma 2 The new space S′ is convex.
Proof Notice that the transform F is linear, and the original space S is also convex.
Therefore, S′ is also convex.
Figure 3.1: Performance optimization with swarm intelligence and convex projection.
3.1.4 Performance Optimization Using PSO
36
Based on the swarm intelligence principle and the unique properties given by Lemmas 1
and 2, we are going to develop a new evolutionary optimization scheme for WVSN perfor-
mance optimization. The new swarm optimization will operate in the new solution space
S′. According to Lemma 1, this will significantly reduce the computational complexity.
In addition, in the new space, since the flow balance and energy constraints are implicitly
satisfied, the particles can move freely in S′. To make sure that each particle move within
the convex region of S′, we can use the property that S′ is convex. This implies that if two
particles X1 and X2 are in S′, then
λX1 + (1− λ)X2 ∈ S′, 0 ≤ λ ≤ 1. (3.14)
In PSO, the movement of each particle is defined by (2.18) and (2.19). If we choose the
weights such that
w + c1Θ1 + c2Θ2 = 1, (3.15)
then (2.19) can be written as
xm(t + 1) = w · x′m + c1Θ1x∗m + c2Θ2x
∗. (3.16)
Here,
x−m = xm(t) + [xm(t)− xm(t− 1)], (3.17)
is the predicted position of xm(t) according to the inertial velocity. In (3.21), we know that
x∗m and x∗ are in S′ because they are the best positions that have been found so far by the
particle itself and the whole group. However, x−m is not necessarily in S′. If not, we find the
convex projection for x′m as follows. Let {xi|1 ≤ i ≤ I} be the set of all particles, including
the initial particles and their moving histories. We know that these particles are all in S′.
Therefore, according to the convex condition, their weighted sum is also in S′. Let
[p′1, p′2, ...p
′I ] = arg min
{pi}||
I∑i
pixi − x−m||2, (3.18)
where 0 ≤ pi ≤ 1 andI∑i
pi = 1. Then, x′m =I∑i
p′ixi ∈ S′ which is the convex projection of
x−m. In (3.21), we replace x−m by x′m, then the new position of the m-th particle xm(t + 1)
37
must be in S′ according to the convex condition. Using this convex projection method, we
can make sure that each particle moves within the solution space.
3.1.5 Experimental Results
We test the proposed performance optimization scheme with different WVSN topology.
In the following experiment, the network has K = 10 nodes and L = 15 links. The particle
population size is 20, and the maximum generations is 2000. The other parameters are set as
follows: λ = 0.729 and γ = 0.5, α = 50, β = 0.0013, m = 2, ρ = 5, λk = 0.023 and σk2 = 10.
Fig. 3.2 shows that the overall video distortion metric quickly reaches to its minimum after
several generations. Fig. 3.6 shows the movement path of each particle in the solution space
(projected onto a 2-D plane for illustration purpose). After all the particles converge, an
optimum solution is found. We can see that the proposed performance optimization works
very efficiently. Our experimental results with other settings of WVSN yield similar results.
0 5 10 15 20 251.5
2
2.5
3
3.5
4
4.5
5
Iteration Number
Min
imum
Fun
catio
n V
alue
Figure 3.2: The performance metric decrease as the particles update their positions of PSOwith convex mapping.
38
−14 −13.5 −13 −12.5 −12 −11.5 −11 −10.5 −10 −9.50
10
20
30
40
50
60
70
80
Dimension 1
Dim
ensi
on 2
Figure 3.3: The traces of the all particles moving in the solution space of PSO with convexmapping.
3.1.6 Summary
In this section, based on the unique properties of WVSN and the swarm intelligence
principle, we have developed an evolutionary optimization scheme to solve the nonlinear
constrained performance optimization problem in WVSN. This section focuses on the an-
alytic development of the performance optimization method. The most challenge for PSO
algorithm is handling constraints. Our analysis shows that we can transform the solution
space into a convex region in a low-dimension space to reduce the computational complex-
ity and remove the interdependence between the control variables. The convex property of
the new solution space can be incorporated in the original swarm intelligence principle to
guide the movement of particles such that each particle in the new generation automatically
satisfies the constraints.
39
3.2 Distributed Optimization over Wireless Sensor Net-
works
A major challenge in WSN system design is data collection and analysis. Since each
node has very limited resources for communication and computation, it is inefficient or
even impractical to transmit all sensor data to a central location for information processing,
such as estimation and optimization. Therefore, distributed in-network signal processing
and optimization are highly desired, especially in large-scale WSN systems [9]. Recently,
several distributed optimization algorithms based on gradient search have been proposed
in the literature [9, 10]. Most of existing approaches assume that the objective function
to be additive and convex. Otherwise, it will be very difficult to assure convergence of the
distributed gradient search algorithm. In addition, existing algorithms also suffer from slow
convergence speed problem. In this section we present a new distributed scheme, called
decentralized PSO [63], which will not be sensitive to local optimum or saddle points and
has fast convergence speed.
3.2.1 Optimization Problems Using Decentralized PSO
Let us consider a generic parameter estimation or optimization problem over a WSN
with N sensors and each sensor taking M measurements about the target phenomenon. The
problem is to estimate a p× 1 vector parameter θ ∈ Rp from M independent measurements
x collected by n distributed selected sensors. The model is
min f(x, θ)
s.t. θ ∈ Rp (3.19)
where f(·) is a cost or utility functions. In general, it is a nonlinear function. Existing
distributed optimization algorithms based on incremental sub-gradient search assume that
the objective function f(x, θ) is additive and convex. However, in this work, there are no
such requirements and f(x, θ) can be a generic nonlinear function.
40
Our major idea in DPSO can be summarized as follows. Suppose there are K sensor
nodes which are involved in the distributed optimization or estimation task. We partition
the swarm particles (typically, in the range of 20-50 particles) into K sub-groups with each
node managing a sub-group of particles. More specifically, it is tasked to evaluate the
objective function for each particle in its sub-group and manage their movements based on
the swarm intelligence principle described in Eq. (2.19). After each iteration, the sensor
node finds the minimum solution within its sub-group and shares this sub-group minimum
with its neighboring nodes. Upon receiving the external information from its neighbors
about their sub-group minimum, each sensor node uses these external information to guide
the particles inside its sub-group so as to facilitate the search and optimization process.
The decentralized PSO algorithm can be stated as follows:
For each particle cluster {For each particle {
Repeat initialized particle until it satisfies
all the constraints}}Do {
Compare the received other cluster best solution with
local cluster best solution, update the local solution
with the best one.
For each particle {Use normal PSO algorithm to find particle
itself best solution and find new cluster
best solution among all cluster particles.}
Transmit new cluster best solution to other
particle clusters.
}While max iterations or convergence criteria is met.
41
In each particle cluster, the movement of each particle is defined by (2.18) and (2.19).
Here we choose the weights such that
w + c1Θ1 + c2Θ2 = 1 (3.20)
then (2.19) can be written as
xm(t + 1) = w · x∗m + c1Θ1xp + c2Θ2xg (3.21)
where,
x∗m = xm(t) + [xm(t)− xm(t− 1)] (3.22)
is the predicted position of xm(t) according to the inertial velocity.
3.2.2 Source Localization Application
In this section, we will use source location as an example to evaluate the performance of
the proposed DPSO algorithm. Location estimation of an acoustic source is an important
problem in both environmental and military applications [65]. Source localization problem
has been traditionally solved through nonlinear least-squares estimation, which is equivalent
to the maximum likelihood estimation when the observation noise is modeled as a white
gaussian noise [66]. In this source localization scenario, sensors near the source or target
are waken up and each sensor collects M measurements about the source location using
received signal strength. In the sensor network field, the acoustic source is located at an
unknown position θ. We assume that each sensor knows its own location ri, i=1,...,n. And
the received signal strength measurement model for the j-th measurement, j=1,...,M , at
node i is given by the following signal propagation model [65]
xi,j =A
‖θ − ri‖β+ ωi,j (3.23)
where A is a constant, β is the signal energy decay exponent and ωi,j are independent
Gaussian noise at each sensor i for every measurement j.
42
A maximum likelihood estimate θ is the global minimum formulated as
θ = arg minθ
1
n ·Mn∑
i=1
M∑j=1
(xi,j − A
‖θ − ri‖β)2 (3.24)
Sheng’s work uses Maximum likelihood estimator to solve the acoustic source localization
problem in sensor network [67], but this approach requires sensors to transmit all their data
to a fusion center for processing, which is impractical because of requiring massive amount
energy and bandwidth for a large network. In Rabbat and Nowak’s work [9, 15], they
propose gradient based distributed algorithm to solve the problem. The advantage of this
distributed algorithm is that it dose not need all the data transmitted to a central point for
processing compared with algorithm provided in [67]. But the drawback of this distributed
algorithm is that it is sensitive to local optimum and saddle points.
In the decentralized PSO algorithm, we assume each sensor knows all other sensors’
measurements and positions information after initial communication. In each particle clus-
ter, it finds its own cluster best estimation parameter. And during the estimation process,
each sensor node only transmits local best estimation parameter through communication
links. After communicating with other particle clusters and updating each cluster’s best
estimation parameter, all particles will find the best solution when all the particles converge.
Compared with gradient based distributed algorithm, the decentralized PSO algorithm will
not suffer from local optimum or saddle points.
3.2.3 Experimental Results
We test the proposed distributed optimization scheme using decentralized PSO with
different WSN topologies. In the following experiment, a sensor network was constructed
by placing 100 sensors uniformly distributed in a sensor field. Here in our experiment, one
sensor field has a size of 100×100 unit square. Sensor location is denoted by ri = (x, y). We
randomly select n sensors to wake up to collect measurements and estimate source location.
In total, there are K links between these sensors. These links should let all the selected
sensors to join the communication. The other parameters are set as follows: M = 10,
43
A = 100, and β = 2. The particle cluster size selects 2, we only assign 2 particles for each
sensor. All sensors make measurements at a signal-to-noise ratio (SNR) of 3 dB.
Fig. 3.4 depicts an example network topology using 20 sensors and 22 links. Fig. 3.5
shows that the overall minimum objective function quickly reaches to its minimum after
several iterations with DPSO. Fig. 3.6 shows the movement path of each particle in the
sensor field. Here in each particle sub-group, we only trace one particle’s moving path.
From these experimental results, we can see that the proposed scheme works very efficiently.
Our experimental results with other settings of WSN with different wake up sensor number
and link number yield similar results. Due to page limitation, they are omitted here.
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
100
X Position
Y Po
sitio
n
Sleep SensorWake Up SensorTargetParticle
Figure 3.4: A WSN topology with 20 sensors and 24 links.
In the sensor network which has fixed number of wake up sensors, when we increase the
number of communication links in the network, we will reach faster convergence. Fig. 3.7
shows the experimental results for one sensor network with 20 sensors waken up. As we
can see that as the number of communication links increases, the sensor nodes are able to
share information more efficiently, therefore speed up the DPSO convergence process. As a
44
0 2 4 6 8 10 12 14 16 18 200
20
40
60
80
100
120
140
Iteration Number
Min
imum
Fun
ctio
n Va
lue
Figure 3.5: The performance function value decrease as the particles update their positionsof decentralized PSO.
result, the number of iterations that are needed for reaching the minimum reduces.
3.2.4 Comparison with Gradient Search Algorithms
In this section, we compare the proposed DPSO with distributed gradient search algo-
rithms proposed in the literature [9, 15]. We will first give a short review about distributed
incremental gradient algorithm [9, 15]. This distributed incremental algorithm is a cycled
parameter estimation. On the k-th cycle, sensor i (i = 1, ..., n) receives an estimation ψki−1
from its neighbour and updates as following:
ψki = ψk
i−1 − α∇fi(ψki−1) (3.25)
where α is a positive step size, and ∇fi(ψki−1) represents the gradient of fi at ψk
i−1
fi(ψ) =1
m
m∑j=1
(xi,j − A
‖ψ − ri‖β)2 (3.26)
At the beginning, sets arbitrary initial condition ψ00 = θ0, and we will have θk = ψk
n after a
45
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
100
X Position
Y Po
sitio
n
Figure 3.6: The traces of the particles moving in the sensor field of decentralized PSO.
complete cycle through the network. Each subiteration only focuses on optimizing a single
function fi, which only depends on local data at sensor i.
The minimum objective function values found by the decentralized PSO and gradient
search algorithms were recorded as a function of the iteration number of the search process.
The comparisons are based on the same sensor network topology with the same number
of waken-up sensors and communication links. The experimental results are based on the
average of 100 experiments Fig 3.8 shows the comparison results of DPSO and gradient
search algorithms on optimization convergence. From the experimental results, we can see
that the decentralized PSO scheme is more efficient than the sub-gradient search algorithm.
3.2.5 Summary
In this section, we have developed an evolutionary optimization scheme, called decen-
tralized PSO (DPSO), to solve the WSN distributed generic parameter estimation or op-
timization problem based on swarm intelligence principles. The basic operation involves
46
20 22 24 26 28 30 32 34 36 38 4010
12
14
16
18
20
22
24
26
28
30
Communication Link Number
Itera
tion
Num
ber
Figure 3.7: Optimization convergence of WSN with 20 sensors wake up and different linknumber in decentralized PSO.
parameter estimation in each sensor and shares its own local results with its neighbors. The
proposed DPSO algorithm does not have requirement for the objective function, is not sen-
sitive to the local optimum or saddle points, and has very fast convergence speed compared
with gradient search method. Simulation results show that our evolutionary optimization
scheme is very efficient for different network topology.
47
0 5 10 15 20 25 300
20
40
60
80
100
120
140
160
180
200
Iteration Number
Min
imum
Fun
ctio
n V
alue
Decentralized PSOSubgradient Search
Figure 3.8: Comparison of decentralized PSO and subgradient search algorithm on opti-mization convergence.
48
Chapter 4
Distributed Rate Allocation for Video
Mesh Networks
Rate allocation and performance optimization problem is the most important aspect
in multi-hop video mesh networks. Video communication over mesh networks has found
many important applications, including webcam video communication over Internet, video
surveillance, and wireless vision sensor networks. We consider a large number of video ses-
sions sharing a mesh network. Since video streaming is bandwidth-demanding while the
network has a limited bandwidth resource, it is important for us to optimally allocate this
critical network resource among video sessions to maximize the overall system performance
or network utility. This optimization is often a high-dimensional nonlinear constrained opti-
mization problem. For large-scale networks, a centralized and synchronous solution to this
rate allocation and performance optimization problem is too costly, non-scalable, fragile,
and even infeasible in many cases. In this chapter, based on a swarm intelligence principle,
we develop a distributed and asynchronous particle swarm optimization (DAPSO) scheme
for distributed rate allocation and performance optimization [68]. Specifically, we study
the problem of decomposition of a network utility function with global resource parame-
ters and inter-dependent resource constraints into local optimization problems, each being
solved by particle swarm optimization. We develop in-network fusion and particle migra-
tion schemes to exchange information between neighboring PSO modules and coordinate
49
their search behaviors and optimization processes. We also develop a collaborative resource
control scheme to efficiently handle network bottleneck issues. Our extensive simulation re-
sults demonstrate that the proposed DAPSO algorithm is very effective for distributed rate
allocation and performance optimization. The proposed optimization framework is generic
and can be extended to other network resource allocation and performance optimization
problems with slight modification.
4.1 Introduction
A wireless mesh network (WMN) is a set of fixed and/or mobile nodes that self-assemble
into a dynamic multi-hop ad hoc network [69]. In this work, we study collaborative video
communication over a large-scale mesh network where a large number of sender devices
transmit compressed video data, either storage (pre-compressed) or live video data, to a large
number of receivers through multi-hop transmission with packet relay. The communication
link between two neighboring network nodes can be either wired or wireless. This type of
video mesh networking technology is found in many important applications, such as webcam
video communication over Internet / enterprise / community networks, video surveillance,
and image/video sensor networks [69, 70, 30].
To successfully deploy the video mesh networking technology, there are a number of
issues that need to be carefully investigated, including packet routing, flow control, Quality
of Service guarantee, resource allocation, and performance optimization [69]. In this work,
we focus on rate allocation and performance optimization. Video streaming is bandwidth-
demanding while the network has a limited bandwidth resource, it is important for us to
optimally allocate this critical network resource among different video sessions so as to
maximize the overall system performance or network utility. This optimization is often
a high-dimensional nonlinear constrained optimization problem. We propose to develop a
distributed asynchronous optimization scheme to solve the rate allocation and performance
optimization problem. The proposed optimization framework is generic and can be extended
50
to other network resource allocation and performance optimization problems with slight
modification.
4.1.1 Related Work
Rate allocation has been extensively studied in rate-distortion analysis and quality op-
timization for point-to-point video communication [71, 72]. It has also been considered in
the broad context of resource allocation [14]. Although a number of cross-layer resource al-
location and performance optimization schemes have been developed, they mostly focus on
centralized performance optimization for single-stream video communication over networks
with relatively simple topologies (e.g. a chain topology) [70, 73].
Within the context of large-scale mesh networks, especially video mesh networks, the
network resource allocation and performance optimization is often a high-dimensional con-
strained nonlinear optimization problem with a large number of resource parameters (op-
timization variables) [17]. In this case, a distributed and asynchronous solution to the
resource allocation and performance optimization is highly desired [17]. This is because,
first, in large-scale mesh networks, communicating the information that is required by each
step of the optimization algorithm from every network node to a central location often in-
volves a significant communication overhead and a large communication delay. However,
a distributed solution only requires local information exchanges. Second, because of the
large communication delay in gathering global information, a centralized rate allocation
algorithm is often not able to quickly respond to local changes in network conditions and
time-varying video data characteristics. However, a distributed approach has the advantage
of quick response to local changes. Third, a distributed solution is scalable. The resource
allocation and performance optimization procedure can be easily extended when new nodes
are added into the mesh network. Therefore, a distributed and asynchronous optimization
scheme is particularly attractive in large-scale networks [74, 17].
A number of distributed network utility optimization schemes based on prime and La-
grangian dual and gradient search have been developed in the literature [1, 9, 10, 25, 3, 26,
51
38, 75]. For those dual-based distributed algorithms, the Lagrange dual variables can be
considered as prices for network resource allocation [1, 25, 3]. Most of existing approaches,
especially those based on incremental sub-gradient search [9], assume that the objective
utility function to be additive and convex. Convex functions have unique properties. A
local minimum is also a global minimum and the duality gap between the prime and dual
optimization problems is zero [17]. For nonconvex objective functions, it will be very dif-
ficult to assure convergence of the distributed optimization process. However, in many
scenarios of resource allocation and performance, especially in video communication over
mesh networks, the utility function is often non-convex. In some cases, the utility function
even has no explicit expression due to the inherent difficulty in its mathematical modeling
and we can only obtain its function value for a given set of independent variables. How to
develop a distributed and asynchronous scheme to optimize a generic nonlinear objective
function over large-scale networks remains an open and challenging problem. In addition,
with the context of rate allocation and performance optimization for large-scale video mesh
networks, some important issues, such as network bottleneck links and their impact to the
overall video streaming and fast response to time-varying video characteristics, have not
been sufficiently addressed.
However, within the context of video communication over networks, the relationship
between the video quality of service metric (or system performance metric) and resource
utilization parameters are often nonlinear and complex. Therefore, there is a need to develop
a distributed asynchronous optimization algorithm which is able to handle generic nonlinear
network utility functions. In general, the rate allocation and performance optimization for
video communication over network problem is a nonlinear, nonconvex optimization problem.
Existing distributed optimization algorithms based on incremental sub-gradient search [9]
[10] assume that the objective function U(x) is additive and convex, and the algorithms
based on price-based lagrangian dual flow control optimization [1] [25] [3] assume that
the objective function is increasing and strictly concave. Distributed optimization scheme
proposed in [38] for multiple video streams in networks can still only handle the convex
52
optimization problem.
4.1.2 Major Contributions
In this paper, based on a swarm intelligence principle [16], we develop a distributed
asynchronous particle swarm optimization (DAPSO) scheme to solve the rate allocation
and performance optimization problem for large-scale video mesh networks. The major
contributions of this work include: (1) we explore the idea of particle swarm optimization
which provides a natural and ideal platform for distributed resource allocation, collabora-
tive video communication, and network performance optimization. (2) We develop simple
yet efficient schemes for local DAPSO modules to exchange information which enable the
fast convergence of DAPSO. (3) We develop a simple yet efficient scheme to address the
network bottleneck issue in distributed rate allocation and performance optimization. (4)
Unlike many network resource allocation performance optimization algorithms in the litera-
tures which are able to handle convex network utility functions, the proposed optimization
framework is generic, is able to handle nonconvex network utility functions, even does not
require their specific expressions, and can be extended to other network resource allocation
and performance optimization problems with slight modification.
4.2 Resource Allocation for Video Mesh Networks
In this section, we formulate the resource allocation problem for video mesh networks
and then discuss a generic distributed solution for this type of problems.
4.2.1 Formulation of Generic Resource Allocation Problems
In this work, we model the mesh network as a graph with V network nodes V =
{1, 2, · · ·V } and L logical links L = {1, 2, · · ·L}. The mesh network is shared by a set of
video transmission sessions (or streams), denoted by S = {1, 2, · · · , S}, as illustrated in
Fig. 4.1. In a large-scale mesh network, for example, a community or enterprize network
53
supporting online video chatting or conference services, there could be a large number of
simultaneous video sessions crossing over the mesh network and sharing communication
links.
Figure 4.1: Illustration of video communication over mesh networks.
In performance optimization of video mesh networks under resource constraints, all
network nodes need to collaborate in video streaming and network resource utilization
so as to maximize the overall system performance under resource constraints. Let X =
{x1, x2, · · · , xN} be the set of resource parameters. Example resource parameters include
encoding bit rate of a video stream, link transmission rate, transmission power, and delay
bound 1[70, 76]. There are two types of resource parameters, local and global resource pa-
rameters. A local parameter is associated with a specific network node or link, or a local
neighborhood of nodes or links. For example, data processing and transmission power are
two local resource parameters. A global parameter involves a set of networks nodes or links
that spatially distribute over the network. For example, the encoding bit rate of a video
streaming is a global parameter since a video stream involve a series of network nodes and
link along its transmission path.
The network operates under resource constraints. There are two types of constraints:
1In video communication over multi-hop networks, delay is also an important resource since, with alarge delay bound, the network has more flexibility in scheduling so as to improve its data communicationefficiency.
54
independent constraints, such as the energy constraint, which are uniquely associated with
one specific node or link; and inter-dependent constraints, such as flow balance constraints,
which are often associated with a neighborhood of network nodes or links and different
constraints are inter-dependent of each other.
As in the literature, a network utility function, denoted by U, is used to describe the
overall system performance. Since the overall system performance depends on the config-
uration of resource parameters, therefore, U is a function of {x1, x2, · · · , xN}, denoted by
U(x1, x2, · · · , xN). Now, the problem performance optimization under resource constraints
can be formulated as
max U(x1, x2, · · · , xN) (4.1)
s.t. independent constraints
interdependent constraints.
In the following, we will use rate allocation as an example to show how a practical
resource allocation problem could be fitted into the formulation in (4.1).
4.2.2 Basic Framework for Distributed Resource Allocation
Distribution of computation implies decomposition. More specifically, we need to de-
compose the global optimization problem with a global network utility function (objective
function) and all resource constraints into a set of local optimization problems with a lo-
cal objective function and local resource constraints, as illustrated in Fig. 4.2. There are
two basic decomposition methods, primal and dual decomposition. The former is based
on decomposing the original primal problem, whereas the latter is based on decomposing
the Lagrangian dual problem. In dual decomposition, a pricing method is often used to
coordinate the resource utilization between different optimization modules [17].
After decomposition, neighboring optimization modules are allowed to exchange status
55
information through local communication. In optimization over networks, it is highly desir-
able that the optimization modules operate in an asynchronous manner. More specifically,
each local optimization module immediately moves to its next optimization step / itera-
tion using the available status information received so far from neighboring nodes instead
of waiting for synchronous status information exchange with other nodes. If a distributed
asynchronous optimization scheme is effective, after a number of rounds of local optimization
and information exchange, the overall system performance should approach the optimum
obtained by global optimization. If not possible, at least a sub-optimum should be achieved.
From (4.1), we can see that the major task in decomposing a global constrained op-
timization problem into a set of local optimization modules is to decompose the network
utility function with global resource parameters and handle those inter-dependent resource
constraints. To decompose the global utility function, our basic idea is to introduce a set of
local resource parameters to replace the global ones and decompose the network utility func-
tion into local ones which only have local resource parameters. In doing so, we need to make
sure theoretically and/or experimentally that the local network utility functions converge
to the global optimum during distributed optimization. To handle those inter-dependent
resource constraints, our basic idea is to use local communication between optimization
modules to exchange information and let them negotiate with each other to make sure that
the inter-dependent resource constraints are satisfied during distributed optimization.
In the following sections, based on this observation and a swarm intelligence principle
[16], we will develop a distributed asynchronous scheme called DAPSO for rate allocation
and performance optimization for video mesh networks.
4.3 Distributed Rate Allocation for Video Mesh Net-
work
In this section, we first formulate the rate allocation problem within the basic framework
of (4.1). We discuss how this problem can be decomposed into local optimization problems
56
Figure 4.2: Illustration of distributed asynchronous optimization.
to be solved using DAPSO.
4.3.1 Problem Formulation
In rate allocation, the major resource constraint is in the form of link capacity and
the resource parameters are the bit rate of each video stream. While determining the bit
rates of video streams in order to maximize the overall video communication performance,
we need to make sure that the aggregated bit rate of all video streams on every link of the
network does not exceed its link capacity [2, 75]. More specifically, suppose we have N video
streams (sessions) that are sharing communication links and crossing over the network. Let
xn ∈ X = {x1, x2, · · · , xN} be the bit rate of video stream n. Note that xn is a global
resource parameter. Let
XL[l] = {xn ∈ X|video stream n uses link l}. (4.2)
which is the subset of video sessions that use link l.
Let Cl be the link capacity. Here, the link capacity can be considered as the average
number of information bits per second that can be successfully transmitted over the link.
A number of factors, including SIR (signal-interference ratio), transmission distance, com-
munication protocols, forward error correction (FEC) schemes, and packet re-transmission,
57
contribute to this link capacity [77]. According the link capacity constraint, we have
∑
xn∈XL[l]
xn ≤ Cl, 1 ≤ l ≤ L. (4.3)
In the following, we define the network utility function for video communication over
mesh networks. The ultimate goal in video communication system design is to provide users
with the best-quality videos. Therefore, the system performance should be measured by the
overall video quality. A commonly used metric for measuring the quality of a single video
stream is the encoding distortion. As suggested by the literature on rate-distortion (R-D)
modeling for video coding [72], we use the following R-D model
D(xn) = σ2n × 2−λxn , (4.4)
to describe the relationship between video coding distortion D and source rate xn for video
stream n. Here, σ2n represents the picture variance. More specifically, within the context of
motion prediction based video coding, it is the variance of difference picture after motion
compensation. λ is an encoder-related parameter. It should be noted that in this work
we just use the R-D model in (4.4) as an example to demonstrate the proposed DAPSO
algorithm. Certainly, this model can be replaced by any other more accurate R-D model de-
veloped in the literature [72, 78]. As we can see from the following sections, the optimization
procedure of the proposed DAPSO algorithm does not depend on the specific expression of
the R-D model.
Within the context of video communication over mesh networks with a large number of
simultaneous video streams, we need establish a performance metric function, or a network
utility function, to characterize the overall system performance. One commonly used mea-
sure to describe the overall video quality of multiple video streams is the aggregated video
distortion, i.e.,
U(x1, x2, · · · , xN) =N∑
n=1
D(xn) =N∑
n=1
σ2n × 2−λxn . (4.5)
At this moment, we consider stationary video sources and assume that σ2n is constant.
As noted by a number of researchers, when characterizing the overall quality over multiple
58
video streams, besides minimizing the aggregated video distortion, we also need to minimize
the quality variation between different video streams. From a user’s perspective, minimizing
the quality variation is also a very important part of maintaining the fairness among users
and different video services. By incorporating the quality variation into the network utility
function, we have
U(x1, x2, · · · , xN) =N∑
n=1
D(xn) +N∑
n=1
|D(xn)− D|, D =1
N
N∑n=1
D(xn). (4.6)
Now, the rate allocation and performance optimization can be formulated as
min U(x1, x2, · · · , xN) =N∑
n=1
[D(xn) + |D(xn)− 1
N
N∑n=1
D(xn)|]
(4.7)
s.t.∑
xn∈XL[l]
xn ≤ Cl, 1 ≤ l ≤ L. (4.8)
xn ≥ 0, 1 ≤ n ≤ N. (4.9)
Here, the resource constraints in (4.8) are inter-dependent since they involve global resource
parameters in X, while those in (4.9) are independent.
From this rate allocation example we can see that the network utility function in video
communication over networks U(x1, x2, · · · , xN) is often a nonlinear nonconvex (or noncon-
cave) function. For large-scale networks, the performance optimization problems in (4.1) and
(4.7) are often a high-dimensional nonlinear nonconvex constrained optimization problems.
Existing methods developed for convex optimization and existing algorithms for distributed
rate allocation, flow control, and resource allocation, such as distributed gradient based
lagrange dual algorithm, cannot be applied. In this work, based on a swarm intelligence
principle, we develop a distributed and asynchronous particle swarm optimization (DAPSO)
algorithm to solve the constrained nonlinear rate allocation problem in (4.7).
4.3.2 Decomposition
We propose to decompose the global optimization problem in (4.7) into a set of local
optimization modules, each of which is associated with a communication link. More specif-
ically, let L = {1, 2, · · ·L} be the set of links that are involved in the multiple-session video
59
communication, as illustrated in Fig. 4.1. In total, we have L local optimization modules.
For each link, which hosts a local optimization module, we define a set of local resource para-
meters, Xl = {xnl}, where xnl represents the bit rate of video session n at local optimization
module l. We define the l-th local optimization module to be
maxXl
Ul(Xl) =∑
n∈XL[l]
D(xnl) +∑
n∈XL[l]
|D(xnl)− D|, D =1
N
∑
n∈XL[l]
D(xnl).
s.t.∑
n∈XL[l]
xnl ≤ Cl. (4.10)
xnl ≥ 0, n ∈ XL[l].
Note that, for a video session, its bit rates at all links along the transmission path should
be the same. In other words,
xnl = xnk, ∀ l, k ∈ LS[n], (4.11)
where LS[n] is the set of links used by video session n. This is an inter-dependent flow
balance constraint [14]. We can see that the original global optimization problem has been
decomposed into L local optimization modules in (4.12) plus a set of inter-dependent re-
source constraints in (4.13), which will be solved by the proposed DAPSO algorithm to
be presented in the following section. It should be noted that the specific decomposition
procedure depends on the actual problem formulation. However, we observe that the rate al-
location problem in (4.7) is quite representative and many other network resource allocation
problems share similar formulation, often having a network utility function as an optimiza-
tion objective plus inter-dependent resource constraints (e.g. flow balance constraints) and
several local resource constraints (e.g. link capacity and energy constraints) [17].
4.4 Distributed and Asynchronous Particle Swarm Op-
timization
In the proposed DAPSO scheme, each local optimization problem is solved by parti-
cle swarm optimization (PSO). Through local communication, neighboring PSO modules
60
share important information in a distributed and asynchronous manner to expedite the
search process and make sure that the inter-dependent flow balance constraints in (4.13)
are satisfied. To do this, we propose to explore two major ideas: (1) in-network fusion
and particle migration to handle inter-dependent resource constraints and (2) collaborative
resource management to handle network bottleneck issues. In the following sections, we
discuss these design issues in more detail.
4.4.1 Original Optimization Problem Decomposition
In rate allocation, we propose to decompose the global optimization problem in (4.7)
into a set of local optimization modules, each of which is associated with a communication
link. More specifically, let L = {1, 2, · · ·L} be the set of links that are involved in the
multiple-session video communication, as illustrated in Fig. 4.1. In total, we have L local
optimization modules. For each link, which corresponds to a local optimization module,
we introduce a set of local resource parameters, Xl = {xnl}, where xnl represents the bit
rate of video stream n at local optimization module l. We define the l-th local optimization
module to be
min Ul(Xl) = w1 ×∑
n∈XL[l]
D(xnl) + w2 ×∑
n∈XL[l]
|D(xnl)− D∗|, D∗ =1
N∗∑
n∈XL[l]
D(xnl).
s.t.∑
n∈XL[l]
xnl ≤ Cl. (4.12)
xnl ≥ 0, n ∈ XL[l].
Here w1 and w2 are weight parameters for overall video distortion and distortion difference,
and N∗ is the total number of the video streams path link l. Note that, for a video stream,
its bit rate on each link along the transmission path should be the same. In other words,
xnl = xnk, ∀ l, k ∈ LS[n], (4.13)
where LS[n] is the set of links used by video stream n. This is an inter-dependent flow
balance constraint. We can see that the original global optimization problem has been de-
composed into L local optimization modules in (4.12) plus a set of inter-dependent resource
61
constraints in (4.13), which will be solved by the following DAPSO algorithm.
4.4.2 In-Network Fusion and Particle Migration
In the proposed DAPSO algorithm, each local constrained optimization in (4.12) is
solved by the PSO algorithm discussed in Section 2.3.1. Each local PSO module has a set
of local particles moving around in the solution space defined by local constraints in (4.12)
and searching for optimum solution for the local optimization module. Neighboring PSO
modules share status information about their “group-best”, denoted by Xgl , and then use
this external information to guide the movement of their internal particles so as to meet the
inter-dependent resource constraints in (4.13) in a collaborative manner. This is achieved
by two major operations: in-network fusion and particle migration, as illustrated in Fig. 4.3.
Figure 4.3: Distributed and asynchronous PSO algorithm.
More specifically, during in-network fusion, neighboring local PSO modules exchange and
fuse information about their group-best particles so as to generate new group-best particles
which has the following two properties: (1) they still satisfy the independent constraints
in each local PSO module. In other words, they are still in the solution space. (2) They
should satisfy the inter-dependent resource constraints better than each group-best particle.
For example, consider two neighboring PSO modules l and k, 1 ≤ l, k ≤ L. Suppose video
stream n is optimized by both modules xgnl and xg
nk which are the bit rates of video stream
62
n in the group-best particles found by local PSO modules l and k, respectively. During
in-network fusion, these two modules need to negotiate with each other to determine what
is a better bit rate for each, denoted by xgnl and xg
nk, such that the inter-dependent resource
constraint is better satisfied. The specific fusion rule depends on the actual formulation of
the constraint. For example, in the rate allocation case, the following fusion rule can be
used
xnk = xnl = min(xgnk, x
gnl), (4.14)
which are the new group-best particles in local PSO modules l and k. The major reason that
we choose the “min” operation in (4.14) is that those new group-best particles still satisfy
the link capacity constraints in both DAPSO modules. Certainly, with this fusion rule
applied to all local PSO modules and for all video sessions, the new generation of group-best
particles of both modules will satisfy the inter-dependent flow balance constraint better.
Once the group-best particle in the local DAPSO module is updated, guided by the swarm
intelligence principle in (2.19), it will attract the rest particles towards this new location
in the subsequent local search. We call this operation as particle migration because these
local particles move towards a new location due to external factors.
4.4.3 Handling Network Bottleneck Issues Using Collaborative
Resource Control
One of the major challenges in network resource allocation is to deal with bottleneck
links which have very limited bandwidth resources however are shared by a large number
of video sessions. This network bottleneck issue has not be adequately addressed in the
literature. This problem becomes even more difficult in distributed resource allocation
since local optimization modules may not be aware of the bottleneck links in some remote
places of the network. Let us consider the example in Fig. 4.4. The network has four video
sessions. Video sessions s1 and s2 share link l1 whose link capacity is 300 kbps (kilo bits
per second). Sessions s1, s3, and s4 share link l2 whose capacity is only 180 kbps. Let xn,li ,
63
1 ≤ n ≤ 3 and i = 1, 2, be the local bit rate of session sn at local optimization module li.
We have the following two link capacity constraints:
size = 0.7, and increase window step size = 2. The parameters used in WVSN are set as
follows: λ = 0.023, and σ21 = 100, σ2
2 = σ26 = 200, σ2
3 = σ25 = 400, σ2
4 = 800.
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
100
X Position
Y P
ositi
onDestination Node Source
A
B
C
Figure 4.6: A randomly generated video mesh network with 16 nodes, 15 links, and 6 videosessions.
At the end of each iteration, we compute the average, minimum, and maximum utility
values of 6 video sessions at all local DAPSO modules. We obtain the global optimum
solution using brute-force search. We compare DAPSO algorithm solution with centralized
algorithm solution which is theoretical optimal solution and is impractical in the real net-
work. Fig. 4.7 shows that the average, minimum, and maximum utility values all converge
to the global optima. When the DAPSO solution converges, all video sessions have simi-
lar quality as expected in the objective function in (4.7). This implies that the proposed
DAPSO is quite efficient.
To show more detail about the distributed rate allocation and interactions between
neighboring DAPSO modules, especially those at bottleneck links, we choose three links,
A, B, and C, as shown in Fig. 4.6 and plot the local bit rates of those video sessions that
pass through these three links in Figs. 4.8, 4.9, and 4.10, respectively. Here, we only
70
show the local bit rate from the group-best particle. We can clearly see that link A is a
bottleneck link for session 5 and more system resource (bandwidth) of link C is shifted
from session 5 to session 6 during the information sharing and resource negotiation process.
When the DAPSO solution converges, all video sessions have similar quality as expected in
the objective function in (4.7). This implies that the proposed DAPSO is quite efficient.
0 10 20 30 40 50 60 70 80 90 1000
5
10
15
20
25
30
35
Iteration Number
Util
ity F
unct
ion
Val
ue
DAPSO Utility Function Ave ValueDAPSO Utility Function Min ValueDAPSO Utility Function Max Value
Theoretical Utility Function Value
Figure 4.7: Convergence of network utility functions of DAPSO to the global optima ofconvex function.
4.5.3 Nonconvex Distributed Rate Allocation and Performance
Optimization Application
The system model used in the previous experiments is still a convex utility function, next
we change the system model to a nonconvex utility function to do the similar experiments
again.
We use the same R-D behavior analysis model for the video compression. We assume that
the performance of the whole WVSN is not only measured by the overall video distortion,
71
0 10 20 30 40 50 60 70 80 90 10018
20
22
24
26
28
30
Iteration Number
Sou
rce
Rat
eSource 1Source 2Source 3Source 5
Figure 4.8: The traces of the particles moving in critical link A of convex function.
but also by the difference between every source. The system model [82] can be changed as
follows:
D = w1×L∑
l=1
∑
n∈S(l)
D(xn) + w2×L∑
l=1
∑
n∈S(l)
|D(xn)−Dl| (4.20)
Here w1 and w2 are weight parameters for overall video distortion and distortion difference.
Dl is the average video distortion on link l in the network.
We test the proposed distributed and asynchronous optimization scheme using DAPSO
with different WVSN topologies. In the following experiment, the sensor network topology
is same as the previous experiment. The old parameters used in DAPSO are same as the
previous experiment, and new parameters w1 and w2 are set as follows: w1 = 1, w2 = 0.1.
Similar as the previous experiment, we compute the average, minimum and maximum
nonconvex utility function value of each video session at all local DAPSO modules. Fig. 4.11
shows the average, minimum, and maximum utility values all converge to the global optima.
Similarly, to show more detail about the distributed rate allocation and interactions between
neighboring DAPSO modules, especially at bottleneck links A, B, and C which we mentioned
72
0 10 20 30 40 50 60 70 80 90 10015
20
25
30
35
40
45
50
55
60
Iteration Number
Sou
rce
Rat
e
Source 1Source 2Source 4
Figure 4.9: The traces of the particles moving in critical link B of convex function.
before. Fig. 4.12, 4.13, and 4.14 show the local bit rates of those video sessions that pass
through these three links respectively. Here, we only show the local bit rate from the group-
best particle. And the bottleneck links’ capacity usage percentage for link A, B and C are
near 100%, 84% and 65% when the whole network is balanced.
The weight parameters here will control the optimization results, when we change the
parameters to: w1 = 1 and w2 = 1, Fig. 4.15 shows the average, minimum, and maximum
utility values all converge to the global optima. And Fig. 4.16, 4.17, and 4.18 show the
local bit rates of those video sessions that pass through these three bottleneck links, A, B,
and C respectively. Here we only show the local bit rate from the group-best particle. And
the bottleneck links’ capacity usage percentage for link A, B and C are near 68%, 52% and
36% when the whole network is balanced. We can see that when we want the weight of
source rate difference equals to the weight of source rate distortion, the link capacity usage
percentage drops very quickly.
73
0 10 20 30 40 50 60 70 80 90 10025
30
35
40
45
50
55
60
65
70
75
Iteration Number
Sou
rce
Rat
eSource 5Source 6
Figure 4.10: The traces of the particles moving in critical link C of convex function.
4.5.4 Comparison with Gradient-Based Lagrangian Dual Algo-
rithms
Many existing methods for network utility maximization are based Lagrangian dual and
distributed gradient or sub-gradient search [17]. They often assume that the utility function
is convex. This is because for convex functions, a local minimum is also a global minimum
and the duality gap between the prime and dual optimization problems is zero. In this
way, the decomposition of the centralized optimization problem can be performed on the
Lagrangian dual using a pricing approach where a resource price is set for each subproblem
to coordinate the resource utilization behaviors of local optimization modules [17].
In this section, we compare the proposed DAPSO algorithm with gradient-based La-
grangian dual algorithms on convex network utility functions. To this end, we remove
the video quality smoothness term∑N
n=1 |D(xn) − D| from the network utility function
U(x1, x2, · · · , xn) in (4.6). With this new convex network utility function, we apply the
gradient-based Lagrangian dual approach [17] to solve rate allocation problem in (4.7) and
74
0 10 20 30 40 50 60 70 80 90 1000
5
10
15
20
25
30
35
Iteration Number
Non
−con
vex
Util
ity F
unct
ion
Val
ue
DAPSO Non−convex Utility Function Ave ValueDAPSO Non−convex Utility Function Min ValueDAPSO Non−convex Utility Function Max Value
Figure 4.11: Convergence of network utility functions of DAPSO to the global optima ofnonconvex function: w1=1 and w2=0.1
compare it with the proposed DAPSO algorithm.
The Lagrangian algorithm is defined as follows
L(R,p) =L∑
l=1
∑
i∈S(l)
Di +L∑
l=1
pl(∑
i∈S(l)
Ri − cl) (4.21)
where pl is the price for link l. Using the gradient-based method [27, 28, 29], the link price
pl can be adjusted as follows
pl(t + 1) = [pl(t) + α∂L
∂pl
(R(t),p(t))]∗ (4.22)
where α is a parameter for step size, [x]∗=max{x, β}, and β is the lower bound of link price.
Since Di is strictly convex and L(R,p) is continuously differentiable, we have
∂L
∂pl
(R(t),p(t)) =∑
i∈S(l)
Ri − cl = Rl − cl (4.23)
where Rl is the aggregate source rate at link l. Substituting (4.23) to (4.22), we have the
price adjustment scheme for link l ∈ L:
pl(t + 1) = [pl(t) + α(Rl − cl)]∗ (4.24)
75
0 10 20 30 40 50 60 70 80 90 10018
20
22
24
26
28
30
Iteration Number
Sou
rce
Rat
eSource 1Source 2Source 3Source 5
Figure 4.12: The traces of the particles moving in critical link A of nonconvex function:w1=1 and w2=0.1
According to the optimum condition, ∂L/∂Ri = 0, we can update the bit rate as follows:
Ri(t + 1) =ln( pi(t)
λσ2i ln 2
)
ln 2· 1
−λ(4.25)
where
pi(t) =∑
l∈L(i)
pl(t) (4.26)
is the overall price for video session i on its transmission path. The above procedure
is repeated until the overall network utility function converges to the global optima. It
should be noted that, in distributed gradient search, the price information pl(t) should be
propagated along the video transmission path so that the overall price pi(t) in (4.26) can be
computed. 2 Furthermore, the convergence behavior of distributed gradient search depends
on the initial settings of starting point and link pricing.
2In essence, this is similar to the in-network fusion in the DAPSO algorithm because all effective dis-tributed performance optimization schemes require local information sharing and propagation.
76
0 10 20 30 40 50 60 70 80 90 10015
20
25
30
35
40
45
50
55
Iteration Number
Sou
rce
Rat
e Source 1Source 2Source 4
Figure 4.13: The traces of the particles moving in critical link B of nonconvex function:w1=1 and w2=0.1
We compare the distributed gradient search against the DAPSO algorithm on the ex-
ample in Fig. 4.6 with a convex network utility function as discussed in the above. For
distribution gradient search, we set α = 0.5× 10−3, β = 1.5× 10−4, and the initial prices to
be 0.2, 0.5 and 0.8 in different experiments. Fig. 4.19 shows the convergence behaviors of
DAPSO and distributed gradient search with initial price setting at 02, 0.5, and 0.8. It can
be seen that the DAPSO converges to the global minimum much faster than the distributed
gradient search. It should be noted that, in distributed gradient search, the link capacity
constraint is enforced through link pricing (or penalty) [17]. Therefore, when the link price
is low (e.g. 0.2), video sessions attempt to use higher bit rates (link bandwidth). In this
way, the overall bit rate will exceed the link capacity and the overall network utility (video
distortion) is even lower than the optimum case, as we can see in Fig. 4.19. Fig. 4.20 show
the average link bandwidth that has been over used by video sessions. The saw effect is
caused by link price update.
77
0 10 20 30 40 50 60 70 80 90 10028
30
32
34
36
38
40
42
44
Iteration Number
Sou
rce
Rat
e
Source 5Source 6
Figure 4.14: The traces of the particles moving in critical link C of nonconvex function:w1=1 and w2=0.1
4.6 Discussion and Conclusion
In this paper, based on the swarm intelligence principle, we have developed a distributed
asynchronous scheme for rate allocation and performance optimization of video mesh net-
works. We have studied the problem of decomposition of objective functions with global re-
source parameters and inter-dependent resource constraints. We have developed in-network
fusion and particle migration schemes for neighboring DAPSO modules to exchange their
group-best information to guide their local particle movements in search for the optimum
solution. By adjusting the resource budget window, we have developed a scheme to ad-
dress the network bottleneck issue. Our extensive simulation results demonstrate that the
proposed DAPSO algorithm is very efficient in distributed rate allocation and performance
optimization.
To test the response of DAPSO to time-varying video content, we double the variance
of video session 3 from 120 to 240 in the middle of simulation. Fig. 4.21 shows how the
78
0 50 100 15030
40
50
60
70
80
90
100
110
120
Iteration Number
Non
−con
vex
Util
ity F
unct
ion
Val
ue
DAPSO Non−convex Utility Function Ave ValueDAPSO Non−convex Utility Function Min ValueDAPSO Non−convex Utility Function Max Value
Figure 4.15: Convergence of network utility functions of DAPSO to the global optima ofnonconvex function: w1=1 and w2=1
DAPSO adjusts its local resource allocation and re-converge to another optimum point. We
test DAPSO algorithm under different network topologies. Fig. 4.22 shows different wireless
sensor network topologies with different number of sessions, there are 3, 4, 5, 6, 7, 8 sessions
in sequence for our test. And Table.4.1. gives the total video distortion and communication
cost for DAPSO algorithm and centralized optimization algorithm using PSO under these
different topologies. We can easily see that the DAPSO algorithm achieves better result
and has very low communication cost compared with the centralized optimization algorithm.
Centralized optimization algorithm using PSO is that we assume there exist one node which
knows the whole network information, this algorithm is impractical and too cost in the actual
networks. Figs. 4.23 to 4.25 show more convergence results for DAPSO on other examples
of random network topologies.
Compared with other methods, DAPSO has the following advantages:
(1) the algorithm is simple; (2) the algorithm is powerful, and DAPSO’s convergence speed
is very fast; (3) there is no predefined limitation on the objective function, not same as
79
0 50 100 1505
10
15
20
25
30
Iteration Number
Sou
rce
Rat
e
Source 1Source 2Source 3Source 5
Figure 4.16: The traces of the particles moving in critical link A of nonconvex function:w1=1 and w2=1
the other primal-dual based optimization algorithm ; and (4) the algorithm is asynchronous
because each local module only checks its neighbor communication information at special
time.
80
0 25 50 75 100 125 15010
12
14
16
18
20
22
24
26
Iteration Number
Sou
rce
Rat
eSource 1Source 2Source 4
Figure 4.17: The traces of the particles moving in critical link B of nonconvex function:w1=1 and w2=1
0 25 50 75 100 125 15014
16
18
20
22
24
26
28
30
32
34
Iteration Number
Sou
rce
Rat
e
Source 5Source 6
Figure 4.18: The traces of the particles moving in critical link C of nonconvex function:w1=1 and w2=1