zetite/Files/MasterThesis_LeinonenMarkus.pdf · Leinonen M. (2011) Power Minimization in Single-Sink Data Gathering Wireless Sensor Network via Distributed Source Coding. Department

DEPARTMENT OF ELECTRICAL AND INFORMATION ENGINEERING DEGREE PROGRAM IN ELECTRICAL ENGINEERING

POWER MINIMIZATION IN SINGLE-SINK DATA GATHERING WIRELESS SENSOR NETWORK VIA DISTRIBUTED SOURCE CODING

Author ____________________________________

Markus Leinonen

Supervisor ____________________________________

Markku Juntti

Accepted ______ / ______ 2011

Grade ____________________________________

Leinonen M. (2011) Power Minimization in Single-Sink Data Gathering WirelessSensor Network via Distributed Source Coding. Department of Electrical and In-formation Engineering, University of Oulu, Oulu, Finland. Master’s thesis, 93 p.

ABSTRACT

Energy efficiency arises as a vital issue to consider in data gathering wireless sen-sor networks with energy-constrained sensor nodes. In order to achieve energyefficiency, one key enabler for the future wireless networks is cross-layer opti-mization. The objective of this thesis is to address distributed transmit powerminimization in single-sink data gathering wireless sensor networks by utilizingcross-layer optimization.

The pursuance of energy efficiency starts with employing a lossless distributedsource coding, Slepian-Wolf coding, to remove all the redundancy in data. Theemployment of Slepian-Wolf coding is done in global or localized fashion, de-pending on the degree of the correlation of data available in the network. Then,the conventional data gathering model is extended to the wireless sensor networkscenario by including the power-limited sensor nodes and the wireless links inthe system. The data transmissions occur across the capacity constrained linksthat encounter no mutual interference. Multi-path routing is used across additivewhite Gaussian noise channels under Rayleigh fading. The data transmissionsfrom each sensor node to the sink node are optimized regarding the transmitpowers. This is done with the cross-layer optimization between the physical andnetwork layers by jointly optimizing the power allocation and the routing in adistributed fashion. Two alternative frameworks are proposed for the optimiza-tion criterion: the first framework involves the total transmit power minimizationand the second problem the minimization of the maximum transmit power, bothappearing in convex form. The structures are efficiently exploited to derive dis-tributed algorithms based on the dual decomposition technique that distributesthe solution process vertically across the protocol layers. Nevertheless, the secondalgorithm is shown to require also a small amount of centralized signalling.

Simulation results are provided to show the advantages of Slepian-Wolf cod-ing and the functionalities of the proposed algorithms. The simulations wereconducted in the simulator compiled in Matlab. The data transportation costswith Slepian-Wolf coding are compared to those with the independent encodingin single-sink data gathering scenario. The results show the high reliance of thefunctionality of Slepian-Wolf coding on the correlation properties of the data, andon the network and cluster sizes. The impact of Slepian-Wolf coding on the trans-mit power is studied in wireless sensor network scenario. The results show that byusing Slepian-Wolf coding, significant improvements in terms of energy efficiencycan be achieved. The proposed distributed algorithms are shown to converge nearto the optimal solutions in static channels. In addition, the algorithms are shownto be capable of tracking the solution under Rayleigh slow fading channels.

Keywords: Slepian-Wolf coding, cross-layer optimization, power allocation, rout-ing, dual decomposition.

Leinonen M. (2011) Tehon minimointi yhden keruusolmun langattomassa datan-keruuanturiverkossa hajautetun lähteenkoodauksen avulla. Oulun yliopisto, säh-kö- ja tietotekniikan osasto. Diplomityö, 93 s.

TIIVISTELMÄ

Langattomat datankeruuanturiverkot koostuvat energiarajoitteisista anturisol-muista, jolloin verkon energiatehokkuus nousee erittäin merkittäväksi asiaksi.Energiatehokkuutta voidaan parantaa hyödyntäen kerrosten välistä optimointiatulevaisuuden langattomissa anturiverkoissa. Työn tarkoituksena on minimoidalähetystehoa yhden keruusolmun langattomassa datankeruuanturiverkossa ha-jautetusti kerrosten välistä optimointia hyväksi käyttäen.

Hyvän energiatehokkuuden saavuttamiseksi datasta poistetaan redundanssihäviöttömän, hajautetun lähteenkoodauksen, Slepian-Wolf-koodauksen, avulla.Lähteet Slepian-Wolf-koodataan joko globaalisti tai klustereittain, riippuen saa-tavilla olevasta datakorrelaation määrästä. Tämän jälkeen datankeruu laajenne-taan langattomiin anturiverkkoihin, jolloin järjestelmään lisätään tehorajoittei-set anturisolmut ja langattomat linkit. Tiedonsiirto tapahtuu kapasiteettirajoi-tettuja, toisiaan häiritsemättömiä linkkejä pitkin käyttäen monitiereititystä. Ka-navat mallinnetaan Rayleigh-häipyvinä, additiivisen valkoisen Gaussin kohinanlinkkeinä. Datan lähetys kustakin verkon solmusta keruusolmuun optimoidaanlähetystehojen suhteen kerrosten välistä optimointia hyödyntäen. Optimointi teh-dään hajautetusti fyysisen ja verkkokerroksen välillä suorittamalla optimointitehoallokoinnin ja reitityksen yhteisvaikutuksena. Optimoinnin viitekehyksenäkäytetään kahta eri kriteeriä: ensimmäinen käsittää verkon kokonaistehon mi-nimoinnin ja toinen maksimitehon minimoinnin. Optimointiongelmien konveksi-suutta ja Lagrangen duaalisuutta hyväksi käyttäen ratkaisumenetelmiksi johde-taan hajautetut algoritmit, joilla ratkaisuprosessi pystytään hajauttamaan verti-kaalisesti eri protokollakerrosten välille. Jälkimmäisen algoritmin osoitetaan kui-tenkin tarvitsevan myös hieman keskitettyä signalointia.

Työssä esitetään simulointituloksia näyttämään Slepian-Wolf-koodauksen tuot-tamat hyödyt sekä osoittamaan ehdotettujen, hajautettujen algoritmien toimi-vuus. Simuloinnit suoritettiin Matlab-ohjelmistolla toteutetulla simulaattorilla.Tiedonsiirrosta aiheutuvia kustannuksia verrataan Slepian-Wolf-koodauksen jariippumattoman lähteenkoodauksen käytön välillä yhden keruusolmun datan-keruuverkossa. Tulokset osoittavat, kuinka Slepian-Wolf-koodauksen toimivuuson riippuvainen datan korrelaatio-ominaisuuksista sekä verkon ja klusterei-den koosta. Langattomassa anturiverkossa tutkitaan Slepian-Wolf-koodauksenvaikutusta tarvittavaan lähetystehoon. Tulokset osoittavat, että Slepian-Wolf-koodauksella saavutetaan merkittäviä parannuksia energiatehokkuudessa. Eh-dotettujen, hajautettujen algoritmien näytetään konvergoituvan lähelle optimaa-lisia ratkaisuja staattisissa kanavissa. Lisäksi niiden näytetään kykenevän seu-raamaan optimaalista ratkaisua hitaasti häipyvissä Rayleigh-kanavissa.

Avainsanat: Slepian-Wolf-koodaus, kerrosten välinen optimointi, lähetystehon al-lokointi, reititys, Lagrangen duaalisuus.

TABLE OF CONTENTS

ABSTRACTTIIVISTELMÄTABLE OF CONTENTSFOREWORDLIST OF ABBREVIATIONS AND SYMBOLS1. INTRODUCTION 112. SLEPIAN-WOLF CODING 14

2.1. Information Theoretical Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.1.1. Entropy, Joint Entropy and Conditional Entropy . . . . . . . . . . . . . 142.1.2. Differential Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.1.3. Gaussian Random Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2. Distributed Source Coding of Correlated Sources . . . . . . . . . . . . . . . . . . . 182.3. Slepian-Wolf Rate Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3. SINGLE-SINK DATA GATHERING 243.1. Single-Sink Data Gathering Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.2. Single-Sink Data Gathering Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.2.1. Distributed Shortest Path Tree Algorithms . . . . . . . . . . . . . . . . . . 293.3. Slepian-Wolf Rate Allocation Process in Gaussian Random Field . . . . . 29

3.3.1. Data Correlation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.3.2. Global Slepian-Wolf Coding Scenario . . . . . . . . . . . . . . . . . . . . . . 313.3.3. Localized Slepian-Wolf Coding Scenario . . . . . . . . . . . . . . . . . . . 33

4. SINGLE-SINK DATA GATHERING WIRELESS SENSOR NETWORK 384.1. Network Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.2. Multi-path Routing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.3. Communication model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5. TOTAL TRANSMIT POWER MINIMIZATION 425.1. Centralized Approach for Joint Power and Routing Optimization . . . . . . 425.2. Distributed Approach for Joint Power and Routing Optimization . . . . . . 44

6. MINIMIZATION OF MAXIMUM TRANSMIT POWER 516.1. Centralized Approach for Joint Power and Routing Optimization . . . . . . 516.2. Distributed Approach for Joint Power and Routing Optimization . . . . . . 53

7. NUMERICAL RESULTS 607.1. Structure of the Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

7.1.1. Creation of Network Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607.1.2. Rate Allocation Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617.1.3. Wireless Sensor Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

7.2. Data Transportation Costs in Single-Sink Data Gathering . . . . . . . . . . . . 637.3. Total Transmit Power Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

7.3.1. Total Transmit Power with Slepian-Wolf Coding . . . . . . . . . . . . . 687.3.2. Convergence of the Distributed Algorithm . . . . . . . . . . . . . . . . . . 70

7.4. Minimization of Maximum Transmit Power . . . . . . . . . . . . . . . . . . . . . . . 767.4.1. Maximum Transmit Power with Slepian-Wolf Coding . . . . . . . . 767.4.2. The Trade-off between Total and Maximum Transmit Powers . . 787.4.3. Convergence of the Distributed Algorithm . . . . . . . . . . . . . . . . . . 79

8. DISCUSSION 84

9. SUMMARY 8910. REFERENCES 90

FOREWORD

This master’s thesis has been carried out in the framework of Networks of 2020(NETS2020) project in Centre for Wireless Communications (CWC) at the Universityof Oulu. I appreciate that the research for my thesis has been a part of the forward-looking project and I was allowed to be in a collaboration between the industry part-ners of the project. I would like to thank the Finnish Funding Agency for Technologyand Innovation (Tekes), Nokia, Nokia Siemens Networks (NSN), Elektrobit, EricssonFinland, Nethawk, Renesas Mobile Europe and University of Oulu for the financialsupport.

I would like to express my gratitude to the supervisor of this thesis, ProfessorMarkku Juntti, for reviewing, commenting and examining the thesis. I am also gratefulto Professor Marcos Katz who is the second examiner of this thesis. I want to thankmy advisor, Mr. Juha Karjalainen, for giving me valuable guidance during the makingof the thesis and providing indispensable aid and support through the completion ofthe work. I would like to thank all of my motivated colleagues and competent admin-istration staff for creating a comfortable and innovative work environment.

I would like to express my sincere appreciation to my family and my friends for theirsupport, encouragement and understanding they have provided. Especially, I wouldlike to thank my colleague, M.Sc. Kalle "telecommunication guy" Lähetkangas, forkeeping me humorous company at the work to relieve my stress of making this thesis.

Oulu, May 18, 2011

Markus Leinonen

LIST OF ABBREVIATIONS AND SYMBOLS

AWGN additive white Gaussian noiseCC capacity constraintCR cost ratioCSI channel state informationDSC distributed source codingFCL flow conservation lawFDD frequency division duplexingFDMA frequency division multiple accessLP linear programmingMAC medium access controlMMP minimization of the maximum transmit powerMPR maximum power ratioNUM network utility maximizationOSI open systems interconnectionPDF probability density functionPMF probability mass functionQoS quality of serviceSISO single input, single outputSPT shortest path treeSW Slepian-WolfTPM total power minimizationTPR total power ratioWSN wireless sensor network

1 column vector consisting of onesA node-link incidence matrixail an entry of node-link incidence matrixAB matrix identifying the outgoing links of each nodec the speed of lightcl the capacity of link lD number of clusters in the networkd0 reference distancedi1i2 the distance between nodes i1 and i2d max-mini1i2

the largest minimum distance among the nodesdti transmission range of node iDs Doppler spreadei1i2 the edge between end nodes i1 and i2f flow vectorfc carrier frequencyfl the flow on the link lf(R) a function that depends on rate Rf(y) the probability density function of random variable Yf(y) the probability density function of random vector Yg vector of total transmit powers of nodesGssdgn graph for single-sink data gathering network

Gwsn graph for wireless sensor networkH(X) the entropy of random variable XH(X1, X2, . . . , Xn) the joint entropy of random variables X1, X2, . . . , Xn

H(XQ|XU) the conditional entropy of random vectorXQ givenXUh(Y ) the differential entropy of random variable Yi node indexi local node index for clusters Cj, j = 1, 2, . . . , D(i1, i2) the link between nodes i1 and i2j cluster indexk,m, u, q indicesK covariance matrixKj the covariance matrix associated with the sources in cluster Cjl link indexL number of directed wireless linksM number of independent Gaussian random variablesn number of random variablesN number of source nodes in a networkN + 1 the sink nodeNj number of source nodes in cluster Cjp vector of transmit powers of linksp(x) the probability mass function of random variable Xp(x1, x2, . . . , xn) the joint probability mass function of random variables

X1, X2, . . . , Xn

pl the transmit power allocated to link lP toti the total transmit power of node ir rate vectorR source rateri the source rate of node iRj

ithe rate of node i in cluster Cj

SY the support set of random variable Y , SY = {y|f(y) > 0}t iteration instanceTc coherence timeTc,norm normalized coherence timeTs duration of iteration instancesvr velocity of a receiverwe the link weight of edge ewi the total path weight of data transmission for node iwSPTi the total path weight of node i on SPT

wj,SPTi

the total path weight of node i in cluster Cj on SPTX discrete random variableX vector of discrete random variablesX estimate of XY continuous random variableY continuous random vectorY ∆ quantized version of continuous random variable YY ∆ quantized version of continuous random vector Y

Y Gaussian random variableY j

ithe continuous random variable in cluster Cj

Y j the continuous random vector in cluster CjZ random variable of informationZ vector of random variables of information

α, β step sizesγl the channel condition factor of link l∆ quantization step lengthε a positive constantζ, λ, ν, ω Lagrange multipliersη number of bits used in quantizationθ correlation coefficientκl Rayleigh distributed channel coefficient of the link lµ mean valueµ mean vectorµj the mean vector associated with the sources in cluster Cjµji

the mean value of source i in cluster Cjξ a positive constantπkm a constantΣi1i2 the correlation model associated with nodes i1 and i2σ2Y the variance of random variable Yτ the epigraph variableΥi1i2 channel noise realization of link (i1, i2)ς2l additive Gaussian noise power associated with link lϕ the trade-off parameter

A set of nodes in WSNCj cluster of source nodesCj cluster of source nodes with the local indicesD dual functionE set of edgesI(i) incoming links of node iK subset of sensor nodesKj subset of sensor nodes in CjL set of directed wireless links in WSNL LagrangianM subset of sourcesN normal distributionO(i) outgoing links of node iR the set of real numbersRSW Slepian-Wolf rate regionS set of source nodesT set of nodesU ,Q set of indicesX the discrete alphabet of random variable X

Z the set of integer numbersZ set of sources

alc column vector consisting of lth column of matrixAair column vector consisting of ith row of matrixAcov(Y1, Y2) the covariance of random variables Y1 and Y2

det(K) the determinant of matrixKlog2(·) logarithm of base 2var(Y ) the variance of random variable YKc the relative complement of set KO(·) big-O notation for complexity analysis|X| the cardinality of vectorX‖·‖ absolute value‖·‖1 the L1-norm‖·‖∞ the L∞-norm[υ]+ projection on to the set of non-negative numbers,

[υ]+ = max{0, υ}

1. INTRODUCTION

Wireless sensor networks (WSNs) have been widely proposed for different kinds ofmeasuring, monitoring and surveillance purposes in the field of medical, industrialand military applications [1, 2]. The networks consisting of multiple collaborativesensor nodes have been destined for instance for vehicle traffic monitoring, militaryreconnaissance and surveillance, and monitoring of habitat [1]. Due to the nature ofWSNs and the rough operation environment, remarkable issues to be carefully takeninto account are, e.g., small physical size of sensor nodes, low infrastructure, energyconsumption and computation power, robustness in rough operation environment, re-silience to failures, high communication efficiency and autonomous, distributed oper-ation of the nodes [2–4]. Especially, in WSNs consisting of battery-powered sensornodes with restricted possibility of recharging, energy efficiency arises as a vital issueto consider [2]. In some applications, the battery replacement can be even impossibledue to the deployment of the network in inaccessible or hostile environments [5].

Due to the nature of applications running in WSNs, one of the main concepts toinclude is a distributed networking architecture [2]. Distributed networking providesan opposite alternative to centralized networks where there is one main coordinatorresponsible for the issues related to the network management and maintenance. Also,the most of the processing load can be allocated to the head unit. A network with thecentral coordinator unit leads to a great amount of total overhead dissemination neededin the network. In a distributed network approach, the vital necessity of network headoperator can be avoided since the nodes are designed to operate autonomously with aneed of changing only a little amount of communication overhead between the nodes.

In a traditional network planning, the network layers of so-called Open Systems In-terconnection (OSI) model are designed individually and independently. This leads tooptimization of network parameters related to only one specific communication layerat a time. However, in wireless network settings, the parameters of one layer affectalso significantly the other layers – the layers are inherently coupled [6]. For instance,if the data rates are readjusted in the network, the optimal power allocation is alteredin the physical layer and the optimal routing of data to the destination is influencedin the network layer [6]. To achieve the optimal performance in the network, the op-timization within each layer is clearly not enough. Thus, contrary to the traditionallayer-dependent networking structure with virtually strict boundaries between layers,network planning can be done with an aid of cross-layer optimization which takes theinterdependence of the communication layers into account.

Distributed source coding (DSC) has been a hot topic in a field of sensor networkswith autonomously operating sensor nodes with low-overhead communication infras-tructure [2, 7–10]. The basic idea of source coding or compression is to remove un-necessary redundancy from data such that the resulted code words are the most com-pressed versions possible [11, Ch. 7]. Information theory gives the fundamental andwell-known result for the ultimate data compression rate for lossless coding, that isthe entropy of the source [11, Ch. 1]. A major result in the field of distributed sourcecoding was given by Slepian and Wolf in [12]. They showed, that for two correlatedsources the total rate given by the joint entropy is sufficient for lossless representa-tion of data, even without the collaboration between nodes. By optimizing the source

12

rates in the network, the need of valuable communication resources, such as transmitpowers, is decreased, leading to significant improvements in energy efficiency.

A correlated data gathering problem in sensor networks has been under an extensiveinvestigation in the literature [3, 4, 7–9, 13]. In correlated data gathering problem, thecorrelated data originated from the observations of densely deployed sensor nodes hasto be transported to a central node for further data processing [14]. A DSC method,Slepian-Wolf (SW) coding, can be used to efficiently exploit the correlation structureof data and completely remove the redundancy in the data [8]. The reduction in totaldata in the network has a notable influence on the cost function of transporting thedata to a sink node, that essentially corresponds to the total energy consumption ofthe nodes [10, 14]. Cristescu et al. [14] proved that a shortest path tree (SPT) is theoptimal routing structure for transporting all the correlated data to the destination witha minimum cost. However, they ignored the influence of wireless links on the overallnetwork performance by considering only distance-dependent link weights with nointerference.

Mathematical optimization techniques have enabled the use of cross-layer designleading to the performance improvements against the traditional network planningin wireless data networks [6]. Decomposition theory represents a mathematical fieldthat allows to create an analytical foundation for the designs of modularized and dis-tributed protocols in networks [15]. Recently, network utility maximization (NUM)has emerged as a typical framework for investigating different cross-layer related is-sues and optimization techniques [6]. One of the techniques is so-called dual decompo-sition technique which is a vastly covered method in the existing literature of wirelessnetworking [4, 6, 9, 15–18]. By applying dual decomposition method to the global op-timization problem, the solution can be found by coordinating the cross-layer interfacewith Lagrange multipliers between nodes [6]. After the decomposition, the structure ofthe resulted problem allows to use gradient or subgradient method to iteratively solvethe problem [15]. These introduce simple and memory saving methods with an op-tion to the parallel implementations, and the most important, to achieve a distributedalgorithm [15].

Thus, in addition to DSC, a cross-layer design is a key enabler to outperform thedesigns of conventional intra-layer networking systems in terms of energy efficiency.Yuen et al. [4] proposed a distributed algorithm to minimize the total transmission en-ergy consumption in sensor network by using SW coding for the rate allocation andby finding an optimal transmission structure based on the cost functions of data trans-portation. The design included congestion control due to the interference on the linksbut power allocation was not included since the link capacities were considered fixed.In practice, the wireless networking structure dictates that by neglecting the varyingconditions of wireless links in the design, the performance drop can be considerable ina real situation. Ramamoorthy [3] considered the minimum cost joint rate and flow al-location with fixed capacity links. The rate allocation involved SW coding which wasemployed with a greedy algorithm due to the special property of conditional entropy.

A joint design of routing and power allocation provides efficient networking issuesdealing with the network and physical layers, respectively. Xiao et al. [17] investi-gated simultaneous routing and resource allocation for wireless data networks. Theyexploited the structure of the convex problem formulation via dual decomposition andderived efficient methods for finding the solution in distributed fashion for maximum-

13

utility problem. In the system, mutual interference between the links was ignoredby assuming an orthogonal multiple access scheme. The work of He et al. [9] pro-posed a distributed algorithm that involves the joint optimization of the routing, therandom access and the power allocation with an objective to maximize the networklifetime in WSNs. The results showed improvements in network lifetime against thedesign of minimizing the total energy consumption in the network, like in [4]. Thenetwork lifetime was defined as the period between the initial deployment of the net-work and the energy exhaustion of battery of the first node. Yuan et al [18] addressed across-layer optimization framework by jointly optimizing the source quantization, therouting and the power allocation in WSN. They proposed an algorithm to efficientlysolve the problem in a modularized way with an objective that introduced the trade-offbetween minimization of the total transmit power and the distortion incurring in theestimation process.

However, all these works assume that the channels remain fixed during iterativeconvergence of the distributed algorithms. In terms of natural behavior of WSNs,the channels are time-varying when the requirement of tracking ability raises for theoptimization protocol. The work of Cheng et al. [19] studied the tracking proper-ties of distributed scaled gradient projection algorithm under time-varying channelsin multi-carrier interference network. Chen et al. [20] considered NUM problem un-der time-varying channels by applying primal-dual scaled gradient algorithm with dy-namic scaling matrices.

The objective of the thesis is to find out a way to achieve energy efficient commu-nications in single-sink correlated data gathering wireless sensor network scenario. Aconvenient approach to this is to use distributed source coding, SW coding, for remov-ing the redundancy in data, and then, to optimize the data transmission to the destina-tion in the network. The optimization is done by jointly optimizing the routing and thepower allocation in a distributed fashion. The main contribution of this thesis is to pro-pose distributed algorithms for two alternative optimization frameworks in single-sinkdata gathering WSN based on the dual decompositions. The first framework coversthe total transmit power minimization and the second considers the minimization ofmaximum transmit power. The functionalities of the proposed algorithms are studiedunder static and time-varying Rayleigh channel conditions.

This thesis is organized as follows. In Chapter 2, fundamental issues related to SWcoding of correlated sources are covered from the information theoretical view. Chap-ter 3 defines the concept of single-sink data gathering problem with SW coding. Thechapter includes two algorithms for finding the solution by using either global or lo-calized SW coding. In Chapter 4, the concept of single-sink data gathering is extendedto wireless sensor networks by giving the definitions for the essential system param-eters. The total transmit power minimization problem in single-sink data gatheringWSN is stated in Chapter 5. The chapter proposes a distributed algorithm for findingthe solution to the problem. As an alternative, Chapter 6 includes the minimizationof the maximum transmit power problem and proposes an algorithm to solve it in adistributed fashion with some centralized networking needed. Numerical results areprovided in Chapter 7 to show the benefits of using SW coding in terms of energyefficiency and to examine the functionalities of the proposed distributed algorithms.Finally, issues for further study are discussed in Chapter 8 and the work is summarizedin Chapter 9.

14

2. SLEPIAN-WOLF CODING

In terms of lossless distributed source coding, one promising method is so-calledSlepian-Wolf coding, which has frequently been under an extensive investigation witha main target to implement it to applications running in sensor networks [2]. Slepian-Wolf coding is a distributed source coding method where the multiple correlated infor-mation sources can be compressed without collaboration between sources [2]. How-ever, the joint decoding of data has to be performed at the receiver and the correlationstructure of data has to be known a priori. Slepian-Wolf coding efficiently exploits thecorrelation structure of data by completely removing the redundancy in data [8].

In Section 2.1, the fundamental information theoretical properties of random vari-ables and vectors are defined as a basis for distributed source coding. Gaussian randomfield is introduced as a special case for defining the stochastic properties of randomdata. Section 2.2 covers the problem regarding the lossless distributed source codingof correlated sources with different collaboration schemes between encoders. The con-cept of Slepian-Wolf rate region is discussed in Section 2.3 for defining the admissiblerates that can be achieved with Slepian-Wolf coding of correlated sources.

2.1. Information Theoretical Concepts

In this section, the concepts of entropy, joint entropy and conditional entropy with theirbasic properties are defined for discrete random variables and vectors. The essentialchain rules are introduced to express the relationships between entropy, joint entropyand conditional entropy. Correspondingly, differential entropy is introduced for contin-uous random variables and vectors. The relation between a continuous random vectorand its uniformly quantized version is defined related to data quantization. Gaussianrandom field is described for estimating the properties of correlated data. Particularly,differential entropy, joint entropy and conditional entropy are determined for Gaussiandistributed random variables and vectors.

2.1.1. Entropy, Joint Entropy and Conditional Entropy

In information theory, entropy defines the ultimate data compression rate for losslesscoding of a random variable [11, Ch. 1]. The entropy of discrete random variable Xwith a discrete probability mass function (PMF) p(x) and with discrete alphabet X isdefined as [11, Ch. 2]

H(X) =∑x∈X

p(x)log2

1

p(x)

= −∑x∈X

p(x)log2 p(x). (2.1)

The property 0 ≤ p(x) ≤ 1 results that log21

p(x)≥ 0. Thus, the entropy of a random

variable is always non-negative, that isH(X) ≥ 0. Entropy is a measure of the averageuncertainty of a random variable. It gives the average number of bits needed to describe

15

it1. The entropy of random variable X is defined only as a function of its probabilitydistribution, meaning that the absolute value of a random variable has no influence onthe entropy. [11, Ch. 2]

By extending the definition of entropy of single variable to the case of several ran-dom variables, joint entropy is obtained. The joint entropy of n-dimensional discreterandom vector X = [X1, X2, . . . , Xn]T ∈ Rn with joint PMF p(x1, x2, . . . , xn) andwith discrete alphabets X1,X2, . . . ,Xn, respectively, is defined as [11, Ch. 2]

H(X1, X2, . . . , Xn) = −∑

x1∈X1,x2∈X2,...,xn∈Xn

p(x1, x2, . . . , xn)log2 p(x1, x2, . . . , xn). (2.2)

A relationship between the joint entropy H(X1, X2, . . . , Xn) and the individual en-tropies H(X1), H(X2), . . . , H(Xn) can be written with the inequality

H(X1, X2, . . . , Xn) ≤n∑k=1

H(Xk) = H(X1) +H(X2) + . . .+H(Xn), (2.3)

where the equality holds if and only ifX1, X2, . . . , Xn are statistically independent [11,Ch. 2].

In general, the conditional entropy defines the remaining uncertainty of a set ofrandom variables given the other disjoint set of random variables. Let XQ and XUbe the disjoint vectors selected out from vector X . The subscripts Q and U definethe set of indices, such that Q = {q1, q2, . . . , |XQ|} ⊂ {1, 2, . . . , n}, |XQ| < nand U = {u1, u2, . . . , |XU |} ⊂ {1, 2, . . . , n}, |XU | < n, with Q ∩ U = ∅ and|Q| + |U| ≤ n. The operator | · | denotes the cardinality, such that, i.e., |XQ| is thecardinality of vector XQ. Conditional entropy for discrete random variables in vectorXQ given vectorXU can be expressed as [10]

H(XQ|XU) = H(XQ,XU)−H(XU). (2.4)

Basic property is that conditioning reduces entropy [11, Ch. 2]. According to this,

H(XQ|XU) ≤ H(XQ) (2.5)

holds. The equality holds if and only if the variables in sets Q and U are independent.The joint entropy of random variables inX can be expressed by means of the chain

rule with the following summation over conditional entropies [11, Ch. 2]:

H(X1, X2, . . . , Xn) =n∑k=1

H(Xk|Xk−1, . . . , X1) (2.6)

2.1.2. Differential Entropy

Differential entropy is defined as the entropy of a continuous random variable. Dif-ferential entropy differs from the discrete entropy in some basic properties. However,the definitions of joint entropy, conditional entropy and chain rules can be defined for

1The entropy of a random variable is measured in bits, when the base of logarithm is 2.

16

continuous random variables in a similar way they are defined for discrete randomvariables.

The differential entropy is defined as a function of probability density function(PDF) instead of probability mass function. The differential entropy of continuousrandom variable Y with PDF f(y) is defined as [11, Ch. 8]

h(Y ) =

∫SY

f(y)log2 f(y) dy, (2.7)

where SY is the support set of random variable Y such that SY = {y|f(y) > 0}. Unlikethe entropy of a discrete random variable, differential entropy can also be negative [11,Ch. 8].

The relation between the entropies of continuous random variable and its uniformlyquantized version can be defined as a function of the length of bins that divide therange of the variable. Within each bin, the PDF is assumed to be continuous. Therelation for quantized version of continuous random variable Y , denoted with Y ∆,with quantization step length ∆ and with Riemann integrable PDF f(y) is definedby [11, Ch. 8]

lim∆→0

H(Y ∆) + log ∆ = h(Y ), (2.8)

where H(Y ∆) is the entropy of quantized, discrete random variable Y ∆. The expres-sion (2.8) means that approximately h(Y )+η bits are needed to describe an η-bit quan-tized continuous random variable. When the quantization step is sufficiently small, theentropy of quantized continuous random variable approaches the differential entropyof the variable.

For an n-dimensional continuous random vector Y = [Y1, Y2, . . . , Yn]T ∈ Rn, a cor-responding relation between the entropies can be defined. Under the aforementionedconditions and with the assumption that the samples of vector Y are independentlyquantized with the same, sufficiently small quantization step ∆, the relation is givenby [11, Ch. 8] [3]

lim∆→0

H(Y ∆) + nlog ∆ = h(Y ), (2.9)

where Y ∆ = [Y ∆1 , Y ∆

2 , . . . , Y ∆n ]T ∈ Rn is quantized random vector and H(Y ∆) is the

respective entropy.

2.1.3. Gaussian Random Field

Gaussian random process can be used for estimating the stochastic properties of cor-related data and it is widely accepted for providing excellent approximations in realapplication scenarios [9]. A Gaussian process is characterized by the mean and thecovariance function due to the coincidence of the second-order statistics and strongstationarity [21, Ch. 2]. The use of Gaussian random field makes the analysis of thedata correlation at different sources convenient because the dependence of data be-tween the sources can be fully expressed with the covariance matrix of the data [10].Gaussian random field has been frequently employed for modeling continuous randomvariables involved in different kinds of data gathering scenarios [3, 8–10, 14].

17

The differential entropy of Gaussian distributed real-valued random variable Y ∼N (µ, σ2

Y ) is defined as [11, Ch. 8]

h(Y ) =1

2log2(2πeσ2

Y ), (2.10)

where µ is the mean value and σ2Y is the variance of random variable Y . It is notable

that in (2.10) the differential entropy for Gaussian real-valued random variable can befully expressed by its variance only – the mean value does not have influence at all.This is meaningful, since it can be intuitively deduced that the more the data values aredispersed around the mean value, the more the bits on average will be needed to get acomplete description of the data. Among all distributions with same variance, Gaussiandistribution maximizes the differential entropy of a random variable [11, Ch. 8].

The PDF for an n-dimensional Gaussian distributed continuous real-valued randomvector can be presented with a mean vector µ = [µ1, µ2, . . . , µn]T ∈ Rn

+, and with acovariance matrix K ∈ Rn×n

+ . This is referred to a multivariate Gaussian distributedrandom vector. A random vector Y = [Y1, Y2, . . . , Yn]T ∈ Rn with zero mean is saidto follow multivariate Gaussian distribution if each component of Y can be expressedas a linear combination of independent standard Gaussian random variables. Hence, ifY1, Y2, . . . , YM are M independent standard Gaussian random variables, each elementYk of Y can be expressed as [22, Ch. 6]

Yk =M∑m=1

πkmYm, ∀k = 1, 2, . . . , n, (2.11)

where πk1, πk2, . . . , πkM are real-valued constants for each k = 1, 2, . . . , n.By assuming the jointly Gaussian model for the data, the PDF for the vector

Y = [Y1, Y2, . . . , Yn]T having multivariate Gaussian distribution Y ∼ N (µ,K) isexpressed as follows [11, Ch. 8]

f(y) =1√

(2π)ndet(K)e−

12

(Y −µ)TK−1(Y −µ), (2.12)

where det(K) is the determinant of matrix K. The mean vector µ contains realvalued, positive mean values of each normally distributed random variable. The co-variance matrix K is symmetric, positive definite matrix that has positive and realvalued entries [23, Ch. 2]. The structure of covariance matrix for random vectorY = [Y1, Y2, . . . , Yn]T can be written as [23, Ch. 3]

K =

var(Y1) cov(Y1, Y2) · · · cov(Y1, Yn)

cov(Y2, Y1) var(Y2) · · · cov(Y2, Yn)...

... . . . ...cov(Yn, Y1) cov(Yn, Y2) · · · var(Yn)

, (2.13)

where a diagonal element var(Yk) represents the variance of random variable Yk and anoff-diagonal value cov(Yk1 , Yk2) represents the covariance between random variablesYk1 and Yk2 .

18

The differential joint entropy of random vector Y ∼ N (µ,K) is expressed as fol-lows [11, Ch. 8]:

h(Y1, Y2, . . . , Yn) = h(Nn(µ,K)

)=

1

2log2

((2πe)ndet(K)

)(2.14)

The differential joint entropy is maximized with jointly Gaussian distributed randomvector, correspondingly to the case of single random variables [11, Ch. 9].

According to the multivariate Gaussian probability law, two random vectors canbe selected from n-dimensional normal distributed random vector Y ∼ N (µ,K)such that the resulting vectors preserve the properties of Gaussian distribution [24,A.2]. Let Y Q and Y U be the disjoint vectors selected out from vector Y , pre-serving the multivariate Gaussian distribution. The subscripts Q and U define theset of indices, such that Q = {q1, q2, . . . , |Y Q|} ⊂ {1, 2, . . . , n}, |Y Q| < nand U = {u1, u2, . . . , |Y U |} ⊂ {1, 2, . . . , n}, |Y U | < n, with Q ∩ U = ∅ and|Q| + |U| ≤ n. By (2.4) and (2.14), the differential conditional entropy for Gaussianrandom variables in vector Y U given vector Y Q can be written as follows:

h(Y U |Y Q) = h(Y U ,Y Q)− h(Y Q)

=1

2log2

((2πe)|Y U |

det(KU∪Q)

det(KQ)

), (2.15)

where KQ is the selected covariance submatrix of K with rows and columns deter-mined by indices inQ andKU∪Q with rows and columns given by U ∪Q, respectively.

2.2. Distributed Source Coding of Correlated Sources

Distributed source coding has been vastly studied in the literature especially for datagathering scenarios, where there exists correlation between data readings [2–4, 7–10,13, 14]. For instance, in wireless visual sensor network each sensor node capturesdigital visual information about the target and delivers the data to particular sink nodefor further data processing. Intuitively, it is not necessary for each sensor node tosend its data at a rate equal to its unconditioned entropy in order to recover the wholevideo information of the monitored target at the receiver. By utilizing the correlationbetween spatially adjacent video recordings, the total rate required to be transmitted atsensor nodes can be decreased. In the presence of correlation, by using lossless DSC,the redundancy in data can be removed such that all the individual data associatedwith each node can be losslessly recovered by performing the joint decoding at thedecoder. [7]

As an essential starting point of Slepian-Wolf coding scenario, a lossless source cod-ing problem of correlated sources with different collaboration schemes is discussed.The collaboration schemes involve three distinct cases which differ in the employedencoding and decoding methods. The first scheme covers the independent coding, thatis referred to separate encoding and decoding of correlated sources. The second caseincludes the collaboration between the sources while encoding the data whereas in thethird scheme, the sources perform the encoding without collaboration. For both of thelatter schemes, the joint decoding of data is performed at the receiver. The collabora-tion schemes for discrete random sources X1 and X2 are illustrated in Fig. 1. Withoutloss of generality, a case of two correlated discrete random sources is considered.

19

Encoder 1

Encoder 2

Joint decoder

X1

X2

X1, X2

Encoder 1

Encoder 2

Joint decoder

X1

X2

X1, X2

(b)

(c)

Encoder 1

Encoder 2

Decoder 1X1

X2

X1

(a)Decoder 2

X2

Figure 1. (a) Separate encoding and decoding of sources X1 and X2 (b) joint encodingand decoding c) distributed encoding and joint decoding.

Independent encoding and decoding of sources X1 and X2 is illustrated in Fig. 1(a).Sources X1 and X2 perform independent encoding and send their encoded data to thereceiver. At the receiver, separate decoding results in the estimates of X1 and X2,denoted with X1 and X2, respectively. When sources X1 and X2 have to encode theirdata separately followed by the separate decoding, the individual rates equal or largerto the unconditional entropies of the sources are sufficient to completely reconstruct themessages at the decoder. According to (2.3), the rates for the sources can be expressedas

R1 ≥ H(X1)

R2 ≥ H(X2)

R1 +R2 ≥ H(X1, X2) = H(X1) +H(X2), (2.16)

where R1 is the rate of source X1 and R2 is the rate of source X2. In order to encodea source at a rate equal to its unconditioned entropy, the ultimate data compressionlimit for lossless source coding has to be reached [11, Ch. 1]. Slepian and Wolf [12]introduced a theorem asserting that arbitrarily small decoding error probability with

20

block codes can be achieved when sending at a rate R = H(X) + ε, ε > 0. Onthe contrary, the rate R = H(X) − ε cannot be achieved with arbitrarily small errorprobability. Beyond this point, it is assumed that this limit can be reached, i.e., theideal lossless source coding is used.

If sources X1 and X2 are able to communicate with each other, that is to do jointencoding, they can coordinate their source coding in a cooperative manner. The col-laboration of encoders is presented in Fig. 1(b) with joint decoding at the receiver.Sources X1 and X2 perform joint encoding and send their encoded data to the receiverfor joint decoding. The joint decoding results in the estimates of X1 and X2, denotedwith X1 and X2, respectively. The sources can send their encoded data at the total ratedetermined by joint entropy H(X1, X2) [11, Ch. 15]. This can be achieved for exam-ple when source X1 sends its data at a rate equal to its unconditioned entropy H(X1)and source X2 at a rate H(X2|X1), that is the remaining uncertainty of X2 given X1.By (2.6), the rates for sources X1 and X2 can be written as

R1 = H(X1)

R2 = H(X2|X1)

R1 +R2 = H(X1, X2) = H(X1) +H(X2|X1). (2.17)

Naturally, the roles of the sources are interchangeable. According to relationship (2.5),H(X2|X1) < H(X2) when the sources are not independent. Thus, by having thecollaboration between the encoders, the total rate of correlated sources can be reducedcompared to separate encoding, as long as the joint decoding is implemented at thedecoder.

Due to the collaboration between sources in the case depicted in Fig. 1(b), the redun-dancy in data can be totally removed because all the data generated by the sources isavailable at both encoders. This is originated from the assumption that the communi-cation channel between the sources has enough capacity to support the joint encodingof data.

Another scenario is that the encoders are not able to collaborate, while the joint de-coding is still employed. This is referred to a DSC and it is presented in Fig. 1(c) forsources X1 and X2. Slepian and Wolf [12] showed, that for two correlated sourcesthe total rate given by the joint entropy is sufficient for lossless representation of data,even without the collaboration between nodes. This is referred to Slepian-Wolf cod-ing of correlated sources. By performing Slepian-Wolf coding, the source rates givenin (2.17) are still sufficient rates for sourcesX1 andX2 in order to recover the messageslosslessly at the decoder with the joint decoding. The Slepian-Wolf theorem states thatseparate encoding of correlated sources is as efficient as joint encoding, but at a costof increased complexity of the joint decoder and the necessity of the knowledge ofdata correlation between the sources [2]. Strictly speaking, the Slepian-Wolf theoremstates that the Slepian-Wolf rates are achievable with an arbitrary low probability oferror [11, Ch. 15].

In addition to the requirement of joint decoding at the receiver, Slepian-Wolf cod-ing requires that the correlation structure between data sources is known in advancefor each source [2]. In many applications, statistical models of the data can be avail-able, and thus, they can be efficiently exploited while encoding the sources [25]. Ifnot, the correlation between data sources has to be estimated beforehand, e.g., with

21

explicit communication in a distributed manner [14]. The estimation entails a trainingperiod associated with the DSC and furthermore, extra overhead dissemination in thenetwork. Since the functionality of the Slepian-Wolf coding is inherently based onthe knowledge of the data correlation, the accuracy of the estimation will have directimpact on the coding efficiency, and, on the other hand, on the rate feasibility.

Cheung et al. [25] proposed optimal strategies for information exchange that min-imize the rate penalty due to inaccurate estimation, under constraints on the numberof bits that can be exchanged between sources. They derived analytical expressionsto quantify the rate penalty and analyzed how it changed with a priori knowledgeof correlation. Another paper from the same authors [26] considers the correlationestimation subject to rate and complexity constraints and its impact on the amountof data exchange needed and on coding efficiency of DSC, with the main focus onSlepian-Wolf coding. They proposed a model-based estimation method, where thecontinuous-valued joint PDF of the source and the side information at the decoder wasfirst estimated by sampling the continuous valued inputs, and then derived the binarycorrelation from the estimated model.

2.3. Slepian-Wolf Rate Region

A rate region specifies the achievable rates for the sources such that when satisfied,the rates are sufficient for reconstructing the message of each source losslessly at thedecoder [12]. An achievable rate region can be specified as a closure of the set ofachievable rates [11, Ch. 15].

Figure 2 shows an example of the Slepian-Wolf region for two correlated randomsources X1 and X2 [2]. The horizontal axis stands for the rate of source X1 andthe vertical for source X2, respectively. The distinct areas of rate region are essen-tially defined by means of different limiting boundaries marked with the dotted lines.For instance, the boundaries associated with source X1 are the unconditioned entropyH(X1), the conditional entropy H(X1|X2) and the joint entropy H(X1, X2). In fact,the rate region for two correlated random sources is fully defined when the aforemen-tioned boundaries are known for each source.

The region of the admissible rates, that can be achieved with Slepian-Wolf codingof sources X1 and X2, is illustrated with the gray-colored area in the Slepian-Wolf rateregion. For the sake of comparison, the achievable rates with independent coding arealso shown, that is highlighted with the tilted lines at the upper corner of the rate region.The admissible rates the independent encoding of sources X1 and X2 can provideare naturally lower-bounded by the unconditioned entropies H(X1) and H(X2). Theachievable rates with the Slepian-Wolf coding of sources X1 and X2 can be expressedas [11, Ch. 15]

R1 +R2 ≥ H(X1, X2)

R1 ≥ H(X1|X2)

R2 ≥ H(X2|X1). (2.18)

It is worth noting that the inequalities determine the same admissible rate region aswhen the sources can collaborate with each other.

22

R2

R1

H(X2|X1)

H(X1|X2) H(X1)

H(X2)

H(X1, X2)

H(X1, X2)

A

B

Achievable rates withSlepian-Wolf coding

Achievable rates withindependent coding

C

Figure 2. The Slepian-Wolf rate region for two correlated sources X1 and X2.

In terms of achieving the optimal rates with Slepian-Wolf coding, the most interest-ing area of the Slepian-Wolf region is referred to the Slepian-Wolf bound. The Slepian-Wolf bound is a multidimensional plane defined by the joint entropy of the sources andit possesses a particular meaning for the code design of Slepian-Wolf codes. The endpoints of the Slepian-Wolf bound are called the corner points of the Slepian-Wolf re-gion and they are marked with points A and B in Fig. 2. The corner point A is achievedwith the rate pair {R1, R2} = {H(X1|X2), H(X2)}, whereas the corner point B standsfor {R1, R2} = {H(X1), H(X2|X1)}. Approaching the corner points is referred toasymmetric coding [2]. The rest of the points between the end points A and B inthe Slepian-Wolf bound are defined by the joint entropy H(X1, X2). These boundarypoints can be achieved for instance by time-sharing or source-splitting [27]. If the coderesults exactly to the mid point of the Slepian-Wolf boundary, corresponding to pointC in Fig. 2, the coding approach is called symmetric coding [2]. Essentially, the codedesign of Slepian-Wolf codes aims at generating codes achieving the rates, that arelocated exactly at the Slepian-Wolf bound in the Slepian-Wolf rate region [2].

Consider a case of n sources where each source is supplied with individual data andSlepian-Wolf coding is employed to completely remove the redundancy in the data.For each source k = 1, 2, . . . , n, a random variable of the information is denoted withZk. Set Z is the set containing all the sources, such that Z = {1, 2, . . . , n}. For anyarbitrary subset M ⊆ Z , a |M|-dimensional vector containing the variables of the

23

information can be expressed as ZM = [Z1, Z2, . . . , Z|M|]T . Correspondingly, for the

relative complement ofM in Z , that isMc = Z −M, M∩Mc = ∅, a vector ofrespective information can be defined as ZMc = [Zc

1, Zc2, . . . , Z

c|Mc|]

T . Slepian-Wolfrate region RSW for an arbitrary subsetM gives the admissible rates for the sources,that is given by [3]

RSW=

{[R1, R2, . . . , Rn]T :∀M ⊆ Z,

∑k∈M

Rk ≥ H(ZM|ZMc)

}, (2.19)

where R1, R2, . . . , Rn are the source rates of sources 1, 2, . . . , n. Expression (2.19)means that a subset of sources has to be encoded at least with a total rate given by thejoint entropy to fully recover all the individual data at the destination [7].

24

3. SINGLE-SINK DATA GATHERING

Single-sink data gathering scenario refers to the case where correlated data is locatedat the source nodes in the network and the sink node serves as a destination for thesources. The correlated data is originated from the observations of densely deployedand spatially distributed source nodes, where the physical phenomenon under sensingis following a certain spatial correlation structure. The objective of the correlateddata gathering is to transport all the individual data observed by each source node tothe sink node in a way to minimize some predefined cost function related to the datatransportation [10].

Since the data at the source nodes is correlated, distributed source coding of thesources provides an appropriate way to attain significant improvements with respectto the cost function by removing the redundancy in data while encoding the sources.Single-sink data gathering with Slepian-Wolf coding is a vastly studied scenario inthe literature and it has been applied to wireless sensor networks, where the typicalobjective is to minimize the total rates of the sources under predefined constraintsin order to achieve high energy efficiency [3, 4, 7–9, 13]. In addition to removingthe redundancy in the data, the distributed source coding manner involves a relativelyautonomous operation of the nodes providing advantage over a centralized networkingmanner, especially in wireless sensor networks.

In the literature, a frequently encountered data correlation model used with Slepian-Wolf coding in correlated data gathering networks is Gaussian random process [3, 8–10,14]. A typical property for Gaussian random process is that the dependence in databetween nodes can be fully expressed with the data covariance matrix, that makes themodeling straightforward [14]. This was discussed in more detail in Section 2.1.3.

Related to the exploitation of the correlation structure of data, the partitioning ofnetwork into disjoint clusters gives a possibility to employ Slepian-Wolf coding lo-cally within each cluster. This approximative method facilitates the acquirement ofthe correlation structure of data between the source nodes due to the decreased num-ber of nodes involved in local encoding. However, the smaller number of nodes onentropy conditioning inevitably introduces the trade-off between the requirement ofcorrelation knowledge and the rate reduction achieved with Slepian-Wolf coding. Ad-ditionally, the clustering of the network arises as an optimization problem on its ownfor which many algorithms have been proposed in the literature [8, 28].

The chapter starts with giving a definition for the concept of single-sink data gath-ering network in Section 3.1. Single-sink data gathering problem with Slepian-Wolfcoding is casted as an optimization problem in Section 3.2. The section also givesthe solution for the problem with linearly dependent rate function. In Section 3.3, theprinciples of allocating the rates for the sources with Slepian-Wolf coding are describedwith an assumption that the spatial data is generated by continuous-space process inGaussian random field. The algorithms for global and localized Slepian-Wolf codingapproaches are introduced as alternative methods for encoding the sources when differ-ent degree of knowledge about the network is available at each node. The algorithmsintroduce an unavoidable trade-off between the performance and the complexity in cor-related data gathering scenarios: the amount of total reduction in terms of the sourcerates against the required preliminary knowledge that has to be available in order toperform the appropriate encoding of the sources.

25

3.1. Single-Sink Data Gathering Network

An example of single-sink data gathering network is illustrated in Fig. 3. The networkconsists of N source nodes and a sink node that is denoted with N + 1. The single-sink data gathering network can be modeled as weighted graph Gssdgn = (T , E), whereT = {1, 2, . . . , N,N + 1} is the set of node indices i ∈ T and E = {1, 2, . . . , E} isthe set of edge indices e ∈ E in the network. The set of source nodes is defined asS = {1, 2, . . . , N} such that T = S ∪{N + 1}. Source nodes i ∈ S and the sink nodeN + 1 are depicted with the dots and the square in the figure, respectively. Each edgee ∈ E between the end nodes i1 and i2, is assigned weight we representing the cost ofusing the link.

Each source node i ∈ S produces a data reading Xi, that has to be recovered atthe sink node. Discrete random sources Xi, i ∈ S, form a discrete random vectorcontaining the total data associated with all the source nodes in the network, that isX = [X1, X2, . . . , XN ]T . The links are undirected lossless point-to-point links mean-ing that nodes can not transmit data via multiple paths. In the figure, the links withsolid lines show an example of the selected optimal transmission structure whereas allthe available links are marked with dashed lines.

Beyond the networking model, it is assumed that the adequate networking issueshave been solved in higher communication layers to provide successful communica-tions across the point-to-point links. The networking model is abstracted such that allthe data sent across the links are received perfectly without any errors and such thatthere is no mutual interference occurring on links. Moreover, the source nodes canact as relays such that each source node is capable of aggregating its own data withthe data received from other source node and then forward the aggregated data in thenetwork. [10, 14]

X1

N + 1

X14

X15 XN

XN−1

X12

X11

X13X6

X3

X2

X4

X8

X9

X7

X5

X10

Figure 3. Single-sink data gathering network.

26

3.2. Single-Sink Data Gathering Problem

Consider a single-sink data gathering scenario where the objective is to gather all theindividual but correlated data of the source nodes and send them to the sink node inorder to minimize transmission costs. Single-sink data gathering problem can be ex-pressed as an optimization problem where the objective is to minimize some predefinedtransmission cost function under rate constraints. The problem is to jointly optimizethe transmission structure and the rate allocation [10]. It has been proved that whenthe data gathering network consists of only one sink and Slepian-Wolf coding is used,the shortest path tree (SPT) is optimal solution for the transmission structure [10], re-gardless of the rate allocation. Due to the statement, the joint optimization problemseparates into the rate allocation problem and the transmission structure problem. Inother words, the optimization of single-sink data gathering problem can be done byfirst finding the optimal transmission structure by considering only the link weightsand then, optimizing the rate allocation for the given transmission structure [10].

The transmission cost function of each source for sending the data to the sink nodeis considered to be the product of a function depending on the rate and link weightsassociated with data transmission. In practice, the link weights depend heavily ondistances between the nodes. The distance-dependent cost function is related to theenergy consumption of the nodes and is a relevant cost model in sensor networks [10].

The objective is to find the optimal transmission structure, that is to find the mini-mum weight paths {w∗i }Ni=1 = w∗1, w

∗2, . . . , w

∗N on graph Gssdgn and assign the optimal

rate allocation {R∗i }Ni=1 = R∗1, R∗2, . . . , R

∗N for the given transmission structure. The

optimization problem for the single-sink data gathering problem can be written as [10]

{R∗i , w∗i }Ni=1 = arg minimize{Ri,wi}Ni=1

N∑i=1

f(Ri)wi, (3.1)

where Ri is the source rate of node i, f(Ri) is a monotonically increasing functionof source rate Ri and wi is the sum of the link weights that node i encounters whensending its data to the destination, namely

wi =∑i:e∈E

we. (3.2)

In the single-sink data gathering with Slepian-Wolf coding, the optimal transmissionstructure has been proved to be a SPT, regardless of the rate allocation [10]. Assumethat the SPT has been generated on the graph Gssdgn by running for instance the dis-tributed Bellman-Ford or Dijkstra’s algorithm in the network [29, Ch. 7]. Distributedshortest path tree algorithms are discussed in more detail in Section 3.2.1. Addition-ally, it is assumed that functions f(R) are defined as linear functions of rates, that isf(Ri) = Ri,∀i ∈ S. Finally, the cost function in (3.1) can be written as follows:

N∑i=1

RiwSPTi , (3.3)

where wSPTi is the total weight of the path on SPT that source node i uses to send its

data to the sink node N + 1. Thus, the optimal transmission structure corresponds tow∗1, w

∗2, . . . , w

∗N = wSPT

1 , wSPT2 , . . . , wSPT

N , respectively.

27

Since the separation of optimization of transmission structure and the rate allocationholds, the optimization problem in (3.1) is reduced in to a linear programming (LP)problem with rate variables R1, R2, . . . , RN only [10]. By adding the linear Slepian-Wolf rate constraints defined in (2.19), the resulted rate allocation problem with theoptimal transmission structure given by the SPT can be written as follows:

minimize{Ri}Ni=1

N∑i=1

RiwSPTi

subject to∑i∈K

Ri ≥ H(XK|XKc), ∀i ∈ 1, 2, . . . , |K|, (3.4)

where vectors XK and XcK denote the disjoint vectors, that contain discrete random

variables in vector X . The vectors are defined as XK = [X1, X2, . . . , X|K|]T and

XKc = [Xc1, X

c2, . . . , X

c|Kc|]

T , where Kc ∪ K = S, K ∩ Kc = ∅. It is remarkable, thatthe solution requires a centralized algorithm because for each node the path weights{wSPT

i }Ni=1 have to be known in advance in order to determine the associated sourcerate Ri. In addition, the use of Slepian-Wolf coding inherently involves centralizednetworking procedure since having a necessity for the global knowledge of the corre-lation structure of data in the whole network. [10]

In the beginning of the rate allocation process with Slepian-Wolf coding a certainordering of the nodes is performed with information dissemination among the network.The ordering is needed as a prerequisite for each node i ∈ S that node i knows exactlyon which other nodes the conditioning is done when determining its rate. Thus, thenodes are given particular indices based on their position on the SPT according to thepath weights to the sink node, given as (3.2). Without loss of generality, the sourcenodes can be indexed in an ascending order according to the path weights. Hence,the vector of discrete random sources X = [X1, X2, . . . , XN ]T with ordered indicescorresponds to the path weights wSPT

1 ≤ wSPT2 ≤ . . . ≤ wSPT

N , respectively. By havingthis particular node indexing, the optimal solution for the rate allocation problem ofLP form in (3.4) can be expressed as [10]

R∗1 = H(X1)

R∗2 = H(X2|X1)...

R∗N = H(XN |XN−1, XN−2, . . . , X1). (3.5)

According to the chain rule in (2.6), the rate allocation leads to the sum rate given bythe joint entropy of vector X . Moreover, the rate allocation procedure results in theset of rates that approaches one of the corner points of Slepian-Wolf rate region givenin (2.19). Hence, the assignment of the rates is referred to asymmetric coding.

The interpretation for the rate allocation in (3.5) goes as follows: the node with thesmallest path weight to the sink encodes its data at a rate equal to its unconditionedentropy. Each of the other nodes encode their data at a rate equal to their respectiveentropy conditioned on all the nodes that have smaller path weight to the sink than thenode itself. Hence, the largest rates are assigned to the nodes located at the proxim-ity of the sink node on the SPT. This leads to the situation where the largest rates aretransported to the destination across shorter links leading to a decreased value for the

28

cost function under optimization in (3.4). It is reasonable that this will have a substan-tial influence directly on the energy consumption of heavily energy-constrained sensornodes in wireless sensor networks.

An illustrative example of the rate allocation problem in (3.4) for two sourcesX1 andX2 is illustrated in Fig. 4 with respect to the Slepian-Wolf rate region. The objective fortwo random variables X1 and X2 is to minimize cost function R1w

SPT1 +R2w

SPT2 under

the Slepian-Wolf rate constraints shown in (2.18). The dashed lines represent the levelcurves of the objective function. According to the slope of the curves, the ratio of thepath weights to the sink is such that wSPT

1 < wSPT2 . Thus, the optimal solution for the

rate allocation problem is rate pair {R1, R2} = {H(X1), H(X2|X1)} which is locatedat point B. In general, in the case of two random variables the optimal point is locatedat either of the corner points A or B depending on the ratio between the path weightsof the sources. If the path weights are equal to wSPT

1 = wSPT2 , the level curves of the

objective are parallel to the Slepian-Wolf bound. Then the solution is not unique, sinceevery point in the Slepian-Wolf bound stands for the optimal point [10]. Nevertheless,a corner point can be still achieved by assigning the indices for the equal-weightednodes in an arbitrary order.

R2

R1

H(X2|X1)

H(X1|X2) H(X1)

H(X2)

H(X1, X2)

H(X1, X2)

A

B

R1wSPT1 + R2w

SPT2

Figure 4. An example of the rate allocation problem (3.4) for two sources X1 and X2.

29

3.2.1. Distributed Shortest Path Tree Algorithms

At the beginning of the data gathering problem with the Slepian-Wolf coding, a shortestpath tree has to be calculated in the network. The SPT for graph Gssdgn = (T , E) canbe found by running a distributed shortest path tree algorithm such as Bellman-Fordalgorithm or Dijkstra’s algorithm in the network, that require additional informationdisseminations through the network. The distributed Dijkstra’s algorithm solves thesingle-source shortest path tree to the destination with non-negative edge weights [30].Correspondingly, the distributed Bellman-Ford SPT algorithm solves the problem, butwith an addition that the edge weights are allowed to be also negative [31, Ch. 25].Because the algorithms inherently solve the single-source shortest path tree problem,an algorithm has to be performed separately for each source node i ∈ S to find itsshortest paths to the sink node N + 1.

The distributed Bellman-Ford algorithm solves the shortest path tree problem on agraph with cycles of non-negative edge weights [29, Ch. 7]. The algorithm requiresthat each node in the network knows the weights of incident edges, the identities ofall other nodes and estimates of the distances to all network nodes, that are receivedfrom its neighbors [29, Ch. 7]. The description for finding the SPT with the distributedBellman-Ford algorithm is given in more detail in [29, Ch. 7]. Correspondingly, thedistributed Dijkstra’s algorithm is summarized more specifically in [29, Ch. 7].

The running time of Bellman-Ford is slower than the time with the Dijkstra’s algo-rithm [31, Ch. 25]. Bellman-Ford yields to the complexity proportional to O(|T |2 +|E|), which corresponds to O(|T |4) on dense graphs [31, Ch. 25]. The complexity ofthe Dijkstra’s algorithm with respect to the min-priority queue with the linear-arrayimplementation can be found to be O(|T |3 + |T ||E|), with the binary min-heap imple-mentation O(|T ||E| log|T |) and with Fibonacci heap O(|T |2 log|T |+ |T ||E|), respec-tively [31, Ch. 25].

3.3. Slepian-Wolf Rate Allocation Process in Gaussian Random Field

Rate allocation processes and the associated algorithms for the single-sink data gath-ering with global and localized Slepian-Wolf coding methods in a Gaussian randomfield are described in the following section. Solving the optimization problem in (3.4)leads to two different algorithms which depend on the amount of knowledge availableabout the network structure and the correlation structure of data. Spatially distributedsource nodes are assumed to generate their data with respect to continuous-space pro-cess and having the distance-dependent correlation structure among the nodes. As aspecial case, Gaussian random field is employed to express the stochastic propertiesof sources and to provide a convenient way of determining the correlation between thedata observations.

3.3.1. Data Correlation Model

The data produced by each source node in the single-sink data gathering network de-fined by graph Gssdgn = (T , E) is assumed to follow a so-called continuous-space

30

process leading to purely spatial data. The data samples at each source are realizationsof a stochastic spatial process where the samples can be identified with giving thecorresponding spatial indices. The indices refer to spatial locations that are continu-ous throughout the spatial region where the sensing of a physical phenomenon occurs.Spatio-temporal data is not considered since the data readings at different locations areassumed to refer to a certain, fixed time instance. [21, Ch. 2]

By assuming the continuous-space process model for spatial data, the correlationbetween the data observations of the nodes becomes the function of the spatial loca-tions of the source nodes. The properties of data correlation are expressed by meansof a covariance function. Widely accepted model for the covariance model of spatiallycorrelated data in data gathering networks, such as densely deployed wireless sensornetworks, is a power exponential model [3, 8, 10, 14]. In general, power exponentialcovariance model Σi1i2 , associated with nodes i1 and i2, i1, i2 ∈ T , can be definedas [32]

Σi1i2 = e(−di1i2/θ1)θ2 , (3.6)

where di1i2 is the distance between nodes i1 and i2, and θ1 > 0 and θ2 = (0, 2] are theparameters for adjusting the correlation properties. The range parameter θ1 adjusts therate the correlation is decaying with distance, and the smoothness parameter θ2 definesgeometrical properties of the random field of interest [33]. Especially θ2 carries aspecial role since when θ2 = 1, the covariance is exponential and with θ2 = 2, itbecomes squared exponential [32].

The correlation structure forN -dimensional continuous jointly Gaussian real-valuedrandom vector Y = [Y1, Y2, . . . , YN ]T , Y ∼ N (µ,K), can be defined with covari-ance matrix K ∈ RN×N . According to the continuous-space process and assumingthe squared exponential covariance model with θ1 = 1 given in (3.6), an entry of co-variance matrixK associated with node i1 and i2, i1 6= i2, can be defined as [10, 32]

Ki1i2 = σ2 e−θd2i1i2 , (3.7)

where σ2 is the common variance of random variables Y1, Y2, . . . , YN and θ > 0 is apositive correlation coefficient. The diagonal elements are reduced to the expressionswith dependence on the variance only, namely [10]

Kii = σ2i , i = 1, . . . , N, (3.8)

where σ2i , i = 1, . . . , N are the individual variances which are the same for each

random variable Y1, Y2, . . . , YN . Finally, the covariance matrix with respect to jointlyGaussian random vector Y = [Y1, Y2, . . . , YN ]T following the aforementioned corre-lation model, is given as

K =

σ2

1 σ2 exp(−θd212) · · · σ2 exp(−θd2

1N)σ2 exp(−θd2

21) σ22 · · · σ2 exp(−θd2

2N)...

... . . . ...σ2 exp(−θd2

N1) σ2 exp(−θd2N2) · · · σ2

N

. (3.9)

The entries of the covariance matrix are non-negative and monotonically decreasingwith the distance having the extreme values of σ2

i1at di1i2 = 0 and 0 at di1i2 =∞ [32].

31

3.3.2. Global Slepian-Wolf Coding Scenario

The global Slepian-Wolf coding scenario is referred to a case where the global knowl-edge of the network is available for each source node in the network. The scenario as-sumes the following two main issues to be known for each node: a perfect knowledgeabout the correlation structure of the data readings between source nodes and the pathweights from each source node to the sink node. The correlation structure is neededfor the rate allocation since the source nodes use conditional entropy for defining theirindividual rates. The path weights have to be disseminated through the network so thateach node is tagged with a certain index and each node knows the proper order in therate allocation process.

With the assumption that the nodes are sensing the physical phenomenon in aGaussian random field, the data samples before data quantization have multivariateGaussian distribution. However, assuming that the data at each node is quantizedindependently with the same and sufficiently small quantization step ∆, instead ofusual entropy, the differential entropy can be used to assign the rates for each source.This was discussed in more detail in Section 2.1.2. Hence, according to (2.9), therate allocation process is performed for continuous jointly Gaussian random vectorY = [Y1, Y2, . . . , YN ]T , Y ∼ N (µ,K).

The rate allocation problem in (3.4) with global Slepian-Wolf coding in Gaussianrandom field can be expressed in the following form:

minimize{Ri}Ni=1

N∑i=1

RiwSPTi

subject to∑i∈K

Ri ≥ H(Y K|Y Kc), ∀i ∈ 1, 2, . . . , |K| (3.10)

The algorithm for solving the rate allocation problem in (3.10) with global Slepian-Wolf coding in Gaussian random field is summarized in Algorithm 1.

32

Algorithm 1 Rate allocation with global Slepian-Wolf coding in single-sink data gath-ering network

1. Find the SPT by running the distributed Dijkstra’s shortest path tree algorithmin given graph Gssdgn = (T , E). As a result, source nodes i ∈ S are assigned pathweights wi:

for i = 1 : Nwi ← wSPT

i

end for2. Order the source nodes in an ascending order on the SPT with respect to thepath weights to the sink node N + 1 and reassign the corresponding indices i =1, 2, . . . , N :

wSPT1 ≤ wSPT

2 ≤ . . . ≤ wSPTN ⇔ Y1, Y2, . . . , YN

3. Assign rate Ri for each node i ∈ S:for i = 1 : N

if i = 1

R1 = h(Y1) =1

2log2(2πeσ2

Y ) (3.11)

end ifif i = 2, 3, . . . , N

Ri = h(Yi|Yi−1, Yi−2, . . . , Y1)

=1

2log2

((2πe)

det(K [i,i−1,...,1]

)det(K [i−1,i−2,...,1]

)) (3.12)

end ifend for

In the algorithm, the rate assignments for Gaussian distributed random sources are cal-culated according to (2.10), (2.15) and (3.5). Each random variable Yi, i = 1, 2, . . . , Nhas the same variance σ2

Y . Subscript indices i, i − 1, . . . , 1 of covariance matrix Kdenote the respective row and and column indices of K that are selected to form co-variance matrix K [i,i−1,...,1]. Indices i − 1, i − 2, . . . , 1 refer to the source nodes thatare closer to the sink node N + 1 on the SPT than node i itself. Since the source nodesare reassigned new indices after finding the SPT, the covariance matrix has to be alsoreformed into the corresponding form. The covariance matrix has the structure givenin (3.9), since the exponential correlation model is assumed for the data correlation.

An example of the starting point of the rate allocation problem with ordered nodeindices in the single-sink data gathering scenario with global Slepian-Wolf coding isillustrated in Fig. 5. The source nodes have been sorted in ascending order according tothe path weights to the sink node on the SPT and assigned the corresponding indices,as described in Algorithm 1.

33

Y13

N + 1

Y7

Y4Y1

Y2

Y3

Y5

Y6Y9

Y11

Y17

Y16

Y15

Y10

Y14

Y12

Y8

Figure 5. A single-sink data gathering scenario with global Slepian-Wolf coding.

In terms of defining the path weights for each source node on SPT, Cristescu etal. [10] used a distance metric through the predetermined SPT. Thus, the path weightsare determined by means of expression (3.2) as

wi =∑i:e∈E

d2e, (3.13)

where de is the length of edge e, i.e., the distance between the end nodes i1 and i2 ofthe edge, i1, i2 ∈ T . Thus, the source nodes located at the immediate proximity of thesink node are allocated higher rates than the nodes lying at the extremity of the sensingregion. The procedure is reasonable because by having such an ordering, the amountof total data flow occurring at long distances from the destination is reduced yieldingthe decrement in the data transportation cost function.

3.3.3. Localized Slepian-Wolf Coding Scenario

Slepian-Wolf coding of correlated sources in a global fashion requires that the knowl-edge about the correlation structure of data and the path weights associated with eachnode have to be known a priori in the whole network. In terms of distributed network-ing, this is not convenient due to the increased amount of communication overheadneeded to gather all the necessary information for source encoding. Especially, inwireless sensor networks with the nodes facing scarcity of communication resources,the nodes should operate relatively autonomously and with small amount of messageexchange to gain energy savings in networking. As an alternative for global Slepian-Wolf coding, a more applicable approach with less complexity but providing suffi-

34

ciently close to the optimal solutions for the rate allocation, is referred to localizedSlepian-Wolf coding. [10]

In localized Slepian-Wolf coding scenario, the main idea is to partition the networkinto disjoint sets or local areas and to perform Slepian-Wolf coding inside each subset.The main difference compared to global Slepian-Wolf coding scenario is that eachnode belonging to its designated subset has a smaller number of sources on which tocondition when determining its respective rate. An attribute that makes the localizationof encoding process perform relatively well is that typically the correlation betweendata readings of the nodes in the network diminishes with respect to distance betweenthe nodes [10]. Even if the characteristic property of conditional entropy in (2.5) assertsthat the entropy can be reduced by conditioning, the ignored nodes outside the localarea would have only had a negligible effect for the rate reduction. The advantagecompared to the global Slepian-Wolf scenario is that the rate allocation can be done in adistributed manner with reduced amount of overhead signalling and without the globalknowledge of the network. Another perspective is that localized Slepian-Wolf codingcan offer robustness against the failures occurring at the nodes in the network [8].Thus, the failure of a node in a subset is not automatically causing the failure in datareception from the nodes belonging to another subset.

A prerequisite for the use of localized Slepian-Wolf coding is that the clustering ofthe network has to be performed in order to define the local areas for each node. Theclustering of the network actually becomes an optimization problem on its own andit has been investigated in the literature related to Slepian-Wolf coding in correlateddata gathering scenarios. For instance, Wang et al. [8] proposed an algorithm for theselection of disjoint clusters covering the whole network such that the global compres-sion gain of Slepian-Wolf coding is maximized. Once the clustering has been made,Slepian-Wolf coding is performed in a typical way inside each cluster by exploitingthe correlation structure of data only within the intra-cluster nodes. Zhang et al. [28]proposed a distributed algorithm for solving the weighted connected dominating setproblem in order to minimize the total communication costs, e.g., in the spatially cor-related wireless sensor network. The data correlation was exploited by dividing thenetwork into clusters, where each node sends its data to the chosen cluster head. Thecluster heads perform data compression and send the data to the sink node. The algo-rithm yielded substantial energy savings in the network.

Assume that the clustering of the network has been performed with some particularprocedure in given graph Gssdgn = (T , E). The network is partitioned into D disjointsubsets of source nodes, denoted with Cj, j = 1, 2, . . . , D, which cover the whole setof source nodes S in the network, hence C1 ∪ C2 ∪ . . .∪ CD = S. The set of Nj sourcenodes in cluster Cj is defined as Cj = {i1, i2, . . . , iNj} ⊆ {1, 2, . . . , N}, Nj ≤ N,with the global node indices selected from i ∈ S . The nodes in cluster Cj are alsotagged with local node indices i = 1, 2, . . . , Nj, Nj = Nj, such that they correspondto the global indices in the same ascending order, that is 1, 2, . . . , Nj ⇔ i1, i2, . . . , iNj .The set of source nodes in cluster Cj with the local node indices is denoted as Cj . AGaussian distributed continuous real-valued random vector associated with sources incluster Cj ⊆ S is denoted with Y j = [Y j

1, Y j

2, . . . , Y j

Nj]T ∼ N (µj,Kj). The mean

vector is given as µj = [µj1, µj

2, . . . , µj

Nj]T ∈ RNj corresponding to indices i ∈ Cj and

35

Kj ∈ RNj×Nj is the respective covariance matrix selected out from matrix K withrespect to the rows and columns referring to node indices i ∈ Cj .

Solving the rate allocation problem in (3.4) with the localized Slepian-Wolf codingseparates into D distinct LP problems that are solved independently within each clus-ter Cj ⊆ S [10]. The rate allocation problem with localized Slepian-Wolf coding inGaussian random field involves solving the following optimization problem for eachcluster Cj, j = 1, 2, . . . , D:

minimize{Rj

i}Nj

i=1

Nj∑i=1

Rj

iwj,SPTi

subject to∑i∈Kj

Rj

i≥ H(Y j

Kj |YjKcj

), ∀ i ∈ 1, 2, . . . , |Kj|, (3.14)

where Rj

iis the rate of node i in cluster Cj , wj,SPT

iis the corresponding path weight

on SPT, and Y jKj and Y j

Kcjare the disjoint vectors containing all the random variables

in Y j . The vectors can be expressed as Y jKj = [Y j

1, Y j

2, . . . , Y j

|Kj |]T and Y j

Kcj=

[Y j,c

1, Y j,c

2, . . . , Y j,c

|Kcj |]T , where the subsets within cluster Cj are given as Kj ∪ Kcj =

Cj, Kj ∩ Kcj = ∅.The algorithm for solving the rate allocation problem in (3.14) with localized

Slepian-Wolf coding in Gaussian random field is summarized in Algorithm 2.

36

Algorithm 2 Rate allocation with localized Slepian-Wolf coding in single-sink datagathering

1. Partition the network into disjoint subsets of source nodes, such that they coverall the source nodes in the network, C1 ∪ C2 ∪ . . . ∪ CD = S. Assign local indicesi = 1, 2, . . . , Nj for source nodes within each cluster Cj, j = 1, 2, . . . , D , in thesame ascending order 1, 2, . . . , Nj ⇔ i1, i2, . . . , iNj2. Find the SPT by running a distributed Dijkstra’s shortest path tree algorithm ingiven graph Gssdgn = (T , E). As a result, each source node i ∈ Cj is assigned pathweight wj

i, j = 1, 2, . . . , D:

for j = 1 : Dfor i = 1 : Nj

wji← wj,SPT

iend for

end for3. Within each cluster Cj ⊆ S, j = 1, 2, . . . , D, order the source nodes i ∈ Cj in anascending order on the SPT with respect to the path weights to the sink node N + 1and reassign corresponding indices i = 1, 2, . . . , Nj:

for j = 1 : Dwj,SPT

1≤ wj,SPT

2≤ . . . ≤ wj,SPT

Nj⇔ Y j

1, Y j

2, . . . , Y j

Nj

end for4. Within each cluster Cj ⊆ S, j = 1, 2, . . . , D, assign rate Rj

ifor each node i ∈ Cj:

for j = 1 : Dfor i = 1 : Nj

if i = 1

Rj

1= h(Y j

1) =

1

2log2(2πeσ2

Y ) (3.15)

end ifif i = 2, 3, . . . , Nj

Rj

i= h(Y j

i|Y j

i−1, Y j

i−2, . . . , Y j

1)

=1

2log2

((2πe)

det(Kj

[i,i−1,...,1]

)det(Kj

[i−1,i−2,...,1]

)) (3.16)

end ifend for

end for

In the algorithm, the rate assignments for Gaussian distributed random sourceswithin each cluster Cj ⊆ S are computed according to expressions (2.10), (2.15)and (3.5). Each random variable Y j

i, i = 1, 2, . . . , Nj, j = 1, 2, . . . , D, is assumed

to have the same variance, that is σ2Y . Subscript indices i, i− 1, . . . , 1 of covariance

matrixKj denote the respective row and and column indices ofKj that are selected toform covariance matrixK [i,i−1,...,1]. Indices i, i− 1, . . . , 1 correspond to the particularnodes that are involved in the calculation of the rate for node i. For instance, indices

37

i− 1, i− 2, . . . , 1 refer to the source nodes in cluster Cj that are closer to sink nodeN +1 on the SPT than node i itself. Since the source nodes are reassigned new indicesafter finding the SPT, the covariance matrix has to be also reformulated into the corre-sponding form. The data correlation is assumed to follow the exponential correlationlaw, thus the correlation matrix has the form given in (3.9).

It is remarkable that in localized Slepian-Wolf coding scenario the encoding ofsources with conditioning is carried out by only treating the sources located at therespective cluster. Hence, the source nodes outside the cluster under consideration donot have any influence on the rate allocation, when the rate assignments are done in alocalized manner.

An example of rate allocation problem in a single-sink data gathering scenario withlocalized Slepian-Wolf coding is illustrated in Fig. 6. The network was partitionedinto four subsets C1, C2, C3 and C4 such that they cover all the nodes in the network.The figure illustrates also how the node indices have been reassigned, i.e., the closestnode to the sink node on the SPT in each cluster Cj, j = 1, 2, 3, 4, corresponds to Y j

1,

the second closest to Y j

2and so on. The figure shows clearly the distinctive property

of localized Slepian-Wolf coding scenario: the area where the conditioning is donewhen encoding the sources is smaller than in a global Slepian-Wolf case. Inevitabletrade-off between the amount of information acquirement needed and the rate reduc-tion achieved by encoding the sources arises when choosing the method for Slepian-Wolf coding. Thus, the important system parameters for adjusting the trade-off are theclustering algorithm used and naturally, cluster sizes in the network.

Y 34

N + 1

Y 14

Y 12 Y 1

1

Y 21

Y 22

Y 23

Y 13

Y 31

Y 32Y 3

5

Y 44

Y 43

Y 41

Y 42

Y 33

Y 24

C3

C4

C2

C1

Figure 6. A single-sink data gathering scenario with localized Slepian-Wolf coding.

38

4. SINGLE-SINK DATA GATHERING WIRELESS SENSORNETWORK

In a single-sink data gathering wireless sensor network, densely deployed sensor nodesobserve the phenomenon of interest with an objective to transport all the measureddata to the sink node for further data processing and analyzing. The observations ofspatially proximal sensor nodes are highly correlated leading to a substantial amountof redundancy in data. The degree of correlation is typically inversely proportional tothe spatial separation of sensor nodes. Slepian-Wolf coding of data can completelyremove the redundancy in data without inter-sensor communication therefore being apromising method for data aggregation in energy-scarce wireless sensor networks. [8]

In this chapter, the concept of single-sink correlated data gathering wireless sensornetwork is described. In Section 4.1, a network topology of WSN is defined. Sec-tion 4.2 describes the multi-path routing model that is used for defining multiple routesto deliver the data packets to the sink node in the network. The section introduces flowconservation laws by assuming lossless data transmissions across the links. A com-munication model across the wireless links is defined in Section 4.3 by assuming anorthogonal multiple access scheme leading to a non-interference communications overlinks in the network. The section describes the power allocation scheme used in thesystem and defines the expression for the capacity of link.

4.1. Network Topology

Consider a single-sink data gathering WSN consisting of N sensing nodes and a sinknode that is denoted with N + 1. The WSN can be modeled as directed graph Gwsn =(A,L), where set A = {1, 2, . . . , N,N + 1}, determines the set of node indices i ∈ Aand set L = {1, 2, . . . , L} represents the set of link indices l ∈ L. The set of N sourcenodes is defined as S = {1, 2, . . . , N}, such thatA = S ∪{N +1}. A single-sink datagathering WSN is illustrated in Fig. 7.

Each strictly energy-constrained sensor node i ∈ S is capable of transmitting, re-ceiving and relaying data. Sensor nodes operate in a full-duplex mode by using fre-quency division duplexing (FDD). The sink node receives the data originated fromeach source and has capability of performing the joint decoding of data. Each sensornode i ∈ S has a fixed transmission range dti. The distance between nodes i1 and i2 isdenoted with di1i2 , i1, i2 ∈ A, i1 6= i2. Thus, a directed wireless link from node i1 toi2, denoted with (i1, i2), is available for data transmission, if di1i2 ≤ dti1 . [7]

Network topology with respect to the interactions between the nodes and the linkscan be compactly described with a node-link incidence matrix A ∈ Z(N+1)×L. Anentry ail of the matrix, associated with node i and link l, can be written as [17]

ail =

1, if node i is the start node of link l−1, if node i is the end node of link l

0, otherwise.(4.1)

39

N + 1

Z7

Z4 Z1

Z2

Z3

Z5

Z6

κ67

Ψ

κ47

46

2

2

Ψ

Ψ

Ψ

Ψ

Ψ

Ψ

Ψ

κ214κ2

κ562

25κ2

23κ2

1(N+1)κ2

2(N+1)κ2

κ352

κ122

Ψ

Ψ

Ψ

6r

5r

3r

7r

2r

4r1r

r1 2 7

67

47

46

56

35

23

25

12

2(N+1)

1(N+1)

14

+r r. . .+ +

Figure 7. A single-sink data gathering wireless sensor network.

4.2. Multi-path Routing Model

The routing of data packets across the WSN is assumed to follow a braided multi-pathrouting model. Multi-path routing allows the data from each source to be transmittedto the sink node via multiple paths, thus balancing the traffic load in the network [7].Multi-path routing can provide significant improvements in efficiency, such as in net-work utilization, compared to a traditional routing where the data is routed via singlepaths to the destination [34]. Multiple paths can be either partially or completely dis-joint. Braided multi-path routing is referred to a partially disjoint case where the alter-nate paths are not necessary required to be disjoint [35]. Ganesan et al. [35] showedthat braided technique provides energy savings over disjoint multi-path routing methodin some particular cases.

Let fl ≥ 0 denote the amount of total flow on the link l ∈ L, when the correspondingvector for the network is f = [f1, f2, . . . , fL]T ∈ RL

+. If flow fi1i2 > 0, node i1 hasdecided to send data to node i2 by using link (i1, i2). In addition, each source nodei ∈ S is associated with an external flow ri > 0 that is the source rate of node i. Thesink node is associated with the sink rate rN+1 < 0. Thus, the rate vector for the wholenetwork is r = [r1, r2, . . . , rN , rN+1]T ∈ RN+1.

The concept of lossless data gathering in the network involves that the flow con-servation law has to hold at each node i ∈ A. For each node i ∈ A, the subtraction

40

between all the outgoing flows and incoming flows has to equal the external flow ofnode i. The flow conservation law at each node i ∈ A can be expressed as follows [7]∑

l∈O(i)

fl −∑l∈I(i)

fl =

{ri, if i ∈ S

−∑i∈S ri, if i = N + 1(4.2)

where O(i) denotes the outgoing links of node i and I(i) the incoming links. Theflow conservation law in (4.2) introduces also a quality of service (QoS) requirement,namely all the source rates has to be transported to the sink node, that is given byrN+1 = −∑i∈S ri.

The compact expression for the flow conservation law in the whole network can bewritten as [17]

Af = r, (4.3)

whereA is the node-link incidence matrix described in (4.1).

4.3. Communication model

Throughout this thesis, it is assumed that appropriate and fixed medium access control,coding and modulation schemes are used in order to support a feasible data trafficacross the links in the WSN. Moreover, frequency division multiple access (FDMA) isassumed to be used in the system leading to a non-interfering communications acrossthe links. Based on the assumption, the link capacities are functions of local resourceallocations. The links are single-input, single-output (SISO) links, hence, the multi-antenna communication is not supported. Each sensor node i ∈ S uses fixed bandwidthand time slots, but it can allocate different transmit powers to its outgoing links O(i).The channels are assumed to be AWGN channels with Rayleigh distributed channelcoefficients. A channel noise realization associated with link (i1, i2) is denoted withΥi1i2 in Fig. 7. [17]

The link capacity as a function of transmit power pl ≥ 0 with unit bandwidth al-located for each link l ∈ L and with inverse-square path loss model is given by [36,Ch. 4] [17, 37]

cl = log2

(1 +

(d0

dl

)2κ2l plς2l

), l = 1, 2, . . . , L, (4.4)

where d0 = minl∈L (dl) is a reference distance, dl is the length of link l, ς2l is the

power spectral density of AWGN present at each receiver and κl ≥ 0 is a real-valuedchannel coefficient of link l resulted from taking the real part of the time-varying andinput-independent Rayleigh distributed complex random number. The wireless com-munication links are modeled as Rayleigh flat fading channels by assuming that theperiod of transmitted signal is larger than the multi-path delay spread [37]. Moreover,each time-varying channel power gain κ2

l associated with link l ∈ L is correlated overtime. The noise is assumed to be uniformly distributed across the network, such thatς2l = ς2,∀l ∈ L.

Since the capacity constrained communication links are assumed, the total amountof flow on each link l ∈ L has to satisfy fl ≤ cl in order to achieve successful data

41

transmissions across link l. Each sensor node i ∈ S is limited with the total amount oftransmit power P tot

i that it can allocate to its outgoing links O(i), that is∑l∈O(i)

pl ≤ P toti , i = 1, 2, . . . , N . (4.5)

The total transmit powers associated with each node in the network can be expressedwith vector g = [P tot

1 , P tot2 , . . . , P tot

N , 0]T ∈ RN+1+ , where the last entry corresponds

to the sink node. The total power constraints regarding the entire network can beexpressed as [17]

Bp � g, (4.6)

where B = A+ ∈ Z(N+1)×L+ and p = [p1, p2, . . . , pL]T ∈ RL

+ is the transmit powervector. An element (ail)

+ of A+ is given by (ail)+ = max{0, ail}, thus matrix B

identifies the outgoing links of each node i [17].

42

5. TOTAL TRANSMIT POWER MINIMIZATION

This chapter introduces a framework of total transmit power minimization in single-sink data gathering wireless sensor networks, where the source nodes employ Slepian-Wolf coding for the correlated data obtained by sensing a physical phenomenon ofinterest. Hence, the objective is to gather all the correlated and individual data in thenetwork and deliver it to the sink node for lossless recovering of message in a way tominimize the total usage of the transmit powers of the nodes.

The total power minimization can be motivated by the fact that it achieves overallhighly efficient communications in the network by mostly stressing the sensor nodesthat have good connectivity to the sink node. However, the approach causes unavoid-able failures of the over-stressed nodes due to energy exhaustion. Hence, it is assumedthat the failure of a single node does not have a dramatic influence on the connectivity,coverage and functionality of the correlated data gathering network due to the densedeployment of sensor nodes associated with highly correlated spatial data. This im-plies an extended network lifetime with the meaning that the correlated data is stilltransported to the destination from the majority of the data gathering network for alonger time without renewing the energy throughout the mission. [2, 4]

Since one key requirement for wireless sensor networks is to achieve a relativelyautonomous operation, the joint optimization has to be carried out distributively. Theoptimization framework is carried out with applying cross-layer optimization over thephysical and network layers, that is, the total transmit power minimization is done byjointly optimizing the power allocation and the routing in the network.

The chapter starts with Section 5.1 where the total transmit power minimization isformulated as a convex optimization problem over the power and flow variables. Thestated optimization problem has inherently a necessity of centralized networking tofind the optimal solution. In order to achieve a more applicable procedure for thesolution process in wireless sensor networks, Section 5.2 gives the main principlesof solving a convex optimization problem in a distributed fashion by implementingLagrange dual decomposition technique. The section proposes a distributed algorithmfor solving the total transmit power minimization problem with changing only the opti-mization and Lagrange variables within a proximal neighborhood of each sensor node.

5.1. Centralized Approach for Joint Power and Routing Optimization

The objective is to minimize the total transmit power in the single-sink data gather-ing WSN given by graph Gwsn while guaranteeing that all the individual but correlateddata can be fully recovered at the destination. Source nodes produce data readingsin sensing region leading to samples Zi, i = 1, 2, . . . , N , which are spatially corre-lated. Each random variable of information Zi is generated from a continuous-spacerandom process which is discussed in more detail in Section 3.3.1. By assuming asufficiently small quantization step size and Gaussian random field, as described inSection 3.3.2, the process constitutes a continuous Gaussian distributed random vectorY ∼ N (µ,K), where Y = [Y1, Y2, . . . , YN ]T [10].

Before joint optimization of the power allocation and routing, Slepian-Wolf codingis performed at each source node i ∈ S in order to remove all the redundancy in cor-

43

related data. Hence, the Slepian-Wolf coding part of the correlated data gathering isperformed separately from the joint optimization of the power allocation and routing.Slepian-Wolf coding can be performed with either global SW coding described in Al-gorithm 1 or with localized SW coding described in Algorithm 2, depending on, e.g.,the system requirements and the knowledge about the correlation structure of the data.

After the Slepian-Wolf source coding rates are assigned to the source nodes, the totaltransmit power minimization problem can be expressed as a joint optimization over thepower and flow variables as

minimizepl,fl

∑l∈L

pl

subject to Af = r

fl ≤ log2

(1 +

(d0

dl

)2κ2l plς2

), ∀l ∈ L

Bp � gfl ≥ 0, pl ≥ 0, ∀l ∈ L, (5.1)

where the N first entries of rate vector r belong to the Slepian-Wolf rate region, that isexpressed by means of (2.19) as [3]

RSW=

{[r1, r2, . . . , rN ]T :∀K ⊆ S,

∑i∈K

ri ≥ H(Y K|Y Kc)}, (5.2)

where K is a subset of source nodes in S and Kc denotes the complementary set of K,such that Kc ∪ K = S, K ∩ Kc = ∅.

The optimization problem stated in (5.1) is a convex problem, since it involves theminimization of convex objective function over a set of convex constraints [38, p. 137].More precisely, in order to have a convex optimization problem, three necessary re-quirements have to be satisfied for the problem: the objective of the problem has tobe convex, inequality constraints have to be convex and equality constraints have tobe affine [38, p. 137]. Thus, the convexity of the problem in (5.1) is easily verified,since the objective function is affine, hence also convex, the flow conservation lawconstraint is affine, the capacity and the total power constraints are both convex andthe constraints about the non-negativity of the optimization variables are convex. Theproof of convexity of the capacity constraint can be verified by noting that the loga-rithm on the right hand side is concave with the set of positive numbers [38, p. 71]. Ingeneral, if a function is concave, its negative version is convex [38, p. 67]. The con-straint can be reorganized into form fl + (−cl) ≤ 0, l ∈ L, when it consist of sum ofconvex functions. Since the non-negative weighted sum of convex functions preservesconvexity, the capacity constraint is convex [38, p. 79].

In order to find the solution for the optimization problem in (5.1), the structure of theproblem in its original form dictates that it inevitably requires centralized networking.The objective and the constraints require the global knowledge of the network such asall the flow and power variables occurring at each link, the channel state informationassociated with each link and the source rates associated with each node. To find theoptimal solution in a centralized fashion, a natural approach would be that the sinknode is responsible for gathering the global knowledge about the network and then

44

finding the solution. The centralized solution process would bring substantial amountof overhead dissemination between the source nodes and the sink node. This kind ofinformation flooding and strict reliance of the source nodes on the sink node are not sopractical concepts for the wireless sensor networks.

5.2. Distributed Approach for Joint Power and Routing Optimization

The optimization problem in (5.1) includes a capacity constraint associated with eachlink l ∈ L, that depends on both of the optimization variables pl and fl. The capacityconstraint, acting as a coupling constraint for the problem, can be relaxed by applying apartial dual decomposition with respect to the constraint [16]. By introducing Lagrangemultipliers ν = [ν1, ν2, . . . , νl]

T ∈ RL+ for the links, the following relaxed optimization

problem is obtained:

minimizepl,fl

∑l∈L

[pl + νl

(fl − log2

(1 +

(d0

dl

)2κ2l plς2

))]subject to Af = r

Bp � gfl ≥ 0, pl ≥ 0, ∀l ∈ L (5.3)

Due to the relaxation, the optimization problem in (5.3) is decomposed into twoindependently solvable subproblems: routing problem in the network layer and powerallocation problem in the physical layer. Because this involves cross-layer optimizationacross the protocol stack, it is referred as a vertical decomposition [9]. The objectivefunction, i.e., the partial Lagrangian in (5.3) can be decomposed as

L(f ,p,ν)=Lf (f ,ν) + Lp(p,ν), (5.4)

whereLf (f ,ν) =

∑l∈L

νlfl (5.5)

is the Lagrangian associated with the routing subproblem and

Lp(p,ν) =∑l∈L

[pl − νllog2

(1 +

(d0

dl

)2κ2l plς2

)](5.6)

is the Lagrangian associated with the power allocation subproblem. The associateddual function is of the form

D(ν) = inff

Lf (f ,ν) + infp

Lp(p,ν) (5.7)

with the constraint set in (5.3).Let us denote the optimal flow variables with f ∗ = [f ∗1 , f

∗2 , . . . , f

∗L]T and the optimal

power variables with p∗ = [p∗1, p∗2, . . . , p

∗L]T attained by finding the infimum points for

the dual function in (5.7). Finally, the master dual problem can be written as

maximizeν

D∗(ν)

subject to ν � 0, (5.8)

45

where the objective function is

D∗(ν) = Lf (f∗,ν) + Lp(p

∗,ν). (5.9)

Solving the master dual problem in (5.8) results in the dual optimal points that equalto the primal optimal solutions only if strong duality holds. In general, strong dualityholds if the primal problem is convex. Since this is not always true, there exists con-straint qualifications for establishing such conditions under which strong duality holds.Slater’s conditions is one constraint qualification for checking whether strong dualityholds or not. Slater’s condition holds if a strictly feasible point exists, i.e., if the pointis feasible and the inequality constraints hold with strict inequalities. If Slater’s con-dition holds and the primal problem is convex, strong duality holds for the problem.Since the primal problem in (5.1) is convex and Slater’s condition is assumed to hold,i.e., the inequality constraints hold with strict inequalities, strong duality holds. Thus,solving the dual problem leads to zero duality gap, that is originated from the fact, thatthe best lower bound that can be obtained by maximizing the dual function in (5.9), istight. [38, p. 226]

Due to the convexity of the primal problem and the differentiability of the objec-tive function of the dual problem, the solution for (5.8) can be found by using thegradient projection method [16]. The use of gradient method updates for solving anoptimization problem in a distributed fashion is amenable since it is simple, has lit-tle requirements of memory usage and convenient for parallel implementation [15].The idea of gradient projection method is to update a variable into the direction givenby the positive or negative gradient (corresponding to maximization and minimizationproblem, respectively) and project the resulted point onto the feasible set. The pa-rameter for adjusting the convergence properties of an algorithm is step size. With asufficiently small step size the gradient algorithm is proved to converge to the optimalsolution [15].

The derivative of the objective function in (5.8) with respect to νl, l ∈ L, is

∂ D∗(ν)

∂νl= f ∗l − log2

(1 +

(d0

dl

)2κ2l p∗l

ς2

). (5.10)

The master dual problem is in charge of updating Lagrange variables νl, l ∈ L. TheLagrange variables are updated at each iteration instance t with the gradient projectionmethod as

νl(t+ 1) =

[νl(t) + βν(t)

∂ D∗(ν)

∂νl(t)

]+

, (5.11)

where βν(t) is the step size and [υ]+ denotes the projection on to the set, which con-sists of non-negative elements, such that [υ]+ = max{0, υ}. Since the dual probleminvolves the maximization of the objective, the gradient update is performed into thedirection of the positive gradient.

In order to update dual variables νl, l ∈ L, in (5.11), the optimal flows and powershave to be attained for a given νl. Dual variables νl connect the subproblems to eachother by acting as coordinators for the solution process. Due to the separability ofthe problem, the subproblems can be independently solved in the respective layer byintercommunicating only the dual variables between the layers.

46

Due to the separability of the dual function in (5.7), subproblem for the routing in thenetwork layer involves solving the following convex problem over the flow variables:

minimizefl

∑l∈L

νlfl

subject to Af = r

fl ≥ 0, ∀l ∈ L (5.12)

The problem appears in a form that it requires the global knowledge about the flowsand source rates in the network, that can be seen in the flow conservation constraint.A second-level dual decomposition is employed to obtain a locally solvable networkflow subproblem for each node. By performing a second-level dual decomposition fordistributing the solution process horizontally in the network layer, the relaxed problemof convex form appears as

minimizefl≥0

∑l∈L

(νlfl)

+∑i∈A

λi(aTirf − ri

), (5.13)

where λi, i ∈ A, are the Lagrange multipliers associated with the flow conservationlaw constraint, hence corresponding to vector λ = [λ1, λ2, . . . , λN , λN+1]T ∈ RN+1

+

and air = [ai1, ai2, . . . , aiL]T is the vector of ith row of node-link incidence matrix Agiven in (4.1).

The dual function for minimizing partial Lagrangian Lf (f ,ν,λ) in (5.13) is of thefollowing form:

Df (ν,λ) = inffl

Lf (f ,ν,λ)

= inffl

∑l∈L

(νlfl)

+∑i∈A

λi(aTirf − ri

)(5.14)

After finding the optimal flow variables f ∗ for (5.14), the associated dual problem canbe expressed as

maximizeλ

D∗f (ν,λ)

subject to λ � 0. (5.15)

The dual problem in (5.15) can be solved with the primal-dual algorithm, whichsimultaneously updates the primal and dual variables towards the optimum points [7,9]. The partial derivatives of the objective function in (5.15) with a given ν with respectto fl, l ∈ L, and λi, i ∈ A, are given as

∂Lf (f ,ν,λ)

∂fl= νl + aTlcλ, (5.16)

where alc = [a1l, a2l, . . . , a(N+1)l]T is the vector of lth column of node-link incidence

matrixA given in (4.1) and

∂Lf (f ,ν,λ)

∂λi= aTirf − ri. (5.17)

47

According to the gradient projection method, the flow variables are updated at eachiteration instance t as

fl(t+ 1) =

[fl(t)− αf (t)

(∂Lf (f ,ν,λ)

∂fl(t))]+

(5.18)

and the Lagrange variables as

λi(t+ 1) =

[λi(t) + αλ(t)

(∂Lf (f ,ν,λ)

∂λi(t))]+

, (5.19)

where αf (t) and αλ(t) are the step sizes.Due to the fact that the dual function in (5.7) is separable with respect to the power

and flow variables, the power allocation subproblem in the physical layer can be writtenas a convex problem as

minimizepl

∑l∈L

[pl − νllog2

(1 +

(d0

dl

)2κ2l plς2

)]subject to Bp � g

pl ≥ 0, ∀l ∈ L. (5.20)

Similar to the routing subproblem in (5.12), to solve the power allocation problem inthe network, a global information about the power variables is needed. This can beseen in the objective function having the summation over all the links and in the totalpower constraint of each node. A second-level dual decomposition is performed thatresults in the power allocation subproblem associated with each node and its outgoinglinks. By applying a second-level dual decomposition for distributing the solutionprocess horizontally in the physical layer, the relaxed power allocation subproblemcan be written as

minimizepl≥0

∑l∈L

[pl − νllog2

(1 +

(d0

dl

)2κ2l plς2

)]+∑i∈A

ωi(bTirp− gi

), (5.21)

where ωi, i ∈ A, are the Lagrange multipliers associated with the total power con-straint for each node, hence corresponding to vector ω = [ω1, ω2, . . . , ωN , ωN+1]T ∈RN+1

+ and bir = [bi1, bi2, . . . , biL]T is the vector of ith row of matrixB.The associated dual function is to minimize partial Lagrangian Lp(p,ν,ω) in (5.21),

thus of the following form:

Dp(ν,ω) = infpl

Lp(p,ν,ω)

= infpl

∑l∈L

[pl − νllog2

(1 +

(d0

dl

)2κ2l plς2

)]+∑i∈A

ωi(bTirp− gi

)(5.22)

The corresponding dual problem after attaining optimal power variables p∗ for (5.22)is written as

maximizeω

D∗p(ν,ω)

subject to ω � 0. (5.23)

48

Due to the strict convexity and differentiability of the objective function in (5.21), theoptimal powers p∗ for a given ν can be uniquely found by means of the derivative ofpartial Lagrangian Lp(p,ν,ω) with respect to pl, l ∈ L. In order to find the minimum,the derivative is set equal to zero, when the solution for each link l ∈ L appears as

p∗l = max{

0,νl

ln2(1 + bTlcω

) − 1

γl

}, (5.24)

where blc = [b1l, b2l, . . . , b(N+1)l]T is the vector of lth column of matrixB and γl stands

for the link condition factor, that is given by

γl =

(d0

dl

)2κ2l

ς2. (5.25)

Similar to the routing subproblem in (5.12), the gradient projection method canbe employed to solve the dual problem in (5.23) with the optimal powers attainedwith (5.24). The partial derivative with respect to each Lagrange multiplier ωi, i ∈ A,is expressed as

∂Lp(p∗,ν,ω)

∂ωi= bTirp

∗ − gi. (5.26)

Lagrange multipliers ωi, i ∈ A, are updated at each iteration instance t according tothe gradient projection method as

ωi(t+ 1) =

[ωi(t) + αω(t)

(∂Lp(p∗,ν,ω)

∂ωi(t))]+

, (5.27)

where αω(t) is the step size.At the last iteration of the algorithm regarding the power allocation subproblem, the

powers are recovered with respect to the capacity region to attain feasible solution. Thepower recovery is done locally for each link l ∈ L such that the powers are ensured tosupply the capacity for the flows achieved in the routing subproblem. The recovery ofthe powers is defined by using the capacity expression in (4.4) as

pl =2fl − 1

γl, (5.28)

The distributed algorithm for joint routing and power optimization is summarized inAlgorithm 3. In the initialization part of the algorithm, the sink node gathers the infor-mation about the sum rate in the network. It is worth noting that the remainder of thealgorithm requires only local information exchange of the optimization and Lagrangevariables within the closest neighborhood of each node i ∈ A. The algorithm alsoinvolves that channel state information (CSI) of the outgoing links of each node i ∈ Sis available for each iteration instance t. This requires feedback channel between thetransmitting and the receiving node.

An illustration of the information exchange from the perspective of a node (red) ispresented in Fig. 8. The optimization and Lagrange variables that have to updated ateach iteration are depicted with blue color. Since all the calculation has to be per-formed at the nodes, it is assumed that each node is responsible for the variables asso-ciated with its outgoing links in addition to its associated variables. The purple-colored

49

variables are the ones that the node has to really acquire with the information exchangebetween the neighboring nodes. The variables with black color stand for the constantparameters that are assigned in the initialization phase. The complexity of the algo-rithm is increasing proportionally to the density of the network, that is, to the numberof neighboring nodes and the corresponding link connectivity associated with eachnode in the network.

Algorithm 3 Total transmit power minimization - joint power and routing optimization1. Initialization

a) For each node i ∈ A, choose initial λi ≥ 0 and ωi ≥ 0, and set total powerconstraint gi for i ∈ S

b) For each link l ∈ L, choose initial fl ≥ 0 and νl ≥ 0c) For the sink node, collect rN+1 = −∑i∈S ri

2. Distributed algorithm - At each iteration instance t:I. The routing subproblema) For each node i ∈ A, collect flow variables fl from the links connected

to node ib) For each link (i1, i2), i1, i2 ∈ A, collect λi1 and λi2 from the end nodes

of the linkc) Update fl according to (5.18)d) For each node i ∈ A, update λi according to (5.19) by using the

updated flow variables flII. The power allocation subproblema) For each node i ∈ S, collect power variables pl of the outgoing links of

node ib) For each link l = (i1, i2), i1, i2 ∈ A, collect ωi1 from the start node of

the link and collect link condition factor γlc) For each link l ∈ L, set the optimal powers p∗l according to (5.24)d) For each node i ∈ A, update ωi according to (5.27)III. The master dual problema) For each link l ∈ L, update νl according to (5.11)

3. The recovery of the powersa) At last iteration instance t, recover the powers according to (5.28)

50

f21

X1

X2

X3

X4

X5

X6

f31

f16

f15

f14p14

p15

p16

λ1ω1

r1g1

λ6

λ5

λ4

ν16

ν15

ν14

γ16

γ15

γ14

Figure 8. Information exchange of the variables within the neighborhood of a node.

51

6. MINIMIZATION OF MAXIMUM TRANSMIT POWER

This chapter introduces an alternative framework for transmission power minimizationin single-sink data gathering wireless sensor networks with the Slepian-Wolf coding ofcorrelated data. Similar as in Chapter 5, the objective is to gather all the correlated andindividual data in the wireless sensor network and deliver it to the sink node for loss-less recovering of the messages. Nevertheless, in contrast to the total transmit powerminimization discussed in Chapter 5, the objective of the optimization framework is tominimize the maximum total transmit power of a single node.

The minimization of the maximum (min-max) transmit power is a relevant alter-native objective function for the total transmit power minimization. It introduces atraffic load balancing into the the network by avoiding to over-load the sensor nodesthat have a good connectivity to the sink node. Hence, the data flow can be balancedin the network such that the total transmit power of each node is aspired to maintainin an acceptable level. Since the transmit power is directly proportional to the energyconsumption of the node, the use of the objective function is motivated by the fact thatit essentially introduces network lifetime maximization into the network. Here, thenetwork lifetime is defined as the time from the initial deployment of the network tothe time when the first sensor node runs out its battery [9]. The data gathering sce-nario assumes that all the individual data associated with each sensor node is of equalimportance and vital to be transported to the sink node for further data processing.

The problem involves the joint optimization of the power allocation and the routing,similar to the total transmit power problem. Due to the requirement of distributednetworking of wireless sensor networks, the problem can be solved distributively byapplying Lagrange dual decomposition technique. However, due to the existence ofthe maximum transmit power in the objective, some centralized networking is showedto be required in order to solve the problem with the distributed algorithm.

In Section 6.1, the min-max transmit power problem is formulated as a convex opti-mization problem over the power and flow variables. Similar to the problem in (5.1),the optimization problem appears in a form that it requires global knowledge aboutthe network to find the optimal solution. The problem includes a trade-off parameterthat adjusts the ratio of how much emphasis is given for the total transmit power andhow much for the maximum transmit power in the network. Section 6.2 proposes adistributed algorithm for solving the problem of min-max transmit power that involvesalso some amount of centralized signalling in the network.

6.1. Centralized Approach for Joint Power and Routing Optimization

The objective function of the min-max approach is to minimize the maximum totaltransmit power of a single sensor node in the single-sink data gathering WSN givenby graph Gwsn. In the scenario, the same assumptions as in Chapter 5 are assumed tohold for the system model and for the concept of correlated data gathering. Thus, theoptimization problem consists of the same constraint set as the problem in (5.1). Afterthe Slepian-Wolf source coding rates are assigned to the sensor nodes, the min-max

52

transmit power problem can be expressed as a joint optimization over the power andflow variables as

minimizepl,fl

(1− ϕ)‖Bp‖∞ + ϕ∑l∈L

pl

subject to Af = r

fl ≤ log2

(1 +

(d0

dl

)2κ2l plς2

), ∀l ∈ L

Bp � gfl ≥ 0, pl ≥ 0, ∀l ∈ L, (6.1)

where the N first entries of rate vector r belong to the Slepian-Wolf rate regiongiven in (5.2), ‖·‖∞ denotes the L∞-norm and ϕ ∈ [0, 1] is the trade-off parameterthat adjusts the ratio of how much emphasis in the minimization is given for the to-tal transmit power and how much for the maximum transmit power in the network.In general, the L∞-norm is defined as the maximum row sum that corresponds to‖Bp‖∞ = max{bT1rp, b

T2rp, . . . , b

T(N+1)rp} in the objective [38, p. 636].

The objective function in (6.1) has a twofold objective: Besides the objective isminimizing the maximum transmit power, it also tends to minimize the total transmitpower in the network. The min-max transmit power part is carried out with the L∞-norm with respect to the total power that each node is allocating to its outgoing links.The penalization of the total transmit power has been taken into account by adding thesummation over the power variables to the objective. Precisely, the trade-off parameterϕ is used to regulate the ratio of the two distinct terms in the objective.

The penalty term is added to the objective function, because the basic property of theL∞-norm minimization is that even if the global minimum value is attained, the pointwhere the minimum value is reached, is not necessary unique. In other words, in thepoint where the minimum value is attained, a set of inequality constraints can be inac-tive. Thus, the optimization variables that are associated with the inactive constraintscan vary locally without changing the objective value. As a result, the problem in (6.1)without the penalty term would lead to unnecessary usage of the transmit powers forall the other sensor nodes than the node who consumes the maximum transmit power.In other words, the capacity constraint would not be active for each link. However,the extra energy exhaustion of the nodes would not have any influence on the optimalobjective value but by penalizing the total transmit power, significant improvements interms of the total power can be achieved. Naturally, this is done at a cost of increasedmaximum transmit power. [39]

The optimization problem stated in (6.1) is convex, because it involves the mini-mization of a convex objective function over a set of convex constraints [38, p. 137].The objective consists of the summation over the L∞-norm term, that is convex on Rn

and the affine term. Since the sum of two convex functions preserves the convexity, theobjective is convex [38, p. 72, 79]. The constraint set is the convex set as in the totaltransmit power problem in (5.1).

Solving the problem introduces similar issues than the total transmit power prob-lem in (5.1): due to the same constraint set, centralized processing is needed to gatherglobal information about the system parameters and the variables in the network. Ad-ditionally, the L∞-norm term in the objective inherently requires that all the transmit

53

powers of the nodes have to be globally available for solving the problem. The acquire-ment of the global knowledge causes extra signalling overhead dissemination into thenetwork.

6.2. Distributed Approach for Joint Power and Routing Optimization

A major issue to be considered is the L∞-norm term in the objective in (6.1). For theL∞-norm term, a standard trick can be applied for relaxing it in the objective function.The L∞-norm part of the objective can be expressed in an epigraph form by introducinga new variable into the problem and by taking the original objective term into accountin the constraint set [38, p. 293]. After that, the linear, hence convex, term appearsin the objective function where as the convex constraint is added to the constraint set.The convexity of the problem is preserved with the modifications [38, p. 143].

By introducing a new variable τ ∈ R, called as the epigraph variable henceforward,into the problem and casting the first part of the objective as an epigraph form, theoptimization problem can be reformulated as

minimizepl,fl,τ

τ + ϕ∑l∈L

pl

subject to (1− ϕ)‖Bp‖ � τ1

Af = r

fl ≤ log2

(1 +

(d0

dl

)2κ2l plς2

), ∀l ∈ L

Bp � gfl ≥ 0, pl ≥ 0, ∀l ∈ L, (6.2)

where 1 ∈ ZN+1+ is a column vector consisting of N + 1 elements that are all ones and

‖·‖ denotes the absolute value. The first constraint defines that each absolute value ofthe row sum of Bp scaled with the trade-off parameter ϕ has to be component-wisesmaller or equal than τ . The absolute value operation can be removed, since the powersare explicitly restricted to be non-negative. This indicates implicitly that the parameteris τ ≥ 0. The problem in (6.2) is equivalent to the original problem in (6.1) meaningthat solution (f ∗,p∗, τ ∗) for the epigraph form of the problem is optimal if and only if(f ∗,p∗) is optimal for the original problem [38, p. 134].

In order to solve the problem in (6.2) distributively, the capacity constraint has to berelaxed in the similar way as in the total transmit power minimization problem in (5.3).Moreover, the first constraint introduces a dependence between the power variables andthe epigraph variable. Lagrange multipliers ν = [ν1, ν2, . . . , νl]

T ∈ RL+ are introduced

for the capacity constraint associated with each link l ∈ L and Lagrange multipliersζ = [ζ1, ζ2, . . . , ζN+1]T ∈ RN+1

+ for each node i ∈ A associated with the L∞-norm

54

relaxation constraint. By applying partial dual decomposition, the relaxed problem canbe written as

minimizepl,fl,τ

τ +∑l∈L

[ϕpl + νl

(fl − log2

(1 +

(d0

dl

)2κ2l plς2

))]+

∑i∈A

[ζi((1− ϕ)bTirp− τ

)]subject to Af = r

Bp � gfl ≥ 0, pl ≥ 0, ∀l ∈ Lτ ≥ 0. (6.3)

By applying the partial dual decomposition, the problem is decomposed into threesubproblems that are carried out by optimizing over one variable at a time. Similarto (5.4), the partial Lagrangian can be rewritten such that it involves the power allo-cation optimization in the physical layer and the routing optimization in the networklayer. Additionally, the third subproblem considers the optimization over the epigraphvariable τ . The objective function, i.e., the partial Lagrangian in (6.3) can be decom-posed as

L(f ,p, τ,ν, ζ)=Lf (f ,ν) + Lp(p,ν, ζ) + Lτ (τ, ζ), (6.4)

whereLf (f ,ν) =

∑l∈L

νlfl (6.5)

is the Lagrangian associated with the routing subproblem similar to (5.5) and

Lp(p,ν, ζ) =∑l∈L

[ϕpl − νllog2

(1 +

(d0

dl

)2κ2l plς2

)]+∑i∈A

[ζi((1−ϕ)bTirp

)](6.6)

is the Lagrangian associated with the power allocation subproblem and

Lτ (τ, ζ) = τ(1−∑i∈A

ζi) (6.7)

is the Lagrangian associated with the L∞-norm relaxation. The associated dual func-tion is of the form

D(ν, ζ) = inff

Lf (f ,ν) + infp

Lp(p,ν, ζ) + infτ

Lτ (τ, ζ) (6.8)

with the constraint set in (6.3).Let us denote the optimal flow variables with f ∗ = [f ∗1 , f

∗2 , . . . , f

∗L]T , the optimal

power variables with p∗ = [p∗1, p∗2, . . . , p

∗L]T and the optimal epigraph variable with τ ∗

attained by finding the infimum points for the dual function in (6.8). The master dualproblem can be written as a maximization problem over the dual function, that is

maximizeν,ζ

D∗(ν, ζ)

subject to ν � 0, ζ � 0, (6.9)

55

where the objective function is

D∗(ν, ζ) = Lf (f∗,ν) + Lp(p

∗,ν, ζ) + Lτ (τ∗, ζ). (6.10)

The gradient projection method can be used for finding the solution for (6.9) due tothe convexity of the primal problem in (6.2) and the differentiability of the objectivefunction of the dual problem [16]. The master dual problem is in charge of updatingLagrange variables νl, l ∈ L, and ζi, i ∈ A. The partial derivative of the objectivefunction in (6.9) with respect to νl, l ∈ L, is given in (5.10) and the respective gradientprojection update in (5.11). The partial derivative of the objective function with respectto ζi, i ∈ A, is given as

∂ D∗(ν, ζ)

∂ζi= (1− ϕ)bTirp

∗ − τ ∗. (6.11)

Lagrange variables ζi, i ∈ A, are updated at each iteration instance t with the gradientprojection method as

ζi(t+ 1) =

[ζi(t) + βζ(t)

∂ D∗(ν, ζ)

∂ζi(t)

]+

, (6.12)

where βζ(t) is the step size.In order to update dual variables νl, l ∈ L, and ζi, i ∈ A, in (5.11) and (6.12),

respectively, the optimal flows, powers and epigraph variable have to be attained for agiven ν and ζ. Dual variables νl, l ∈ L, and ζi, i ∈ A, connect the subproblems toeach other by acting as coordinators for the solution process in the higher level. Dueto the separability of the problem, the power allocation and the routing subproblemscan be independently solved in the respective layers by intercommunicating only thedual variables between the layers. However, the update in (6.12) involves centralizedinformation exchange in the network, since the optimal τ ∗ has to be known for eachnode i ∈ A at each iteration instance t. The value has to be disseminated among allthe nodes in the network.

Since the dual function in (6.8) is separable and the partial Lagrangian associatedwith the flow variables appears in the same form as in (5.5), the subproblem for therouting involves exactly the same convex minimization problem as expressed in (5.12).The subproblem can be solved in the network layer by using the gradient projectionupdates given in (5.18) and (5.19) with the partial derivatives given in (5.16) and (5.17),respectively.

By (6.8), the power allocation subproblem in the physical layer can be written as aconvex problem as

minimizepl

∑l∈L

[ϕpl − νllog2

(1 +

(d0

dl

)2κ2l plς2

)]+∑i∈A

[ζi((1− ϕ)bTirp

)]subject to Bp � g

pl ≥ 0, ∀l ∈ L. (6.13)

A second-level dual decomposition is performed to decompose the problem into thelocally solvable power allocation subproblems that are associated with each node and

56

its outgoing links. By applying a second-level dual decomposition for distributingthe solution process horizontally in the physical layer, the relaxed power allocationsubproblem can be written as

minimizepl≥0

∑l∈L

[ϕpl − νllog2

(1 +

(d0

dl

)2κ2l plς2

)]+∑i∈A

[ζi((1− ϕ)bTirp

)]+∑

i∈A

ωi(bTirp− gi

), (6.14)

where ωi, i ∈ A, are the Lagrange multipliers associated with the total power con-straint for each node, hence corresponding to vector ω = [ω1, ω2, . . . , ωN , ωN+1]T ∈RN+1

+ .The associated dual function is to minimize partial Lagrangian Lp(p,ν, ζ,ω) in

(6.14), thus of the following form:

Dp(ν, ζ,ω) = infpl

Lp(p,ν, ζ,ω)

= infpl

∑l∈L

[ϕpl − νllog2

(1 +

(d0

dl

)2κ2l plς2

)]+

∑i∈A

[ζi((1− ϕ)bTirp

)]+∑i∈A

ωi(bTirp− gi

). (6.15)

The corresponding dual problem after attaining the optimal power variables p∗ for(6.15) is written as

maximizeω

D∗p(ν, ζ,ω)

subject to ω � 0. (6.16)

Due to the strict convexity and differentiability of the objective function in (6.14),the optimal powers p∗ for a given ν and ζ can be uniquely found by means of thederivative of partial Lagrangian Lp(p,ν, ζ,ω) with respect to pl, l ∈ L. By settingthe derivative equal to zero, the optimal powers for each link l ∈ L are written as

p∗l = max{

0,νl

ln2(1 + (1− ϕ)bTlcζ + bTlcω

) − ϕ

γl

}. (6.17)

The gradient projection method can be employed to solve the dual problem in (6.16)with the optimal powers attained in (6.17). Since the second-level relaxation was ap-plied similar to the total power minimization problem, Lagrange variables ωi, i ∈ A,are updated according to (5.27) with the partial derivative given in (5.26). To guaranteethat the powers are sufficient to support the capacity for the flows, the power recoveryis employed according to (5.28) at the last iteration instance.

By (6.8), the subproblem associated with finding the optimal epigraph variable τ ∗

for a given ζ involves solving the following linear optimization problem:

minimizeτ≥0

τ(1−∑i∈A ζi) (6.18)

57

Since the objective is differentiable, the gradient projection method can be employedto solve the problem. The derivative of the objective with respect to τ is given as

∂Lτ (τ, ζ)

∂τ= 1−

∑i∈A

ζi. (6.19)

According to the gradient projection method, the epigraph variable is updated at eachiteration instance t as

τ(t+ 1) =

[τ(t)− ατ (t)

(∂Lτ (τ, ζ)

∂τ(t))]+

, (6.20)

where ατ (t) is the step size.The distributed algorithm for the min-max transmit power problem as a joint opti-

mization over the routing and the power allocation is summarized in Algorithm 4. Theinitialization phase of the algorithm requires the sink node to gather the informationabout the sum rate in the network. The power allocation and the routing subproblemsrequire only local information exchange of the optimization and Lagrange variableswithin the closest neighborhood of each node i ∈ A. Also the CSI is needed for eachsensor node at the transmitter side. The subproblem related to the optimization overthe epigraph variable τ and the update of the associated Lagrange variables introduceextra overhead signalling requirement for the algorithm. Every ζi, i ∈ A, has to avail-able to update τ in (6.12) and after that, the updated τ has to be available for each nodeto update ζi, i ∈ A, at each iteration instance t. In this work, it is assumed that the sinknode is responsible for acquiring ζi, i ∈ S , updating τ and communicating it back tothe sensor nodes i ∈ S at each iteration instance t.

The information exchange of the essential optimization variables and the system pa-rameters from the perspective of a node (red) is illustrated in Fig. 9. Respectively, theacquirement of the necessary information with respect to the sink node is presented inFig. 10. The meanings of the colors are analogous to the case in Fig. 8. From the per-spective of the node, the necessity of acquirement of the epigraph variable τ from thesink node brings considerable amount of complexity and overhead increase comparedto the total transmit power case in Algorithm 3. Figure 10 shows how the update of τbrings its own challenge to the network since the sink node has to collect ζi, i ∈ S.This is causing more dramatic increase in the overall complexity in proportion to theincreased network density as compared to the total transmit power case.

58

Algorithm 4 Min-max transmit power - joint power and routing optimization1. Initialization

a) For each node i ∈ A, choose initial λi ≥ 0, ωi ≥ 0 and ζi ≥ 0,set total power constraint gi and ϕ ∈ [0, 1] for i ∈ S

b) For each link l ∈ L, choose initial fl ≥ 0 and νl ≥ 0c) For the sink node, collect rN+1 = −∑i∈S rid) Choose an initial τ ≥ 0 and set it to each node i ∈ A in the network

2. Distributed algorithm - At each iteration instance t:I. The routing subproblema) For each node i ∈ A, collect flow variables fl from the links connected

to node ib) For each link (i1, i2), i1, i2 ∈ A, collect λi1 and λi2 from the end nodes

of the linkc) For each link l ∈ L, update fl according to (5.18)d) For each node i ∈ A, update λi according to (5.19) by using the flow

updated variables flII. The power allocation subproblema) For each node i ∈ S, collect power variables pl of the outgoing links of

node ib) For each link l = (i1, i2), i1, i2 ∈ A, collect ωi1 and ζi1 from the start

node of the link and collect link condition factor γlc) For each link l ∈ L, set the optimal powers p∗l according to (6.17)d) For each node i ∈ A, update ωi according to (5.27)III. The epigraph variable subproblema) Acquire ζi associated with each node i ∈ S for the sink nodeb) Update τ according to (6.20)c) Inform the sensor nodes i ∈ S with the updated value of τIV. The master dual problema) For each link l ∈ L, update νl according to (5.11)b) For each node i ∈ A, update ζi according to (6.12)

3. The recovery of the powersa) At last iteration instance t, recover the powers according to (5.28)

59

f21

X1

X2

X3

X4

X5

X6

f31

f16

f15

f14p14

p15

p16

λ1ω1

r1g1

λ6

λ5

λ4

ν16

ν15

ν14

γ16

γ15

γ14

ζ1τ

Figure 9. Information exchange of the variables within the neighborhood of a node.

X4X2

X1X5

X7

X6X3

X9

X8

Sink

ζ6ζ3

ζ2ζ4

ζ7

ζ9

ζ8ζ5ζ1

τ

Figure 10. Acquirement of the necessary information with respect to the sink node.

60

7. NUMERICAL RESULTS

This chapter gives numerical results about the use of Slepian-Wolf coding in single-sink data gathering network and the functionalities of the proposed distributed algo-rithms. In order to perform numerical simulations in different data gathering scenarios,a simulator was developed in Matlab 7.10.0 (R2010a) software environment.

In Section 7.1, the structure of the simulator is defined. The section determines theprocedures on how the network is created, how the Slepian-Wolf coding is performedand how the data gathering model is finally extended to the wireless sensor network.Section 7.2 provides simulation results related to the data transportation costs achievedwith Slepian-Wolf coding of correlated data. Section 7.3 gives results for the totaltransmit power minimization problem in wireless sensor network scenario. Resultsare provided to show the benefits of use of Slepian-Wolf coding in the network and toevaluate the convergence properties of the distributed algorithm under static and time-varying channels. The influence of Slepian-Wolf coding on the maximum transmitpower is shown in Section 7.4. Additionally, the impact of the trade-off parameteron the system performance is investigated. The section also covers the convergenceanalysis of the corresponding distributed algorithm in static and time-varying channelconditions.

7.1. Structure of the Simulator

The section describes the structure of the simulator that was used throughout the sim-ulations. Firstly, the section defines the procedure for the deployment of the nodes inthe network with different topologies. Then, the Slepian-Wolf rate allocation process,generation of SPT and the clustering of the network are described. Finally, the exten-sion of the data gathering network to the wireless sensor network and the introductionof the related system parameters are defined.

7.1.1. Creation of Network Topology

The construction of the topologies of a single-sink data gathering network and thecorresponding wireless sensor network version are described in parallel in the fol-lowing section. The creation of the system model begins with the deployment of thesource/sensor nodes and the sink node in a particular restricted area. Throughout thesimulations, two different ways of deploying the nodes are used: in the first model,the source/sensor nodes are deployed in a grid form with uniform distances betweenthe nodes where as the sink node is located in the center of the network. The secondmodel consists of deploying the source/sensor nodes in a pseudo-random manner suchthat the minimum distance between two nodes is guaranteed not to go below a specificvalue. This is performed by first dividing the network area into small square areas suchthat the squares are separated from each other with the minimum distance requirement.After that, each source/sensor node is deployed randomly inside its designated squarewhere as the sink node is placed into the center of its designated square, located at thecenter of the network.

61

Once the deployment of the nodes is performed, the network topology is finishedby setting the interconnections between the nodes, i.e., point-to-point links or wirelesslinks, depending on the used scenario. For the single-sink data gathering network, theavailable communication links between the nodes are defined according to a predefinedmaximum distance allowed between a node pair i1, i2 ∈ T , i1 6= i2, and for the WSN,the available links are determined with the transmission range dti of each sensor nodei ∈ S discussed in 4.1. According to the limitations in communication range, node-link incidence matrixA is formed for the WSN scenario to define the interconnectionsbetween the sensor nodes in the network. Finally, the network topology of the single-sink data gathering network can be expressed as graph Gssdgn and the topology of thecorresponding WSN scenario as graph Gwsn. If the topologies are equal, the graphs areequivalent with the correspondences of A , T and L , E .

7.1.2. Rate Allocation Process

The rate allocation process is considered to be performed in the single-sink data gath-ering network model because the process requires only the network topology to beavailable. Hence, the allocated source rates are valid in the both network scenarios. Inorder to assign the Slepian-Wolf rates for the source nodes in the network, the shortestpath tree has to be generated for the given network. The shortest path tree is foundby running the distributed Dijkstra’s algorithm for each source node i ∈ T in thesingle-sink data gathering network given by graph Gssdgn. For calculation of the SPT,an appropriate Matlab function was used [40]. The function gets the node coordinatesand the undirected interconnections between the nodes as inputs and returns the short-est path and the corresponding distance from a starting node to an ending node. Afterthe path weights have been calculated for each node i ∈ T , the sources are assignedthe associated rates by performing Slepian-Wolf coding. Global and localized Slepian-Wolf coding are employed by using Algorithm 1 and Algorithm 2, respectively. For thesake of comparison, independent encoding of sources is performed in some simulationscenarios when the source rates of every node correspond to (2.10).

In the localized Slepian-Wolf coding, the clustering of the source nodes in the net-work is performed in a simple way but such that it allows significant rate reductionswithin each cluster. Hence, in the network scenario, where the nodes are placed in thegrid alignment, the form of each cluster corresponds to either square or rectangular.The clustering method exploits efficiently the correlation structure of the data in thesymmetric network topology when the correlation depends on the spatial separation ofthe nodes. Covariance matrixK defines the amount of correlation in the data betweenthe source nodes. The covariance matrix is defined by using the squared exponen-tial law such that the off-diagonal entries correspond to (3.7) and the diagonal entriesto (3.8). The correlation coefficient θ is the system parameter that is varied to adjusthow fast the correlation is diminishing with the distance between the nodes. Through-out the simulations, the variance of each Gaussian random variable Yi, i ∈ S, wasσ2Y = 1.An example of the grid form network topology with N = 256 source nodes and

with the partitioning of the network into D = 64 clusters is illustrated in Fig. 11(a).Each cluster Cj, j = 1, 2, . . . , D, consists of Nj = 4 source nodes of 2 × 2 grid form

62

which can be distinguished by the different colors. In this particular case, the linksthat are available for data transportation between the source nodes, appear also in agrid form. The minimum distance between the nodes was set to 100 units, and thecommunication link existed if the distance between the nodes was smaller or equalto 100 units. Figure 11(b) illustrates a pseudo-random grid network topology withN = 24 source nodes without network partitioning. The minimum distance betweenthe nodes was guaranteed not to go below 50 units. The communication range was setto d max-min

i1i2×√

2×0.99, where d max-mini1i2

stands for the largest minimum distance amongall the node pairs in the network.

0 500 1000 1500

0

500

1000

1500

(a) The grid form with 256 source nodes and with 2× 2 clusters.

0 100 200 300 400 500 600 7000

100

200

300

400

500

600

700

(b) A pseudo grid form with 64 source nodes.

Figure 11. Examples about the network topologies used in the simulations.

63

7.1.3. Wireless Sensor Network

After the network topology has been generated for WSN given by Gwsn and the ratesare assigned for the source nodes, the essential parameters of wireless sensor networkare added to the model to achieve the single-sink data gathering WSN scenario. Theparameters involved in the model are described in more detail in Chapter 4.

For each directed wireless link (i1, i2), i1, i2 ∈ A, i1 6= i2, Gaussian noise powerand Rayleigh distributed channel coefficient are introduced. Gaussian noise powerς2 remains the same in the whole network. Rayleigh distributed channel coefficientsκl, l ∈ L, are generated with the particular simulator in which the time correlation andfading rate properties can be adjusted. Channel generation procedure creates differentchannel initialization for each link, and then, the fading rate is adjusted by varying thecoherence time. Each sensor node i ∈ S is assigned the total transmit power constraintP toti by defining vector g for the network. The resulted rates in 7.1.2 are addressed to

the corresponding source nodes by introducing vector r.

7.2. Data Transportation Costs in Single-Sink Data Gathering

In the following section, simulation results are provided to show the influence ofSlepian-Wolf coding of data on the cost function of data transportation. Both globaland local Slepian-Wolf coding were performed with varying values of correlation co-efficient θ and for localized Slepian-Wolf coding, different cluster sizes Nj were used.The simulations were also made for different network sizes by varying N . The costfunction for data transportation was calculated according to (3.3), where the pathweights wSPT

i for each node i ∈ T are determined with Dijkstra’s algorithm by us-ing the edge weights we = d2

e, e ∈ E given in (3.13). The source nodes were deployedin a grid form in the network such that the minimum distance between each sourcenode pair was 100 units. The link between two nodes existed if the distance betweenthe nodes was smaller or equal to 100 units.

Figure 12 shows the influence of cluster size of the localized Slepian-Wolf codingand the value of correlation coefficient on the data transportation costs. The cost ra-tio between the global and localized Slepian-Wolf coding (CRGL) is defined with theobjective functions in (3.14) and (3.10) as

CRGL =Costs for global SW

Costs for localized SW=

N∑i=1

RGlobi wSPT

i

D∑j=1

Nj∑i=1

Rj,Loci

wj,SPTi

, (7.1)

where RGlobi is the source rate of node i achieved with the global Slepian-Wolf coding

and Rj,Loci

is the source rate of node i achieved with localized Slepian-Wolf coding.To analyze also the impact of the network size, three different network sizes, N = 16,N = 100 and N = 256, were simulated. The correlation coefficient took values inθ = [3.5 × 10−5, 1.2 × 10−4]. The cluster sizes correspond to Nj = 2, 4, 8, 16 forN = 16, Nj = 2, 4, 25, 50, 100 for N = 100 and Nj = 2, 4, 16, 64, 128, 256

64

for N = 256. Thus, the last entry of each cluster size set corresponds to the globalSlepian-Wolf scenario resulting CRGL to be unity.

The simulation results in Fig. 12 show that when the correlation between the datareadings is high, the global Slepian-Wolf coding clearly outperforms the localized ver-sion. This is due to the fact that with highly correlated data, in global Slepian-Wolfcoding there are more nodes on which to condition when calculating the respectiverate of a node. Hence, the "ignored" nodes outside each cluster in localized Slepian-Wolf coding are causing substantial penalty in terms of rate reduction which directlyinfluences the cost function in (7.1). The impact of the cluster size is also clearly seenin the figure: the larger the cluster sizes, the better the localized Slepian-Wolf codingis performing. When the correlation decreases, the use of Slepian-Wolf coding is notanymore outstandingly exploiting the correlation structure of the data, regardless ofthe number of nodes in conditioning. That is why the localized Slepian-Wolf codingappears to perform proportionally better since the ratio is approaching unity.

The figure also provides information about the relationship between the network sizeand the cluster sizes. When the network size is increased while the cluster size andthe correlation coefficient are kept fixed, the cost ratio is decreasing, meaning poorerperformance for the localized Slepian-Wolf coding. For instance, there are 8 clustersof size 2 for N = 16 and 128 for N = 256, respectively. Since the proportion of thenumber of the clusters is much higher for the latter case, the overall rate reduction isdramatically smaller which is causing the increase in the cost function. However, withlarger cluster sizes the impact of network size on the cost ratio is diminishing.

65

2 4 6 8 10 12 14 160.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Number of nodes in each cluster ( Nj )

Cos

t rat

io Σ

(R*w

) glob

al /

Σ(R

*w) lo

cal

θ = 3.50e−005

θ = 4.00e−005

θ = 4.50e−005

θ = 5.00e−005

θ = 5.50e−005

θ = 6.00e−005

θ = 6.50e−005

θ = 7.00e−005

θ = 7.50e−005

θ = 8.00e−005

θ = 8.50e−005

θ = 9.00e−005

θ = 9.50e−005

θ = 1.00e−004

θ = 1.05e−004

θ = 1.10e−004

θ = 1.15e−004

θ = 1.20e−004

(a) N=16.

0 10 20 30 40 50 60 70 80 90 100

0.4

0.5

0.6

0.7

0.8

0.9

1


Cos

t rat

io Σ

(R*w

) glob

al /

Σ(R

*w) lo

cal

θ = 3.50e−005

θ = 4.00e−005

θ = 4.50e−005

θ = 5.00e−005

θ = 5.50e−005

θ = 6.00e−005

θ = 6.50e−005

θ = 7.00e−005

θ = 7.50e−005

θ = 8.00e−005

θ = 8.50e−005

θ = 9.00e−005

θ = 9.50e−005

θ = 1.00e−004

θ = 1.05e−004

θ = 1.10e−004

θ = 1.15e−004

θ = 1.20e−004

(b) N=100.

0 50 100 150 200 250

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Cos

t rat

io Σ

(R*w

) glob

al /

Σ(R

*w) lo

cal

θ = 3.50e−005

θ = 4.00e−005

θ = 4.50e−005

θ = 5.00e−005

θ = 5.50e−005

θ = 6.00e−005

θ = 6.50e−005

θ = 7.00e−005

θ = 7.50e−005

θ = 8.00e−005

θ = 8.50e−005

θ = 9.00e−005

θ = 9.50e−005

θ = 1.00e−004

θ = 1.05e−004

θ = 1.10e−004

θ = 1.15e−004

θ = 1.20e−004

(c) N=256.

Figure 12. Data transportation cost ratios between global and localized Slepian-Wolfcoding with different correlation coefficients and cluster sizes.

66

Figure 13 illustrates the data transportation cost ratio between independent encodingof the sources and Slepian-Wolf coding with different cluster sizes and with varyingcorrelation coefficient values. The cost ratio between localized Slepian-Wolf codingand independent encoding (CRLI) is defined as

CRLI =Costs for localized SW

Costs for independent encoding=

D∑j=1

Nj∑i=1

Rj,Loci

wj,SPTi

N∑i=1

RIndi wSPT

i

, (7.2)

where RIndi is the source rate of node i achieved with independent encoding. The cost

ratios were plotted with the same network sizes and cluster sizes as in the case pre-sented in Fig 12. The correlation coefficient attained values from θ = [3.5×10−5, 2.0×10−4].

Figure 13 shows a critical issue related to Slepian-Wolf coding that is the relianceon the data correlation. If the correlation between the data associated with each sourcenode is sufficiently high, the Slepian-Wolf coding is not providing observable bene-fits in terms of cost effective data transportation. The second remarkable issue is thatwhen the network consists of a large amount of highly correlated source nodes, the useof Slepian-Wolf coding with also smaller cluster sizes is profitable against the inde-pendent encoding. For instance, with the network size of N = 256, with correlationcoefficient θ = 3.5× 10−5 and with the cluster size of Nj = 4, the Slepian-Wolf cod-ing improves the energy efficiency about 25 per cent. When the global Slepian-Wolf isused in the corresponding case, as much as about 77 per cent of decrease is achieved.

67

0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

x 10−4

0.5

0.6

0.7

0.8

0.9

1

Correlation coefficient ( θ )

Cos

t rat

io Σ

(R*w

) SW

/ Σ(

R*w

) ind

Localized SW with Nj = 2



Global SW

(a) N=16.

0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

x 10−4

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Cos

t rat

io Σ

(R*w

) SW

/ Σ(

R*w

) ind





Global SW

(b) N=100.

0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

x 10−4

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Cos

t rat

io Σ

(R*w

) SW

/ Σ(

R*w

) ind






Global SW

(c) N=256.

Figure 13. Data transportation cost ratios between independent encoding and Slepian-Wolf coding with different correlation coefficients and cluster sizes.

68

7.3. Total Transmit Power Minimization

The section covers simulation results related to the total transmit power minimizationproblem in (5.1). Simulation results are provided to show the advantages of Slepian-Wolf coding in single-sink data gathering WSN in terms of the total transmit power.Also, the convergence properties of the proposed distributed algorithm summarizedin Algorithm 3 are shown for both static and time-varying channel conditions. Thesimulation scenario consisted of a grid form topology with the minimum distance be-tween nodes of 100 units. The communication range of each sensor node was set to0.99×100×

√2 meaning the existence of vertical and horizontal links only. The noise

variance was set to ς2 = 0.01. The total transmit power constraint for each node i ∈ Swas set to g(i) = 1.0× 103.

7.3.1. Total Transmit Power with Slepian-Wolf Coding

Total transmit power in the single-sink data gathering WSN was studied by finding thesolution for the total transmit power minimization problem in (5.1). The sources wereencoded with the global Slepian-Wolf coding, with localized Slepian-Wolf coding andwith independent encoding. The solutions for the problem were found in centralizedmanner with CVX software [41]. Total transmit power ratio between Slepian-Wolfcoding and independent encoding (TPRSW_Ind) was investigated which is defined as

TPRSW_Ind =Total transmit power with SW coding

Total transmit power with independent encoding=

∑l∈L

pSWl∑

l∈L

pIndl

, (7.3)

where pSWl stands for the transmit power of link l when Slepian-Wolf coding is em-

ployed where as pIndl is the corresponding power with independent encoding.

The simulations were carried out by using varying network and cluster sizes. Thecombinations of the network and cluster sizes were N = 8 with Nj = 2, Nj = 4 andNj = 8, N = 16 with Nj = 2, Nj = 4, Nj = 8 and Nj = 16, and N = 24 withNj = 2, Nj = 6, Nj = 12 and Nj = 24. The total transmit power ratio betweenSlepian-Wolf coding and independent encoding with different correlation coefficientsand cluster sizes is illustrated in Fig. 14. The ratios are averaged over 50 channelinitializations. Correlation coefficient was set such that θ = [3.5× 10−5, 2.3× 10−4].

69

0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2

x 10−4

0.4

0.5

0.6

0.7

0.8

0.9

1


Tot

al tr

ansm

it po

wer

rat

io Σ

p SW

/ Σp

Ind



Global SW

(a) N = 8.

0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2

x 10−4

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Tot

al tr

ansm

it po

wer

rat

io Σ

p SW

/ Σp

Ind




Global SW

(b) N = 16.

0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2

x 10−4

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Tot

al tr

ansm

it po

wer

rat

io Σ

p SW

/ Σp

Ind




Global SW

(c) N = 24.

Figure 14. Total transmit power ratios between Slepian-Wolf coding and independentencoding with different correlation coefficients and cluster sizes.

70

The figure illustrates how beneficial the use of Slepian-Wolf coding is to reduce thetotal transmit power required to transmit all the individual data to the sink node. Withhigh correlation, the Slepian-Wolf coding provides energy savings even with smallcluster sizes – the extreme one representing the pair of nodes – by reducing the to-tal rate occurring in the network. With the increasing network size, the advantagebecomes more obvious, since the larger amount of sources involved in Slepian-Wolfcoding provides valuable rate reduction associated with the nodes lying at the extremeof the network. Hence, the total amount of data transmitted from the far distances isdecreased. This will directly influence the transmit power usage of the nodes that areacting as intermediate nodes for the data delivery. For instance, with the network sizeof N = 24, with correlation coefficient θ = 3.5 × 10−5 and with the cluster size ofNj = 2, TPRSW_Ind of about 0.43 is achieved that is a noticeable improvement. Withthe global Slepian-Wolf, the difference is enormous in the corresponding case, sinceover 98 per cent of decrease in the total transmission power is achieved.

7.3.2. Convergence of the Distributed Algorithm

The convergence properties of the distributed algorithm described in Algorithm 3 wasstudied in static and time-varying channel conditions. The initial values for the opti-mization and Lagrange variables were set to zero. For the gradient updates, diminish-ing step sizes were used. Diminishing step size is defined with the conditions that itconverges to zero and sums up to infinity when the number of iterations approach infin-ity [42, p. 32]. The diminishing step sizes used in the simulations were αf = 4.0/

√t,

αλ = 0.3/√t, αω = 0.3/

√t and βν = 0.3/

√t. The sources were encoded with the

global Slepian-Wolf with a fixed correlation of θ = 3.5× 10−5.The convergence properties of the algorithm in static channel conditions were eval-

uated by inspecting the feasibility of the solutions in the scenario that consisted of 16source nodes deployed in the grid form. The number of iteration instances was set to5000 for each channel initialization. The number of channel initializations was set to50. The indicators used for the feasibility checks were the normalized duality gap andthe amount of normalized violations of the flow conservation law and the capacity con-straint, calculated at the last iteration instance. These constraints were the most criticalones in the optimization. The total power constraint for each node i ∈ S was set soloosely that it did not introduce any constraint violations to the solutions. The nor-malized duality gap was defined as the regular duality gap normalized with the primalvalue achieved with the CVX software to attain the percentage differences of the so-lutions. The normalized constraint violations were calculated with respect to L1-normas

FCLviol =∑i∈S

∥∥∥aTirf − riri

∥∥∥1

CCviol =∑l∈L

∥∥∥fl − clfl

∥∥∥1, if (cl − fl) < 0, (7.4)

where ‖·‖1 stands for the L1-norm, FCLviol stands for the flow conservation law viola-tion and CCviol stands for the capacity constraint violation. Thus, FCLviol is a measure

71

of the unbalance in the flows involved at each node and CCviol tells how much the flowsexceed the capacity on each link.

The normalized duality gaps achieved for different channel initializations are pre-sented in Fig. 15(a) and the associated constraint violations in Fig. 15(b). In Fig. 16,the normalized differences on the total transmit powers between the solutions achievedin centralized and distributed algorithms are presented. Hence, the figure representsthe percentage difference in terms of the objective function, i.e, it tells how well thedistributed algorithm is working regarding the total transmit power minimization (un-der the conditions that the feasibility of the solution is in an acceptable level). Also,the evolution of duality gap was plotted for a single channel initialization as a functionof the iteration instances. This is shown in the semi-logarithmic scale in Fig. 17.

72

0 5 10 15 20 25 30 35 40 45 500

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

x 10−4

Channel initialization

Nor

mal

ized

dua

lity

gap

at th

e la

st it

erat

ion

(a) Duality gaps.

0 5 10 15 20 25 30 35 40 45 500

0.001

0.002

0.003

0.004

0.005

0.006

0.007

0.008

0.009

0.01


Con

stra

int v

iola

tion

at th

e la

st it

erat

ion

FCLviol

= || (A*f−r)/r ||1

CCviol

= || (f−c)/f ||1, if (c−f)<0

(b) Constraint violations.

Figure 15. Feasibilities of the solutions for N = 16 with 50 channel initializations.

73

0 5 10 15 20 25 30 35 40 45 500

1

2

x 10−4


Diff

eren

ce o

n th

e to

tal t

rans

mit

pow

ers

| Σ

p lopt −

Σ p

ldist

| / Σ

plop

t

Figure 16. Difference on the total transmit powers between the centralized and dis-tributed solutions for N = 16 with 50 channel initializations.

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

10−6

10−4

10−2

100

102

Iteration instance

Nor

mal

ized

dua

lity

gap

Figure 17. Normalized duality gap as a function of iteration instances for a channelinitialization.

74

Figures 15 and 16 illustrate that the proposed algorithm is converging near to opti-mal solutions with relatively small violation of the constraints in a given deterministicnetwork topology with static channel conditions. The achieved normalized dualitygaps and the differences between the objective values were on the order of 10−4 andthe normalized constraint violations were below one per cent on average. These arenegligible inaccuracies in terms of the proper operation of the algorithm. As a result,the accuracy of the solutions achieved with the distributed algorithm can be consideredto be sufficient for the joint optimization framework.

The tracking ability of the proposed algorithm was tested under time-varyingRayleigh fading channel conditions with the same indicators for the convergence asin the static case. The tracking ability of the algorithm was studied with different nor-malized coherence times of the channels. In general, the coherence time Tc can beapproximated to be the inverse of Doppler spread Ds for small scale fading, hencegiven by [43, p. 16, 41]

Tc ≈1

Ds=

(2fcvr

c

)−1

, (7.5)

where fc is the carrier frequency, vr is the velocity of receiver in m/s and c is the speedof light. The definition assumes that only the receiver is moving. Normalized coher-ence time Tc,norm was defined by normalizing the coherence time by the duration of theiteration instances, denoted with Ts, such that Tc,norm = Tc/Ts. Thus, the normalizedcoherence time is indicating the rate of fading during the convergence process of thealgorithm.

The tracking properties of the algorithm with different normalized coherence timeswas studied by checking the solution feasibilities after 5000 iterations. The feasibili-ties were averaged over 25 channel initializations per each normalized coherence time.The network consisted of N = 8 source nodes deployed in the grid form. The nor-malized coherence times used were Tc,norm = [0.0022, 0.0045, 0.0075, 0.0150, 0.0225,0.0450, 0.0750, 0.2250]. The normalized duality gaps achieved at the last iterationwith different normalized coherence times are presented in Fig. 18(a) and the associ-ated constraint violations in Fig. 18(b).

Figure 18 shows that the algorithm tracks the solution relatively well under Rayleighslow fading channel down to normalized coherence time of Tc,norm = 0.0075. After thatwhen the channel realizations are changing more rapidly between consecutive iterationinstances, the algorithm fails to track the solution. This can be seen in a dramaticincrease in the feasibilities of the solutions: for instance, with Tc,norm = 0.0045, FCLviol

is around 5 per cent and with Tc,norm = 0.00225, it increases rapidly to about 13 percent. The ability to track under slow fading time-varying Rayleigh channel introducesan important property for the algorithm from the view of practical WSN scenarios.

75

0 0.05 0.1 0.15 0.20

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

0.018

0.02

Normalized coherence time (Tc,norm

)

Nor

mal

ized

dua

lity

gap

at th

e la

st it

erat

ion

(a) Duality gaps.

0 0.05 0.1 0.15 0.20

0.02

0.04

0.06

0.08

0.1

0.12

0.14


)

Con

stra

int v

iola

tion

at th

e la

st it

erat

ion

inst

ance

FCLviol

= || (A*f−r)/r ||1

CCviol

= || (f−c)/f ||1, if (c−f)<0


Figure 18. Feasibilities of the solutions forN = 8 with different normalized coherencetimes.

76

7.4. Minimization of Maximum Transmit Power

The section covers simulation results related to the min-max transmit power problemin (6.1). Simulation results are conducted to show the advantages of Slepian-Wolf cod-ing in single-sink data gathering WSN in terms of resulted maximum transmit poweramong the sensor nodes. The impact of the trade-off parameter ϕ on the energy effi-ciency with respect to the total and maximum transmit powers was also investigated.The convergence properties of the proposed distributed algorithm described in Algo-rithm 4 are shown in static and time-varying channel conditions. The noise variancewas set to ς2 = 0.01. The total transmit power constraint for each node i ∈ S was setto g(i) = 1.0× 103.

7.4.1. Maximum Transmit Power with Slepian-Wolf Coding

The influence of Slepian-Wolf coding on the maximum transmit power needed in thesingle-sink data gathering wireless sensor network was studied by finding the solutionfor the min-max transmit power problem in (6.1). The sources were encoded with theglobal Slepian-Wolf coding, with localized Slepian-Wolf coding and with independentencoding. The solutions for the problem were found in a centralized manner withthe CVX software. Maximum transmit power ratio between Slepian-Wolf coding andindependent encoding (MPRSW_Ind) was investigated which is defined as

MPRSW_Ind =Max. transmit power with SW coding

Max. transmit power with independent encoding=‖BpSW‖∞‖BpInd‖∞

,

(7.6)where pSW stands for the transmit power vector when Slepian-Wolf coding is employedwhere as pInd corresponds to the transmit power vector with independent encoding.

The simulations were carried out by using varying network and cluster sizes. Thecombinations of the network and cluster sizes were N = 8 with Nj = 2, Nj = 4 andNj = 8, N = 16 with Nj = 2, Nj = 4, Nj = 8 and Nj = 16, and N = 24 with Nj =2, Nj = 6, Nj = 12 and Nj = 24. The network topology was the grid form and theminimum distance between nodes was set to 100 units. The communication range ofeach sensor node was set to 0.99×100×

√2. The trade-off parameter ϕ was kept fixed

with the value of 0.001 to have the most of the emphasis on the minimization of themaximum transmit power in the objective. Maximum transmit power ratios betweenSlepian-Wolf coding and independent encoding with different correlation coefficientsand cluster sizes are illustrated in Fig. 19. The ratios were averaged over 50 channelinitializations. Correlation coefficient was set such that θ = [3.5× 10−5, 2.3× 10−4].

77

0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2

x 10−4

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1


Max

imum

tran

smit

pow

er r

atio

||B

p|| ∞S

W /

||Bp|

| ∞ind



Global SW

(a) N = 8.

0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2

x 10−4

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Max

imum

tran

smit

pow

er r

atio

||B

p|| ∞S

W /

||Bp|

| ∞ind




Global SW

(b) N = 16.

0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2

x 10−4

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Max

imum

tran

smit

pow

er r

atio

||B

p|| ∞S

W /

||Bp|

| ∞ind




Global SW

(c) N = 24.

Figure 19. Maximum transmit power ratios between Slepian-Wolf coding and inde-pendent encoding with different correlation coefficients and cluster sizes.

78

It can be deduced in the figure that the use of Slepian-Wolf coding provides substan-tial decrease in the maximum transmit power of a single node in the network. Withhigh correlation, the reduction in the total source rate caused by Slepian-Wolf cod-ing is considerable, leading to such large amount of energy savings in the network.Naturally, the benefit of Slepian-Wolf coding becomes more obvious with the increas-ing correlation and with the larger network and cluster sizes. For instance, the globalSlepian-Wolf coding with N = 24 and θ = 3.5× 10−5 is introducing over 90 per centdecrease in the maximum transmit power compared to the corresponding case withthe independent encoding. This will have a significant impact on the maximum en-ergy consumption among the source nodes which will lead to the prolonged networklifetime.

7.4.2. The Trade-off between Total and Maximum Transmit Powers

The impact of the trade-off parameter ϕ on the energy efficiency with respect to thetotal and maximum transmit powers was investigated. The simulation scenario con-sisted of N = 8 source nodes that were deployed pseudo-randomly in the network asdescribed in Section 7.1.1. The minimum distance between the nodes was guaranteednot to go below 50 units. The communication range of each sensor node was set tod max-mini1i2

×√

2× 1.2. Slepian-Wolf coding of sources was performed with a fixed cor-relation of θ = 8.0 × 10−5 in the global fashion. The trade-off between the total andmaximum transmit powers was evaluated by solving the problem in (6.1) with vary-ing trade-off parameter values, and then, by comparing the total and maximum powersto the corresponding powers achieved by solving the problem in (5.1). The solutionswere found with the CVX software. Hence, the total power ratio (TPRMMP_TPM) andthe maximum power ratio (MPRMMP_TPM) were defined as

TPRMMP_TPM =Total transmit power with MMPTotal transmit power with TPM

=

∑l∈L

pMMPl∑

l∈L

pTPMl

(7.7)

MPRMMP_TPM =Maximum transmit power with MMPMaximum transmit power with TPM

=‖BpMMP‖∞‖BpTPM‖∞

, (7.8)

where MMP refers to the min-max transmit power problem and TPM for the totaltransmit power minimization problem, respectively.

Figure 20 illustrates the trade-off between MPRMMP_TPM and TPRMMP_TPM for thetrade-off parameter values of ϕ = [1.0×10−4, 1.0×10−3, 1.0×10−2, 1.0×10−1, 0.2,0.4, 0.6, 0.8, 0.95, 0.99, 0.999, 0.9999]. The ratios were averaged over 100 pseudo-random network drops. When the trade-off value ϕ in (6.1) is small, more emphasisis given to the minimization of the maximum transmit power of single node in thenetwork and less to the total transmit power, and vice versa.

It can be seen in the figure that with small trade-off parameter values, the decreasein the maximum transmit power is proportionally much larger than the correspondingincrease in the total transmit power. For instance, by solving the min-max transmitpower problem with ϕ = 0.4, MPRMMP_TPM of about 0.81 and TPRMMP_TPM of 1.02

79

were achieved. This means that with the slight increase in the total transmit power,almost 19 per cent decrease was achieved in terms of the maximum transmit power.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.75

0.8

0.85

0.9

0.95

1

1.05

1.1

ϕ

Rat

io

MPR = ||BpMMP||∞ / ||BpTPM||

∞

TPR = ΣpMMP / ΣpTPM

Figure 20. The ratios for the total and maximum transmit powers for N = 8 withdifferent trade-off parameter values.

7.4.3. Convergence of the Distributed Algorithm

The convergence properties of the distributed algorithm summarized in Algorithm 4were studied under static and time-varying Rayleigh fading channel conditions. Theinitial values for the optimization and Lagrange variables were set to zero. For thegradient updates, diminishing step sizes were used. The diminishing step sizes wereαf = 2.5/

√t, αλ = 0.3/

√t, αω = 0.3/

√t, ατ = 1.1/

√t, βζ = 0.7/

√t and

βν = 0.7/√t. The sources were encoded with the global Slepian-Wolf with a fixed

correlation of θ = 3.5×10−5. A fixed value for the trade-off parameter was consideredthroughout the simulations and it was set to ϕ = 0.8. The simulations for the conver-gence analysis were carried out with having N = 8 source nodes deployed in the gridform in the network.

The convergence analysis in static channel conditions was performed by setting thenumber of iteration to 5000 for each channel initialization. The channel initializationswas set to 100. The feasibilities of the solutions were evaluated by calculating theamount of constraint violations according to (7.4) and by measuring the normalizedduality gap between the solutions achieved with the CVX software and the proposedalgorithm.

The normalized duality gaps achieved for different channel initializations are pre-sented in Fig. 21(a) and the respective constraint violations in Fig. 21(b). In Fig. 22,the normalized differences on the objective functions achieved in centralized and dis-tributed fashions are illustrated to show the attainment of the objective with the dis-

80

tributed algorithm with given constraint violations. For a random channel initializa-tion, the duality gap was measured as a function of the iteration instances that is shownin Fig. 23 in the semi-logarithmic scale.

10 20 30 40 50 60 70 80 90 1000

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08


Nor

mal

ized

dua

lity

gap

at th

e la

st it

erat

ion

(a) Duality gaps.

10 20 30 40 50 60 70 80 90 1000

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

0.018


Con

stra

int v

iola

tion

at th

e la

st it

erat

ion

FCLviol

= || (A*f−r)/r ||1

CCviol

= || (f−c)/f ||1, if (c−f)<0


Figure 21. Feasibilities of the solutions for N = 8 with 100 channel initializations andwith ϕ = 0.8.

81

10 20 30 40 50 60 70 80 90 100

1

1.01

1.02

1.03

1.04

1.05

1.06

1.07

1.08


Rat

io b

etw

een

the

obje

ctiv

e fu

nctio

ns (

dis

tr. /

cen

tr. )

Figure 22. The ratio between the objective functions achieved with centralized anddistributed fashions with ϕ = 0.8.

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

10−2

10−1

100

101

102

Iteration instance

Nor

mal

ized

dua

lity

gap

Figure 23. Normalized duality gap as a function of iteration instances.

82

According to Fig. 21(a) and Fig. 21(b), the proposed algorithm is converging suf-ficiently near to optimal solutions with relatively small violation of the constraints ina given deterministic network topology with static channel conditions. In Fig. 21, thenormalized duality gaps are on the order of few per cents and the normalized con-straint violations are below one per cent on average. Hence, the algorithm can be saidto work properly under the simulation conditions to provide near to optimal and fea-sible solutions for the optimization problem. However, there is a slight difference onthe objective functions achieved with the distributed algorithm compared to the primaloptimal solutions which can be seen in Fig. 22. The most of the cases, the distributedalgorithm has yielded to the objective values that are greater than the correspondingprimal optimal ones. Hence, the algorithm experiences a small performance drop interms of minimizing the objective function.

In addition to the static channel conditions, the proposed algorithm was tested undertime-varying Rayleigh fading channel conditions. The functionality under time vary-ing channel conditions was evaluated with the same indicators for the solution feasi-bilities as in the static case. The tracking ability of the algorithm was studied by usingthe same normalized coherence times as for the TPM in Section 7.3.2. The trackingwas studied during 10000 iterations for each normalized coherence time by averagingover 100 channel initializations. The diminishing step sizes were set to diminish aftereach iteration instance as usual until t = 5000 after which each step size is remainedfixed with its current value. The modification will improve the tracking of the algo-rithm especially with relatively small coherence times since the gradient updates resultin more significant changes with respect to the variables.

The normalized duality gaps achieved at the last iteration with different normalizedcoherence times are presented in Fig. 24(a) and the respective constraint violations inFig. 24(b). Figure 24 shows that the algorithm tracks the solution in terms of acceptablelevel of the duality gap down to coherence times of Tc,norm = 0.045. This is muchworse than for the TPM algorithm that can be seen in 18(a). However, the constraintviolations remain in acceptable level also with the smaller coherence times. With thecoherence times smaller than Tc,norm = 0.0075, the algorithm can be considered to havelead to infeasible solutions by having the constraint violation of over 5 per cent withthe normalized duality gap of 0.29.

83

0 0.05 0.1 0.15 0.20

0.05

0.1

0.15

0.2

0.25

0.3

0.35


)

Nor

mal

ized

dua

lity

gap

at th

e la

st it

erat

ion

(a) Duality gaps.

0 0.05 0.1 0.15 0.20

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16


)

Con

stra

int v

iola

tion

at th

e la

st it

erat

ion

inst

ance

FCLviol

= || (A*f−r)/r ||1

CCviol

= || (f−c)/f ||1, if (c−f)<0


Figure 24. Feasibilities of the solutions forN = 8 with different normalized coherencetimes.

84

8. DISCUSSION

The purpose of this thesis has been to examine distributed transmit power minimiza-tion in single-sink data gathering wireless sensor networks by utilizing cross-layer op-timization. A lossless distributed source coding method, Slepian-Wolf coding, wasperformed in order to remove all the redundancy in correlated data. SW coding wasperformed in global and localized fashion, depending on the degree of the correlationof data available in the network. Lagrange dual decomposition technique was appliedto derive two distributed algorithms which jointly optimize the power allocation andthe routing. The algorithms had different frameworks as optimization criteria. Thesewere the total transmit power minimization and the minimization of the maximumtransmit power of a sensor node in the network.

The simulation results showed the benefits of using Slepian-Wolf coding with corre-lated data gathering scenario in terms of energy efficiency in the particular simulationscenarios. In the conventional data gathering scenario with abstract network model,substantial savings in the data transportation costs were achieved. The model assumedlossless data transportation over point-to-point links by routing the data through singlepaths and by ignoring the networking issues, such as transmit power, channel noiseand link capacity. The scenario was extended to the WSN by including capacity con-strained AWGN channels with Rayleigh fading and by allowing multi-path routing forthe data delivery. Due to the joint optimization of the power allocation and routing,significant improvements were attained in terms of energy efficiency. It was shownthat when the amount of correlation between the sources is high, sensor nodes canoperate by using the transmission powers that are fractions of the corresponding pow-ers required with the independent encoding. The drops in the transmission powers forboth optimization frameworks were remarkable in the sense that they will yield betterenergy efficiency and therefore, prolonged network lifetime in the network.

The proposed distributed algorithms were shown to converge near to the optimalsolutions in static channel conditions in the particular simulation scenarios. The fea-sibilities of the solutions were shown to remain in a level that the algorithms can beimplemented in the practical WSNs while guaranteeing the proper functionality. Ad-ditionally, the algorithms were shown to be capable of tracking the solution in time-varying slow fading Rayleigh channels. This is an important feature because often,WSNs are exposed to the changing channel conditions due to their inherently fluctuat-ing operation environment. An initial assumption is that the topology of WSN remainsfixed, and therefore, the fluctuations in the amplitude of received signal are due to thechanges in surrounding operation environment, e.g., due to the movement of obstaclesin the vicinity of a node.

The definition for the network lifetime was assumed to be different for two optimiza-tion frameworks. The lifetime depends highly on the nature of the application, sincethe relevance and importance of the data associated with each node can be unequal.In this connection, the choice of the trade-off parameter value involved in the min-max transmit power problem plays a big role from the system design point of view.It was shown that with the small trade-off parameter values, the network lifetime canbe increased considerably while keeping also the total transmit power in a reasonablelevel. This leads to the situation where the battery recharging or node replacement,if possible due to possible restricted access, is done most frequently to the nodes that

85

are located near the sink. The nodes lying on the extremity of the network can op-erate for longer times without battery replacement. This can have an advantage overthe framework with complete emphasis on the network lifetime maximization wherethe consideration of unnecessary usage of the transmit powers is totally neglected. Byreasoning, the optimization framework consists of the sensor nodes that are of equallyimportant in terms of guaranteeing the sufficient data delivery to the sink node. If thedepletion of the battery of any node is fatal for adequate functionality of the data gath-ering, the framework is relevant to use, since it inevitably gives the best solution interms of maximizing the network lifetime.

During the making of the thesis, numerous assumptions and simplifications weremade to simplify the system model and to make the analysis easier. Also, numerousideas arose which were left unsolved in this thesis. Based on the reasonings, many ofthe issues are relevant and interesting topics for further study. Next, few of these itemsare discussed.

Throughout the thesis, a major assumption was made for the data inherited from thesource nodes to motivate the use of Slepian-Wolf coding in single-sink data gatheringscenario: the degree of correlation is considerably large in order to achieve reason-able amount of rate reduction in the network. What would be the case if the spatiallycorrelated data is not providing such notable improvements due to the poor correla-tion? Since the SW coding inherently involves increased signal processing costs andcomplexity in the network, the SW coding with the low correlation is not useful fromthe view of overall energy efficiency. In this case, it can be more relevant to encodethe sources independently when the network has to carry a bit larger amount of databut the savings in signal processing costs will compensate the loss in energy efficienttransmission. Another inherent assumption made in the thesis was that the number ofsource nodes in the network is also relatively large contributing significantly to the ratereduction. It was shown that with the decreasing network size, the profit the SW isproviding, is also decreasing. If the relevance of a node or a group of nodes is not vitalfor the objective of data gathering, its/their data can be ignored when the SW coding isnot outperforming the independent encoding so greatly. At some point, the use of SWcoding is not profitable in terms of overall costs when the number of nodes is small,regardless of having a reasonable amount of correlation between the sources.

Thus, an interesting question inevitably arises for the trade-off between Slepian-Wolf coding and independent encoding: could it be possible to determine a particularindicator for the network which will adaptively regulate the use of SW coding and in-dependent encoding of the sources alternatively? Intuitively, the state of the indicatorwould be based on the amount of data correlation inherent between the sources and onthe number of source nodes in the rate allocation process. More precisely, the indica-tor has to had estimated the possible, achievable benefits of the use of SW coding interms of overall costs. The costs include the encoding costs and all the extra signallingcosts the SW coding is causing when acquiring the correlation structure of the data. Inorder to have the valid indicator value, the adaptive encoding method selection will re-quire training periods in the network to attain information about the current correlationstructure. This by itself will introduce extra signalling to the network.

Localized Slepian-Wolf coding inherently involves an optimization problem as itsown related to the optimal network clustering that was discussed briefly in Sec-tion 3.3.3. The simulations results regarding the localized SW coding scenario were

86

carried out by using a very simple clustering method of the source nodes. Introducingmore sophisticated network clustering methods with more random network topologiesis a relevant topic for further study. By casting the network partitioning as an opti-mization problem that considers the covariance properties between the nodes, the ratereduction can be even more significant for localized SW.

The estimation of data correlation introduces also a major challenge for the networksince the proper encoding of sources with Slepian-Wolf coding will have a direct influ-ence on the QoS. The estimation will need a training period that has to be performed inthe initialization phase after the network is deployed. The high accuracy of the estima-tion is a vital issue in terms of achieving high QoS after the joint decoding of data hasbeen performed at the sink node. Another issue to take into consideration is to assesswhether the used continuous-space correlation model is valid for the used scenario.The model assumed also that the correlation is not changing in time after the data ob-servations have been performed. How would be the case if the correlation is dependenton the data which is varying with the constantly changing physical phenomena undertarget? Then, the estimated correlation will be valid only a fraction of time and theestimation periods are required more frequently.

In Chapter 4, numerous assumptions were made to simplify the system model relatedto the WSN. For instance, mutual interference between the wireless links was neglectedby assuming an orthogonal multiple access scheme, FDMA, in the system. In practice,there is always interference occurring due to the wireless data transmissions of somegroup of links to another group of links. Hence, by including the interferences tothe link capacities, the system model would become closer to the practical situation.From the view of distributed algorithms, the local areas where overhead signalling isrequired, will become larger due to having the interference sets in the network. Theexistence of the interference sets will also introduce essential networking conceptsrelated to interference-based or contention-based channel access methods. Then, inaddition to the transmit power of the link, the link capacities will also be functionsof, e.g., random access and transmission persistence probabilities of a node, signal-to-interference-plus-noise ratio, and time slots allocated to each node. Further, if the datatransmission is not considered to be constantly error-free, the packet loss probabilitywill have its own impact on the link capacities and the system performance.

In the system model, the energy efficiency was investigated by considering onlythe transmit powers of sensor nodes. A step closer to a more practical situation is toinclude also the energy consumption of the data reception which can typically be ofconsiderable order. This would influence on the optimal routing of the flows, since thecost for transmitting several small data flows via multiple intermediate nodes can turnout to be higher than the respective data transmission with fewer flows. Another issueaffecting the energy consumption of the nodes is the frequency the sensor nodes aretransmitting their data. In the model, the time-related issues for data transmission wereignored by assuming that each source node is continuously transmitting its data. Dueto the nature of measuring and monitoring applications for which the WSNs are oftendestined, the data transmissions occur in bursts. This would cut the energy consump-tion of each node inversely proportional to the frequency they are using transmit powerfor data transmissions. Nevertheless, it is worth noting that between the data bursts,i.e., during idle periods, the sensor nodes are also consuming small amount of energywhich can not be ignored.

87

Throughout the thesis, the distributed algorithms inherently assumed that each nodeis having the same amount of initial battery energy to support the data traffic. A rele-vant issue to consider is to include the state of each battery in the optimization frame-work. This would influence the optimal routing of data as well as the power allocation,because the data has to be delivered across alternative paths to the sink node by avoid-ing to over-stress the nodes having low battery. This will be especially crucial to notewhen striving to lengthen the network lifetime by keeping all the nodes alive for thedata gathering purpose. The knowledge about the battery states would introduce an ad-ditional inter-node information exchange similar to the case of knowing the maximumtransmit power among the sensor nodes in the network.

In terms of WSN signalling complexity, one challenging issue for the distributedalgorithms is the requirement of CSI at each iteration instance. To acquire CSI, afeedback channel between the transmitter and receiver of the nodes is needed. Withrelatively slow fading channels, the CSI updates will not be necessarily required foreach iteration instance that will reduce the complexity. On the contrary, if the chan-nels encounter fast fading, the update of CSI will be truly required for each iterationinstance. This will demand efficient CSI signalling strategies for the sensor nodes op-erating in WSN. In addition, the acquirement of CSI will eat the valuable energy of thesensor nodes reducing the overall efficiency of data transmissions.

The work assumed a system model with fixed networks where the unexpectedchanges in the topology were neglected. In practical scenarios, changes in the net-work topology due to the node failures will surely take place and may cause seriousconnectivity problems of a node or even a group of nodes. This will influence the func-tionality of the target application depending on the importance of the data involved withthe nodes under failures. In addition, the node mobility will affect on the correlationstructure of the data, and thus, it introduces a requirement for performing a trainingperiod to attain the most updated correlation structure in the network.

An interesting issue left for further study is that how to address the joint optimizationover all the three layers, that is, the joint optimization of the source coding, the powerallocation and the routing. Is it possible to add the Slepian-Wolf rate constraints intojoint optimization problem and derive a distributed algorithm for finding the solution?If the data covariance matrix is involved in the optimization problem as well as infor-mation about the channel conditions, the generation of node ordering can be done in amore efficient way. The ordering would not be based just on the distances on the SPTbut it will depend also on the current channel states indicating the achievable channelcapacities. The major problem in adding the source coding to the optimization problemis that the problem becomes hard to solve, since the number of rate constraint wouldincrease exponentially with the number of sources. And, how would be the convexityof the introduced optimization problem?

The proposed distributed algorithms introduced also some appreciable challengesand issues for further development. The tracking of the proposed distributed algorithmswas evaluated by updating the variables with the gradient projection method withoutany modifications compared to the static channel case. However, the tracking abilitycould be improved by considering more sophisticated methods for the updates. Oneoption could be to use a sort of dynamic scaling matrices for taking the time correlationof the channel into account while updating the variables at each iteration instance. Thiskind of approach is found in the work of Cheng et al. [19].

88

The convergence of the algorithms were investigated in deterministic and symmetricnetwork topologies where they were shown to converge. In more random topologies,the gradient projection method can introduce scaling problems to the convergence.This is due to the fact that the method is really sensitive to the order of the variableswhen the selection of proper step size becomes crucial. One approach would be toadjust the step size based on the order of system variables or even calculate the opti-mal one for each network state. Naturally, this would increase the complexity of thegradient projection method.

The distributed algorithm regarding the min-max transmit power was showed notto be fully distributed due to the necessity of flooding the epigraph variable τ andLagrange variables ζi, i ∈ S, in the network. To avoid the extra signalling, an ap-proximate method could be used as an alternative. Inspired by the approach in [9],the L∞-norm part of the objective function could be replaced with the approximationgiven by

∑i∈S

(Bp)ξ

ξ−1, ξ →∞, when no extra signalling will be needed in the network.

Of course, this approach will introduce a penalty for achieving the optimal solutions.

89

9. SUMMARY

The purpose of this thesis has been to find out a way to achieve energy efficient com-munications in single-sink correlated data gathering wireless sensor network scenario.This was done by first removing the redundancy in the data by performing the global orlocal Slepian-Wolf coding of correlated data and then, by optimizing the data transmis-sion to the destination in the network. The data transmissions were assumed to occuracross the capacity constrained links that encounter no mutual interference. Multi-path routing was used to model the data traffic across additive white Gaussian noisechannels under Rayleigh fading.

The data transmissions were optimized by involving the joint optimization of thepower allocation and the routing. The optimization problems were stated as convexminimization problems which had a particular structure of decomposability. Two dis-tributed algorithms of different optimization frameworks were proposed of which thefirst considered the total transmit power minimization and the second involved theminimization of the maximum transmit power. The algorithms were derived by ap-plying the dual decomposition technique for relaxing the problem and allowing theproblem to be solved by considering the associated protocol layers separately. Thealgorithm for the total transmit power minimization was analyzed to require only in-formation exchange of scalar values within a neighborhood of each source node. Theminimization of maximum transmit power framework will inherently need some extraoverhead signalling among the network and can not be applied in a fully distributedfashion. Nevertheless, the distributed algorithms were developed as competitive alter-natives for the centralized versions to find the solutions distributively. Typically, thisis a major prerequisite for the wireless sensor networks.

Numerical results were generated in Matlab environment. Those showed how ben-eficial it is to use Slepian-Wolf coding in order to achieve energy efficient communi-cations in data gathering scenarios. According to the results, substantial amount ofenergy can be saved when Slepian-Wolf coding is employed in correlated data gath-ering scenarios. These were seen both in the total transmit power and in the maxi-mum transmit power which act as alternative optimization objectives depending on thenature of the application in the data gathering. In order to attain such benefits, theamount of correlation between data readings has to be adequate since the efficiencyof the Slepian-Wolf coding highly relies on the correlation structure of the data. Thefunctionalities of the proposed distributed algorithms were evaluated and they wereshown to converge near to the optimal solutions under static channel conditions withthe deterministic network topologies. In addition, the algorithms were shown to becapable of tracking the solution under Rayleigh slow fading channels which is an im-portant issue when intending to apply them to the practical wireless sensor networkenvironments.

The benefits of use of cross-layer design emerged clearly from this work. The dis-tributed source coding combined with the joint optimization over the power allocationand the routing yielded the results that show the great potentiality for applying theapproaches to practical designs. When assessing the future trends, cross-layer designwill be a key technique and it will be increasingly applied in future wireless sensor net-works. Hence, the importance of considering it as an alternative in wireless networkdesign can not be emphasized too much.

90

10. REFERENCES

[1] Sridhar P., Madni A. & Jamshidi M. (2006) Hierarchical data aggregation in spa-tially correlated distributed sensor networks. In: World Automation Congress,2006. WAC ’06, Budapest, Hungary, pp. 1–6.

[2] Xiong Z., Liveris A. & Cheng S. (2004) Distributed source coding for sensornetworks. IEEE Signal Processing Magazine 21, pp. 80–94.

[3] Ramamoorthy A. (2007) Minimum cost distributed source coding over a network.In: 2007 IEEE International Symposium on Information Theory, (ISIT’07), Nice,France, pp. 1761 –1765.

[4] Yuen K., Liang B. & Baochun L. (2008) A distributed framework for correlateddata gathering in sensor networks. IEEE Transactions on Vehicular Technology57, pp. 578–593.

[5] Zhu J., Chen S., Bensaou B. & Hung K.L. (2007) Tradeoff between lifetime andrate allocation in wireless sensor networks: a cross layer approach. In: 2007 26thIEEE International Conference on Computer Communications, (INFOCOM’07),Anchorage, Alaska, USA, pp. 267–275.

[6] Johansson B., Soldati P. & Johansson M. (2006) Mathematical decompositiontechniques for distributed cross-layer optimization of data networks. IEEE Jour-nal on Selected Areas in Communications 24, pp. 1535–1547.

[7] Li C., Zou J., Xiong H. & Zhang Y. (2009) Joint coding / routing optimizationfor correlated sources in wireless visual sensor networks. In: 2009 IEEE GlobalTelecommunications Conference, (GLOBECOM’09), Honolulu, Havaii, USA,pp. 1–8.

[8] Wang P., Li C. & Zheng J. (2007) Distributed data aggregation using clusteredSlepian-Wolf coding in wireless sensor networks. In: 2007 IEEE InternationalConference on Communications, (ICC’07), Glasgow, Scotland, pp. 3616–3622.

[9] He S., Chen J., Yau D. & Sun Y. (2010) Cross-layer optimization of correlateddata gathering in wireless sensor networks. In: 2010 7th Annual IEEE Commu-nications Society Conference on Sensor Mesh and Ad Hoc Communications andNetworks, (SECON’10), Boston, Massachusetts, USA, pp. 1–9.

[10] Cristescu R., Beferull-Lozano B. & Vetterli M. (2005) Networked Slepian-Wolf:Theory, algorithms, and scaling laws. IEEE Transactions on Information Theory51, pp. 4057–4073.

[11] Cover T. & Thomas J. (2006) Elements of Information Theory. John Wiley andSons Inc., USA, 2nd ed., 776 p.

[12] Slepian D. & Wolf J. (1973) Noiseless coding of correlated information sources.IEEE Transactions on Information Theory 19, pp. 471–480.

91

[13] Roumy A. & Gesbert D. (2007) Optimal matching in wireless sensor networks.IEEE Journal of Selected Topics in Signal Processing 1, pp. 725–735.

[14] Cristescu R., Beferull-Lozano B. & Vetterli M. (2004) On network correlated datagathering. In: 2004 Twenty-third Annual Joint Conference of the IEEE Computerand Communications Societies, (INFOCOM’04), Hong Kong, China, pp. 2571–2582.

[15] Palomar D. & Chiang M. (2006) A tutorial on decomposition methods for net-work utility maximization. IEEE Journal on Selected Areas in Communications24, pp. 1439–1451.

[16] Palomar D. & Chiang M. (2007) Alternative distributed algorithms for networkutility maximization: framework and applications. IEEE Transactions on Auto-matic Control 52, pp. 2254–2269.

[17] Xiao L., Johansson M. & Boyd S. (2004) Simultaneous routing and resourceallocation via dual decomposition. IEEE Transactions on Communications 52,pp. 1136–1144.

[18] Yuan J. & Yu W. (2008) Joint source coding, routing and power allocation inwireless sensor networks. IEEE Transactions on Communications 56, pp. 886–896.

[19] Cheng Y. & Lau V. (2010) Distributive power control algorithm for multicarrierinterference network over time-varying fading channels - tracking performanceanalysis and optimization. IEEE Transactions on Signal Processing 58, pp. 4750–4760.

[20] Chen J., Lau V. & Cheng Y. (2011) Distributive network utility maximizationover time-varying fading channels. IEEE Transactions on Signal Processing 59,pp. 2395–2404.

[21] Cressie N. (1991) Statistics for Spatial Data. John Wiley and Sons Inc, New York,900 p.

[22] Larson H. & Shubert B. (1979) Probabilistic Models in Engineering Sciences:Random Variables and Stochastic Processes, Vol. 1. John Wiley and Sons Inc,556 p.

[23] Wasserman L. (2004) All of Statistics: A Concise Course in Statistical Inference.Springer Verlag, 1st ed., 461 p.

[24] Larson H. & Shubert B. (1979) Probabilistic Models in Engineering Sciences:Random Noise, Signals, and Dynamic Systems. John Wiley and Sons, 750 p.

[25] Cheung N.M., Wang H. & Ortega A. (2005) Correlation estimation for distributedsource coding under information exchange constraints. In: 2005 IEEE Interna-tional Conference on Image Processing, (ICIP’05), Genoa, Italy, pp. 682–685.

92

[26] Cheung N.M., Wang H. & Ortega A. (2008) Sampling-based correlation estima-tion for distributed source coding under rate and complexity constraints. IEEETransactions on Image Processing 17, pp. 2122–2137.

[27] Stankovic V., Liveris A.D., Xiong Z. & Georghiades C.N. (2004) Design ofSlepian-Wolf codes by channel code partitioning. In: 2004 Data CompressionConference, (DCC’04), Snowbird, Utah, USA, pp. 302–311.

[28] Zhang C., Wang B., Fang S. & Li Z. (2008) Clustering algorithms for wirelesssensor networks using spatial data correlation. In: 2008 International Conferenceon Information and Automation, (ICIA’08), Hunan, China, pp. 53–58.

[29] Glisic S. (2006) Advanced Wireless Networks: 4G Technologies. John Wiley andSons Ltd, 867 p.

[30] Dijkstra E.W. (1959) A note on two problems in connection with graphs. Nu-merische Mathematik 1, pp. 269–271.

[31] Cormen T., Leiserson C., Rivest R. & Stein C. (2001) Introduction to Algorithms.MIT Press, 2nd ed., 1056 p.

[32] Vuran M.C., Akan O.B. & Akyildiz I.F. (2004) Spatio-temporal correlation: the-ory and applications for wireless sensor networks. Computer Networks 45, pp.245–259.

[33] Berger J.O., Oliveira V.D. & Sanso B. (2000) Objective Bayesian analysis ofspatially correlated data. Journal of the American Statistical Association 96, pp.1361–1374.

[34] Banner R. & Orda A. (2007) Multipath routing algorithms for congestion mini-mization. IEEE Transactions on Networking 15, pp. 413–424.

[35] Ganesan D., Govindan R., Shenker S. & Estrin D. (2001) Highly-resilient,energy-efficient multipath routing in wireless sensor networks. ACM SIGMO-BILE Mobile Computing and Communications Review 5, pp. 11–25.

[36] Saunders S. & Aragón-Zavala A. (2007) Antennas and Propagation for WirelessCommunication Systems. John Wiley and Sons, 2nd ed., 546 p.

[37] Li J., Bose A. & Zhao Y. (2005) Rayleigh flat fading channels’ capacity. In:2005 3rd Annual Communication Networks and Services Research Conference,(CNSR’05), Halifax, Nova Scotia, Canada, pp. 214–217.

[38] Boyd S. & Vandenberghe L. (2004) Convex Optimization. Cambridge UniversityPress, New York, USA, 730 p.

[39] Hartley R. & Schaffalitzky F. (2004) L∞-minimization in geometric reconstruc-tion problems. In: 2004 IEEE Computer Society Conference on Computer Visionand Pattern Recognition, (CVPR’04), Washington DC, USA, pp. I–504 – I–509Vol.1.

93

[40] Kirk J. (2007), Matlab function: Dijkstra’s shortest path algorithm, release 1.3.http://www.mathworks.com/matlabcentral/fileexchange/12850-dijkstras-shortest-path-algorithm/.

[41] Grant M. & Boyd S. (2011), CVX: Matlab software for disciplined convex pro-gramming, version 1.21, build 808, April 2011. http://cvxr.com/cvx.

[42] Bertsekas D. (1999) Nonlinear Programming. Optimization and Neural Compu-tation Series, Athena Scientific, 2nd ed., 802 p.

[43] Tse D. & Viswanath P. (2005) Fundamentals of Wireless Communication. Cam-bridge University Press, 564 p.

zetite/Files/MasterThesis_LeinonenMarkus.pdf · Leinonen M. (2011) Power Minimization in Single-Sink Data Gathering Wireless Sensor Network via Distributed Source Coding. Department

Documents