-
On Quantifying the Effects of Mobility on Data
Replication in Mobile Ad Hoc Networks
Yang Zhanga,, Guohong Caoa, Bhaskar Krishnamacharib, Tom La
Portaa,
aDepartment of Computer Science and Engineering, The
Pennsylvania State University,University Park, PA
bDepartment of Electrical Engineering, University of Southern
California, Los Angeles,CA
Abstract
In mobile ad hoc networks, nodes move freely and network
partition occursfrequently. To mitigate this problem, data
replication is commonly used toincrease the data availability and
reduce the data access delay. However,most previous work assumed a
particular mobility model and could not fullystudy the effects of
mobility on data replication. In this paper, we quantifythe effects
of mobility on different data replication algorithms from
variousperspectives. The study is based on several metrics which
are not limited tothe average access delay and data availability,
by including the geographicaldistribution of these values. Through
extensive experiments, we study theeffects of four typical mobility
models on data replication, and identify themost suitable data
replication algorithms under various mobility models.
Keywords:Mobile ad hoc networks, Replication, Mobility model
1. Introduction
In mobile ad hoc networks (MANETs)[1], since nodes move freely,
net-work partition may occur, where nodes in one partition cannot
access dataheld by nodes in other partitions. To mitigate this
problem, data replication
Email addresses: [email protected] (Yang Zhang),
[email protected](Guohong Cao), [email protected] (Bhaskar
Krishnamachari), [email protected] (TomLa Porta)
Preprint submitted to Ad Hoc Networks February 7, 2011
-
can be used. By replicating the data into a number of nodes, a
data requestcan be served by the closest node which has the data
replica. Then, even ifthere is a network partition between the
requesting node and the originallydata source, the data request can
still be served as long as it can reach anode with a data replica.
Further, since the data request can be served withless number of
hops, the data access delay is reduced.
Data replication can increase the data availability and reduce
the dataaccess delay, but at the cost of data storage. Since mobile
nodes only havelimited storage space, bandwidth, and power, it is
impossible for one node tohold all the data. Therefore, it is
important for mobile nodes to cooperatewith each other to decide
which node should hold which data replica. Toincrease data
availability, a node may not hold the data which has alreadybeen
replicated by neighbors so that its local storage can be used to
holdadditional data. However, this may increase the hop count of
some data andincrease the data access delay. The problem becomes
more complex whenmobility is considered, since mobility can change
the location of the datareplica, and then affect the data
availability and data access delay.
There have been some studies on data replication in MANETs [2,
3, 4].These studies show that node mobility significantly affects
the performance ofdata replication. However, most of these works
assume a particular mobilitymodel and only examine the effect of
one particular mobility model on theperformance of their proposed
algorithm. In other words, they could notprovide any general
insights on the relationship between different mobilitymodels and
data replication algorithms. In this paper, we aim to studythe
effect of various mobility models on data replication and then
identifythe most suitable data replication algorithms under various
mobility models.More specifically, the contributions of this paper
are as follows:
1. We quantify the effect of mobility on data replication based
on metricssuch as data access delay and data availability. Besides
these traditionalmetrics, we also look into the geographical
distribution of access delayand data availability.
2. Our experimental results illustrate that different
replication algorithmsshow quite different features on node
cooperation, and thus achieve dif-ferent data access performance
under different mobility models. Specif-ically, we provide a deep
analysis and evaluation on the relationshipbetween data replication
and node mobility, and identify the reasonbehind it.
2
-
3. We identify the most suitable data replication algorithms
under var-ious mobility models. These results can be used as
guidelines for re-searchers and system developers to design and
examine data replicationalgorithms when considering node
mobility.
The remainder of this paper is organized as follows. In Section
2, wesummarize related work in this area. In Section 3, we present
the systemmodel, performance metrics, and data replication
algorithms that will beevaluated in this paper. Section 4 reports
the evaluation results of howdifferent data replication algorithm
perform under various mobility models.Finally, Section 5 concludes
the paper.
2. Related work
Data replication has been extensively studied in the Web
environment[5, 6] and the distributed database systems [7], where
the goal is to placesome replicas of the web servers or database
among a number of possiblelocations so that the performance in
terms of query delay or data availabilityis optimized. However, in
all these conventional works, both web servers anddatabase systems
are assumed to be static, whereas our work is proposed fora mobile
ad hoc environment.
Recently, much research has been conducted to investigate the
effect ofmobility on network performance such as the efficiency of
routing protocol[8] and network partitioning [9]. For routing, the
main objective is to finddestinations and forward data with low
message overhead, high data deliveryratio, and short delivery
delay. For network partition, the dynamic changesof the size and
shape of each partition are important issues. These studieshave
some similarity to our work from the point of investigating the
effectof mobility. However, these works mainly focus on link
stability and nodedistribution in the network; i.e., their studies
are at the link and node level.Our work, however, focuses on the
effects of mobility on data replication.
Some existing works studied the effects of mobility on data
availabilityand data dissemination speed in MANETs. In [10], the
authors mathemati-cally define some metrics that represent the
effects on information diffusionin MANETs. Huang and Chen [11]
studied how to replicate data when nodeshave group mobility
pattern. However, all these works aim at studying thenetwork
dynamics and the characteristic of node mobility. They
cannotprovide any deep and general insight on the internal
relationship between
3
-
mobility model and data access. The existing work that is most
relevantto our work is [12], where Hara proposes metrics to
evaluate the impact ofmobility on data availability. However, all
the metrics are limited to dataavailability. No specific data
replication algorithm is analyzed and the dataaccess performance is
not considered. Therefore, it is not enough to fullyexamine the
relationship between mobility model and data access. In thiswork,
we will study the effects of mobility on data replication in terms
ofdata access delay and data availability. We also identify the
most suitabledata replication algorithms under various mobility
models.
3. Preliminary
In this section, we propose new metrics to quantify the effects
of mobilityon data replication and present four data replication
algorithms that will beused in the evaluation.
3.1. System Model
We assume there are 𝑚 nodes in the network. The nodes are
denotedby 𝑁 = {𝑁1, 𝑁2, ..., 𝑁𝑚}, where 𝑁𝑘(𝑘 = 1, ...,𝑚) is a node
identifier. Thecommunication range of each mobile node is
represented by a circle withradius 𝑅. When two nodes move out of
their communication range, thelink between them will fail and the
link failure probability between 𝑁𝑖 and𝑁𝑗 is denoted as 𝑓𝑖𝑗. The
link failure probability is related to the distance,and the moving
direction and velocity of the nodes. For example, if twoconnected
nodes have a long distance and they move towards the
oppositedirection, then they are easier to disconnect and the link
failure probabilitybetween them is high. We assume every link is
bidirectional, and thus 𝑓𝑖𝑗 isequal to 𝑓𝑗𝑖. The network can be
partitioned due to the limitations of thecommunication range and
link failure.
There are 𝑛 different data items in the network. The set of data
itemsis denoted by 𝐷 = {𝑑1, 𝑑2, ...𝑑𝑛}, where 𝑑𝑘(𝑘 = 1, ..., 𝑛) is
a data identifier.Each mobile node maintains some amount of data
locally. For simplicity,we assume that data are not updated, and
similar techniques used in [13]and [14, 15] can be applied to
extend the proposed scheme to handle dataupdate or data consistency
issues. These data items may be replicated toother nodes based on
some data replication algorithm. Because of limitedmemory (or disk)
size, each mobile node can only host 𝐵(𝐵 < 𝑛) replicasincluding
its original data. When a mobile node 𝑁𝑖 needs to access a data
4
-
item 𝑑𝑗, 𝑁𝑖 first searches its local memory. If 𝑁𝑖 cannot find a
copy of 𝑑𝑗in the local memory, 𝑁𝑖 communicates with its reachable
nodes (throughone-hop or multi-hop links in its partition) to get
𝑑𝑗. If the requesting nodecannot communicate with any of the nodes
that have 𝑑𝑗, 𝑑𝑗 is considered tobe not accessible to 𝑁𝑖.
3.2. Evaluation Metrics
Based on the system model, we define several metrics that
represent theperformance of different data replication
algorithms.
3.2.1. Average Access Delay (𝒟)This metric is defined as the
average number of hops from the query node
to the nearest node that has the data. Formally, if we use 𝑡𝑖𝑗
to denote theaccess delay of the 𝑗th request of node 𝑁𝑖, the
average access delay duringthe whole experiment can be expressed by
the following equation:
𝒟 =∑𝑚
𝑖=1
∑ℛ(𝑖)𝑗=1 𝑡𝑖𝑗∑𝑚
𝑖=1 ℛ(𝑖)(1)
Here, ℛ(𝑖) is a function to return the number of requests
initiated by node𝑁𝑖 during the experiment.
3.2.2. Average Availability (𝒜)Average availability is the
average probability that the query can be
served successfully. Similarly, we use a binary variable 𝑠𝑖𝑗 to
denote if the𝑗th request of node 𝑁𝑖 is satisfied or not, the
definition of this metric can beformalized as
𝒜 =∑𝑚
𝑖=1
∑ℛ(𝑖)𝑗=1 𝑠𝑖𝑗∑𝑚
𝑖=1 ℛ(𝑖)(2)
where 𝑠𝑖𝑗 = 1 if the 𝑗th request of node 𝑁𝑖 is satisfied;
otherwise, 𝑠𝑖𝑗 = 0.
3.2.3. Distribution of the Access Delay (𝒟ℎ)We believe that the
average access delay may not always be a very signif-
icant metric since it treats the two case equally: 1) each
request has similaraccess delay; and 2) some requests have long
access delay but others haveshort delay. Therefore, to study the
performance of data replication algo-rithms, the distribution of
access delay is more significant than their averagevalue, and is
heavily affected by the adopted mobility model and replication
5
-
algorithms. Therefore, we define the distribution of access
delay as a newmetric by the following equation:
𝒟ℎ =𝑚∑𝑖=1
ℛ(𝑖)∑𝑗=1
𝑏𝑒𝑙(𝑡𝑖𝑗 , ℎ), (ℎ = 0, 𝑡, 2𝑡, ...) (3)
where 𝑡 is the statistic interval of the access delay, and
𝑏𝑒𝑙(𝑡𝑖𝑗, ℎ) is a functionto return if 𝑡𝑖𝑗 belongs to the range [ℎ,
ℎ+ 𝑡). 𝑏𝑒𝑙(𝑡𝑖𝑗, ℎ) = 1 if ℎ ≤ 𝑡𝑖𝑗 < ℎ+ 𝑡;otherwise, 𝑏𝑒𝑙(𝑡𝑖𝑗, ℎ)
= 0.
3.2.4. Geographical Distribution of the Access Delay
(𝒟⟨ℎ𝑥,ℎ𝑦⟩)Since different mobility models may lead to different
deployment patterns
of mobile nodes, we study the geographical distribution of the
data accessdelay. We divide the entire network area into ℎ × ℎ
small subareas andcompare the results in different subareas. The
geographical distribution ofaccess delay at subarea ⟨ℎ𝑥, ℎ𝑦⟩ is
expressed by the following equation:
𝒟⟨ℎ𝑥,ℎ𝑦⟩ =𝑚∑𝑖=1
ℛ(𝑖)∑𝑗=1
ℒ(𝑡𝑖𝑗 , ⟨ℎ𝑥, ℎ𝑦⟩) (4)
where ℒ(𝑡𝑖𝑗, ⟨ℎ𝑥, ℎ𝑦⟩) is a function that returns if the request
takes place inthe subarea ⟨ℎ𝑥, ℎ𝑦⟩. ℒ(𝑡𝑖𝑗, ⟨ℎ𝑥, ℎ𝑦⟩) = 𝑡𝑖𝑗 if the
𝑗th request of node 𝑁𝑖 isinitiated in the subarea ⟨ℎ𝑥, ℎ𝑦⟩;
otherwise, ℒ(𝑡𝑖𝑗, ⟨ℎ𝑥, ℎ𝑦⟩) = 0.
3.2.5. Geographical Distribution of Availability
(𝒜⟨ℎ𝑥,ℎ𝑦⟩)Similar to the definition of geographical distribution of
access delay, the
geographical distribution of availability is represented by the
following equa-tion:
𝒜⟨ℎ𝑥,ℎ𝑦⟩ =𝑚∑𝑖=1
ℛ(𝑖)∑𝑗=1
ℒ(𝑠𝑖𝑗 , ⟨ℎ𝑥, ℎ𝑦⟩) (5)
where ℒ(𝑡𝑖𝑗, ⟨ℎ𝑥, ℎ𝑦⟩) is a function that returns if the request
takes place inthe subarea ⟨ℎ𝑥, ℎ𝑦⟩. ℒ(𝑡𝑖𝑗, ⟨ℎ𝑥, ℎ𝑦⟩) = 𝑠𝑖𝑗 if the
𝑗th request of node 𝑁𝑖 isinitiated in the subarea ⟨ℎ𝑥, ℎ𝑦⟩;
otherwise, ℒ(𝑡𝑖𝑗, ⟨ℎ𝑥, ℎ𝑦⟩) = 0.
3.3. Data Replication Algorithms
To study the effects of mobility on data replication, we use the
followingfour representative data replication algorithms.
6
-
3.3.1. Greedy Data Replication
The Greedy data replication is a naive data replication
algorithm. In thisalgorithm, each node replicates its most
frequently accessed data until thememory is full. More
specifically, let 𝑎𝑖𝑗 denote the access frequency of node𝑁𝑖 to data
𝑑𝑗. Then, each node always replicates the data with the highest𝑎𝑖𝑗.
Since each node only takes its own data access pattern into
accountduring data replication, this algorithm is
non-cooperative.
3.3.2. Pairing Cooperation Data Replication
Different from the Greedy data replication, in the Paring
algorithm (e.g.,the OTOO scheme in [16] and the DAFN scheme in
[2]), each mobile nodecooperates with one of its neighbors to
decide which data to replicate. Morespecifically, each node pair 𝑁𝑖
and 𝑁𝑗 calculates a combined access frequencyvalue to data item 𝑑𝑘
at 𝑁𝑖 and 𝑁𝑗, called 𝐶𝐴𝐹𝑖𝑗, respectively. For example,for 𝑁𝑖:
𝐶𝐴𝐹𝑖𝑗(𝑘) = 𝑎𝑖𝑘 + 𝑎𝑗𝑘 × (1− 𝑓𝑖𝑗) (6)Similarly 𝑁𝑗 calculates its
combined access frequency. Each node sorts thedata according to the
CAF value and picks data items with the highest valuesto replicate
in its memory until no more data items can be replicated. Thedata
replication decision does not simply depend on the access frequency
ofone single node. It depends on the access frequency of the other
pairing nodeand the link stability between them.
3.3.3. Reliable Neighboring Data Replication
The Paring algorithm considers neighboring nodes when making
datareplication choices. However, it still considers its own access
frequency asthe most important factor and only considers to
cooperate with one neigh-boring node. As described in [16], the
reliable Neighboring data replicationalgorithm further increases
the degree of cooperation and allows nodes toreplicate and share
data with multiple reliable neighbors within its one-hoprange. The
replication decision is made depending on the data access
fre-quency and the link stability. More specifically, in this
algorithm, part ofthe node’s memory is used to hold the most
interesting data for itself andothers are for its reliable
neighbors. The combined access frequency functionfor node 𝑁𝑖 to
data 𝑑𝑘 in the Neighboring algorithm is defined as:
𝐶𝐴𝐹𝑖(𝑘) =∑
𝑁𝑗∈𝑛𝑏(𝑖)𝑎𝑗𝑘 × (1− 𝑓𝑖𝑗) (7)
7
-
where 𝑛𝑏(𝑖) is the set that includes all reliable neighbors of
𝑁𝑖; i.e., whoselink failure rate to 𝑁𝑖 is less than a
threshold.
3.3.4. Reliable Grouping Data Replication
Reliable Grouping data replication (e.g., the DCG scheme in [2]
andDRAM scheme in [11]) is the most aggressively cooperative
algorithm indata replication. All nodes in this group contribute
parts of their memory toshare and replicate data for all members in
the same group. More specifically,the access frequency and access
overhead of each data is evaluated from thegroup perspective.
During data replication, the data with the highest group-ing access
frequency will be allocated first at the node that minimizes
thetotal access delay within the group. The allocation process is
repeated forall data items in the order of their access frequency
until the memory of allnodes in the group are filled. The Grouping
algorithm can fully exploit thecooperation among a group of well
connected nodes. Obviously, the perfor-mance of the group data
replication algorithm highly depends on the groupconnectivity, and
the performance will be better when the group connectivityis
better.
4. Experiments
In this section, we measure the performance of the four data
replicationalgorithms under typical mobility modes: random walk
[17], random way-point [18], Manhattan mobility [10], and reference
point group mobility [19].
4.1. Mobility Models
Random Walk (RW): In this model, at every unit of experimental
time,each mobile node randomly determines a movement direction, and
randomlydetermines a movement speed from 0 to 𝑉 m/sec. From long
term pointof view, this model offers very low mobility similar to
vibrating in the sameposition, because mobile nodes randomly change
movement direction.Random WayPoint (RWP): Each node remains
stationary for a pausetime 𝑆 seconds. Then, it selects a random
destination in the entire area andmoves to the destination at a
speed determined randomly between 0 and 𝑉m/sec. After reaching the
destination, it pauses again, and then repeats thisprocess. In this
model, mobile nodes tend to gather at the center of the
area.Manhattan Mobility (MM): This model emulates the node
movementon streets where nodes only travel on the pathways in the
map. Manhattan
8
-
Table 1: Parameter ConfigurationParameter Symbol Value
RangeNumber of nodes m 300Node movement speed V 5m/s (3, 8m/s)Group
movement speed (RPGM) V’ 5m/s (3, 8m/s)Radius of group (RPGM) R
300mNode pause time S 5sec (3, 7sec)Group pause time (RPGM) S’ 5sec
(3, 7sec)Communication range C 100mNumber of data n 200Memory Size
B 10Zipf access 𝜃 0.8
grid maps of horizontal and vertical streets are used to
restrict the nodemovement. On each street, the mobile nodes move
along the lanes in bothdirections. At each intersection, the mobile
nodes choose their directions andspeed (0 to 𝑉 m/sec)
randomly.Reference Point Group Mobility (RPGM): This model is used
to modelgroup mobility. Each group has a logical “center” called a
reference pointand group members (nodes). Each reference point
moves according to theRWP model with 𝑉 ′ m/sec (maximum speed) and
𝑆 ′ sec (pause time). Ineach group, nodes are uniformly distributed
within a certain radius from thereference point. To achieve this,
we assume that each node moves accordingto the RW model with 𝑉
m/sec (maximum speed) within that range. Specifi-cally, a node’s
movement vector is composed by adding the movement vectorof the RW
model of the node to that of the RWP model of the
referencepoint.
4.2. Simulation Settings
There are 𝑚 mobile nodes (𝑁 = 𝑁1, ..., 𝑁𝑚) in a 2500𝑚× 2500𝑚
squarearea. All nodes move based on the mobility model. For the MM
model, weuse a grid road map with six vertical and horizontal
streets; i.e., 25 blocks ofthe same size (500𝑚× 500𝑚). For the RPGM
model, we assume that thereare 25 reference points 𝑟𝑝1,...,𝑟𝑝25,
and 𝑁𝑗(𝑗 = 1, ...,𝑚) sets its referencepoints as 𝑟𝑝⌈(𝑚/25)⌉.
At the beginning of the simulations, the initial position of
each mobilenode is randomly determined in the space where the node
can exist. Forexample, nodes can only exist on a road in the MM
model. We set thesimulation time 𝑇 as 500,000 seconds. Each node
initiates query request
9
-
Table 2: Access Delay with Uniform Data Access Pattern
RW RWP MM RPGMGreedy 0.8410 0.8511 1.4296 1.4346Pairing 0.9047
0.8619 1.3238 1.3347Neighboring 0.9863 0.8664 1.0423 1.3153Grouping
1.0093 0.8689 1.237 1.3559
Table 3: Access Delay with Zipf Data Access Pattern (𝜃 =
0.8)
RW RWP MM RPGMGreedy 0.3967 0.3678 0.6565 0.7364Pairing 0.4091
0.4934 0.5788 0.7390Neighboring 0.4452 0.5015 0.5848 0.743Grouping
0.5734 0.5247 0.6145 0.7612
every 5 seconds. Therefore, each node has almost 100,000
requests duringthe entire simulation period. We neglect the first
1000 seconds to removethe impact of the initial start. Table 1
summarizes the parameters and theirvalues used in the experiments.
Most parameters are fixed to constant valueswhile others can change
within a range represented by the parenthetic values.
4.3. Results
4.3.1. Average Delay and Average Data Availability
In this subsection, we study the average delay and average data
avail-ability of four data replication algorithms under four
mobility models withuniform data access pattern and a more skewed
data access pattern, i.e., Zipfdata access.
Table 2 shows the average query delay with uniform data access
pattern.As for the RW mobility model, both the Greedy algorithm and
the Pairingalgorithm achieve relatively shorter access delay than
the other two. Theshort access delay of the Greedy algorithm is due
to its low data availability(as shown in Table 4), and the missed
queries will not be accounted. ThePairing algorithm, however, helps
share data with one-hop pairing nodes. Inthis way, the node and its
paring node can both serve its requests. Consider-ing the
relatively reliable connectivity between paring nodes under the
RWmobility, the Paring algorithm can achieve higher data
availability and lower
10
-
Table 4: Data Availability with Uniform Data Access Pattern
RW RWP MM RPGMGreedy 0.3811 0.4830 0.5732 0.6252Pairing 0.4691
0.4458 0.5725 0.6532Neighboring 0.4703 0.4337 0.6074 0.6804Grouping
0.4732 0.4251 0.6088 0.7595
Table 5: Data Availability with Zipf Data Access Pattern (𝜃 =
0.8)
RW RWP MM RPGMGreedy 0.5474 0.6141 0.6909 0.7489Pairing 0.6634
0.5503 0.7033 0.7734Neighboring 0.6185 0.5899 0.8077 0.8226Grouping
0.6381 0.5541 0.7622 0.9182
query delay compared to other algorithms. Similar results can be
observedfrom Table 3.
In RWP, there is no reliable connectivity between any node pair,
andhence the Paring algorithm may not be helpful for data sharing.
Similar tothe Greedy algorithm, most requests in Paring are served
locally. Therefore,the average query delay of the Paring algorithm
becomes even shorter in thiscase at the cost of low data
availability (see Table 4). Similar results existin the Neighboring
and Grouping algorithms. However, the average delayof the Greedy
algorithm increases in RWP. This is related to the networkformation
under RWP where nodes tend to gather at the central area.
Thus,large partitions may be formed in the center, which increases
the possibilityof finding available data from nearby nodes to serve
query requests whendata access is uniformed distributed.
Similarly, in the MM and the RPGM mobility models, due to the
roadlayout constraint and the restricted mobility pattern, the
network has rela-tively higher density from nodes perspective. As
expected, larger partitionscan be formed in MM and RPGM compared to
RW. Therefore, the querydelay becomes larger in the MM and the RPGM
models.
Table 3 shows the results of query delay with skewed data access
followingZipf (𝜃 = 0.8) distribution. By comparing Table 2 and
Table 3, we can see
11
-
that as data access becomes more skewed, the average data access
delaydecreases dramatically. This is because as data access becomes
more skewed,it becomes easier for each node to buffer and replicate
its interested data intoits own memory or at nearby nodes so that
more query requests can be servedlocally or from nearby neighbors.
Here we also note that there are two factorsthat may affect the
performance in RWP. First, due to random mobility, therewill be
fewer reliable connections in RWP. Therefore the cooperation
basedalgorithms tend to work like the Greedy algorithm resulting in
low accessdelay. Second, the cooperative algorithms may still
replicate and share datawith other nodes when they find some
reliable connections occasionally. Thisincreases the access delay
as two nodes move farther away but still reachablewith multiple
hops. When the data access pattern is uniform, the first factorhas
more weight on the performance. When data access becomes skewed,the
second factor has more weight because some interesting data with
highaccess frequency may not be replicated locally. Therefore, the
Paring andNeighboring algorithms have a larger access delay in RWP
than those in RWwhen the access pattern follows Zipf
distribution.
Table 4 and Table 5 show the results of data availability with
uniformand Zipf data access pattern. Similar to the results of data
access delay, thedata availability is much higher in Zipf data
access than uniform data access.Moreover, we can see that MM and
RPGM always have better data availabil-ity than RW and RWP. This
advantage comes from the higher relative nodedensity and more
similar node mobility in MM and RPGM. More nodes canbe accessed and
more data can be used to serve query requests.
Tables 2, 3, 4 and 5 also demonstrate that cooperation helps to
improveperformance in MM and RPGM, but less improvement in RW, and
nonein RWP. This is because MM and RPGM have more mobility
similaritybetween close nodes than that in RW and RWP. If close
nodes can movetogether for a long time, cooperative data
replication algorithms such asParing, Neighboring, and Grouping
have more advantages.
In summary, in RWP where nodes move randomly, the Greedy
algorithmis the best solution. In the RW model where close nodes
have more reliableconnections, the Paring algorithm works the best.
In MM, mobile nodestend to have more reliable neighbors and have
higher nodes density, theNeighboring algorithm shows more
advantage. In RPGM model where nodesmove following strict group
mobility, Grouping data replication outperformsothers.
12
-
Greedy Paring Neighboring Grouping0
0.05
0.1
0.15
0.2
0.25
local1 hop2 hops3 hops4 hops5 hops6 hops6+ hops
(a) RW
Greedy Paring Neighboring Grouping0
0.05
0.1
0.15
0.2
0.25
local1 hop2 hops3 hops4 hops5 hops6 hops6+ hops
(b) RWP
Greedy Paring Neighboring Grouping0
0.05
0.1
0.15
0.2
0.25
local1 hop2 hops3 hops4 hops5 hops6 hops6+ hops
(c) MM
Greedy Paring Neighboring Grouping0
0.05
0.1
0.15
0.2
0.25
local1 hop2 hops3 hops4 hops5 hops6 hops6+ hops
(d) RPGM
Figure 1: Distribution of the data access delay (with uniform
data access pattern)
4.3.2. Distribution of Access Delay
Figure 1 and Figure 2 show the distribution of access delay
under uniformand Zipf distributions. In these figures, the y-axis
indicates the requestsuccess ratio. Each bar represents the query
delay in terms of hops. Sevendifferent bars represent different
distribution of the access delay, from 0 hopto 6+ hops.
As shown in Figure 1(a), for the RW model, since Paring,
Neighboring,and Grouping algorithms share data among nearby nodes,
a few requests thatare not satisfied locally can be served from
one-hop or two-hop neighbors.Because the data access is uniformly
distributed, the improvement from co-operation is not too much.
When the data access become more skewed(as shown in Figure 2(a)),
more cooperations exist, and more requests areserved from nearby
nodes. For example, compared to the Greedy algorithm,the
Neighboring algorithm sacrifices 5% requests served locally, but
achieves
13
-
Greedy Paring Neighboring Grouping0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
local1 hop2 hops3 hops4 hops5 hops6 hops6+ hops
(a) RW
Greedy Pairing Neighboring Grouping0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
local1 hop2 hops3 hops4 hops5 hops6 hops6+ hops
(b) RWP
Greedy Pairing Neighboring Grouping0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
local1 hop2 hops3 hops4 hops5 hops6 hops6+ hops
(c) MM
Greedy Paring Neighboring Grouping0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
local1 hop2 hops3 hops4 hops5 hops6 hops6+ hops
(d) RPGM
Figure 2: Distribution of the data access delay (with Zipf (𝜃 =
0.8) data access pattern)
15% more requests that can be served from one-hop neighbors.
Similarly,the Grouping algorithm tries to share data in a larger
area. It has the fewestnumber of requests served locally, but the
largest number of satisfied requestsfrom two-hop or three-hop
neighbors.
In Figures 1(b) and 2(b), since nodes move randomly in RWP,
cooperativealgorithms such as Paring, Neighboring, and Grouping do
not get help fromcooperation. The Greedy algorithm in which each
node replicates its mostinterested data, however, is more suitable
for RWP.
In MM, due to the road layout constraint, nodes can only move on
andfollow the roads. Therefore, each node has more neighbors in MM
than thatin RW and RWP, and the average network partition size can
be larger thanthat in RW and RWP. As a result, as shown in Figures
1(c) and 2(c), eachreplication algorithm has more requests
satisfied from multi-hop neighbors.
In Figures 1(d) and 2(d), the result is similar to that in MM
due to the rel-
14
-
atively reliable connectivity and higher density in RPGM. These
two figuresalso clearly demonstrate that in RPGM, the Grouping
replication algorithmhas more requests served by the neighboring
nodes that are multiple hopsaway.
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
1.2
Y (m)X (m)
Acc
ess
Del
ay
(a) RW
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
1.2
Y (m)X (m)A
cces
s D
elay
(b) RWP
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
1.2
Y (m)X (m)
Acc
ess
Del
ay
(c) MM
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
1.2
Y (m)X (m)
Acc
ess
Del
ay
(d) RPGM
Figure 3: Geographical distribution of the access delay (Greedy
Algorithm)
4.3.3. Geographical Distribution of Access Delay
Figures 3 to 6 show geographical distribution of access delay
with differentdata replication algorithms. Due to page limit, we
only present the resultswith Zipf distribution.
As shown in Figures 3(a), 4(a), 5(a), and 6(a), geographical
location doesnot affect the access delay too much with the RW
mobility model. This isbecause nodes are initially randomly
distributed and randomly determinemovement directions in RW. The
node density is relatively even, and hencethere is no large
variation for data access delay at different locations. How-
15
-
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
1.2
Y (m)X (m)
Acc
ess
Del
ay
(a) RW
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
1.2
Y (m)X (m)
Acc
ess
Del
ay
(b) RWP
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
1.2
Y (m)X (m)
Acc
ess
Del
ay
(c) MM
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
1.2
Y (m)X (m)
Acc
ess
Del
ay
(d) RPGM
Figure 4: Geographical distribution of the access delay (Paring
Algorithm)
ever, we can still see that the Greedy algorithm and the Paring
algorithmhave lower access delay than the other two, which is
consistent with ourprevious results on access delay.
From Figures 3(b), 4(b), 5(b), and 6(b), we can see some
interestingresults under RWP. When a node is at the boundary of the
simulation area,its access delay is short. As it moves towards the
center area, its accessdelay becomes larger first and then begins
to decrease. In RWP, duringeach movement cycle, each node randomly
chooses a destination and movesthere. Therefore, nodes have higher
probability to appear at the center area,and thus the central area
has higher node density than the boundary area.Thus, nodes are
easier to be isolated at the boundary area, but form
largepartitions at the center of the simulation area. In an extreme
case whereone node is isolated, its access delay is the lowest
since it can only accessthe local replicated data. At the center
area, the node density is high, which
16
-
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
1.2
Y (m)X (m)
Acc
ess
Del
ay
(a) RW
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
1.2
Y (m)X (m)
Acc
ess
Del
ay
(b) RWP
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
1.2
Y (m)X (m)
Acc
ess
Del
ay
(c) MM
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
1.2
Y (m)X (m)
Acc
ess
Del
ay
(d) RPGM
Figure 5: Geographical distribution of the access delay
(Neighboring Algorithm)
helps nodes to find their interested data from close nearby
nodes, resultingin a low access delay.
Under MM, shown in Figures 3(c), 4(c), 5(c), and 6(c), mobile
nodes areonly allowed to move in the vertical or horizontal
directions following theroad layout, and thus the access delay is
only available at the position wherethere is a road. This is good
for achieving a relatively higher node density andavoiding nodes
being isolated, but it results in larger access delay comparedto RW
and RWP.
Finally, Figures 3(d), 4(d), 5(d), and 6(d) compare the access
delay ofdifferent data replication algorithms under the RPGM
mobility model. Sim-ilar to RWP, RPGM has lower access delay at the
boundary area and thecenter area but larger delay in the middle.
This is because the movement ofthe reference point of each group
follows the RWP mobility model, and themobility pattern of each
mobile group follows RWP. Because of the group
17
-
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
1.2
Y (m)X (m)
Acc
ess
Del
ay
(a) RW
0500
10001500
20002500
0
1000
2000
30000
0.2
0.4
0.6
0.8
1
1.2
Y (m)X (m)
Acc
ess
Del
ay
(b) RWP
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
1.2
Y (m)X (m)
Acc
ess
Del
ay
(c) MM
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
1.2
Y (m)X (m)
Acc
ess
Del
ay
(d) RPGM
Figure 6: Geographical distribution of the access delay
(Grouping Algorithm)
mobility characteristic of RPGM, the connectivity among nodes in
the samegroup are relatively reliable. This helps nodes to form
larger partitions andthus more nodes can be reached. Therefore, the
access delay is larger in theRPGM mobility model than that in the
RWP.
4.3.4. Geographical Distribution of Data Availability
Similar to the geographical distribution of data access delay,
from Figures7(a), 8(a), 9(a), and 10(a), we can see that data
availability is independent tothe location where the query is
initiated under RW. However, different datareplication algorithms
achieve different data availability. In the Greedy algo-rithm,
there is no data sharing since each node only replicates data
accordingto its own interest. Therefore, there could be duplicated
data among closelyconnected nodes, which reduces the overall data
availability. The Neighbor-ing algorithm and the Grouping algorithm
aim to share data with nearby
18
-
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
Y (m)X (m)
Ava
ilabi
lity
(a) RW
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
Y (m)X (m)
Ava
ilabi
lity
(b) RWP
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
Y (m)X (m)
Ava
ilabi
lity
(c) MM
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
Y (m)X (m)
Ava
ilabi
lity
(d) RPGM
Figure 7: Geographical distribution of the data availability
(Greedy Algorithm)
nodes, which can remove some data redundancy and improve the
data avail-ability. However, when partition occurs, data saved on
neighbors may notbe available. The Paring algorithm, however,
considers to replicate data onthe most reliable neighbor, and can
achieve the best balance between nodes’cooperation and the risk of
partition. Therefore, the Paring algorithm hasthe highest data
availability.
As shown in Figures 7(b), 8(b), 9(b), and 10(b), under RWP,
nodes areeasier to stay around the central area, and thus the data
availability is higherin the center area. We can also see that the
Greedy algorithm has the bestdata availability in RWP since it does
not consider any cooperation.
In MM, shown in Figures 7(c), 8(c), 9(c), and 10(c), there are
more nodesaround the intersection area than other area, and hence
the data availabilityat the intersection is higher. We also find an
interesting fact existing in thecooperative data replication
algorithms. Let’s use the Paring algorithm as
19
-
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
Y (m)X (m)
Ava
ilabi
lity
(a) RW
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
Y (m)X (m)
Ava
ilabi
lity
(b) RWP
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
Y (m)X (m)
Ava
ilabi
lity
(c) MM
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
Y (m)X (m)
Ava
ilabi
lity
(d) RPGM
Figure 8: Geographical distribution of the data availability
(Paring Algorithm)
an example. Figure 11 shows the data availability along the
third horizontalroad, i.e., the position (x,y) changes from [0,
1500] to [2500, 1500]. In thisfigure, we can see that both the
intersection area and the middle segmentsof the road have higher
data availability. However, the data availability islow at other
areas that are close to the intersections. This fact comes fromthe
characteristic of the MM mobility model. Due to the road layout
con-straint, mobile nodes may split at the intersection area when
they choosedifferent movement directions. Since cooperation based
data replication al-gorithms rely on data sharing among nearby
nodes, some data may not beavailable when split happens, which
affects the data availability at these ar-eas. However, when nodes
are aware of the splitting, they will reorganizetheir collaborative
nodes and share data with them. Therefore, after thereorganization
process, i.e., at the middle segments of the road, they canachieve
a relatively higher data availability.
20
-
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
Y (m)X (m)
Ava
ilabi
lity
(a) RW
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
Y (m)X (m)
Ava
ilabi
lity
(b) RWP
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
Y (m)X (m)
Ava
ilabi
lity
(c) MM
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
Y (m)X (m)
Ava
ilabi
lity
(d) RPGM
Figure 9: Geographical distribution of the data availability
(Neighboring Algorithm)
Figures 7(d), 8(d), 9(d), and 10(d) present results for RPGM.
Due tothe similar mobility pattern of the mobile nodes in RWP and
the referencepoint in RGPM, the shape of the data availability
figure of RPGM is simi-lar to RWP, i.e., higher data availability
near the center area and low dataavailability at the boundary area.
Since nodes in the same group have quitesimilar mobility pattern
and more reliable connectivity, RPGM can achievemuch higher data
availability than RWP. By comparing different data repli-cation
algorithms, we can see that the Grouping algorithm has the
highestdata availability. The advantage comes from its data sharing
within eachmobile group, and thus nodes’ memory can be utilized
more efficiently.
4.4. Discussions
In this section, we summarize the experimental results and
identify themost suitable data replication algorithms under various
mobility models.
21
-
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
Y (m)X (m)
Ava
ilabi
lity
(a) RW
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
Y (m)X (m)
Ava
ilabi
lity
(b) RWP
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
Y (m)X (m)
Ava
ilabi
lity
(c) MM
0500
10001500
20002500
0500
10001500
20002500
0
0.2
0.4
0.6
0.8
1
Y (m)X (m)
Ava
ilabi
lity
(d) RPGM
Figure 10: Geographical distribution of the data availability
(Grouping Algorithm)
RW Model: Under RW, nodes have low mobility similar to vibrating
inthe same position. Then, the connectivity between closely
connected nodes isrelatively reliable. Also, RW always forms small
network partitions but rarelyforms large ones. Due to the low
mobility, even if there is a network partition,each partition is
relatively stable. Thus, when designing a data
replicationalgorithm, it is more appropriate for nodes to
cooperatively replicate datawith their closely connected neighbor,
and the replication should not relyon data sharing with a large
number of nodes. This also explains why theParing algorithm is the
most efficient algorithm in the simulation.
RWP Model: In RWP, nodes move randomly and do not show
anyreliable connections with each other, and hence the node
partition rate ishigh. Thus, it may not be good to share data with
others and the non-cooperative Greedy algorithm may be the best
choice. On the other hand,since nodes tend to gather at the center
of the network, it forms a large
22
-
0 500 1000 1500 2000 25000.68
0.69
0.7
0.71
0.72
0.73
0.74
0.75
Ava
ilabi
lity
Intersection Intersection IntersectionIntersection
Figure 11: Geographical distribution of the Paring algorithm in
MM
partition around the central area, where the availability is
high. Thus, whendesigning a data replication algorithm, it is
better to push and replicate themost important data on the nodes
around the central area. Further, mobilenodes should forward their
requests to the central area to improve the querysuccess ratio.
MM Model: The MM model has several interesting features due to
itsrestricted mobility. First, in MM the connection between
neighboring nodeslasts longer than that in RWP. The connectivity is
relatively reliable becauseseveral neighboring nodes on the same
street with the same direction oftenmove together. Therefore, when
designing a data replication algorithm underMM, it is effective to
share data among neighbors in the same direction.Second, the node
density is higher in the intersection area than other areas.Similar
to the RWP model, it is more suitable to buffer some important
dataat these areas to better serve future requests. Finally,
partitions frequentlyoccur after the intersection area, and
resulting in low data availability in theseareas. To maintain high
data availability and low query delay, new schemesshould be
designed to predict partition at the intersection and pre-fetch
theimportant data before the partition.
RPGM Model: The RPGM model provides much higher data
avail-ability but longer query delay than other mobility models.
Due to groupmobility, RPGM always provides higher connectivity
among nodes in thesame group and the most reliable group
connection. As a result, cooperationbased data replication
algorithm can achieve the best performance in termsof data
availability by cooperatively sharing data within each group.
How-
23
-
ever, the negative effect is that the query delay is relatively
longer than othermobility models due to node cooperation. By
contributing more memory toreplicate data for other group members,
mobile nodes have to access someof the interested data from other
nodes through multi-hop. In summary, itis effective to share data
among nodes in the same group in RPGM. It isimportant to have a
good group detection technique to detect nodes movingin the same
group and then effectively allocate data within the group.
5. Conclusion
In mobile ad hoc networks, nodes move freely and network
partition oc-curs frequently. To mitigate this problem, data
replication is commonly usedto increase the data availability and
reduce the data access delay. However,most previous work assumed a
particular mobility model and could not fullystudy the effects of
mobility on data replication. In this paper, we quantifythe effects
of mobility on different data replication algorithms from
variousperspectives. The study is based on several metrics which
are not limited tothe average access delay and data availability,
by including the geographicaldistribution of these values. Through
extensive experiments, we study theeffects of four typical mobility
models on data replication, and identify themost suitable data
replication algorithms under various mobility models.
We believe that the experimental results and knowledge obtained
fromthe results are very useful for researchers to design various
algorithms for datasharing and replication on these typical
mobility models. To the best ourknowledge, this is the first work
that explores and provides a deep explanationof the relationship
between node mobility and data replication algorithms.
References
[1] D. B. Johnson, D. A. Maltz, Dynamic source routing in ad hoc
wirelessnetworks, Mobile Computing, Kluwer (1996) 153–181.
[2] T. Hara, S. K. Madria, Data replication for improving data
accessibilityin ad hoc networks, IEEE Transactions on Mobile
Computing 5 (11)(2006) 1515–1532.
[3] K. Wang, B. Li, Efficient and guaranteed service coverage in
partition-able mobile ad-hoc networks, IEEE INFOCOM.
24
-
[4] J. Luo, J. Hubaux, and P. Eugster, Pan: Providing Reliable
Storagein Mobile Ad Hoc Networks with Probabilistic Quorum Systems,
ACMMobiHoc.
[5] H. Yu, A. Vahdat, Minimal Replication Cost for Availability,
ACM Sym-posium on Principles of Distributed Computing (PODC).
[6] H. Yu, A. Vahdat, The costs and limits of availability for
replicatedservices, ACM Transactions on Computer Systems 24 (2006)
70–113.
[7] L. Gao, M. Dahlin, A. Nayate, J. Zheng, A. Iyengar,
Consistency andReplication: Application Specific Data Replication
for Edge Services,International conference on World Wide Web.
[8] J. Zhao, G. Cao, Vadd: Vehicle-assisted data delivery in
vehicular adhoc networks, IEEE Transactions on Vehicular Technology
57 (3) (2008(A preliminary version appeared in IEEE infocom’06))
1910–1922.
[9] J. Hahner, D. Dudkowski, P. Marron, K. Rothermel,
Quantifying net-work partitioning in mobile ad hoc networks,
International Conferenceon Mobile Data Management (2007)
174–181.
[10] F. Bai, N. Sadagopan, A. Helmy, Important: A framework to
systemati-cally analyze the impact of mobility on performance of
routing protocolsfor adhoc networks, IEEE INFOCOM.
[11] J. Huang, M. Chen, On the effect of group mobility to data
replicationin ad hoc networks, IEEE Transactions on Mobile
Computing 5 (2006)492 – 507.
[12] T. Hara, Quantifying impact of mobility on data
availability in mobilead hoc networks, IEEE Transactions on Mobile
Computing 9 (2) (2010)241–258.
[13] T. Hara, Replica Allocation in Ad hoc Networks with
Periodic DataUpdate, International Conference on Mobile Data
Management.
[14] T. Hara, S. Madria, Consistency management strategies for
data repli-cation in mobile ad hoc networks, IEEE Transactions on
Mobile Com-puting 8 (7) (2009) 950–967.
25
-
[15] J. Cao, Y. Zhang, G. Cao, L. Xie, Data consistency for
cooperativecaching in mobile environments, IEEE Computer 40 (4)
(2007) 60–66.
[16] L. Yin, G. Cao, Balancing the tradeoffs between data
accessibility andquery delay in ad hoc networks, IEEE International
Symposium on Re-liable Distributed Systems (2004) 289–298.
[17] K. Pearson, The problem of the random walk, Nature 72
(1867) (1905)342.
[18] J. Broch, D. Maltz, D. Johnson Y. Hu, and J. Jetcheva, A
PerformanceComparison of Multi-Hop Wireless Ad Hoc Network Routing
Protocols,ACM MobiCom (1998) 85–97.
[19] X. Hong, M. Gerla, G. Pei, C. Chiang, A group mobility
model forad hoc wireless networks, ACM international workshop on
Modeling,analysis and simulation of wireless and mobile systems
(1999) 53–60.
26