Mathl. Comput. Modelling Vol. 19, No. 1, pp. 7-19, 1994 Copyright@1994 Elsevier Science Ltd
Printed in Great Britain. All rights reserved
08957177(94)E0002-5 O&395-7177194~$6.00 + 0.00
Database Placement in Communication Networks for Minimizing the Overall
Transmission Cost
X.-M. LIN,* M. E. ORLOWSKA AND Y.-C. ZHANG Department of Computer Science, The University of Queensland
St. Lucia, QLD 4072, Australia {lxue,maria,yan}Ocs.uq.oz.au
(Received October 1993; accepted November 1993)
Abstract-The minimum spanning tree write policy for the maintenance of the consistency of a distributed database, where replicated data exist, has been proposed in [l]. In this paper, we first present a data placement heuristic algorithm in general networks for minimizing the overall transmis- sion cost for processing the typical demands of queries (by a “simple” process strategy) and updates (by the minimum spanning tree write policy). Several interesting optimality estimation results of this algorithm are shown, while the computational intractability of the complete optimization, with re- spect to the simple strategy, is shown as well. Secondly, we apply a classical climbing hill technique to obtain a dynamic database placement algorithm based on an employed optimizer-a collection of dis- tributed query process algorithms. This is guaranteed to output a “locally optimal” data allocation. The implementation results also show that those two heuristics work well in practice.
Keywords-Communication cost, Data placement, Distributed database, Network, Optimization.
1. INTRODUCTION
Data transmission cost study in a network usually includes the investigation of the path selection
problem [2]. In this paper, we assume that data communication between different sites travels
through the shortest path [2] which is described as follows.
Given a network, each link Zi has been assigned a maximum volume wi (i.e., link capacity) for
a packet movement, and each packet pays cost ci to cross the link. Here Q could be either the
time for a packet to traverse the link or the expense for a packet movement. Data required to
traverse between different sites must be first packed to several packets in order to cross links to
its destination. Further, each packet should have the size wi with respect to each link Zi, but
there is at most one whose size is possibly smaller than wi. In a heterogeneous communication
network, those ci and wi are not necessarily the same with respect to the links, and a repacking
may be required at a gateway (i.e., an interconnection between different local networks). Suppose
that data volume N is required to go from site a to site b in a network through a path p which
is connected by several links Zi (Zi E p). Thus, the total transmission cost for N to pass the path
can be expressed as
This can be approximately rewritten as
*Corresponding author.
7
X.-M. LIN et al.
The shortest path data communication is to choose a path p connecting a and b such that
c I,Ep ci/wi is minimized.
A communication network N can be represented as a weighted undirected graph [3] (V, E,p)
such that each vertex in V represents a site, each edge represents a two-way communication link,
and p is assigned so that for each edge ei, p(ei) = ci/wi. In this paper, we may assume that all
p(ei) are positive integers, since we can redefine the cost unit to make all ci/wi integers.
We present a database which is located (possibly replicated) among the sites in a communica-
tion network. One of the major aspects in distributed databases is to minimize the transmission
cost of data shipping between different sites for processing a transaction, since it has been decided
over last decade that the transmission cost is the dominant factor in processing a transaction in
a distributed database.
A number of significant results in processing a distributed query for minimizing transmis-
sion cost have been achieved in last decade [4-141. Meanwhile, processing a distributed update
throughout the minimum transmission cost can be found in [1,15,16].
Of cause, the following question has arisen in distributed database design. How can we find the
optimal data placement such that the overall transmission cost for processing typical demands of
transactions may be minimized? This data placement (data allocation) problem, also previously
called the file-allocation problem in distributed computing, has been extensively studied [15-211.
Achievements in the classical file-allocation problem ware surveyed in [19]. A simple application
of the file-allocation problem leads to a simple treatment (simple strategy) for processing a
distributed query. The simple treatment requires only that all referenced (by a query) relations
are sent to the result site, at which a query result is required, to finally perform the query at
the result site. Their works deal with processing a distributed query either through a uniform
communication network [17] (each pair of sites must be connected by a two-way communication
link, and all wi and ci are the same) or by the simple treatment [1,16,21].
The minimum spanning tree policy (see Section 2) for an update propogation has been in-
troduced recently in [1,16] for solving the data placement problem. However, they still use the
simple strategy for processing a query. Further, to achieve the optimal solution, only special
networks are considered.
To highlight the limited coverage shown in the previous works, we shall consider a general
network and address the data placement problem not only in respect to the simple treatment for
processing a query. We also use the minimum spanning tree policy for an update propogation.
These distinguish our work from all previous works.
Generally, each of the following factors will substantially impact on the quality and overall
performance of a data allocation:
1. How do we split a relation into several fragments?
2. What kind of strategy is used to process each transaction?
3. What kind of approach is used to find a data allocation, based on the solutions of 1 and 2, to minimize the overall communication costs?
There are a number of works on how to split a relation into fragments [22-261. In this paper,
we assume that a fragmentation, the set of fragments, has been automatically obtained by using the information about a set of most frequently used transactions [24].
The relationship between factor 2 and factor 3 makes the data allocation problem logically
intractable even if a fragmentation is given:
The quality of a data allocation can be evaluated through transaction processing strategies.
Different strategies may lead to different minimum overall transmission costs. On the other
Database Placement 9
hand, without specific data allocation, it is generally dificult to justify what strategies should be
chosen. This logical intractability is mainly caused by considering the inter-relationship between the
fragments (relations or objects) for processing a transaction. The following will be applied to solving the above logical intractability. We first adopt the
simple treatment for a query process, and the minimum spanning tree policy for an update
propagation, to develop a data allocation algorithm on a given database fragmentation. Then, we propose an approach to refine the data allocation obtained from this algorithm. This refinement
approach starts from an initial data allocation, and then iteratively and greedily reduces the
overall transmission cost by a classical hill climbing technique based on an employed optimizer.
Because current developments in query process optimization are far from perfect, our refinement
approach may allow the use of any query process optimization approach. The rest of the paper is organized as follows. In Section 2, we present some preliminaries which
include the necessary formalizations. As well, we present the motivations of our research in more
detail. In Section 3, a heuristic algorithm is presented for obtaining a data allocation under given
simple query strategies, together with its theoretical performance guarantee. In Section 4, we
present a dynamic approach for obtaining a data allocation in cooperation with an employed
optimizer. This is followed by the conclusions, remarks, and experiment reports.
2. PRELIMINARIES AND MOTIVATIONS
This section includes a background discussion with respect to networks, fragments, and trans-
action process strategies.
2.1. Networks
As described above, a communication network N can be represented as a weighted graph
(V, E, p) such that p assigns each edge ci E E with a positive integer number p(ei) which represents
the cost for a unit data volume to traverse this link. The transmission cost, as we mean in this
paper, of a data volume U shipping through a link e is U x p(e). Note that we can use (u, V) to
represent an edge where u and v are its two ends. A network N is metric if it is fully connected,
and for any triple u, U, zu of distinct sites:
P((T v)) 5 P((% w)) + P((W v)).
A communication network N = (V, E,p) h as a corresponding metric network fi = (V, I?,@)
such that # is constructed from N as follows:
. EC_&
l fi(e) = p(e) if e E E,
l Ij((u, v)) is defined to be the length of the shortest path connecting u and v in N if (u, v) $ E.
Note that we assume that the communication between two sites in a network is through the
shortest path. It follows that finding a data allocation with the minimum overall transmission cost on a network is equivalent to finding a such data allocation on its metric map.
In the rest of the paper, we consider only metric networks. “Metric network” is abbreviated
to “network.”
2.2. Fragmentations, Data Allocation, and Transactions
A primary fragmentation of a relational database is a set F = {fi : 1 5 i 5 m} of fragments
with the property that fi # fj if i # j. For the correctness issue of a fragmentation, we refer read- ers to [22,24-261. Allocation of a primary fragmentation may induce a duplicated fragmentation in which each fragment may have several copies.
10 X.-M. LIN et al.
For the remainder of this paper, an element in a primary fragmentation will always be called a
fragment, while each element in a duplicated fragmentation will be called a copy of a fragment.
Clearly, we need only to consider the allocation problem of a primary fragmentation. Thus,
“primary fragmentation” is abbreviated to “fragmentation” in the rest of the paper.
A data allocation (data placement) L of F on a network N = (V, E,p) is a mapping from F
to 2v. That is, for each fragment fi E F, L(fi) C V. A data allocation with respect to a
fragment fi is also called a data allocation (data placement) of fi.
Without loss of generality, in this paper we assume that the transactions are either query only
or update only, and transactions are expressed by fragments. For a transaction consisting of several queries and updates, we could view it as several transactions.
2.3. Transaction Processings
Suppose that F is a fragmentation, N = (V, E,p) is a network, L is a data allocation of F on N, and T is a transaction set. Further, let F = {fi : 1 5 i 5 m} and N = {j : 1 5 j 5 TZ}.
In this section, an overview is given of the current developments of techniques for processing a transaction in distributed databases.
Minimum spanning tree policy for an update
The consistency of a database should be maintained after an update. The following assumption
about the expression of an update may atomically maintain the consistency between different
fragments:
a transaction causing an update of several different fragments is always assumed to be
expressed in terms of those fragments.
Further, to maintain the consistency between different copies of a fragment, one needs to
propagate an update between those copies. Suppose that a user at node j of the network N
issues an update of a fragment fi. The following update strategy, MST-strategy, is applied to
update all copies of fi located on N:
l the route for processing this transaction is a minimum spanning tree of L(fi) U {j} in N.
For example, a given network, as illustrated in Figure 1, has the fragment fi located at site 1,
site 2, and site 4. A transaction issued at site 3 requires an update of fi. Using MST-strategy,
the update route is from site 3 to site 4, and then from site 4 to site 1 and site 2. In earlier papers
[17-191, the naive broadcasting policy (as called in [l]) was applied to update copies of the same
fragment. The naive policy asked the issue site to send write data content directly to every site
that has a copy of the fragment. In the above example, the route via the broadcasting strategy is
from site 3 to sites 1, 2, and 4. Usually, MST-strategy leads to a smaller transmission cost than
the naive broadcasting strategy, since the route of the naive policy is also a spanning tree.
Let Uij denote the total data volume required by the transactions in T issued at node j to
update the fragment fi. The overall transmission for updating j’i by MST-strategy is:
C uij vmsp(.h L(fi)).
j=l (1)
Here vmsp(j, L(fi)) is the summation of the weights of the links of the minimum spanning tree of {j} U L(fi) in N. The overall transmission cost to process the update type transactions of T with resnect to F and L is:
n ec uij vmsp(j, L(fi)). (2)
Database Placement 11
1 1 sill? 3
---9 - site4
Figure 1. MSP-strategy.
We should note that MST-strategy is optimal if we consider only the routes consisting of the
issuing site and the sites where fi is located. The optimal propagation policy for an update is
NP-hard if a propagation route allows to include any sites (minimum Steiner tree problem [27]).
In the rest of the paper, we shall use MST-strategy for an update process.
Distributed query process
According to a query processing optimization technique [28], we may always assume that
selection and projection operations (initial process) have been pushed down on the query tree
when processing a query; that is, it is necessary to first process the relevant selections and
projections on the required fragments at the sites where those fragments are located.
A simple strategy to process a query is to send the contents of all fragments, which are required
for access, to the query result site to perform the query. Meanwhile, if a fragment has several
copies in the allocation L, then the closest copy to the result site is chosen. For example, assume
that fr (A, B) and fz(B, C) are two fragments, and that fi is located at site 1 and fs is allocated at
site 2 and site 4 of the network illustrated in Figure 1. A query, resulting at site 3, is represented
in SQL [29] as follows:
SELECT A, B, C
FROM fi, f2
WHERE f1.B = f2.B and f1.A 5 50.
The simple strategy to process this query is to do a selection on fr at site 1 with the condition
f1.A 2 50, and then to send the result of the selection and the copy of fragment f2 at site 4 to
site 3 to implement the join.
Let Qij denote the total data volume of fi required to send to site j to process the queries
in T which resulted at site j by the simple strategy. Following this, the overall transmission cost
to process the queries in T is:
F 9 QijG, L(fi)), (3) i=l j=l
where d(j, L(fi)) = min{p((j, 1)) : 1 E L(fi)} (the shortest distance from j to L(fi) in N) if
j # L(fi), and d(j, L(fi)) = 0 if j E L(fi). Th e overall transmission cost for processing the queries in T with respect to fi is delined as:
2 Qij4.i, L(fi))- (4 j=l
There are a number of other ways developed to process a distributed query (especially to
process a join) for reducing the transmission cost with respect to the simple strategy, while the
wm 19:1-B
12 X.-M. LIN et al.
problem of minimizing the transmission cost is known to be NP-complete [9,30]. Those proposed
approaches can be classified into two classes.
One is called the semi-join reduction approach. This approach suggests to first implement a
semi-join on each referenced relation (fragment), and then send each semi-join reduced relation to
the result site to perform the join. Numerous algorithms [4,6-8,10,11,13,31] have been published
in the semi-join reduction approach, which are aimed at producing the optimal execution plan
for a distributed join. The other is called the join-based approach. This approach suggests using the join operation
only to reduce transmission cost. That is, instead of sending each referenced relation directly to
the result site, one may check each group of several fragments (relations) to see whether or not
the operation of implementing the join on them at some site and then sending the result to the
result site to perform the final join will save the transmission cost. Several papers [9,14,32] have
addressed this approach. Recently, a combination of these two approaches has been studied [7,33,34].
2.4. Motivations
Our motivations for the research are based on observations as follows:
Observation 1. The coverage of the previous works on data allocation has some limitation as pointed out in Section 1. Also, those works are concentrated on a special case in which
they are interested. This leads to difficulties for a generalization.
Observation 2. An obtained execution plan (better than the simple strategy) for a distributed query process usually should depend on a specific data allocation.
Observation 3. The overall transmission cost of a data allocation also depends on the employed strategies for processing transactions.
Thus, in this paper we will put our emphasis on a general network to resolve the problem in
Observation 1. We use a refinement algorithm to improve the quality of data allocation where
only the simple strategy has been addressed to resolve the problems in Observations 2 and 3.
3. A DATA ALLOCATION ALGORITHM UNDER THE SIMPLE STRATEGY
Suppose that F = {fi : 1 5 i < m} is a fragmentation, N = (V, E,p) is a network with the
node set V = {j : 1 5 j 5 n}, and T is a transaction set. In this section, we investigate the
problem of finding a data allocation L of F on N so that the overall transmission cost to process
the transactions in T by the simple query strategy and MST-strategy is minimized.
Let Uij denote the total data volume required by the transactions in T which are issued at
node j to update the fragment fi, and let Qij denote the total data volume of fi required to send
to the result site j to process the queries in T by the simple strategy. The above optimization
problem can be precisely stated as the following problem.
Simple data allocation problem (SDAP)
Find an allocation L of F on N so that the following value is minimized:
$J 2 Qijd(j, L(fi)) + F F uij vmsp(j, L(fi)). i=l j=l i=lj=1
(5)
The SDAP is equivalent to the problem of finding an allocation L of each fi so that the following
value is minimized: Tl n
c Q&j, -Wd) + 1 uij vmw(A L(h)). j=l j=l
(6)
Database Placement 13
Thus, to investigate SDAP, we need only to investigate the following problem.
Simple data allocation problem of one fragment (SDAPOF)
Find an allocation L for a given fragment fi so that (6) is minimized.
In the appendix, we will show that both SDAP and SDAPOF are NP-hard.
THEOREM 1. Both SDAP and SDAPOF are NP-hard.
In the following, we present an effective heuristic algorithm, SIMPLE, to solve SDAPOF for
each fragment fi. Then we combine all the produced data allocations on each individual fragment
to obtain an approximation solution of SDAP.
3.1. Algorithm SIMPLE
Since SDAPOF is NP-hard, it is unlikely for us to find a polynomial time algorithm to solve
it [27]. In this section, we present an approximation algorithm, SIMPLE, for SDAPOF. We first
characterize some necessary conditions for the optimal solution of SDAPOF to reduce the search
space. For 1 5 i 5 m, let Ui = C,“=, Uij; and for 1 5 i 5 m and 1 5 j 5 n, let Bij = Qij +
Uij - Vi. Here Vi denotes the total update data volume on fi required by T, and Bij provides a
beneficial measurement at site j. The total transmission cost with respect to a fragment fi and
a placement L of fi to process the transactions in T by the simple strategy and MST-strategy is:
cost(L(fi)) = f: Q&j, L(fi)) + 2 uij vmsp(j, L(fi)). j=l j=l
The following lemma says that we should always locate a copy of fi at the site in the optimal
allocation if the beneficial measurement at the site is positive.
LEMMA 1. Given a fragment fi, suppose that L is an arbitrary allocation of fragment fi. Then
cost(Ll(fi)) 5 cost(L(fi)), where Ll(fi) = L(fi) U {j} for a site j with Bij 2 0.
PROOF. Without loss of generality, we may assume that j $! L(fi). We may immediately verify the following properties according to the definitions:
d(j,K) Id(j,%) if V, G V,; (7)
vmsp(k V, U {j}) i 4.i VI) + vmw(k K); (8)
vmsp(lc, VI U {j}) = vmsp(j, VI) if j = Ic; (9)
where VI and V2 are subsets of the node set of the network. From the above properties, it follows that:
cOst(L(fi)) 5 C Qikd(k, Ll(fi)) + Qijd(j, L(fi))) Wmh 0.)
+ 2 Uik vmsp(k L(fi)), k=l
cOst(Ll(.fi)) 5 c Qi/cd(k, Ll(.fi)) f uij vmsP(j, L(fi)) W-h(f.)
+ 1 Uij(vmsP(k, L(fi)) + 4j, Wi)). k#j
(10)
(11)
14 X.-M. LIN et al.
Take (10) from (11):
cost&(fi)) - CO&qfi)) I -Bijd(j, L(fi)) I 0. I
From the proof of Lemma 1, we have the following corollary.
COROLLARY 1. Given a fragment fi, suppose that L is an arbitrary data allocation. Then
cost(Ll(fi)) < cost(L(fi)), where Ll(fd = L(fd U {j} f or a node j with B, > 0 and j $4 L(fi).
From Corollary 1, it follows that an allocation L of fi with the minimum value of cost(L(fi))
must allocate fi to those nodes j with Bij > 0. As well, it follows from Lemma 1 that we may
assume that an allocation L with the minimum value of cost(L(fi)) always allocates fi to those
nodes j with Bij = 0 in the case that Vi > 0. Thus, an allocation L of fi with the minimum overall communication cost cost(L(fi)) can be viewed as an extension of the data allocation which
allocates fi to those nodes j with B, 2 0. This is the basic idea for the development of the algorithm SIMPLE.
The algorithm SIMPLE starts to allocate copies of fi to those nodes j with Bij 2 0. Then it
iteratively extends the allocation by adding one copy to a node such that each time, the chosen
extension has the minimum communication cost. Finally, a data allocation is chosen such that its cost is minimized among the initial one and these extensions.
Algorithm 1. SIMPLE Input: {Qij, Uij : 1 5 j 5 n}, N, fi; Output: L is an allocation of fi; { ifUi>Othen
{ Vo := {j : Bij 1 0);
if Vo = 0 then cost(Vo) := co; V temp := v - v,;
for j = 1 to IV1 - IV01 do { choose a node 1 in Vtemp such that cost(Vj-1 U (1)) is minimized;
q := v,_l u (1); V tern* := V temp - (1); )
Choose a Vj such that cost(Vj) is minimized; L(fi) := Vj; }
else L(fi) = {j : Qij > 0). }
3.2. Performance Evaluation of the Algorithm SIMPLE
Clearly, this algorithm runs in polynomial time. Further, we have the following performance
guarantee.
THEOREM 2. For a fragment fi, suppose that L is a data allocation given by the algorithm
SIMPLE and Lopt is the data allocation with the minimum overall transmission cost to process
the given transactions under the simple strategy. Further, suppose that C is the largest value of the weights of the edges and c is the smallest value of the weights of the edges in the network.
Then cost(L(f,)) C
cost(L,,t(fi)) < C’
PROOF. Clearly, if Vi = 0 then the algorithm SIMPLE will output a data allocation with the minimum overall transmission cost 0. Below, we prove that this theorem is true for Vi > 0.
Suppose that Vo = {j : Bij 2 0) and L1 is the allocation of fi such that L1 (fi) = Lopt (fi) U VO.
From Lemma 1, it follows that cost(Ll(fi)) = cost(L,,t(fi)).
Let LO be an allocation of fi such that Lo(fi) = VO if VO # 8, otherwise Lo(fi) = {j} where j
is an arbitrary node in L,,,(fi). It is clear that Lo(fi) c Ll(fi).
Database Placement 15
From the algorithm SIMPLE, we have that cost(L(fi)) 5 cost(Lo(fi)). We now prove that
CONLO( c cost(Ll(fi)) s C’
fkvse that iLo(fi)l = Ko + 1 and ILl( = KI + 1. We have that
cost(Lo(fd) = c Q&, Lo(fd) + 2 Gj vmsp(j, Lo(fJ) dLo(f.) j=l
5 C QijC+ C QijC j6ch(fs) jELlUs)-Mfi)
+Ko C UijC+(Ko+l) C UijC
jGLo(f.1 j@oUi)
I C (Qij +u,)C+ C (Qij + Uij)C + KoUiC. j@L(fi) jEh(fd--LoUi)
From the fact that Lo(fi) contains all nodes j with Bj(fi) 2 0 and the above fact, it follows
that
cost(Lo(fi)) I C (Qij + Uij)C + KlUiC- jEL(fi)
Similarly,
cost(Ll(fi)) = C Qij%, b(fi)) + 2 vij vmsp(j, b(fi)) jG&Uifi) j=l
2 C Qijc+ K1 C uij c+(K~+l) C Uijc j&h (f*) jE-b(fi) jt%(f.)
2 C (Qij + Uij)c + KIU~C.
j@l(fg)
Hence, cost(Lo(fi)) C cost(Ll(fi)) 5 C’
I
COROLLARY 2. Suppose that the algorithm SIMPLE is applied to each fragment fi in F to obtain
the allocation Li of fi. Then, in the data allocation L of F such that L(fi) = Li(fi) for each fi,
we have that c:“=, cost(L(fi)) C
cz”=, cost(L,,t(fi)) 5 C’
where C is the largest value of the weights of the edges and c is the smallest value of the weights
of the edges in the given network.
From Corollary 2, it follows that:
COROLLARY 3. Suppose that the algorithm SIMPLE is applied to each fragment fi in F to obtain the allocat<on Li of fi in a uniform network where c = C. Then the data allocation L of F with
L(fi) = L4fi) f or each fi has the minimum overall transmission cost under the simple strategy
and MST-strategy.
Furthermore, suppose the given communication network N is not fully connected, each edge
has the same weight, and the diameter (31 of the underlying graph is d (for example, the graph
16 X.-M. LIN et al.
in Figure 2 has diameter 2). Then after application of the algorithm SIMPLE on the metric map
of N to get a data allocation L, we have:
Figure 2. Diameter 2.
Theoretically, the algorithm SIMPLE has a good performance for a local network-the network
such that C/c is small. Our experiments show that in practice, this algorithm is quite efficient and
quite effective (in most cases among random experiments it achieves the optimal) for a general
network.
3.3. A Heuristic Algorithm: REFINEMENT
As pointed out in Section 2, the recent developments of query processing optimization tech-
niques may always guarantee to output an execution plan better than the simple strategy. The
data allocation output by the algorithm SIMPLE needs to be refined because the strategies to
process queries are not necessarily simple. In this section, we present a framework of a refinement
algorithm, based on an employed distributed optimizer OP, on the data allocation Lo obtained by
the algorithm SIMPLE. Suppose that cost(L, T, OP) is the overall transmission cost to process T
by OP on the data allocation L.
A local modification of a data allocation L of F is either:
l for a fragment fi, drop a copy of fi from a node j with j E L(fi); or
l for a fragment fi, add a copy of fi to a node j with j $ (fi); or
l for a fragment fi, remove a copy of it from a node j with j E L(fi) to a node 1 with
l 6 Wi).
A data allocation L of F on N is locally optimal with respect to a distributed optimizer OP if
no local modification will reduce the overall communication cost to process T by OP.
The algorithm REFINEMENT iteratively refines Lo through the choice of a local modification,
so that the overall communication cost is greedily reduced, until there is no reduction.
Algorithm 2. REFINEMENT Input: F is a fragmentation, N is a network with node set V, OP is
a distributed optimizer, Lo is a data allocation of F on N;
Output: L is a data allocation of F on N; { L := Lo;
repeat co := cost(Lo, T, OP); c := co;
for each fragment fi do { for each j E Lo(fi) do
for each node k # j do { Locally modify LO to produce L1 so that Ll(fi) := Lo(fi) - {j} U {k}, and so that Ll(fi) := Lo(fi) for fr # fi; cl := cost(Ll,T,OP);
Database Placement 17
if cl < c then {L:=L1;c:=c1}};
for j E V - Lo(fi) do { Locally modify LO to produce L1 so that -&(fi) := Lo(fi) U {j}, and so that Ll(fi) := L&) cl := cost(ll, T, OF’); if cl < c then
for fi # k
{ L := L1; c := Cl } } }; Lo := L;
until c := Q (no reduction on co). }
It is clear that in the algorithm REFINEMENT, each iteration runs in polynomial time if
OP runs in polynomial time for each transaction. Each iteration never increases the overall
communication cost, and the algorithm will stop if there is no reduction in the overall transmission
cost. In practice, we can choose a fixed number as the maximal iteration times. Let Ti be a
subset of the given transaction set in which all transactions are required to access fragment fi.
To implement the algorithm REFINEMENT efficiently, for each local modification of fragment fi,
we only need to run the optimizer again for the transactions in Ti instead of the whole T.
From the algorithm REFINEMENT, the following property immediately holds:
PROPOSITION 1. The algorithm REFINEMENT produces a locally optimal data allocation of F.
3.4. Experiments and Remarks
Note that the algorithm REFINEMENT allows the employed optimizer to use any kind of
distributed query process algorithms. To test the effectiveness and efficiency of the algorithm
REFINEMENT, we have developed a prototype of a distributed process optimizer JK which
consists of MST-strategy and the algorithm in [9]. The distributed query process algorithm
in [9] is a join-based approach, and it may produce the optimal (minimum transmission cost)
execution plan for processing a chain query by the join-based approach. We have implemented
the algorithms SIMPLE and REFINEMENT by extensive experiments in which the product of
the node set of a network and the fragmentation size is from 18 to 100. In our environment, we
always assume that each given query in the given transaction set T should have a chain order
(i.e., the implementation of a query follows that of a chain query), like the assumption in [9]. The
experiments show that the combination of the algorithms SIMPLE and REFINEMENT works
very well in practice. We also observe that the algorithm REFINEMENT is slower, and in most
cases, outputs a worse locally optimal allocation if it starts from an arbitrary initial allocation.
We report our experiment results below.
First, we test the efficiency of the combination. The experiments show that the algorithm
REFINEMENT converges very fast by using the data allocation-utput by the algorithm
SIMPLE-as the initial allocation. We have not experienced more than 10 iterations.
Second, we test the effectiveness of the combination. Suppose that Lmin is the data allocation
with the minimum overall transmission cost under the restriction that each query is processed in
a chain order and is processed by the join-based approach [9]. LloC is the data allocation output
through the algorithm SIMPLE and the algorithm REFINEMENT. Our extensive experiments have the following statistical results.
l In the case that the product of node set size and the fragmentation size are not greater 18,
Cost(L,i,, T, JK)
cost(Lloc,T, JK) ’ 0Sg5*
l In the case that the product of node set size and the fragmentation size is from 18 to
100, for each experiment we randomly generate lo6 data allocations &ad. A random data
18 X.-M. LIN et al.
allocation Lrd has no more than 10e3 chance to be better than LI,,. Also,
COSt(Lrad, T, JK)
cost ( Lloc, T, JK) ’ o’g5’
Note that we may use the algorithm REFINEMENT to a data allocation under the simple
strategy. But our experiment showed that the algorithm SIMPLE is effective enough, that is,
in most cases the algorithm REFINEMENT may not refine the production of the algorithm
SIMPLE.
For future study, we shall try to extend the work in this paper to a network such that the
cost of each link is not a constant, but a function which depends on the network load and other
factors.
APPENDIX
In this appendix, we prove Theorem 3.. Because SDAP and SDAPOF are equivalent, we prove
the NP-hardness of SDAPOF in the following.
From [27], it follows that we need only to prove the NP-completeness of the following problem:
Allocation Problem (AP)
INSTANCE: Given a network N with n nodes, a fragment fi, two integers Uij and Qij for each
node j in N, and an integer K. QUESTION: Is there a data allocation L of fi such that (6) is not greater than K?
LEMMA 2. AP is W-complete.
PROOF. Note that the following problem has been shown NP-Complete in [27].
Steiner Tree Problem (STP)
INSTANCE: Given a metric network N = (V, E,p), a subset v c V, and an integer 1.
QUESTION: Is there a subtree (VI, El) of N so that v c VI and xeEEl p(e) 5 l?
For each instance 11 = (N, v’, I) of STP, where N = (V, E, p) is a graph and V = {j : 1 5 j 5 n}
and v c V, we now construct an instance 12 = (N, {U,, Qij : 1 2 j 5 n}, K) of AP a~ follows,
where fl= (P, E,fi) is a network:
l jiT=N;
ofor 1 5 j 5 n, Uij = c, and Qij = nc, where c is a constant integer if j E v, and both U,
and Qij are zero if j 4 v’;
OK = cllv’l.
It can be immediately shown that for a solution (VI, El) of STP, we may construct an allo-
cation L, where L(fi) = VI, which is a solution of
follows that for a solution L of AP, the minimum
Hence the lemma holds.
Thus, Theorem 1 holds.
AP. Also from Lemma 1 and Corollary 1, it
spanning tree of L(fi) is a solution of STP.
I
REFERENCES 0. Wolfson and A. Mile, The multicast policy and its relationship to replicated data placement, ACM
Transactions on Database System 16 (l), 181-205 (1991). Y. Y. Mansour and B. Patt-Shamir, Greedy packet scheduling on shortest paths, Proceedings of the 2ph ACM Symposium on Principles of Distributed Computing, pp. 165-176, (1991). J.A. Bondy and U.S.R. Murty, Gnrph Theory with Applications, The Macmillan, (1978).
Database Placement 19
4.
5.
6.
7.
8.
9.
10.
11.
12.
13. 14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28. 29. 30.
31.
32.
33.
34.
35.
P.M.G. Apers, A. Hevner and S.B. Yao, Optimization algorithms for distributed queries, IEEE ‘lkansactions on Software Engineering SE-9 (l), 57-68 (1983). P.A. Bernstein and D. Chiu, Using semi-joins to solve relational queries, Journal of ACM 28 (I), 25-40
(1981). P.A. Bernstein, N. Goodman, E. Wong, C.L. Reeve and J.B. Rothe, Query processing in a system for distributed database (SDD-l), ACM Transaction on Database Systems 6 (4), 602-625 (1981). M.-S. Chen and P.S. Yu, Interleaving a join sequence with semijoins in distributed query processing, IEEE
Transactions on Parallel and Distributed Systems 3 (5), 611-621 (1992). A.R. Hevner and S.B. Yao, Query processing in distributed database systems, IEEE l?unsactions on Software
Engineering SE5 (3), 177-187 (1979). M.W. Orlowski, On optimisation of joins in distributed database system, Future Database 9.2, pp. 106-114, World Scientific, (1992). S. Pramanik and D. Vineyard, Optimizing join queries in distributed databases, IEEE tinsactions on Software Engineering 14 (9), 1319-1326 (1988). C.P. Wang, V.O.K. Li and A.L.P. Chen, One-shot semi-join execution strategies for processing distributed queries, ph IEEE Data Engineering Conference, pp. 756-763, (1991). E. Wong, Dynamic rematerialization: Processing distributed queries using redundant data, IEEE tinsac- tions on Software Engineering SE-9 (3), 228-232 (1983). C.T. Yu and C.C. Chang, Distributed query processing, ACM Computing Surveys 16 (4) (1984). C.T. Yu, Z.M. Ozsoyoglu and K. Lam, Optimization of distributed tree queries, Journal of Computer and System Science 29, 399-433 (1984). X. Lin, M. Orlowska and Y. Zhang, On data allocation with the minimum overall communication cost in distributed database design, International Conference on Information and Computing 93, IEEE Computer
Press. Replicated fragment allocation using a clustering technique, Proceedings of 4 th Australian Database Conjer-
ence, (1993). P.M.G. Apers, Data allocation in distributed database systems, ACM Bunsactions on Databases Systems
13 (3), 263-304. R.G. Casey, Allocation of copies of a file in information network, Proceedings of the 1972 Spring Joint Computer Conference, AFZPS, pp. 617-625, (1972). L.W. Dowdy and D.V. Foster, Comparative models of the file assignment problem, ACM Computing Surveys 14 (2), 287-313 (1982). H.L. Morgan and K.D. Levin, Optimal program and data location in computer network, Communications of
ACM 20 (5), 315-322 (1977). D. Sacca and G. Wiederhold, Database partitioning in a cluster of processors, ACM 7+-ansactions on Database
Systems 10 (l), 28-56 (1985). S. Navathe et al., Vertical partitioning algorithms for database design, ACM Transactions on Database
Systems 9 (4), 68&303 (1984). S. Navathe and M. Ra, Vertical partitioning for database design: A graphical algorithm, ACM SZGMOD,
44&450 ( 1989). X. Lin, M. Orlowska and Y. Zhang, A graph based cluster approach for vertical partitioning in database design, Data and Knowledge Engineering (1993) (to appear). Y. Zhang, M. Orlowska and B. Colomb, An efficient test for the validity of hybrid knowledge fragmentation in distributed databases, International Journal of Software Engineering and Knowledge Engineering 2 (4),
589-609 ( 1992). Y. Zhang, On horizontal fragmentation of distributed database design, Proceedings of Australian Database
Conference, (1993). M.R. Garey and D.S. Johnson, Computer and Intractability-A Guide to the Theory of NP-Completeness, W.H. Freeman and Company, (1978). D. Maier, Theory of Relational Databases, Computer Science Press, (1983). C.J. Date, An Introduction to Database System, Addition-Wesley, (1982). C.P. Wang, The complexity of processing tree queries in distributed databases, Fd IEEE Symposium on
Parallel and Distributed Processing, pp. 604-611, (1990). D. Chiu, P.A. Bernstein and Y. Ho, Optimizing chain queries in a distributed database system, SIAM Journal
on Computing 13 (l), 116-134 (1984). M.-S. Chen and P.S. Yu, Using join operations as reducers in distributed query processing, Proceedings of the Z”d Intern?. Symposium on Databases in Parallel and Distributed Systems, pp. 116-123, (1990). M.-S. Chen and P.S. Yu, Using combination of join and semijoins operations for distributed query processing, Proceedings of the .Z”d Intern? Symposium on Databases in Parallel and Distributed Systems, pp. 328-335, (1990). M.-S. Chen and P.S. Yu, Determining beneficial semijoins for a join sequence in distributed query processing, IEEE International Conference on Data Engineering, pp. 50-58, (1991). S. Ganguly, W. Hasan and R. Krishnamurthy, Query optimization for parallel execution, SZGMOD Record 21 (2), 9-18 (1992).
36. R. Sedgewick, Algorithms, Addision-Wesley, (1988).
Men 19:1-c