A Service-Oriented Componentization Framework for Java Software Systems by Shimin Li A thesis presented to the University of Waterloo in fulfilment of the thesis requirement for the degree of Master of Applied Science in Electrical and Computer Engineering Waterloo, Ontario, Canada, 2006 c Shimin Li 2006
153
Embed
A Service-Oriented Componentization Framework for Java … · 2007. 8. 20. · A Service-Oriented Componentization Framework for Java Software Systems by Shimin Li A thesis presented
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
A Service-Oriented Componentization Frameworkfor Java Software Systems
Usage RelationshipGiven two classes,A andB, let A<US>−→ B represent that there is a usage
relationship betweenA andB, whereA is the source class andB is the target class. The
UML specifies that a usage relationship is one in which the client (the source) requires
the presence of the supplier (the target) for its correct functioning or implmentation [65].
Furthermore, the UML defines five types of usage relationships: i) thecall relationship
signifies that the source operation invokes the target operation, ii) thecreaterelationship
signifies that the source class creates one or more instances of the target class, iii) the
instantiationrelationship signifies that one or more methods belonging to instances of the
source class create instances of the target class, iv) theresponsibilityrelationship signifies
that the client has some kind of obligation to the supplier, and v) thesendrelationship
signifies that instances of the source class send signals to instances of the target class. We
CHAPTER 4. ARCHITECTURE RECOVERY 45
formalize the usage relationship as follows :
A<US>−→ B = (GE(A,B) = null) ∧ (GE(B, A) = null) ∧
(EX(A,B) ∈ B) ∧ (EX(B,A) ∈ B) ∧
(IS(A,B) ⊆ all − {field, arrayfield, collectionfield}) ∧
(IS(B,A) = φ) ∧
(LT (A, B) ∈ ‖) ∧ (LT (B, A) ∈ ‖) ∧
(MU(A,B) = [1, +∞]) ∧ (MU(B,A) = [0, +∞])
(4.12)
4.3.2 Approach
The architecture modeling process identifies all relationships between the classes/interfaces and
represents the identified relationships in directed graphs. The process also computes the basic
reusability attributes for each class in the system. Figure 4.6 illustrates the architecture modeling
process.
Data Flow
Source Code Models
(XML Doc)XML Parser
Relationship Extractor
Metric Generator
Graph GeneratorGraph TransformerCIRGCIDG
Figure 4.6: The Approach for Architecture Modeling.
As we described before, the source code models built by the source code modeling process are
exported as XML documents. First, these source code models are parsed by theXML Parserin
Figure 4.6. Then, theRelationship Extractoridentifies all relationships described is Section 4.3.1
CHAPTER 4. ARCHITECTURE RECOVERY 46
and theMetric Generatorcomputes a set of metrics for each class/interfacce. We define a metric
suite at the class level to represent the basic reusability attributes for each class in the system.
The metric suite is presented in Table 4.1. The definition of each metric is adapted from SDMet-
rics [72]. Finally, theGraph GeneratorandGraph Transformergenerate the CIRG and CIDG,
respectively. We will give formal definitions of the CIRG and CIDG in following sections.
Metric Definitionlines code The number of lines of non-comment code in a class.num attr The number of attributes in a class. The metric counts all properties re-
gardless of their type (data type, class or interface), visibility, change-ability (read only or not), and owner scope (class-scope, i.e., static, orinstance attribute). Not counted are inherited properties, and proper-ties that are members of an association, i.e., that represent navigableassociation ends.
num ops The number of methods in a class. Includes all methods in the class thatare explicitly modeled (overriding methods, constructors), regardlessof their visibility, owner scope (class-scope, i.e., static), or whetherthey are abstract or not. Inherited operations are not counted.
num pub ops The number of public methods in a class. Same as metricnum ops,but only counts operations with public visibility. Measures the size ofthe class in terms of its public interface.
num nestedclasses The number of inner classes in a classsetters The number of operations with a name starting with ’set’. Note that
this metric does not always yield accurate results. For example, anoperationsettleAccount will be counted as setter method.
getters The number of operations with a name starting with ’get’, ’is’, or ’has’.Again, note that this metric does not always yield accurate results. Forexample, an operationisolateNodewill be counted as getter method.
fan in The number of classes/interfaces that depend on this class. This metriccounts incoming plain UML dependencies and usage dependencies.
fan out The number of classes/interfaces on which this class depends. Thismetric counts outgoing plain UML dependencies and usage dependen-cies.
Table 4.1: The Metric Suite at Class Level
CHAPTER 4. ARCHITECTURE RECOVERY 47
4.3.3 Class/Interface Relationship Graph
The CIDG captures the UML-compliant relationships as explained in Section 4.3.1. The formal
definition of the CIDG is given as follows :
Definition 4.1. A Labeled Directed Graph (LDG) is a tupleΓ(V, E, LV , LE , lV , lE), whereV
is a set of nodes (or vertices),E is a set of edges (or arcs),LV is a set of node labels,LE is a set
of edge labels,lV : V → LV is a label function that maps nodes to node labels, andlE : E → LE
is a label function that maps edges to edge labels.
Definition 4.2. The Class/Interface Relationship Graph (CIRG) of an object-oriented system is
an LDG defined in Definition 4.1, whereV is the set of all classes/interfaces of the system,lV (v)
returns the full name (i.e. package name concatenates class or interface name) ofv for anyv ∈ V ,
E = {(v, w) ∈ V × V | v referencesw}, andlE(e) returns the types of relationships between
the source node and target node ofe for anye ∈ E. The type of a relationship is one ofIN , RE,
AS, AG, CO, andUS, which representsinheritance, realization, association, aggregation,
composition, andusage, respectively.
Each class or interface of a Java system represents a node of the CIRG of the system. We
name the node in the CIRG asRClass, and each node is presented and exported as an XML
document. The XML schema for each node is depicted in Figure 4.7. The XML schema shows
that four types of information about the CIRG node are captured :
• Property The property field records the name, the type (i.e., class or interface), the package
name, and the Java source file name of the corresponding class or interface.
• Characteristics The characteristics field records the accessibility (i.e., public, protect, or
private) and the implementation status (i.e., concrete class or abstract class) of the corre-
sponding class or interface.
CHAPTER 4. ARCHITECTURE RECOVERY 48
• Metrics The metrics field records values of the metrics in Table 4.1 for the corresponding
class or interface.
• RelationshipsThe relationships field records all classes or interfaces which have one of the
defined relationships with the corresponding class or interface. The type and the direction
Figure 5.1: The UML Representation of XML Schema for a Service.
5.2 Supporting Concepts
The proposed service identification approach involves a set of techniques such as graph trans-
formations, dominance analysis on directed graphs, and evaluation of the modularization of a
system that is represented by directed graphs. It is helpful to introduce these techniques prior to
explaining the service identification process.
CHAPTER 5. SERVICE IDENTIFICATION 57
5.2.1 Graph Techniques
Graphs can be used to describe complex object structures in a mathematical way. In the context
of software engineering, we can use graphs to formalize object-oriented languages and concepts,
especially, the UML. In this thesis, we apply graph techniques to assist in service identification.
The important graph concepts and techniques involved in this thesis are reviewed as follows :
Definition 5.2. Let G = (V, E) be a directed graph (DG), whereV represents all nodes (or
vertices) inG andE represents all edges (or arcs) inG. Given a nodev ∈ V , the in-degree ofv
is the number of inward directed edges fromv and the out-degree ofv is the number of outward
directed edges fromv. A root of G is a node whose in-degree is zero.G is said to be a rooted
directed graph iff there is only one root inV .
Definition 5.3. Let G = (V, E) be a DG, whereV represents all nodes (or vertices) inG andE
represents all edges (or arcs) inG. Given two nodesv ∈ V andw ∈ V , a path from vertexv to
vertexw is a sequence of consecutive edges betweenv andw. A cycle is a path from a node to
the same node. Nodew is said to be reachable from nodev if there is a path fromv to w. G is a
directed acyclic graph (DAG) iff there is no cycle inG.
Definition 5.4. A rooted tree is a DG G=(V,E), whereV represents all nodes (or vertices) inG
andE represents all edges (or arcs) inG, such that
1. there is a unique node inV (called the root) which has in-degree0;
2. every node inV except the root has in-degree1; and
3. there is a path from the root to every other node inG.
Definition 5.5. Let G = (V, E) be a DG, whereV represents all nodes (or vertices) inG and
E represents all edges (or arcs) inG. G is connected if the underlying undirected graph ofG is
connected. WhileG is strongly connected if there is a path inG between every pair of nodes in
V .
CHAPTER 5. SERVICE IDENTIFICATION 58
Definition 5.6. Let G = (V, E) be a DG, whereV represents all nodes (or vertices) inG
andE represents all edges (or arcs) inG. A connected component ofG is a maximal (though
not necessarily maximum) connected subgraph ofG. A strongly connected component ofG
is a maximal (though not necessarily maximum) strongly connected subgraph ofG. A rooted
component is a subgraph ofG that consists of a unique root and the collection of all nodesw
such that there is a path from the root tow.
Definition 5.7. Let G = (V, E) be a DG, whereV represents all nodes (or vertices) inG andE
represents all edges (or arcs) inG. A clique inG is a collection of nodes inV such that each pair
of nodes in the collection is joined by an edge. Ak-clique is a clique that the number of nodes in
the clique isk.
Figure 5.2: An Example of a Directed Graph.
For example, given the directed graphG in Figure 5.2, there are two connected components:
graphs (a) and (b) in Figure 5.3. The only strongly connected component ofG is the graph
(c) in Figure 5.3. Note that the subgraph{2, 5, 7} or {5, 6, 7} is not a strongly connected
component ofG, because they are not maximal. Graph (d) and graph (e) in Figure 5.3 are two
rooted components of graph (a) in Figure 5.3. The set{2, 3, 7} is a 3-clique in graphG.
CHAPTER 5. SERVICE IDENTIFICATION 59
(a) (b) (c)
(c) (d)
Figure 5.3: (a) A connected component of the directed graphG in Figure 5.2. (b) The otherconnected component ofG. (c) The only strongly connected component ofG. (d) Arooted component of graph (a). (e) The other rooted component of graph (a).
5.2.2 Dominance Analysis
Dominance analysis is a fundamental concept in compiler optimizations and has been used exten-
sively to identify loops in basic block graphs [61]. It allows one to locate subordinated software
elements in a rooted dependency graph. Dominance analysis on call graphs of procedural lan-
guage applications has been used in reverse engineering to identify modules and subsystems and
recover system architectures [17, 26, 36]. In this thesis, we explore the use of dominance analysis
on SHGs. This assists us in identifying low-level services underneath a top-level service.
Dominance is a relation between nodes in a rooted directed graph. This relation can be
formally defined as follows :
CHAPTER 5. SERVICE IDENTIFICATION 60
Definition 5.8. Let G = (V, E, r) be a rooted directed graph, whereV represents all nodes in
G, E represents all edges inG, andr ∈ V is the unique root node ofG. Given any two different
nodesv ∈ V andw ∈ V , nodev dominates nodew, writtenv dom w, iff every path from root
r to w containsv. Nodev directly dominates nodew, written v ddom w, iff all the nodes that
dominatew dominatev. Nodev strongly directly dominates nodew, written v sddom w, iff
v ddom w andv is the predecessor ofw.
Definition 5.9. LetG = (V, E, r) be a rooted directed graph, whereV represents all nodes inG,
E represents all edges inG, andr ∈ V is the unique root node ofG. The dominance tree corre-
sponding toG is a treeT = (V, Ed, r) whereEd = {(v, w) ∈ V×V | v ddom w ∨ v sddom w}.A ddom subtree ofT is a subtree that the root of the subtree has ddom incoming edge. A sddom
subtree ofT is a subtree that the root of the subtree has sddom incoming edge. A consolidation
subtree of the dominance tree is a subtree that contains only sddom edges. A maximal consolida-
tion subtree is a maximal subtree that contains only sddom edges.
(a) (c)
sddom
ddom
(b)
Figure 5.4: (a) A Simple Directed Graph. (b) The Dominance Tree Corresponding to the Graphin (a). (c) All Two Maximal Consolidation Subtrees of the Dominance Tree in (b).
Figure 5.4 shows a simple rooted directed graph, the corresponding dominance tree, and the
maximal consolidation subtrees in the dominance tree. Note that the subtree{6, 9} is a ddom
CHAPTER 5. SERVICE IDENTIFICATION 61
subtree and{2, 4, 5, 8} is a sddom subtree. The subtree{7, 10} is a consolidation subtree but not
a maximal consolidation subtree, because it is not a maximal subtree that contains only sddom
edges. In Figure 5.4, the dominance tree is constructed from an acyclic graph. However, this is
not a necessary condition. We can construct a dominance tree from every directed graph as long
as it is rooted.
By Definition 5.8 and 5.9, we can observe the following properties of dominance trees :
Property 5.1. Given a rooted directed graphG = (V, E, r), whereV represents all nodes inG,
E represents all edges inG, andr ∈ V is the unique root node ofG. Let T be the dominance
tree corresponding toG. For each node (except the root) in a subtrees (either ddom subtree or
sddom subtree) ofT , there is no incoming edge inE from any other nodes which are outside the
subtree.
Property 5.2. Given a rooted directed graphG = (V, E, r), whereV represents all nodes inG,
E represents all edges inG, andr ∈ V is the unique root node ofG. Let T be the dominance
tree corresponding toG. For each node (except the root) in a consolidation subtrees ofT , there
is no incoming edge inE from any other nodes (either inside the subtree or outside the subtree)
except its parent inT .
In the analysis process of reverse engineering, it is essential to have an effective way of
abstracting information. The dominance tree provides such an abstraction. More importantly, it
represents high-level modularization of the software system through its branches. Each branch of
the dominance tree represents a concept or high level functionality of the system. In the context
of object-oriented design, one benefit of using dominance trees in program comprehension is the
reduction of the visualization complexity of the class dependency graph by decreasing a large
number of edges. In the class dependency graph of a real world software system, a class may
have been referenced by hundreds of classes and a reduction to a single edge on the dominance
tree greatly clarifies the graphic.
CHAPTER 5. SERVICE IDENTIFICATION 62
5.2.3 Modularization Quality Metric
The modularization quality (MQ) metric was first introduced in [54]. It has been used in a number
of software engineering projects to evaluate the quality of software modularization achieved by
graph partitioning [24, 76]. Basically, the MQ metric measures the difference between the average
inter-connectivity and intra-connectivity of a system and shows how well the system is structured.
In this thesis, we use the MQ metric to evaluate how well a top-level service is modularized by
its low-level services.
Let C(G1, G2, ..., Gk) be a partition of a given graphG(V,E), whereV represents all nodes
in G andE represents all edges inG. The MQ metric of the system, which is represented by the
graphG, is defined as follows :
MQ(C, G) =∑k
i=1 s(Gi, Gi)n
−∑k−1
i=1
∑kj=i+1 s(Gi, Gj)
n(n− 1)/2(5.1)
The functions() used in Formula (5.1) is defined as the ratio of the actual number of edges
between two subsets ofV of graphG with respect to the maximum number of possible edges
between those two sets. LetU andW be two subsets ofV (i.e.,U ⊆ V andW ⊆ V ), then we
have
s(U,W ) =e(U,W )|U ||W | (5.2)
wheree(U,W ) denotes the number of edges connecting a vertex inU to a vertex inW .
The MQ metric determines the quality of the modularization quantitatively as the trade-off
between inter-connectivity and intra-connectivity of subsystems. This trade-off is based on the
assumption that well-designed software systems are organized into cohesive subsystems that are
loosely interconnected. Hence, the MQ metric is designed to reward the creation of highly cohe-
sive clusters, and to penalize excessive coupling between clusters. The value of the MQ metric is
CHAPTER 5. SERVICE IDENTIFICATION 63
between−1 (no internal cohesion) and1 (no external coupling). A straightforward consequence
is that a higherMQ value can be interpreted as better modularization since it corresponds to a
partition with either fewer edges connecting vertices from distinct blocks, or with more edges ly-
ing within the identical blocks of the partitions, which is what most clustering or modularization
algorithms aim to achieve [24].
5.3 The Proposed Processes
In the SOC4J framework, we aim to identify critical business services embedded in an exist-
ing Java system. Our service identification process, as shown in Figure 5.5, is supported by a
combination of top-down and bottom-up techniques.
No YesTermination CriteriaSatisfied?
Dominance Tree Reduction
Dominance Tree Generation
SHG Transformation
SHG Reconstruction
To Stage 3
Lo
w-L
evel S
erv
ice Id
en
tificatio
n
DTree of SHG
Reduced DTree
SHG
SHG
Serv
ice A
gg
reg
atio
n
Control Flow Process
Service Validation
Top-Level Service Candidate Generation
CIDG Transformatiom
From Stage 1
To
p-L
ev
el S
erv
ice Id
en
tificatio
n
Validated top-level services andtheir atomic services (described in SHGs)
Top-level service candidates
MCIDGs
Figure 5.5: Processes in Service Identification Stage.
CHAPTER 5. SERVICE IDENTIFICATION 64
In the top-down portion of the process, we identify the top-level services and the atomic
services (to be discussed later) underneath each top-level service. In the bottom-up portion,
we aggregate the atomic services to identify services with higher level of granularity (reusable
services). We will delve into these two portions in the subsequent two sections.
5.3.1 Top-Level Service Identification
The top-level service identification process is the top-down portion of the proposed service iden-
tification process. According to the definition of a top-level service ( introduced in Section 5.1),
top-level services of a software system partition the system into independent parts. Each of these
independent part represents a service to the outside world, from the user’s point of view. We
identify services of a system by starting with its top-level services, and then extracting a ser-
vice hierarchy for each top-level service to identify low-level services underneath each top-level
service.
Algorithm 5.1: CIDG-Transformation
Input : CIDG : The CIDG of the system.Output : MCIDGs : A set of MCIDGs.
// decompose the CIDG into connected componentsMCIDGs ← φ;1
CGraphs ← ConnectedComponents(CIDG);2
// decompose each connected component into a set of rooted// componentsforeachgraph g∈ CGraphsdo3
RGraphs ← RootedComponents(g);4
MCIDGs ← MCIDGs ∪RGraphs;5
end6
To identify the top-level services of an existing object-oriented system, the first step is to
identify the entry points of the system. In Chapter 4, we have modeled the existing system as
CHAPTER 5. SERVICE IDENTIFICATION 65
directed graphs : the class/interface relationship graph (CIRG) and the class/interface dependency
graph (CIDG). At this stage, we decompose the CIDG into a set of connected components with
an unique root such that each component is an independent subgraph of the CIDG. Algorithm 5.1
describes the decomposition process.
In Algorithm 5.1, functionConnectedComponents() computes and returns all connected
components of the given directed graph. While functionRootedComponents() decomposes a
connected directed graph into a set of rooted components. We name each of the rooted compo-
nents as amodularized CIDG(MCIDG). Essentially, Algorithm 5.1 applies a set of graph trans-
formation rules to transform the CIDG into a set of rooted components (i.e., MCIDGs). Note
that the output MCIDGs are subgraphs of the CIDG, and each node in an MCIDG represents a
single class or interface of the system. There is no other class or interface in the system depends
upon the unique root of each MCIDG. Consequently, the unique root of each MCIDG might rep-
resents an entry point of the system, and each MCIDG might therefore embed a top-level service
represented by its root.
As we have mentioned, each node of an MCIDG contains only one class or interface. At this
stage, we consider the root of each MCIDG as a top-level service candidate and the other nodes
as the low-level service candidates underneath the top-level service candidate. The second step of
the top-level service identification is to generate the top-level service candidates from MCIDGs.
This is achieved by performing three tasks for each top-level service candidate represented by an
MCIDG : i) to compute the facade class set, ii) to build the SHG of the top-level service candidate,
and iii) to describe the candidate as the tuple that we have defined in Section 5.1.
The final step of the top-level service identification is to validate the top-level service candi-
date and assign a meaningful name for each accepted top-level service. This is a user-involved
procedure. The user retrieves the functionality provided by the candidate through examining the
classes/interfaces in its facade class set. Based on the functionality, the user can make a decision
on the candidate.
CHAPTER 5. SERVICE IDENTIFICATION 66
Algorithm 5.2: Top-Level Service Identification
Input : CIDG : The CIDG of the system.Output : TLSs : A set of identified top-level services that are represented by
(name,CF , SHG) tuples.
// decompose the CIDG into a set of rooted components// each rooted component is an MCIDGMCIDGs ← Run CIDG-Transformation Alg. onCIDG;1
// generate top-level service candidates// represent candidates as (name,CF , SHG) tuplesCandidates ← φ;2
foreachMCIDG(Vm, Em) ∈ MCIDGs do3
Create a new graphG(V, E);4
V ← φ;5
E ← Em;6
for i ← 1 to |Vm| do7
// Vm(i) means the ith node in Vm
V (i) ← Facade(Vm(i),MCIDG, CIDG);8
end9
Create a new tupleT (name,CF , SHG);10
T.name ← null;11
T.CF ← Root(G);12
T.SHG ← G;13
Add tupleT (name,CF , SHG) to Candidates;14
end15
// validate the top-level service candidates// assign a meaningful name for each accepted serviceTLSs ← φ;16
foreach tupleT ∈ Candidates do17
The user validates the candidate by examiningT.CF ;18
if T is acceptablethen19
T.name ← An meaningful name for the service;20
Add T (name,CF , SHG) to TLSs;21
end22
end23
CHAPTER 5. SERVICE IDENTIFICATION 67
Algorithm 5.2 describes the details of these three steps in the top-level service identification
process. In Algorithm 5.2, each iteration of thefor loop on line3 transforms an MCIDG into a
top-level service candidate. FunctionFacade() computes and returns facade class sets for a given
top-level service candidate and its low-level service candidates. As we have described, the facade
class set contains classes/interfaces that describe the functionality of the service to the outside
world. Therefore, functionFacade() returns a set of classes/interfaces that have incoming edges
from classes/interfaces in the CIDG but not in the MCIDG. FunctionRoot() returns the root of a
given directed graph.
The user validates a candidate by examining its facade class set since classes in the set rep-
resent the functionality of the service. At this stage, the SHG corresponding to each top-level
service is built from the MCIDG and therefore can be viewed as a subgraph of the CIDG. In
other words, the SHG is a abstraction of a MCIDG hiding the non-necessary information for
understanding the service hierarchy. The functionality of low-level services in the hierarchy is
provided by a single class. Hence these services are calledatomic services. In most cases, these
atomic services are too fine-grained and have little reusability. However, the SHG at this stage
provides us a good starting point to identify services with a higher level of granularity by using
service aggregation techniques that are presented in the subsequent section.
After performing the top-level service identification, the critical top-level services of an exist-
ing system have been identified. Moreover, for each top-level service, we have extracted a service
hierarchy graph (SHG) to model its low-level services. However, at this time, low-level services
in the SHG are atomic services with little or no reusability. We need to build a new SHG for each
top-level service that contains low-level services with a higher level of granularity. Consequently,
these low-services in the new SHG are critical business services and have better reusability. This
is achieved at the low-level service identification process.
CHAPTER 5. SERVICE IDENTIFICATION 68
5.3.2 Low-Level Service Identification
The low-level service identification process is the bottom-up portion of the entire service iden-
tification process. SHGs built in the top-level service identification process are rooted directed
graphs that represent the structural dependency between a top-level service and its low-level ser-
vices (atomic services). As we have mentioned, these atomic services are too fine-grained and
therefore have limited reusability. At this stage, we aim to aggregate highly related atomic ser-
vices to build a new SHG for each top-level service such that the services contained in the new
SHG have a higher level of granularity and thus present a higher potential for reuse. The ser-
vice aggregation is an iterative process and the desired new SHG is achieved incrementally. The
low-level services obtained from each iteration have higher level of granularity than the previ-
ous iteration and hence modularize the top-level service in a different way. The result services of
each iteration are presented as an intermediate SHG to users. An evaluation procedure can be per-
formed at each iteration to determine whether specific goals have been reached. Then users can
make a decision on repeating or terminating the process according the pre-defined termination
criteria.
Algorithm 5.3 describes the low-level service identification process for a given top-level ser-
vice. Essentially, it repeatedly runs the service aggregation algorithm (i.e., Algorithm 5.4) on
low-level services underneath a top-level service until theTermination Criteriaare satisfied. Once
the iteration is terminated, the final SHG is built for the top-level service. Then, the algorithm
represents the low-services contained in the newly built SHG in tuples defined in Section 5.1.
FunctionMQ() computes the MQ metric of a given top-level service. The MQ metric quantita-
tively measures the quality of the modularization of a top-level service as the trade-off between
inter-connectivity and intra-connectivity of its low-level services.
Based on the modularization of the top-level service and the level of granularity of the low-
level services underneath the top-level service, we define twoTermination Criteriato stop the
CHAPTER 5. SERVICE IDENTIFICATION 69
Algorithm 5.3: Low-Level Service Identification
Input : CIRG : The CIRG of the system,CIDG : The CIDG of the system,T (name,CF , SHG) : The top-level service.
Output : LLSs : Identified low-level services represented in(name,CF , SHG) tuples.T (name,CF , SHG) : The input top-level service with newly built SHG.
// compute the MQ metric of the input top-level serviceComputeMQ(T.SHG, CIDG);1
// represent identified Low-level services in tuplesLLSs ← φ;7
foreachnon-root nodev ∈ T.SHG do8
Create a new tupleL(name,CF , SHG);9
L.name ← Meaningful name for the service;10
L.CF ← lV (v);11
L.SHG ← φ;12
Add L(name,CF , SHG) to LLSs;13
end14
service aggregation iteration in Algorithm 5.3 :
Termination Criterion 5.1. The top-level service has been nicely modularized by its low-level
services.
Termination Criterion 5.2. Low-level services are presenting appropriate level of granularity.
In term of the structure of a top-level service, the low-level services underneath the top-level
service modularize the top-level service. By the definition of the MQ metric, the higher the value
of the MQ metric of a top-level service is, the better structure the service has. This is based on
CHAPTER 5. SERVICE IDENTIFICATION 70
the hypothesis that a well-modularized service becomes highly malleable; that is, the service can
evolve in less time and at less cost. On the other hand, the level of granularity of services must
be matched to the level of reusability and flexibility required for a given context. The basis of the
second criterion is the hypothesis that the component that realizes a service with higher level of
granularity has better reusability.
Algorithm 5.4: Service-Aggregation
Input : CIRG : The CIRG of the system,CIDG : The CIDG of the system,SHG : The SHG that contains the low-level services to be aggregated,Heuristic1 : Termination Criterion 1,Heuristic2 : Termination Criterion 2.
Output : SHGnew : A new SHG that contains low-level services with higher level ofgranularity.
all c∈comp.CF{v ∈ CIDG.V | (c ∗−→ v)} represents all classes or interfaces
that are reachable from every class incomp.CF in the CIDG.
CHAPTER 6. COMPONENT GENERATION AND SYSTEM TRANSFORMATION 86
+ id : String
- creditRecord : CreditRecord
- drivingRecords : DrivingRecord[]
Customer
+ Customer()
+ Customer(String)
+ updateCreditRecord(int)
+ addDrivingRecord(String)
+ getCreditStatus() : int
+ isSafeDriver() : boolean
+ evaluateVehicles() : String[]
- name : String
- address : String
- phoneNumber : String
Person
+ Person()
+ setName(String)
+ getName() : String
+ setAddress(String)
+ getAddress() : String
+ setPhoneNumber(String)
+ getPhoneNumber() : String
Figure 6.2: The UML Class Diagrams ofCustomer andPerson in the CRS System.
In this example,⋃
all c∈comp.CF{v ∈ CIDG.V | (c ∗−→ v)} =
{ com.uwstar.crs.person.Person,
com.uwstar.crs.record.CreditRecord,
com.uwstar.crs.record.DrivingRecord,
com.uwstar.crs.record.Record }
Then, we have
comp.CC = comp.CF ∪⋃
all c∈comp.CF{v ∈ CIRG.V | (c ∗−→ v)} =
{ com.uwstar.crs.person.Customer,
com.uwstar.crs.person.Person,
com.uwstar.crs.record.CreditRecord,
com.uwstar.crs.record.DrivingRecord,
com.uwstar.crs.record.Record }
4. Create a new interface namedICustomer. Since there is only one class incomp.CF (i.e.,
com.uwstar.crs.person.Customer), we modify this class to implementICustomer as
CHAPTER 6. COMPONENT GENERATION AND SYSTEM TRANSFORMATION 87
shown in Figure 6.3.
5. The inheritance reachable class set of classcom.uwstar.crs.person.Customer is ex-
tracted as follows :
VIN = {com.uwstar.crs.person.Person}Figure 6.2 depicts the UML class diagrams of classcom.uwstar.crs.person.Customer
and classcom.uwstar.crs.person.Person. We add declarations of all public methods
defined in classcom.uwstar.crs.person.Person to ICustomer, and we modify class
com.uwstar.crs.person.Person to implement the interfaceICustomer. These modifi-
cations are reflected in Figure 6.3.
6. Since the realization reachable class set of classcom.uwstar.crs.person.Customer is
empty (i.e.,VRE = ∅), there is no action needed in this step.
7. As Figure 6.2 shows, there is only one public class field declared in class
com.uwstar.crs.person.Customer
(i.e., id) and no public class field in classcom.uwstar.crs.person.Person. We add the
setter method declarationsetID(String) and the getter method declarationgetID() :
String to interfaceICustomer. We also need to implement these two methods in class
com.uwstar.crs.person.Customer. Listing 6.1 lists the implementation of these two
methods. These modifications are also reflected in Figure 6.3.
8. Again, sinceVRE = ∅, there is no action needed in this step.
9. comp.if = ICustomer.
10. comp.CHG = φ, because the serviceCustomer is a low-level service. Hence, the gen-
erated component is a low-level component. If the service is a top-level service, the CHG
of the generated component is the SHG of the top-level service except node names in the
SHG are changed to corresponding service names.
CHAPTER 6. COMPONENT GENERATION AND SYSTEM TRANSFORMATION 88
ICustomer
+ updateCreditRecord(int)
+ addDrivingRecord(String)
+ getCreditStatus() : int
+ isSafeDriver() : boolean
+ evaluateVehicles() : String[]
+ setID(String)
+ getID() : String
+ setName(String)
+ getName() : String
+ setAddress(String)
+ getAddress() : String
+ setPhoneNumber(String)
+ getPhoneNumber() : String+ id : String
- creditRecord : CreditRecord
- drivingRecords : DrivingRecord[]
Customer
+ Customer()
+ Customer(String)
+ updateCreditRecord(int)
+ addDrivingRecord(String)
+ getCreditStatus() : int
+ isSafeDriver() : boolean
+ evaluateVehicles() : String[]
+ setID(String)
+ getID() : String
Newly added methods to
implement the methods declared
in Icustomer interface
Newly created interface for
component Customer
- name : String
- address : String
- phoneNumber : String
Person
+ Person()
+ setName(String)
+ getName() : String
+ setAddress(String)
+ getAddress() : String
+ setPhoneNumber(String)
+ getPhoneNumber() : String
Figure 6.3: Part of UML Class Diagram of the ComponentCustomer.
Now we are ready to package the following classes (i.e., the constituent class set) :
com.uwstar.crs.person.Customer,
com.uwstar.crs.person.Person,
com.uwstar.crs.record.CreditRecord,
com.uwstar.crs.record.DrivingRecord, and
com.uwstar.crs.record.Record
together with the newly created interfaceICustomer as a JAR file namedCustomer.jar.
CHAPTER 6. COMPONENT GENERATION AND SYSTEM TRANSFORMATION 89
1 p u b l i c c l a s s Customer e x t e n d s Person imp lements ICustomer{23 p u b l i c S t r i n g i d ; \\ cus tomer ID4 . . .5 p u b l i c vo id s e t I D ( S t r i n g i d ) {6 t h i s . i d = i d ;7 }8 p u b l i c S t r i n g get ID ( ) {9 r e t u r n i d ;
10 }11 . . .12 }
Listing 6.1: The Implementation of methodssetID andgetID in classCustomer.
6.2 System Transformation
One of the primary goals of the proposed SOC4J framework is to transform the monolithic ar-
chitecture of an existing object-oriented system to a more flexible service-oriented architecture.
In the previous stages of the framework, we have identified services and packaged the identi-
fied services into self-contained components. Now, we introduce a reconstruction technique that
automatically reconstructs the existing source system into a component-based target system.
6.2.1 Approach
The reconstruction process is based on the extracted components. In this thesis, extracted com-
ponents are categorized into two classes : top-level components and low-level components. A
top-level component has an associated component hierarchy graph (CHG) to describe the low-
level components contained in the top-level component. Each component is self-contained and
has been packaged into a JAR file. Based on extracted components, we design a meta-model,
depicted in in Figure 6.4, for the component-based target system. The target system is composed
CHAPTER 6. COMPONENT GENERATION AND SYSTEM TRANSFORMATION 90
Target System
(Component-Based System)
contains
contains
1
1
*
Top-Level Component
(JAR file)
contains
1
*
Low-Level Component
(JAR file)
Class/Interface
(Java file)contains
1
1..*
*
contains1
*
contains1
*
Figure 6.4: The Meta-Model for the Component-Based Target System.
of one or more top-level components, as well as a set of classes/interfaces, while each top-level
component might consist of some low-level components together with a set of classes and in-
terfaces. Like the top-level component, the low-level component might contain other low-level
sub-components, classes and interfaces. In the source system, some classes or interfaces may not
be identified as business services or not be contained in identified business services. Therefore,
these classes or interfaces are not packaged into components. In order to preserve the behavior of
the system, we have to include these classes or interfaces in the component-based target system.
We reconstruct the target system by adopting a bottom-up integration technique that collab-
orates with the extracted components, starting with the components in the lowest position in the
component hierarchy. The reconstruction process should not change the observable behavior of
the existing system. The surrounding parts of the component should use newly extracted compo-
nents in order to avoid the situation where two sets of classes, which provide the same function-
alities, exist in the same system. Algorithm 6.1 describes the transformation process, taking in
the source system and the extracted components represented as input. Extracted components are
represented as tuples in the form of(name, if , CF , CC). The output of the algorithm will be an
CHAPTER 6. COMPONENT GENERATION AND SYSTEM TRANSFORMATION 91
Algorithm 6.1: System-Transformation
Input : An existing object-oriented system and extracted components from the systemOutput : A component-based target system
foreach top-level componentt do1
while there exists a low-level component int.CHG do2
// star with the component in the lowest position in the// component hierarchyc ← node without descendants int.CHG;3
// retrieve components that contain component cP ← parents ofc in t.CHG;4
// refactoring the parents of component c to use cforeachp ∈ P do5
Change the code of classes inp.CC that reads (or writes) the public fields of6
classes inc.CF to the code that invokes the correspondinggetter(or setter)methods in interfacec.if ;Replace the reference types in classes inp.CC , which refer to any classes in7
c.CF , with interfacec.if ;end8
// update t.CHG to remove component cRemove nodec from t.CHG;9
end10
end11
instance of the meta-model described in Figure 6.4.
6.2.2 An Example
To further describe the system transformation process, we give an example of reconstructing the
CRS system into a component-based target system.
Consider the following top-level services identified after the service identification stage :
• (V ehicle Booking, {com.uwstar.crs.Booking}, SHGV B). The service hierarchy graph
CHAPTER 6. COMPONENT GENERATION AND SYSTEM TRANSFORMATION 92
com.uwstar.crs.VehicleRepository
com.uwstar.crs.vehicle.Car
com.uwstar.crs.vehicle.Truck
com.uwstar.crs.vehicle.SUV
com.uwstar.crs
Booking
com.uwstar.crs.person
Agent
com.uwstar.crs.person
Customer
com.uwstar.crs
VehicleEvaluation
com.uwstar.crs.person
Customer
(a) (b)
Figure 6.5: The Service Hierarchy Graphs of the CRS System.
Vehicle Repository
Vehicle Booking
Agent Customer
Vehicle Evaluation
Customer
(a) (b)
Figure 6.6: The Component Hierarchy Graphs of the CRS System.
SHGV B is shown in Figure 6.5 (a).
• (V ehicle Evaluation, {com.uwstar.crs.V ehicleEvaluation}, SHGV E). The service
hierarchy graphSHGV E is shown in Figure 6.5 (b).
We have two top-level components generated after the component generation stage, and the
low-level components underneath each top-level component are described in the related compo-
nent hierarchy graph. The two top-level components are described as follows :
• (V ehicle Booking, IBooking, CF1, CHGV B). The component hierarchy graphCHGV B
is shown in Figure 6.6 (a).
CHAPTER 6. COMPONENT GENERATION AND SYSTEM TRANSFORMATION 93
• (V ehicle Evaluation, IEvaluation, CF1, CHGV E). The component hierarchy graph
CHGV E is shown in Figure 6.6 (b).
After running Algorithm 6.1, we get the component-based version of the CSR system as
Figure 6.7 shown. The component-based system has the same functionality as the original system.
<<application>>
Car Rental System
<<component>>
:Vehicle Repository
<<component>>
:Vehicle Booking
<<component>>
:Vehicle Evaluation
<<component>>
:Agent
<<component>>
:Customer
IRepository IAgent
ICustomer
IBooking IEvaluation
Dealer
contains
Figure 6.7: The Component-Based Car Rental System.
6.3 Summary
In this chapter, we explained the processes contained in the component generation stage and sys-
tem transformation stage of the SOC4J framework. We have discussed how an identified service
can be realized as a self-contained component and how the existing system can be reconstructed
into a component-based system based on the components that realize the identified services.
Chapter 7
Empirical Studies
In this chapter, we perform a set of empirical studies on the proposed SOC4J framework to
assess the service-oriented componentization techniques introduced in this thesis. The proposed
technique has been implemented in a prototype that aims to i)identify critical business services
embedded in an existing Java system, ii) realize identified services into self-contained reusable
components, and iii) transform the existing system into a component-based system. Therefore,
the purpose of the empirical study in this chapter is to test the effectiveness of the proposed SOC4J
framework and assess i) the usefulness in terms of feasibility and effectiveness of the architecture
recovery and representation approach, ii) the usefulness in terms of efficiency and effectiveness
of the business service identification technique, iii) the usefulness in terms of effectiveness of the
identified service modeling and packaging techniques, and iv) the time and space complexity of
the service-oriented componentization technique as a function of source code size.
We outline the implementation of the prototype for the SOC4J framework in Section 7.1. In
Section 7.2, we discusses two evaluation criteria for the proposed framework. While we present
empirical studies on two Java open source projects in Section 7.3 and 7.4. Finally, we summary
this chapter in Section 7.5.
94
CHAPTER 7. EMPIRICAL STUDIES 95
7.1 A Prototype for the SOC4J Framework
As a part of this work, the proposed service-oriented componentization approach has been im-
plemented in a prototype which offers an interactive and integrated environment for i) identify-
ing critical business services embedded in an existing Java system, ii) realizing each identified
service as a self-contained component, and iii) transforming the object-oriented design into a
service-oriented architecture. We have named the prototypeJComp, an Java Componentization
Kit. The JComp is an integrated tool workbench targeted at rapidly integrating software tools for
prototyping the SOC4J framework. Now, we examine the tool integration requirements for the
SOC4J framework and discuss the implementation of the JComp.
7.1.1 Tool Integration Requirements
As we discussed in Chapter 3, several software tools are needed for the SOC4J framework to
componentize an object-oriented system and re-modularize the existing assets for supporting ser-
vice functionality. Figure 7.1 depicts the tool interconnection of the SOC4J framework. Five
rounded rectangles on the right side of the figure represent the tools needed for the the SOC4J
framework, while the flow of data needed for integrating the tools within the framework is shown
by the thick arrow on the right side of the diagram.
The functionality of each tool is outlined as follows :
Source Code ModelingThis tool parsers the Java source code and outputs a set of raw data
of the facts. Based on the extracted facts, the tool further generates source code models
defined in Chapter 4, including, JPackage, JFile, JClass, and JMethod. The raw data set
and source code models are exported as XML documents.
Architecture Modeling Based on the source code model, this tool identifies all class relation-
ships defined in Chapter 4. It exports identified relationships in graph representations, that
CHAPTER 7. EMPIRICAL STUDIES 96
Integrated
Tool Workbench
for
SOC4J
Framework
Source Code Modeling
Service Identification
Architecture Modeling
Component Generation
System Transformation
Java Source Code
FactsSource Code Models
Flo
w o
f In
tegra
tion D
ate
Source Code Models
CIRG, CIDG
CIRG, CIDG
Identified Services
Identified Services
Self-Contained Components
Self-Contained ComponentsSource Code
Component-Based System
Figure 7.1: The Tool Interconnection for the SOC4J Framework.
is, the CIRG and CIDG. Basic reusability attributes for each class in the system also are
computed. The CIRG and CIDG are exported as XML documents.
Service Identification This tool assists users in identifying the business services embedded in
an existing Java system through analysis of the CIRG and CIDG. Firstly, it identifies the
top-level services of the system and builds a service hierarchy graph for each identified
top-level service. Then, it performs a graph transformation on the service hierarchy graph
to identify low-level services for each top-level service.
Component Generation This tool realizes identified services into self-contained components.
For each identified service, it extracts all classes/interfaces that are necessary for imple-
menting the service, generates an interface for the service, and packages these classes/in-
terfaces together with the interface as a JAR file.
System Transformation This tool reconstructs an existing Java system into a component-based
CHAPTER 7. EMPIRICAL STUDIES 97
system by using the generated component from the source system. The system transforma-
tion process preserves the functionality of the source system.
7.1.2 JComp RCP Application
The JComp is built on the top of the Eclipse Rich Client Platform (RCP) [68] and hence it is
called an Eclipse RCP application. An Eclipse RCP application is a collection of plug-ins and
the Runtime on which they run. The platform-independent Eclipse RCP architecture makes rich-
client applications easy to write because business logic is organized into reusable components
called plug-ins. Eclipse RCP provides a core set of services, representing a substantial percentage
of the rich client platform development functionality, so that developers do not have to rewrite
infrastructure code. These Eclipse RCP services are available to every application component
plug-in. These services are the interface between a plug-in and the low-level platform-specific
functionality that supports the plug-in, just like a J2EE container is the interface between EJB
and the application server. Moreover, because of the Eclipse open source license, we can use the
technologies that went into Eclipse to create our own commercial-quality programs. The GUI
toolkits used by Eclipse RCP are the same used by the Eclipse IDE and enable applications with
optimal performance that have a native look and feel on any platform that they run on.
The architecture of the JComp toolkit is depicted in Figure 7.2. The internals of the JComp are
the same OSGi runtime and GUI toolkit provided by the Eclipse IDE. The OSGi runtime enables
Java code from multiple sources to all run together in a single Java Virtual Machine (JVM). The
OSGi framework automatically loads and runs bundles which are encapsulations of various files.
This provides the mechanism by which plug-ins can be automatically detected and loaded into the
JComp RCP application. The resource manager provides a GUI to show the current configuration;
that is, a list of installed plug-ins. It assists the end user in finding and installing new plug-ins.
It is also capable of scanning through the list of already-installed plug-ins to look for updates to
CHAPTER 7. EMPIRICAL STUDIES 98
JComp RCP Application
Eclipse RCP Platform
Platform Runtime (OSGi)
Resource Manager
SWT
JFace
UI (Generic Workbench)
Transformer Plug-in
Generator Plug-in
Extractor Plug-in
Modeler Plug-in
Parser Plug-in
Figure 7.2: The Architecture of the JComp Java Componentization Kit.
these plug-ins. The Standard Widget Toolkit (SWT) provides a completely platform-independent
API that is tightly integrated with the operating system’s native windowing environment. Java
widgets actually map to the platform’s native widgets. This gives Java applications a look and
feel that makes them virtually indistinguishable from native applications. The JFace toolkit is
a platform-independent user interface API that extends and interoperates with the SWT. This
library provides a set of components and helper utilities that simplify many of the common tasks
in developing SWT user interfaces. The generic workbench provides extension points that the
plug-ins extend. The plug-ins provide functionality that is integrated into the RCP platform just
as if it were always part of the application.
As Figure 7.2 depicted, each tool described in Section 7.1.1 was implemented as a separate
JComp plug-in. A snapshot of the JComp Java Componentization Kit is depicted in Figure 7.3.
CHAPTER 7. EMPIRICAL STUDIES 99
Figure 7.3: A Snapshot of the JComp Java Componentization Kit.
CHAPTER 7. EMPIRICAL STUDIES 100
7.2 Evaluation Criteria
Since the proposed framework is trying to extract reusable components from an object-oriented
system and migrate the object-oriented design to a service-oriented architecture, the evaluation
criteria needs to addresscomponent reusabilityandarchitectural improvement.
7.2.1 Component Reusability
The components acquired by applying the proposed framework are structurally reusable because
the internal structures are encapsulated and the components are self-contained and thus have no
dependency upon the entities outside of them. However, we still need to seek a way to assess the
reusability quantitatively.
Reusability Metric Suite
Components have two relatively static sources of information : the external documentation and
the public interface. The external documentation is an important source of information that can
greatly affect component reusability; such documentation is developed for a human audience,
which makes it harder to measure. On the other hand, component interfaces are easily parsed by a
computer, making them easier to measure. This is an important argument for developing reusabil-
ity metrics based upon component interfaces. In this thesis, we aim to assess the reusability of the
extracted components through the analysis of their interfaces and internal methods as well. We
define a reusability metric suite by selecting and adapting the metrics defined in [13, 25, 70, 91]:
Parameter Per Method (PPM ) ThePPM metric measures the mean size of method declara-
tions of the interface of the component, and it is defined as follows:
PPM =
IPCIMC if IMC > 0;
0 otherwise.
(7.1)
CHAPTER 7. EMPIRICAL STUDIES 101
where the metricIPC (Interface Parameter Count) is the count of parameters of all public
methods in the interface of the component, and the metricIMC (Interface Method Count)
is the count of public methods in the interface of the component.
It is believed that methods with fewer parameters are easier to understand, and so will be
easier to reuse [58]. It follows that component interfaces with lowerPPM will tend to
have lower complexity and hence better understandability.
Reference Parameter Density (RPD) TheRPD metric measures the occurrence of reference
parameters in an interface, and it is defined as follows:
RPD =
IRPCIPC if IPC > 0;
0 otherwise.
(7.2)
where the metricIRPC (Interface Reference Parameter Count) is the count of reference
type parameters of all public methods in the interface of component.
It is believed that the use of references makes it more difficult to understand the pro-
gram [87]. This is also applicable to interfaces, as arguments which are passed by reference
tend to be more difficult to understand than arguments which are passed by value. A higher
RPD will indicate that an interface tends to be more difficult to understand. However, it
is often necessary for reference arguments to be used so that useful functionality can be
implemented. Therefore, a high value is not necessarily evidence of a poor interface, but it
does suggest that good documentation is requested [13].
Rate of Component Observability (RCO) TheRCO metric measures the percentage of read-
able properties in all fields implemented within the interface of the component, and it is
CHAPTER 7. EMPIRICAL STUDIES 102
defined as follows:
RCO =
IRMCIFRC if IFRC > 0;
0 otherwise.
(7.3)
where the metricIRMC (Interface Reader Method Count) is the count of public methods
in the interface of the component that read a field, the metricIFRC (Interface Field and
Reference Count) is the count of fields and references the interface of the component.
RCO indicates the component’s degree of observability for users of the component [91].
To understand the behavior of a component from outside the component, the observability
of the component should be high. However, there is a possibility that it is difficult for
users to find an important readable property among all of the readable properties when the
observability is too high.
Rate of Component Customizability (RCC) TheRCC metric measures the percentage of writable
properties in all fields implemented within the interface of the component, and it is defined
as follows:
RCC =
IWMCIFRC if IFRC > 0;
0 otherwise.
(7.4)
where the metricIWMC (Interface Writer Method Count) is the count of public methods
in the interface of the component that write a field.
RCC indicates the component’s degree of customizability for users of the component. To
adapt the settings of a component from outside the component to the user’s requirements,
the customizability of the component should be high. However, too high a customizability
violates the encapsulation of the component, and leads to greater opportunities for improper
use [91].
Self-Completeness of Component’s Return Values (SCCr) TheSCCr metric measures the per-
CHAPTER 7. EMPIRICAL STUDIES 103
centage of business methods without any return values in all business methods implemented
in the component, and it is defined as follows:
SCCr =
V MCMC if MC > 0;
1 otherwise.
(7.5)
where the metricV MC (Void Method Count) is the count of public methods in the compo-
nent that have void return type, and the metricMC (Method Count) is the count of public
methods in the component.
SCCr indicates the component’s degree of self-completeness and external dependency,
based on the return values of methods. The smaller the number of business methods without
return value, the smaller the possibility of the component having external dependency. High
self-completeness of a component (i.e., low external dependency) leads to high portability
of the component [91].
Self-Completeness of Component’s Parameters (SCCp) TheSCCp metric measures the per-
centage of business methods without any parameters in all business methods implemented
in the component, and it is defined as follows:
SCCp =
NPMCMC if MC > 0;
1 otherwise.
(7.6)
where the metricNPMC (None Parameter Method Count) is the count of public methods
in the component that do not have any parameters.
SCCp indicates the component’s degree of self-completeness and external dependency,
based on the parameters of methods. The fewer business methods without parameters, the
smaller the possibility of having dependency outside the component [91].
CHAPTER 7. EMPIRICAL STUDIES 104
Reusability Model
Reusability is a high-level quality of software components and hence it is the result of the combi-
nation and interaction of many low-level properties. The component reusability model typically
shows reusability as being composed of properties such as complexity, observability, customiz-
ability, and external dependency. From the user’s point of view, we define a component reusability
model as illustrated in Figure 7.4. This model is an adaptation of the reusability model intro-
duced by Washizaki et al. [91]. The quality factors are selected only to provide an analysis of
the reusability of a component, while factors related to other aspects of component quality that
are not considered to be important to reusability are not considered. The choice of the three fac-
tors affecting reusability has been made on the basis of an analysis of the activities carried out
when reusing a black-box component. We extend Washizaki’s model to quantify the complexity
of components by utilizing metricReference Parameter Density (RPD)proposed in [13]. Thus,
the adapted model includes aspects related to theUnderstandability, Adaptability, andPortability
factors given by ISO 9126 [1].
Reusability
Portability
Understandability
Adptability
Complexity
Observability
Customizability
External Dependency
RPD
RCO
RCC
SCCr
SCCp
Characteristic Quality Factor Criteria Metric
Figure 7.4: The Component Reusability Model.
In order to quantify the reusability of the components generated by our framework, based on
CHAPTER 7. EMPIRICAL STUDIES 105
the reusability model we formulate reusability measurement as follows:
Reusablity = wcomplexity ∗RPD +
wobservability ∗RCO +
wcustomizability ∗RCC +
wex−dependency ∗ (SCCr + SCCp
2)
(7.7)
By their definition, the values of all metrics in above formula are in[0, 1]. Since the com-
plexity and external dependency have a negative effect on reusability, the weightwcomplexity and
wex−dependency could be values in [−1, 0], while the observability and customizability have a
positive effect and hence the weightwobservability andwcustomizability could be any values in [0,
1]. Nevertheless, the sum of these four weights is set to1. Consequently, the reusability value
will be in [0, 1] and a higher value represents a higher level of the reusability.
7.2.2 Architectural Improvement
The software architecture of a program or computing system is the structure of the system, which
comprise software components, the externally visible properties of those components, and the
relationships among these components. The more complex a system structure is, the more dif-
ficult it is to understand, and therefore to maintain. We wish to measure the degree of confor-
mance, which the target (restructured) architecture presents, to the architectural principles of high
intra-module cohesion and low inter-module coupling. In this thesis, we introduce a metric for
measuring a large software system to determine if it is ”well-structured”, based on the concept of
entropy from information theory.
Entropy from an information theoretic point of view has been proposed in [78] for evaluating
the structuredness of a software’s design. We adopt the definition of entropy for an object-oriented
design introduced in [20] to compute the entropy of our source systems and target systems, re-
CHAPTER 7. EMPIRICAL STUDIES 106
spectively. The smaller the entropy value, the better structure the system has. We then compare
the results to see whether the structures of our target systems are improved. The entropy of a
object-oriented systemS with n classes is defined as follows [20]:
H(S) = −n∑
i=1
p(ci) log p(ci)2 (7.8)
It is assumed that the system is described in a standard class diagram format following UML
notation for associations between classes. For a randomly selected unary association,p(ci) is de-
fined as the probability that the association leads to classci. The existence of such an association
indicates that classci provides services to the rest of the system, since it responds to messages
sent to it. Within this context, bi-directional associations are treated as two separate unary as-
sociations. Classes are used as the units for entropy measurement because classes represent the
most important fundamental building blocks of an object-oriented system and are an identifiable
abstraction that is present both in designs and implementations.
To compute the entropy metric of the source system of our framework, letn be the number of
classes/interfaces of the source system, we computep(ci) as the ratio of the number of incoming
edges of classci over the total number of edges in the CIDG of the source system. To compute
the entropy metric of the target system of our framework, we considern as the total number
of components and classes/interfaces contained in the target system, and we then computep(ci)
using the same way as in the source system except that there may exist an association between a
class/interface and a component.
7.3 Case Study : Jetty
In this section, we apply the JComp Java Componentization Kit to Jetty [46] to empirically eval-
uate the usefulness of the proposed SOC4J framework.
CHAPTER 7. EMPIRICAL STUDIES 107
7.3.1 Statistics of the Jetty
Jetty is an open-source, standards-based, full-featured web server implemented entirely in Java. It
is released under the Apache 2.0 licence and is therefore free for commercial use and distribution.
Jetty can be used as : i) a stand-alone traditional web server for static and dynamic content, ii) a
dynamic content server behind a dedicated HTTP server such as Apache using Apache module
mod proxy, and iii) an embedded component within a Java application.
Project Version LOC Java Source Files Packages Classes Interfaces
Jetty 5.1.10 44125 318 25 273 47
Table 7.1: Statistics of the Jetty.
As shown in Table 7.1, we work on Jetty version 5.1.10, which was released on April 5, 2006.
It has about 44K LOC source code and consists of 318 Java source files that defines 273 classes
and 47 interfaces distributed in 25 packages.
7.3.2 Discussions on Obtained Results
In order to componentize the Jetty system, we first applied the JComp Java Componentization Kit
to identify business services embedded in the system. The JComp then generated a self-contained
component for each identified service.
The Parser plug-in of the JComp imported the source code of the Jetty and built a set of
source code models. These source code models were exported and stored as XML documents.
The Modeler plug-in imported the source code models and recovered architectural models that
are represented by the CIRG and CIDG. Like the source code models, the CIRG and CIDG were
exported and stored as XML documents. Firstly, based on the CIRG and CIDG, the Extractor
plug-in, which implements the top-level service identification algorithm (i.e., Algorithm 5.2) and
the low-level service identification algorithm (i.e., Algorithm 5.3), identified33 top-level service
CHAPTER 7. EMPIRICAL STUDIES 108
Figure 7.5: The AcceptedService Viewof the Extractor plug-in.
candidates from the CIDG. We then validated each candidate by examining the facade class set
of these candidates, and accepted16 top-level services. These16 top-level services represent the
functionality of the Jetty from the points of view of end users. Appendix A lists and describes
all accepted top-level services of the Jetty web server. Figure 7.5 depicts the acceptedService
View of the Extractor plug-in, which displays all accepted top-level services of the Jetty. The
unacceptable candidates are dead code, debugging modules, or testing modules. For instance, we
found8 dead classes inorg.mortbay.utilpackage and a debugging module whose entry point is
CHAPTER 7. EMPIRICAL STUDIES 109
the classorg.mortbay.servlet.ProxyServlet.
ID Top-Level Service Classes/interfaces Low-Level Services
T1 Win32 Server 248 11T2 Dynamic Servlet Invoker 207 12T3 Jetty Server MBean 126 9T4 Proxy Request Handler 113 7T5 XML Configuration MBean 87 5T6 Web Application MBean 86 6T7 Administration Servlet 56 5T8 CGI Servlet 49 5T9 Host Socket Listener 46 5T10 Web Configuration 34 3T11 Authentication Access Handler 30 3T12 Servlet Response Wrapper 27 2T13 IP Access Handler 18 0T14 Multipart Form Data Filter 16 2T15 HTML Script Block 12 1T16 Applet Block 9 1
Table 7.2: Top-Level Services Identified from Jetty.
After all the top-level services were validated, the Extractor plug-in then identified low-level
services underneath each top-level service. Table 7.2 shows the atomic services and identified
low-level services for each top-level service. Actually, atomic services of a top-level service are
Java classes or interfaces that implement the top-level service; they are represented by nodes of
the original SHG of the services. For example, as Table 7.3 shows, there are11 low-level services
identified from top-level serviceWin32Server (i.e., top-level service T1). This top-level service
runs the Jetty as a Windows HTTP server. When identifying low-level services, we used the
Termination Criterion 5.1 described in Chapter 5 to terminate the iteration in Algorithm 5.3 by
settingMQ = 0.75. In the case that the level of granularity of services is crucial, the user may use
the Termination Criterion 5.2 for Algorithm 5.3. As Figure 7.6 shows, we terminated the low-
level service identification process at the fifth iteration. The final low-level services identified for
CHAPTER 7. EMPIRICAL STUDIES 110
original SHG
1st iteration
2nd iteration
final iteration
- - - - - - - - - -
- - - - - - - - - -
- - - - - - - - - -
- - - - - - - - - -
Figure 7.6: Iterations of the Service Aggregation Process of Top-Level ServiceWin32 Server.
top-level serviceWin32Server are shown in Table 7.3.
To realize each identified service (both top-level service and low-level service), the Generator
plug-in generated a self-contained component for each service. Figure 7.7 illustrates the compo-
nent hierarchy graph (CHG) of the top-level componentWin32 Server. There are11 low-level
components contained in the top-level component. Furthermore, the Generator plug-in measured
the reusability for each generated component, applying the component reusability model by com-
puting Formula (7.7). In this empirical study, we setwcomplexity = − 0.3, wobservability = 0.8,
wcustomizability = 0.8, andwex−dependency = −0.3. Figure 7.8 shows reusability values of the
CHAPTER 7. EMPIRICAL STUDIES 111
HTTP
Response
Win32
Server
Jetty
Server
HTTP
Connection
HTTP
Request
Security
Handler
Service
Handlers
Resource
Handler
Servlet
Handler
Web
Application
Context
Servlet
Figure 7.7: The CHG of Top-Level ComponentWin32 Serverof the Jetty.
Average Reusability of Low-Level Components in a Top-Level Component
Figure 7.8: The Reusability of Components Extracted from Jetty.
The Transformer plug-in transformed the Jetty into a component-based system based on the
generated components. We named the target systemJetty-JComp. As we see in Algorithm 6.1,
Jetty-JComp has the same functionality as Jetty. The Jetty-JComp now contains16 independent
JAR files. Each JAR file provides a top-level service and can be used independently. Also, each
independent JAR file is a component-based system that consists of a set of JAR files.
We have computed the entropy of both Jetty and Jetty-JComp by applying Formula (7.8).
When computing the entropy of Jetty-JComp, we used the component hierarchy graphs instead
of the CIDG because Jetty-JComp is comprised of components. We found that the entropy of the
Jetty-JComp was reduced by45.5%, compared to the the original Jetty project. Hence, we can
conclude that our transformation dramatically improves the structure of the system.
In Table 7.4, we summarize the time and space complexity of the proposed service-oriented
CHAPTER 7. EMPIRICAL STUDIES 113
Measurement Item Value
Case Study Size (KLOC) 44.1Source Code Modeling Time (min : sec) 2:18Source Code Model Space (MB) 1.43Architecture Modeling Time (min : sec) 4:19Architecture Model Space (MB) 1.57Top-Level Service Identification Time (min : sec) 6:45Average Low-Level Service Identification Time (sec) 66
Table 7.4: Some Time and Space Statistics of the SOC4J Framework on the Case Study : Jetty.
componentization framework as a function of source code size of the Jetty project. The experi-
ment was carried on a Windows desktop with Intel Pentium IV CPU 3.4GHz, 2G memory.
7.4 Case Study : Apache Ant
In this section, we apply the JComp Java Componentization Kit on another Java open source
project, namely Apache Ant [2], to further evaluate the usefulness of the proposed SOC4J frame-
work.
7.4.1 Statistics of the Apache Ant
The Apache Ant is a software tool for automating software build processes. It is similar to
makebut is written in the Java language and is primarily intended for use with Java. The most
immediately noticeable difference betweenAntandmakeis thatAntuses a file in XML format to
describe the build process and its dependencies, whereasmakehas its ownMakefileformat. By
default the XML file is namedbuild.xml. Ant is an Apache project. It is open source software,
and is released under the Apache Software License 2.0.
As shown in Table 7.5, we work on the Apache Ant version 1.6.5, which is the latest version.
It has around 86K LOC source code and consists of 690 Java source files that defines 640 classes
and 60 interfaces distributed in 70 packages.
CHAPTER 7. EMPIRICAL STUDIES 114
Project Version LOC Java Source Files Packages Classes Interfaces
Apache Ant 1.6.5 86468 690 70 640 60
Table 7.5: Statistics of the Apache Ant.
7.4.2 Discussions on Obtained Results
To componentize the Apache system, as we have done on the Jetty, we first applied the JComp
Java Componentization Kit to identify business services embedded in the system. Then, the
JComp generated a self-contained component for each identified service.
ID Top-Level Service Classes/interfaces Low-Level Services
reusability values of the top-level components of the Apache Ant and the average value of the
low-level components underneath each top-level component. From Figure 7.10, it was observed
that all top-level components, exceptC30, have reusability value above0.5 and all the average
values are between0.5 to 0.9. Thus, we could conclude that identified services from the Apache
Ant project have a reasonable level of the reusability.
Based on the generated components, the Transformer plug-in transformed the Apache Ant
into a component-based system. We named the target systemApache Ant-JComp. As we see in
Algorithm 6.1, Apache Ant-JComp has the same functionality as the Apache Ant. Jetty-JComp
now contains101 independent JAR files. Each JAR file provides a top-level service and can
be used independently. Since we have only further decomposed20 top-level components, each
of these20 corresponding JAR files is a component-based system that consists of a set of JAR
files (i.e., low-level components). Also, we have computed the entropy of both Apache Ant and
CHAPTER 7. EMPIRICAL STUDIES 118
Apache Ant-JComp by applying Formula (7.8). Again, when computing the entropy of Apache
Ant-JComp, we used the component hierarchy graphs instead of the CIDG because Apache Ant-
JComp is comprised of components. We found that the entropy of the Apache Ant-JComp was
reduced by16.3%, compared to the original Apache Ant project. The reduction of the entropy is
not as big as the Jetty-JComp, because we componentized only20 top-level services out of101
top-level services identified from the Apache Ant project.
Measurement Item Value
Case Study Size (KLOC) 86.5Source Code Modeling Time (min : sec) 5:20Source Code Model Space (MB) 3.34Architecture Modeling Time (min : sec) 9:15Architecture Model Space (MB) 3.92Top-Level Service Identification Time (min : sec) 19:43Average Low-Level Service Identification Time (sec) 54
Table 7.8: Some Time and Space Statistics of the SOC4J Framework on the Case Study : ApacheAnt.
In Table 7.8, we summarize the time and space complexity of the proposed service-oriented
componentization framework as a function of source code size of the Apache Ant project. The
experiment was carried on a Windows desktop with Intel Pentium IV CPU 3.4GHz, 2G memory.
7.5 Summary
The design and implementation of supporting tools are fundamental requirements to assess the
practical use of a re-engineering approach. In this chapter, we developed a toolkit implementing
the proposed componentization framework as an Eclipse Rich Client Platform (RCP) application,
The important aspects of the proposed framework have been tested through a series of experi-
ments. The empirical study has shown that the proposed framework is effective in identifying
services from an existing Java system and reconstructing it to a component-based system.
Chapter 8
Future Directions and Conclusions
In this Chapter, we summarize the findings of this thesis and outline future research directions
that may arise from this research. In Section 8.1, we present the contributions of this thesis, and
in Section 8.2, we discuss some future work that could extend this research. Finally, we make
some concluding remarks for this work in Section 8.3.
8.1 Contributions
The principle contributions of this thesis were stated in Chapter 1. Based on the material already
presented, we discuss them in more detail :
• The design and implementation of comprehensive graph representations of an object-oriented
system in different levels of abstraction. These graph representations include the class/in-
terface relationship graph (CIDG), the class/interface dependency graph (CIDG), modular-
ized CIDGs (MCIDGs), service hierarchy graphs (SHGs), and component hierarchy graphs
(CHGs). Each graph represents the system in a different level of abstraction.
• The exploration of an incremental program comprehension approach, including describ-
119
CHAPTER 8. FUTURE DIRECTIONS AND CONCLUSIONS 120
ing an object-oriented software system using different concurrent views, each of which
addresses a specific set of concerns of the system. The SOC4J framework extracts four
views to understand an object-oriented software system. The extracted source code models
provide the basic view (BView), while the recovered architectural models build the struc-
tural view (SView), the identified top-level services together with their service hierarchy
graphs give the service view (ServView), and the generated top-level components together
with their component hierarchy graphs introduce the component view (CompView) of the
system. Each view assists the user in understanding the system from a different perspective.
• The design and implementation of an efficient and effective methodology for identifying
and realizing critical business services embedded in an existing object-oriented system.
The business services embedded in an existing system were categorized into two classes :
Top-Level Services (TLS) and Low-Level Services (LLS). A top-level service is a service
that is not used by any other services of the system. However, it may contain a hierarchy of
low-level services further describing the service. From the requester’s point of view, top-
level services are provided by the system that can be accesses independently. A low-level
service is a service that is underneath a top-level service and may be agglomerated with
other low-level services to yield a new service with a higher level of granularity. The service
identification methodology is a combination of top-down and bottom-up techniques. In the
top-down portion of the methodology, we identify the top-level services and the atomic
services underneath each top-level service by identifying the entry points of the system. In
the bottom-up portion, we aggregate the atomic services to identify services with higher
level of granularity by applying a series of graph transformations. The service aggregation
is performed incrementally.
• The design and implementation of an object-oriented restructuring methodology that trans-
forms the typically monolithic architectures of an existing system to a more flexible service-
CHAPTER 8. FUTURE DIRECTIONS AND CONCLUSIONS 121
oriented architecture. For each identified service (both top-level services and low-level
services), we generate a self-contained component. A component that realizes a top-level
service is calledTop-Level Component(TLC), while a component that realizes a low-level
service is callLow-Level Component(LLC). Based on extracted components, a meta-model
for the component-based target system is designed. we introduce a reconstruction tech-
nique that automatically reconstructs the existing source system into a component-based
system.
• The design and implementation of a prototype system that supports the identification and
realization of critical business services embedded in an Java software system and the com-
ponentization of the Java System. The prototype is designed as an Eclipse Rich Client Plat-
form (RCP) application and namedJComp Java Componentization Kit. A list of JComp
plug-ins have been developed to implement the techniques introduced in the framework. A
set of empirical studies have been performed on the JComp toolkit.
8.2 Future Work
Several new research questions have arisen from this work. We believe that significant improve-
ments can be made in some aspects of the presented approach. The possible future work is
presented as follows :
• To apply the dynamic analysis on system behavior within the first stage of the SOC4J
framework to improve the detection of class relationships.
• To investigate algorithmic processes that can be used to automatically categorize the iden-
tified services.
• To measure the reusability and maintainability of the extracted components more concisely.
CHAPTER 8. FUTURE DIRECTIONS AND CONCLUSIONS 122
• To verify that our definitions are consensual with respect to developers’ intent when per-
forming software re-engineering.
• To apply our componentization toolkit, JComp, on more real-life programs and to validate
their results with the program developers.
• To extend our approach on other programming languages. For instance, C++ programs, or
even C and COBOL systems.
• To develop our approach with more flavors of binary class relationships, such as shared-
aggregation and container relationships.
• To improve the precision of the service identification by considering design-patterns, alter-
nate implementations of the algorithms, and alternate definitions of the class relationships.
8.3 Conclusions
In this thesis, we presented a service-oriented componentization framework for Java systems.
The framework componentizes an object-oriented system to re-modularize the existing assets for
supporting service functionality. We introduced an approach for identifying, modeling, and pack-
aging critical business services embedded in an existing system. In addition to producing reusable
components realizing the identified services, the framework also provides a component-based in-
tegration approach to migrate an object-oriented design to a service-oriented architecture. Our
initial evaluation has shown that our framework is effective in identifying services from an object-
oriented design and migrating it to a service-oriented architecture. Moreover, the BView, SView
ServView, and CompView built by our framework help users gain a program understanding of
the system.
Appendix A
Top-Level Services of Jetty
ID Top-Level Service AtomicServices
Description
T1 Win32 Server 248 Runs the Jetty as a Windows HTTP server.T2 Dynamic Servlet Invoker 207 Invokes anonymous servlets that have not
been defined in the web.xml or by othermeans.
T3 Jetty Server MBean 126 Configures a request log, which records allincoming HTTP requests.
T4 Proxy Request Handler 113 Makes the HTTP/1.1 proxy requests.T5 XML Configuration MBean 87 Performs all required configurations for run-
ning the SESM applications in Jetty contain-ers.
T6 Web Application MBean 86 Manages web applications’ lifecycle.T7 Administration Servlet 56 Jetty Administration Servlet. Allows start
and/or stop of server components and con-trol of debug parameters.
T8 CGI Servlet 49 Runs CGI servlets on Windows.T9 Host Socket Listener 46 Declares a socket listener for a Jetty http
server.T10 Web Configuration 34 Create web container configurations.
T12 Servlet Response Wrapper 27 Wraps a Jetty HTTP response as a 2.2Servlet response.
T13 IP Access Handler 18 Create a handler to authenticate access fromcertain IP-addresses.
T14 Multipart Form Data Filter 16 Decodes the multipart/form-data stream sentby a HTML form that uses a file input item.
T15 HTML Script Block 12 Represents the script block in a HTML form.T16 Applet Block 9 Represents the applet block in a HTML
form.
Table A.2: Top-Level Services of Jetty (2).
Appendix B
Top-Level Services of Apache Ant
ID Top-Level Service AtomicServices
Description
T1 Project Building 205 Runs Ant on a supplied build file.T2 JAR File Expansion 164 Unzips a jar file.T3 WAR File Creation 152 Creates Web Application Archive files.T4 TAR File Creation 144 Creates a tar archive.T5 Zip File Expansion 117 Unzips a zip file.T6 SQL Statement Execution 116 Executes a series of SQL statements via
JDBC to a database.T7 JUnit Invocation 114 Runs tests from the Junit testing framework.T8 JAR File Creation 113 Jars a set of files.T9 TAR File Expansion 95 Expands a tar file.T10 File Packing 92 Packs a file using the GZip or BZip2 algo-
rithm.T11 Unit Test Execution 86 Executes a unit test in the org.apache.testlet
framework.T12 WAR File Expansion 83 Unzips a war file.T13 RPM Invocation 81 Invokes the rpm executable to build a Linux
installation file.T14 File Content Loading 80 Loads a file’s contents as Ant properties.T15 Metamata MParse Invocation 71 Invokes the Metamata MParse compiler-
compiler on a grammar file.T16 CAB File Creation 67 Creates Microsoft CAB Archive files.
Table B.1: Top-Level Services of Apache Ant (1).
125
APPENDIX B. TOP-LEVEL SERVICES OF APACHE ANT 126
ID Top-Level Service AtomicServices
Description
T17 SSH File Copy 67 Copies files to or from a remote server usingSSH.
T18 Build File DTD Generation 67 Generates a DTD for Ant build files thatcontains information about all tasks cur-rently known to Ant.
T19 File Encoding Converting 65 Converts files from native encodings toASCII with escaped Unicode.
T20 Task Adding 59 Adds a task definition to the current project,such that this new task can be used in thecurrent project.
T21 Zip File Creation 57 Creates a zip file.T22 Macro Task Definition 56 Define a new task as a macro built-up upon
other tasks.T23 Path Converting 56 Converts a path format from one platform to
another platform.T24 FTP Implementation 56 Implements a basic FTP client that can send,
receive, list, and delete files, and create di-rectories.
T25 XML File Checking 54 Checks that XML files are valid (or onlywell-formed).
T26 File Expansion 52 Expands a file packed using GZip or BZip2.T27 Directory Property Setting 51 Sets a property to the value of the specified
file up to, but not including, the last path el-ement.
T28 File Availability Property Setting 50 Sets a property if a specified file, directory,class in the classpath, or JVM system re-source is available at runtime.
T29 Path Property Setting 50 Sets a property to the last element of a spec-ified path.
T30 Java Class Execution 45 Executes a Java class within the running(Ant) VM, or in another VM if the fork at-tribute is specified.
T31 Dependency Manifest Generation 45 Generates a manifest that declares all the de-pendencies in manifest.
T32 Key Generation 43 Generates a key in key store.T33 Property Setting 43 Sets a property (by name and value), or set
of properties (from a file or resource) in theproject.
T34 XML Property File Loading 43 Loads property values from a well-formedXML file.
T35 Web Proxy Property Setting 43 Sets Java’s web proxy properties.T36 XML Report Generation 43 Generates an XML report of the changes
recorded in a CVS repository.
Table B.2: Top-Level Services of Apache Ant (2).
APPENDIX B. TOP-LEVEL SERVICES OF APACHE ANT 127
ID Top-Level Service AtomicServices
Description
T37 File Token Identification 40 Identifies keys in files, delimited by specialtokens, and translates them with values readfrom resource bundles.
T38 Java Class Instrumenting 39 Instruments Java classes using the iContractDBC preprocessor.
T39 Existing Task Instrumenting 39 Defines a new task by instrumenting an ex-isting task with default values for attributesor child elements.
T40 File Loading 39 Loads a file into a property.T41 Splash Screen Display 38 Displays a splash screen.T42 File Set Packing 37 GZips a set of files.T43 CVS Pass Entry Adding 37 Adds entries to a .cvspass file.T44 File Checksum Generation 36 Generates a checksum for a file or set of
files.T45 Default Exclude Pattern Modifica-
tion36 Modifies the list of default exclude patterns
from within your build file.T46 JDepend Invocation 35 Invokes the JDepend parser.T47 Time Stamp Setting 35 Sets the DSTAMP, TSTAMP, and TODAY
properties in the current project, based onthe current date and time.
T48 GZip File Expansion 34 Expands a GZip file.T49 File Concatenation 34 Concatenates multiple files into a single one
or to Ant’s logging system.T50 Directory Synchronization 34 Synchronize two directory trees.T51 Condition Property Setting 34 Sets a property if a certain condition holds
true.T52 File Version Checking 34 Sets a property if a given target file is newer
than a set of source files.T53 Telnet Session Generation 34 Automates a remote telnet session.T54 Attribute Permission Change 33 Changes the permissions and/or attributes of
a file or all files inside the specified directo-ries.
T55 Build File Importing 32 Imports another build file and potentiallyoverride targets in it with users’ own targets.
T56 JJTree Invocation 32 Invokes the JJTree preprocessor for theJavaCC compiler-compiler.
T57 Resource Search 32 Finds a class or resource.T58 Temp File Generation 31 Generates a name for a new temporary file
and sets the specified property to that name.T59 Remote Command Execution 30 Execute a command on a remote server us-
ing SSH.T60 Manifest Creation 29 Creates a manifest file.
Table B.3: Top-Level Services of Apache Ant (3).
APPENDIX B. TOP-LEVEL SERVICES OF APACHE ANT 128
ID Top-Level Service AtomicServices
Description
T61 Documentation Generation 29 Generates code documentation using thejavadoc tool.
T62 XSLT Transformation 29 Processes a set of documents via XSLT.T63 CVS Repository Retrieval 29 Handles packages/modules retrieved from a
CVS repository.T64 SMTP Email Sending 28 Sends SMTP emails.T65 User Input 28 Allows user interaction during the build pro-
cess by displaying a message and reading aline of input from the console.
T66 JProbe Invocation 27 Invokes the JProbe suite.T67 Stylebook Invocation 26 Executes the Apache Stylebook documenta-
tion generator.T68 File Comparison 26 Compares a set of source files with a set of
target files, if any of the source files is newerthan any of the target files, all the target filesare removed.
T69 JavaCC Invocation 26 Invokes the JavaCC compiler-compiler on agrammar file.
T70 Regular Expression Replacement 25 Replaces the occurrence of a given regularexpression with a substitution pattern in afile or set of files.
T71 JJDoc Invocation 25 Invokes the JJDoc documentation generatorfor the JavaCC compiler-compiler.
T72 Current Property Listing 25 Lists the current properties.T73 EAA File Creation 24 Creates Enterprise Application Archive
files.T74 File Permission Change 23 Changes the permissions of a file or all files
inside the specified directories.T75 File Deletion 23 Deletes either a single file, all files and sub-
directories in a specified directory, or a setof files specified by one or more FileSets.
T76 Data Type Adding 23 Adds a data-type definition to the currentproject, such that this new type can be usedin the current project.
T77 Change Report File Generation 23 Generates an XML-formatted report file ofthe changes between two tags or datesrecorded in a CVS repository.
T78 File Move 21 Moves a file to a new file or directory, or aset(s) of file(s) to a new directory.
T79 Log Recording 21 Runs a listener that records the logging out-put of the build-process events to a file.
T80 Project Building Termination 21 Exits the current build by throwing aBuildException, optionally printing addi-tional information.
Table B.4: Top-Level Services of Apache Ant (4).
APPENDIX B. TOP-LEVEL SERVICES OF APACHE ANT 129
ID Top-Level Service AtomicServices
Description
T81 Property File Creation 21 Creates or modifies property files.T82 MMetrics Computation 19 Computes the metrics of a set of Java source
files, using the Metamata Metrics/WebGainQuality Analyzer source-code analyzer.
T83 Script Execution 19 Executes a script in a Apache BSF-supported language.
T84 TAB Updating 18 Modifies a file to add or remove tabs, car-riage returns, line feeds, and EOF charac-ters.
T85 URL File Retrieval 18 Gets a file from a URL.T86 Extension Checking 18 Checks whether an extension is present in a
file set or an extension set. If the extensionis present, the specified property is set.
T87 Command Execution 17 Executes a system command.T88 File Modification Time Change 17 Changes the modification time of a file and
possibly creates it at the same time.T89 Sound File Execution 17 Plays a sound file at the end of the build, ac-
cording to whether the build failed or suc-ceeded.
T90 ANTLR Invocation 17 Invokes the ANTLR Translator generator ona grammar file.
T91 JNI Header Generation 17 Generates JNI headers from a Java class.T92 String Replacement 16 Replaces the occurrence of a given string
with another string in a selected file.T93 MAudit Computation 15 Performs static analysis on a set of Java
source-code and byte-code files, using theMetamata Metrics/WebGain Quality Ana-lyzer source-code analyzer.
T94 Directory Creation 15 Creates a directory.T95 Text Output 15 Echoes text to System.out or to a file.T96 File Copying 13 Copies a file or Fileset to a new file or direc-
tory.T97 File Group Ownership Change 12 Changes the group ownership of a file or all
files inside the specified directories.T98 Project Filter Setting 12 Sets a token filter for this project, or reads
multiple token filters from a specified fileand sets these as filters.
T99 Source Code Extraction 12 Allows the user extract the latest edition ofthe source code from a PVCS repository.
T100 File Ownership Change 11 Changes the owner of a file or all files insidethe specified directories.
T101 JAR File Information Display 9 Displays the ”Optional Package” and ”Pack-age Specification” information containedwithin the specified jars.
Table B.5: Top-Level Services of Apache Ant (5).
Bibliography
[1] Software product evaluation-quality characteristics and guidlines for their use.ISO/IEC
Standard ISO-9129, 1991.
[2] Apache Ant. A Java-based build tool.http://ant.apache.org/, 2006.
[3] Jagdish Bansiya and Carl G Davis. A class cohesion metric for object-oriented designs.
Journal of Object-Oriented Programming, 11:47–52, January 1999.
[4] Jagdish Bansiya and Carl G Davis. A hierarchical model for object-oriented design quality
assessment.IEEE Transactions on Software Engineering, 28:4–17, January 2002.
[5] V. Basili, L. Briand, and W. Melo. A validation of object-oriented design metrics as quality
indicators.IEEE Transactions on Software Engineering, 22:751–761, October 1996.
[6] L. Belady and C. Evangelisti. System partitioning and its measure.Journal of Systems and
Software, 2:23–29, 1981.
[7] Martin Bernauer, Gerti Kappel, and Gerhard Kramler. Repre-
senting XML Schema in UML - a comparison of approaches.