On the Use of Feature-Oriented Programming for Evolving Software Product Lines – A Comparative Study Gabriel Coutinho Sousa Ferreira 1 , Felipe Nunes Gaia 1 , Eduardo Figueiredo 2 and Marcelo de Almeida Maia 1 1 Federal University of Uberlândia, Brazil 2 Department of Computer Science, Federal University of Minas Gerais, Brazil {gabriel, felipegaia}@mestrado.ufu.br, [email protected], [email protected]Abstract. Feature-oriented programming (FOP) is a programming technique based on composition mechanisms, called refinements. It is often assumed that feature-oriented programming is more suitable than other variability mechanisms for implementing Software Product Lines (SPLs). However, there is no empirical evidence to support this claim. In fact, recent research work found out that some composition mechanisms might degenerate the SPL modularity and stability. However, there is no study investigating these properties focusing on the FOP composition mechanisms. This paper presents quantitative and qualitative analysis of how feature modularity and change propagation behave in the context of two evolving SPLs, namely WebStore and MobileMedia. Quantitative data have been collected from the SPLs developed in three different variability mechanisms: FOP refinements, conditional compilation, and object-oriented design patterns. Our results suggest that FOP requires few changes in source code and a balanced number of added modules, providing better support than other techniques for non-intrusive insertions. Therefore, it adheres closer to the Open-Closed principle. Additionally, FOP seems to be more effective tackling modularity degeneration, by avoiding feature tangling and scattering in source code, than conditional compilation and design patterns. These results are based not only on the variability mechanism itself, but also on careful SPL design. However, the aforementioned results are weaker when the design needs to cope with crosscutting and fine-grained features. Keywords: Software product lines, Feature-oriented programming, Variability management, Design patterns, Conditional compilation. 1 Introduction Software Product Lines (SPLs) [17] are known to enable large scale reuse across applications that share a similar domain. The potential benefits of SPLs are achieved through a software architecture designed to increase reuse of features in several SPL products. There are common features found on all products of the product line (known as mandatory features) and variable features that allow distinguishing between products in a product line (generally represented by optional or alternative
32
Embed
On the Use of Feature-Oriented Programming for Evolving ...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
On the Use of Feature-Oriented Programming for
Evolving Software Product Lines – A Comparative Study
Gabriel Coutinho Sousa Ferreira1, Felipe Nunes Gaia1, Eduardo Figueiredo2 and
Marcelo de Almeida Maia1
1Federal University of Uberlândia, Brazil 2 Department of Computer Science, Federal University of Minas Gerais, Brazil
The empirical cumulative distribution function (ecdf) can be used to evaluate the fit of
a distribution to our data and to compare the different distributions of our sample. The
stepped ecdf resembles a cumulative histogram without bars. The distribution that
best fitted our data was 3-parameter Gamma. Our data definitely does not follow a
normal distribution. Indeed, it does not follow a symmetric distribution. The data
values are typically concentrated in smaller values. In order words, the median values
for the metrics are generally smaller than the mean values.
The interpretation of the ecdf is done as follow: the higher is the area under the
curve, the higher is frequency of lower values for the corresponding metrics.
Considering that the lower are the values for feature modularity metrics, the better is
the modularization, we consider that the best metrics curve is the one the presents the
highest frequency of lower values.
Figures 8 and 9 show the empirical cumulative distribution function for the feature
modularity metrics of WebStore and MobileMedia, respectively. One interesting point
is that WebStore and MobileMedia, despite some differences, have presented an
overall similar behavior, especially in CDC, CDO and LOCC. Concerning CDO, we
can observe that FOP outperformed DP and CC in both systems, and DP
outperformed CC. For CDLOC, we can observe that FOP clearly outperformed CC in
both approaches, and clearly outperformed DP in WebStore. In MobileMedia, FOP
just slightly outperformed DP. The fact is that DP had a performance similar to FOP
in WebStore.
For CDO and LOCC, we could not see significant differences between the three
approaches in both systems. Nonetheless, it is possible to see a slightly better
performance for FOP in both systems.
In Figure 10, we can observe the tendency of the behavior of the metrics for each
version of the system. We can see that, in general, the same global result previously
presented can be observed in all versions. However, this version-based analysis shows
that in the first versions, the CDC and CDLOC metrics have higher frequency of
lower values for FOP. In general, we can observe that the higher is the version, the
lower is the metrics values for all approaches and the lower is the difference between
the approaches, but still discriminative in the case of CDC.
Figure 11 presents the same metric values from the feature point of view. We could
see that independently from the used approaches, some features tend to produce a
similar behavior. Some features have a remarkable worse behavior than all the others
for all metrics, such as AlbumManagement (Black), PhotoManagement (Dashed
Blue). They were followed by Base (Dashed Red), SMS Transfer (Dashed Green).
These features are naturally complex. Concerning CDLOC, we can observe that
besides the aforementioned features, all approaches had not good metric values for
features Sorting (Blue Dashed-Dotted) and Favourites (Lilac Solid-Dotted).
Figure 8. Empirical CDF for all versions of Webstore (3-parameter Gamma)
Figure 9. Empirical CDF for all versions of Mobile Media (3-parameter Gamma)
Figure 10. Empirical CDF per versions of MobileMedia (3-parameter Gamma)
Figure 11. Empirical CDF per features of MobileMedia (3-parameter Gamma)
5.4 Discussion
FOP succeeds in features with no shared code. This situation was observed with six
features of the MobileMedia SPL, namely, CreateAlbum, DeleteAlbum, CreatePhoto,
DeletePhoto, EditPhotoLabel, and ViewPhoto. Some features with no shared code in
WebStore SPL, namely, DisplayByCategory and DisplayWhatIsNew, produced
similar results. The common characteristic of these features is that there is no source
code sharing or overlapping, i.e., they do not share statements, methods or
components with other features. The FOP solution presents lower values and superior
modularity in terms of tangling (CDLOC) and scattering over components (CDC),
which are supported by data in Figures 6 to 11. Figure 11, for instance, shows that the
measured curves of these features are concentrated in lower values with FOP. The
effectiveness of FOP mechanisms to modularize these features is due to the ability to
move the code in charge of realizing the feature from large classes to a set of small
cohesive class refinements. Conditional compilation lacks this ability because it has a
somewhat intrusive effect on the code, due to the need of adding #ifdef and
#endif clauses located at places where features crosscut. The results obtained from
this quantitative analysis corroborate with the common knowledge about feature
refinement mechanisms being more adequate to modularize features with no shared
code. The analysis of the other scattering metrics (CDO and LOCC) did not follow
the same trend of CDC, which can be explained with the fact that the granularity of
the methods and lines of code is lower and the distribution of features occurs in a
proportional fashion over all mechanisms. On the other hand, since the granularity of
components is higher, the respective impact on modularity metrics is more
observable.
When optional features are turned mandatory, DP removal may cause the SPL
architecture destabilization. Another interesting situation that emerged in our
analysis was the behavior of releases using the DP mechanisms on the transition from
release 3 to release 4 of WebStore. For instance, while the FOP solution handles this
particular situation without major issues, we observed the growth of the metrics in the
DP implementation when an optional feature was turned mandatory, as observed in
Figures 4 and 5. This problem can be explained by the fact that the implementation of
an optional feature with DP requires a larger number of components compared to
implement the same feature being mandatory. Therefore, developers have to carefully
design flexible core architecture to allow the inclusion of mandatory features. If the
patterns used to implement optional features are removed when the features become
mandatory, then the architecture may degenerate and become unstable. An alternative
solution would be keeping the features modularized in that patterns and make sure
that the modules are always present in all products. However, this solution would not
be fair to this specific change scenario since by turning an optional feature into
mandatory, we should remove the components responsible for variation, i.e., the
pattern implementation. If we keep pattern modules responsible for an obsolete
variation point, it means that we are keeping needless code in the SPL, which could
adversely affect future evolutions. For instance, the presence of these modules could
turn program comprehension tasks more arduous. Moreover, keeping these DP would
break the compliance between the SPL source code and feature model, since the SPL
source code would contain modules created to support the instantiation of an
inexistent variation point.
Crosscutting features are problematic for all studied approaches. We could see
from Figure 11 that the crosscutting features Sorting and Favourites were not well
handled by the approaches as the majority of the other features. The reason is that the
typical design to introduce these features intrinsically tangles and scatters their code.
The code related to these features is highly tangled in some base components of
MobileMedia, such as ImageData, MediaData, and MedialUtil. Due to this high
coupling, these features are also scattered across the source code of other features.
These components were minimally modularized and, thus, they are almost equally
implemented with the three evaluated mechanisms. In these cases, the use of
aspectual approaches would enhance modularity of these problematic features easing
their code separation [7] [8].
Ratio-based analysis of metrics tends to be less discriminative in larger systems. The larger is the evaluated software version, the lower are the metrics ratios for all
approaches and the lower is the observable difference between the approaches. Hence,
we should consider that the size of the system can impact on the discriminative
capability of the metrics to evaluate software modularity and stability. We performed
our analysis based on the ratio of the measured values by the number of components.
Since it is necessary to compare different mechanisms, we expect lower differences in
metric values for larger systems due to the greater number of components. This
situation occurs from the intrinsic nature of the studied metrics that evaluates
scattering and tangling related to the whole system.
On the use of a single variability mechanism to construct SPLs. In practice,
developers do not necessarily use only a single mechanism to address all kinds of
features during SPL construction. They often combine two or more variability
mechanisms depending on the kind of feature, feature location and granularity,
quantification level [6, 34, 46]. Recent research shows that there is no silver bullet
when it comes to mechanisms that manage variability in SPLs [6, 46]. We would
introduce more independent variables in the study, for example, with the use of hybrid
approaches. However, there is still lack of data and study about the strength of
individual mechanisms. For this reason, we decided to study the approaches
individually to identify their unique characteristics. For example, annotative
approaches, like CC, are well known to support fine-grained extensions on
statements, parameters, and conditional expressions [31, 34]. On other hand, certain
fine-grained features are very hard, if not impractical, to implement with FOP. All
these points considered, the analysis of individual mechanisms showed that, in
general, FOP refinements provide more benefits related to modularity and changes
propagation when compared to CC and DP. In order to draw more specific
conclusions about the mechanisms, such as to propose programming guidelines to
optimize their use, it is necessary to analyze them in more studies considering
different domains, changes scenarios, and types of features.
6 Threats to Validity
Even with the careful planning of the study, some factors should be considered in the
evaluation of the results validity. We discuss the study validity with respect to its
conclusion, internal, external, and construct validity [51].
Concerning the conclusion validity, since 60264 data points were collected, the
reliability of the measurement process might be an issue. This issue was alleviated
because the measurements were independently checked by one of the authors that had
not collected the respective data. Moreover, analysis may have been affected by
spurious evidence since, for instance, modularity metrics were indirectly used to
answer RQ1. In this particular case, we could only draw plausible conclusions since a
stronger data analysis could not been carried out with such indirect measurement.
Concerning the internal validity, most analyzed versions of the SPLs were
constructed by the authors for the purpose of this study. Different design options
might have produced different results. WebStore was inspired by a previous Java
application, named PetStore [16], developed based on industry-strength technology,
such as Java Server Pages (JSP) and Servlets. Additionally, its successive releases
were discussed between the developers in order carefully developed to employ the
most widely used of each implementation technique. All CC releases of MobileMedia
were designed and implemented in previous studies [24]. Therefore, in this case we
only adapted the available releases to conform to the DP and FOP designs.
Another issue with respect to internal validity is that the modularity metrics
depends on how accurate was the mapping (assignment) of each concern to code
elements. Fortunately, we observed in a previous study [25] that, apart from Concern
Diffusion over Lines of Code (CDLOC), the mapping process does not significantly
impact the modularity metrics used in this paper. Additionally, in order to mitigate
this threat, we relied on concern mappings produced by the original developers.
Whether the concern mapping was fully correct or not, it just reflects how these
metrics would be used in practice.
Concerning the external validity, some other factors limit the generalization of the
results:
Although the SPLs were carefully designed to be as much general as
possible, it should be considered that WebStore and MobileMedia are special
purpose systems that may not represent all properties of real world systems.
However, both PetStore (predecessor of WebStore) and MobileMedia were
used in research studies with similar purposes of ours [16, 24].
The evolution scenarios may also not represent the large space of
possibilities in real-world SPL evolution scenarios. For instance, we have not
investigated some intricate situations involving feature interaction that may
appear in larger SPLs.
Only the Java programming language and the AHEAD environment were
considered in this study. Some of our results could be different if other
languages and environments, such as CaesarJ [43], were used. For example,
different languages may support different types of constructs and the
measures could have some variation.
Only modularity and change propagation metrics were considered helpful to
point out the variability mechanisms benefits. However, they provide only a
limited view of these benefits, as they do not measure the real effort required
to perform SPL changes. Similar limitation is observed in every study that
relies on metrics.
Finally, concerning the construct validity, one issue is on how much support
change propagation and modularity metrics offer to produce robust answers to our
investigation. As a matter of fact, these proxy metrics offer a limited view on the
design stability and modularity problems, i.e., they only permit us to draw indirect
conclusions about SPL modularity and stability properties. The modularity metrics
are mostly related to separation of concerns properties, which are insufficient to allow
a complete analysis of the benefits of each variability mechanism with respect to SPL
modularity. Change propagation measures were used to complement the modularity
analysis. In fact, we have learned in this study that these two sets of metrics should
not be analyzed in isolation. However, they have shown themselves to be more useful
when analyzed in conjunction with the other used metrics.
7 Related Work
Several studies have investigated variability management on SPLs [3, 4, 11, 49].
Batory and others have reported an increased flexibility in changes and significant
reduction in program complexity measured by number of methods, lines of code, and
number of tokens per class [11]. Simplification in evolving SPL architecture has also
been reported in [38, 44], as consequence of variability management. Other research
work has also analyzed stability and reuse of SPLs [18, 24]. For instance, Figueiredo
and his colleagues [24] performed an empirical study to assess modularity, change
propagation, and feature dependency of two evolving SPLs. Their results suggest that
AOP copes well with the separation of features with no shared code and does not
succeed when mandatory features are the change focus. Their study focused on
aspect-oriented programming (AOP) while, in this study, we analyzed variability
mechanisms available in feature-oriented programming (FOP).
Apel and Batory [8] have proposed the Aspectual Mixin Layers [7] approach to
allow the integration between aspects and FOP refinements. These authors have also
used size metrics to quantify the number of components and lines of code in an SPL
implementation. Similar to ours, their study can be seen as a step towards the proper
use of composition mechanisms available in these languages. Their study, however,
(i) did not consider a significant suite of software metrics, such as change propagation
metrics, and (ii) did not address SPL evolution scenarios and stability.
Dantas and his colleagues [18] conducted an exploratory study to analyze the
support of new modularization techniques to implement SPLs. Their study aimed at
comparing the advantage and drawbacks of different advanced programming
techniques in terms of SPL feature stability and reuse. These authors have compared
essentially three different AOP implementations using two evolving software product
lines: iBatis and MobileMedia. Moreover, they conducted their study considering two
additional stability metrics - Refactoring of Modules (RoM) and Alterations in Code
Elements (ACE). Their work suggests that CaesarJ [43], a hybrid AOP and FOP
approach, provides better stability and reuse of SPL modules. With respect to
modularity, their quantitative analysis, based on the same suite of SoC metrics,
showed that compositional approaches enable further modular decomposition of the
SPL code. Our work also supports this finding and presents new ones for the other
studied mechanisms in the context of SPL evolution, as discussed in Section 5.4.
Kästner and others [34] performed a study to compare other important properties to
be assessed when dealing with variability mechanisms for SPL: feature traceability,
ease of adoption and safety. Their study compared compositional and annotative
approaches, showing that each one has strengths and weaknesses. Their study
supports the synergistic use of both approaches for best results in expressiveness,
granularity and type-safety. Other studies also analyzed granularity and type-safety of
variability mechanisms in the context of SPL [9, 33]. These studies complement our
analysis since they investigate different SPL quality properties.
Several studies focused on challenges in the software evolution field [28, 39, 41].
These works have in common the concern about measuring different artifacts through
software evolution, which relies directly on the use of reliable software metrics. For
instance, Greenwood and others [29] used a similar suite of metrics to assess the
design stability of an evolving application. In general, there is a shared sense about
software metrics on the engineering perspective: they are far from being mature and
are constantly the focus of disagreements [1, 32, 40]. Different from our study,
Greenwood's one did not target at assessing the impact of changes in the core and
variable features of SPLs. Additionally, they used a different application as a case
study, named Health Watcher.
8 Concluding Remarks and Future Work
The use of variability mechanisms to develop SPLs largely depends on our ability to
empirically understand its positive and negative effects through design changes.
Generally speaking, the development of an SPL has to provide means to anticipate
changes. That is why incremental development has been largely adopted. This study
evolved SPLs in order to assess the capabilities of FOP mechanisms to provide SPL
modularity and stability in the presence of change requests. Such evaluation included
two complementary analyses: change propagation and feature modularity.
Our main contributions in this work were the development of an open benchmark
for the evaluation of evolving SPLs, a qualitative and quantitative data analysis
framework and an extensive data analysis of collected metrics using the benchmark
and the framework.
Some interesting results emerged from our analysis. First, the FOP design of the
studied SPLs tends to be more stable than the other traditional widely-used
approaches. This advantage of FOP is particularly true when a change targets optional
features. Second, we observed that FOP class refinements adhere more closely the
Open-Closed principle [42]. Furthermore, such mechanisms usually scale well for
dependencies that do not involve shared code.
The results of Sections 4 and 5 indicate that conditional compilation (CC) may not
be adequate when used in evolving SPLs when feature modularity is a major concern.
For instance, the addition of new features using CC mechanisms usually causes the
increase of feature tangling and scattering. These crosscutting features destabilize the
SPL architecture and make it difficult to accommodate future changes.
The implementations using design patterns and FOP refinements also strive to
accommodate changes that require major restructuring. They usually require a higher
number of components insertions during this kind of SPL evolution, when compared
to CC. The results have shown that the removal of some design patterns makes the
SPL architecture unstable when optional features are turned into mandatory. This kind
of change negatively affects the SPL modularity properties (especially scattering).
This work has revealed evidences for developers and language designers that
although FOP is well-suited for SPL implementation, it still has drawbacks that
require the combination with other mechanisms or the design of constructions to
handle fine-grained, crosscutting and type-safe issues, respectively.
For the future work, the study of different metrics and its relationship to other
quality attributes in SPLs, such as robustness and reuse could be interesting. In
addition, other modularity properties, such as coupling and cohesion, could be
assessed to increase the comprehensiveness of the results presented.
Also, aspects can be used symbiotically with one of the studied variability
mechanism to develop SPLs. These hybrid approaches would permit us to better
understand how they behave in change scenarios, especially because we have pointed
out the crosscutting features are issues that none of studied mechanisms could provide
successful solution (Figure 11).
Finally, a key challenge on the developing of SPLs is to guarantee that only well-
typed programs are generated. It is often hard, if not impractical, to type check all
possible products, especially when the number of feature combinations grows
exponentially with the number of features. The annotative and compositional
approaches studied in this paper do not support modular type checking. However,
there are solutions based on SAT solvers [19, 36, 50] and type-checking non-
preprocessed code [9, 35, 37] proposed to help this problem. Thus, future studies
should analyze the ability of each approach to deal with this problem and increase the
breadth of our study.
Acknowledgments
This work was partially supported by FAPEMIG, grant CEX-APQ-02932-10 and
CEX-APQ-2086-11 and CNPq grant 475519/2012-4. This work was partially
supported by CAPES and CNPq scholarships. We would like to thank the reviewer’s
comments that helped to improve the quality of this work.
References
1. Abran, A., Sellami, A., Suryn, W. Metrology.: Measurement and Metrics in Software
Engineering. In Proceedings of the 9th International Software Metrics Symposium
(Metrics), pp. 2—11. (2003).
2. Adams, B., De Meuter, W., Tromp, H., Hassan, A. E.: Can we Refactor Conditional
Compilation into Aspects? In 8th ACM International Conference on Aspect-oriented
Software Development (AOSD), pp. 243--254. ACM, Virginia, New York (2009)
3. Adler, C. Optional Composition - A Solution to the Optional Feature Problem? MSc
Dissertation, University of Magdeburg, Germany. (2011).
4. Ali Babar, M., Chen, L., Shull, F. Managing Variability in Software Product Lines,