-
Dependable softwareBertrand Meyer, ETH Zurich
ABSTRACT
Achieving software reliability takes many complementary
techniques, directedat the process or at the products. This survey
summarizes some of the mostfruitful ideas.
1 OVERVIEWEveryone who uses software or rel in other words,
everyone programs will perform properly.techniques to improve
software qu
There are many subcultures osealed off from each other; menCMMI
to programming languageproofs, can be as incongruous as bThis
survey disregards such establinclude as many as possible
ofproducing good software is hard ea result we will encounter
techniq
A note of warning to the readfrom including references
eaexpectation (if a justification is nthan a cold inspection
limited to o
2 SCOPE AND TERMIN
The first task is to define some ofof this articles title,
determinedInformation and Communication
Reliability and dependabili
In the software engineering lidependable but reliable, as
igeneral-purpose and technical dicdefinitions and are usually
transla
Cite as follows: Bertrand Meyer, DependableSoftware, to appear
in Dependable Systems:Software, Computing, Networks, eds. Jrg
Kohlas,Bertrand Meyer, Andr Schiper, Lecture Notes inComputer
Science, Springer-Verlag, 2006.ies on devices or processes that use
softwarehas a natural interest in guarantees thatThe following
pages provide a review ofality.
f software quality research, often seeminglytioning
process-based approaches such astechnologists, or tests to people
working onringing up Balanchine among baseball fans.ished cultural
fences and instead attempts tothe relevant areas, on the assumption
thatnough that every little bit counts [60]. Asues of very diverse
kinds.er seeking objectivity: I have not shied awaysy to spot to my
own work, with theeeded) that it makes the result more livelyther
peoples products and publications.
OLOGY
the fundamental terms. Even the first wordby the Hasler
Foundations Dependable
Systems project, requires clarification.
ty
terature the more familiar term is notn software reliability. A
check throughtionaries confirms that the two have similarted
identically into foreign languages.
-
DEPENDABLE SOFTWARE2
There does exist a definition of dependability [1] from the
eponymousIFIP Working Group 10.4 [39] that treats reliability as
only one amongdependability attributes, along with availability,
safety, confidentiality,integrity and maintainability. While
possibly applicable to a computingsystem as a whole, this
classification does not seem right for their softwarepart, as some
attributes such as availability are not properties of the
softwareper se, others such as confidentiality are included in
reliability (through one ofits components, security), and the
remaining ones such as maintainability areof dubious meaning for
software, being better covered by other quality factorssuch as
extendibility and reusability [57].
As a consequence of thesedependability as meaning the sam
Defining reliability
The term software reliability itsOne could argue for taking it
to covof use, efficiency and extendibilitymodularity. (The
distinction, detaproperties, immediate or long-tpurchasing and
using the softwaronly to software developers althouof external
factors.)
It is reasonable to retain a mcovers three external factors:
cdoesnt imply that others are irrerobust and secure system can
hardtakes ages to react to inputs, an effiuse: many software
disastersimplemented the right functions buser interfaces. The
reasons for lare, first, that including all othersessentially the
whole of softwartechniques to achieve these threea certain kindred
spirit, not sharperformance optimization techniqand other external
and internal facobservations the present survey interpretse thing,
for software, as reliability.
elf lacks a universally accepted definition.er all external
quality factors such as ease, and even internal quality factors
such asiled in [57], is that external factors are theerm, that
affect companies and peoplee, whereas internal factors are
perceptiblegh in the end they determine the attainment
ore restricted view in which reliability onlyorrectness,
robustness and security. Thislevant; for example even the most
correct,ly be considered dependable if in practice itciency
problem. The same goes for ease of
on record happened with systems thatut made them available
through error-proneimiting ourselves to the three factors
listedwould turn this discussion into a survey ofe engineering (see
[33]); second, that thefactors, although already very diverse,
haveed by those for enhancing efficiency (likeues), ease of use
(like ergonomic design)tors.
-
2 SCOPE AND TERMINOLOGY 3
Correctness, robustness, securityFor the three factors retained,
we may rely on the following definitions: Correctness is a systems
ability to perform according to its specification
in cases of use within that specification. Robustness is a
systems ability to prevent damage in cases of erroneous
use outside of its specification. Security is a systems ability
to prevent damage in cases of hostile use
outside of its specification.They correspond to levels of
increasing departure from the specification. Thespecification of
any realistic system makes assumptions, explicit or implicit,about
the conditions of its use: agenerated program if the inputprogram
defines a pay check if taccess control software specificbuilding
has burned. By nature, tsecurity are different from thosewe can no
longer talk of performiseek the more modest goal of preability to
detect attempts at errone
Security deserves a special mhighly visible place in software
clamented, as it signals the end of awe could concentrate on
devisingmuch concern about the worldsadvantage, since it has
finally brousoftware quality issues, a resultmodern software
engineering pracvisible signs of this phenomenondevelopment in
February of 2001security flaws. Many of these floverflow, are
simply the result of pfocusing on security means lookfixing
security implies taking a coand requires, in the end, ensuring
Product and processAny comprehensive discussioncomplementary
aspects: product a
The products are the softwareassess; the process includes the
mand their organizations build theseC compilers specification
doesnt define ais payroll data, any more than a payrollhe input is
a C program; and a buildingsation cannot define what happens if
thehe requirements defined by robustness andof correctness: outside
of the specification,ng according to that specification, but
onlyventing damage; note that this implies theous or hostile
use.ention as in recent years it has assumed a
oncerns. This is a phenomenon to be bothgolden age of software
development whenthe best possible functionality without
toonastiness, and at the same time taken toght home to corporations
the seriousness ofthat decades of hectoring by advocates oftices
had failed to achieve. One of the mostis Bill Gatess edict famously
halting all
in favor of code reviews for hunting downaws, such as the most
obnoxious, bufferoor software engineering practices. Even if
ing at the symptom rather than the cause,herent look at software
tools and techniquesreliability as a whole.
of software issues must consider twond process.elements whose
reliability we are trying to
echanisms and procedures whereby people products.
-
DEPENDABLE SOFTWARE4
The products of softwareThe products themselves are diverse. In
the end the most important one, forwhich we may assess correctness,
robustness and security, is code. But eventhat simple term covers
several kinds of product: source code as programmerssee it, machine
code as the computer executes it, and any intermediate versionsas
exist on modern platforms, such as the bytecode of virtual
machines.
Beyond code, we should consider many other products, which in
theirown ways are all software: requirements, specifications,
design diagramsand other design documents, test data but also test
plans , userdocumentation, teaching aids
To realize why it is importanproducts other than code, it
sufstudies, some already decades oldcost of correcting an error the
late
DeficienciesIn trying to ascertain the reliabilitoften like a
detective or a firmindset and look for sources oaccepted
terminology here disting A failure is a malfunction of
directly apply to products oth A fault is a departure of the
so
have satisfied. A failure alnecessarily a fault in the
codocumentation, or in a non-which the system runs.
An error is a wrong human dsystem. Wrong is a subjecwhat it
means: a decision is wturn cause failures).
In a discussion limited to softwarresult from errors, since
softwareslings and arrows of the physical w
The more familiar term for eengineering literature shuns it
fobenefit of admitting that our mistathem ourselves. In practice,
as mat in the search for quality to pay attention tofices to
consider the results of numerous[10], showing the steep progression
of the
r it is identified in the lifecycle.
y of a software product or process we muste prevention engineer
adopt a negativef violation of reliability properties. Theuishes
three levels:the software. Note that this term does noter than
executable code.ftware product from the properties it shouldways
comes from a fault, although notde: it could be in the
specification, in thesoftware product such as the hardware on
ecision made during the construction of thetive term, but for
this discussion its clearrong if it can lead to a fault (which can
in
e reliability, all faults and hence all failuresis an
intellectual product not subject to theorld.
rror is bug. The upper crust of the softwarer its animist
connotations. Error has thekes dont creep into our software: we
inserty be expected, everyone says bug.
-
2 SCOPE AND TERMINOLOGY 5
Verification and validation
Even with subjectivity removed from the definition of error,
definitions forthe other two levels above remains relative: what
constitutes a malfunction(for the definition of failures) or a
departure from desirable properties (forfaults) can only be
assessed with respect to some description of the
expectedcharacteristics.
While such reference descriptions exist for some categories of
softwareproduct an element of code is relative to a design, the
design is relative toa specification, the specification is relative
to an analysis of the requirements the chain always stops
somewhere; for example one cannot in the end certifythat the
requirements have no faultsome higher-level description,
anassessing the value of the descript
Even in the absence of anotheassess a particular product, we
canby performing internal checks. Fo A program that does not
init
path is suspicious, independfulfillment of its specification
A poorly written user manualof another project document,
This observation leads to distireliability assessment,
verificatioabbreviation V&V: Verification is internal asse
considered just by itself. Theare subject to verification:
foranother example.
Validation is relative assessdefines some of the propertiedesign
against specificatiodocumentation against standrules, delivery
dates againstagainst defined goals, test sui
A popular version of this distiascertaining that the product is
ddoing the right thing. It only appa project plan or a test plan do
not, as this would mean assessing them againstd would only push the
problem further toion itself. Turtles all the way up.r reference
(another turtle) against which tooften obtain some evaluation of
its quality
r example:ialize one of its variables along a particularently of
any of its properties vis--vis the.may not explicitly violate the
prescriptions
but is problematic all the same.nguishing two complementary
kinds ofn and validation, often combined in the
ssment of the consistency of the product,last two examples
illustrated properties thatcode; for documentation. Type checking
is
ment of a product vis--vis another thats that it should satisfy:
code against design,n, specification against requirements,ards,
observed practices against companyproject milestones, observed
defect rates
tes against coverage metrics.nction [10] is that verification is
aboutoing things right and validation that it islies to code,
however, since a specification, do anything.
-
DEPENDABLE SOFTWARE6
3 CLASSIFYING APPROACHES
One of the reasons for the diversity of approaches to software
quality is themultiplicity of problems they address. The following
table shows a list ofcriteria, essentially orthogonal, for
classifying them.
The first distinction is cultural almtechniques the emphasis is
methapply certain rules to produce a bthe goal is to examine a
proposepossible deficiencies, with the aimstate that the two are
complemenoften used by proponents of a pcriticized for accepting
softwareimprove it they correspond tohopeful of prevention and the
othe
The second distinction correengineering cited above: are we
wleading to them?
Some approaches are of aapplying some practices; we mtechniques
that are tool-supported
Criteria for classifying approaches to software reliability
A priori (build) A posteriori (assess and correct)Process
ProductManualTechnology-neutralProduct- and phase-neutralStatic
(uses software text)InformalComplete (guarantee)Freeost as much as
it is technical. With a prioriodological: telling development teams
to
etter product. With a posteriori techniques,d software product
or process element for
of correcting them. While it is natural totary rather than
contradictory a defenseosteriori approaches such as testing
whentechnology as it is rather than helping todifferent views of
the software world, oner willing to settle down for cure.
sponds to the two dimensions of softwareorking on the products,
or on the processes
methodological nature and just requireay call them manual, in
contrast withand hence at least partially automated.
vs
Tool-supportedTechnology-specificProduct- or
phase-specificDynamic (requires execution)MathematicalPartial (some
progress)Commercial
-
3 CLASSIFYING APPROACHES 7
An idea can be applicable regardless of technology choices; for
exampleprocess-based techniques such as CMMI, discussed below,
explicitly stayaway from prescribing specific technologies. At the
other extreme, certaintechniques may be applicable only if you
accept a certain programminglanguage, specification method, tool or
other technology choice. We may talkof technology-neutral and
technology-specific approaches; this is more aspectrum of
possibilities than a black-and-white distinction, since
manyapproaches assume a certain class of technologies such as
object-orienteddevelopment encompassing many variants.
Some techniques apply to aspecification (a specification
langcode) They are product-specconfiguration management tools,
aproduct-neutral. Product is usedof the software construction
proce
For techniques directed at prbetween dynamic approaches
sucprogram, and purely static ones,which only need to analyze the
prsimulation technique requires edynamic even though the
execenvironment; model-checking isrespect it is close to
testing.
Some methods are based onthe case with program proofs andmore
informal.
A technique intended to assesguarantee that they are
satisfiedreassurance to this effect.
The final distinction is ecodomain usable for free, incommercial
ones.specific product or phase of the lifecycle:uage),
implementation (a static analyzer ofific, or phase-specific.
Others, such aspply to many or all product kinds; they arehere to
denote one of the types of outcome
ss.
ogram quality, an important division existsh as testing, which
rely on executing the
such as static analysis and program proofs,ogram text. Here too
some nuances exist: axecution and hence can be classified asution
doesnt use the normal run-timeclassified as static even though in
some
mathematical techniques; this is obviouslyformal specification
in general. Many are
s quality properties can give you a complete, or more commonly
some partial
nomic: between techniques in the publicthe ordinary sense of the
term and
-
DEPENDABLE SOFTWARE8
4 PROCESS-BASED APPROACHES
We start with the least technical approaches, emphasizing
managementprocedures and organizational techniques.
Lifecycle models
One of the defining acts of software engineering was the
recognition of theseparate activities involved, in the form of
lifecycle models that prescribe acertain order of tasks (see the
figure on the adjacent page). The initial modelis the so-called
waterfall [11], stsoftware process although no loVariants
include:
The V model which retainsdivides the process into twoalong the
first branch are for dfor verification and validatiosteps along the
first branch.
The Spiral model [11] wmanagement, in particular thof the
Waterfall approach. Ththe systems functionality tquickly, and when
they haveexperience to proceed to othewith the notion of rapid
proto
The Rational Unified Proceelaboration, construction andof
development and a set oconfiguration management.
The Cluster model [51]incrementality buildingfundamental to the
most usersuccessive activities, frommaintenance, as a
continuumindividual lifecycle of everyfuture reuse of some of the
d
The figure shows pictorial represeill used as a reference for
discussions of thenger recommended for literal application.
the sequential approach of the waterfall butparts, the branches
of the V; activitiesevelopment, those in the second branch are
n, each applied to the results of one of the
hich focuses on reducing risk in projecte risk caused by the
all-or-nothing attitudee spiral model suggests isolating subsets
ofhat are small enough to be implementedbeen implemented taking
advantage of ther parts of the system. The idea is
connectedtyping.
ss, distinguishing four phases, inception,transition, with a
spiral-like iterative stylef recommended best practices such as
[57], emphasizing a different form ofa system by layers, from
the most
-oriented and a seamless process treatinganalysis to design,
implementation and. This model also introduces, as part of
thecluster, a generalization step to prepare foreveloped
elements.
ntations of some of these models.
-
4 PROCESS-BASED APPROACHES 9
Waterfall Cluster
Spiral (from [11])Lifecycle models, illustrated
V-shaped
-
DEPENDABLE SOFTWARE10
Whatever their effect on how people actually develop software,
thecontribution of lifecycle models has been a classification and
definition of theactivities involved in software development, even
when these activities are notexecuted as phases in the precise
order mandated by, for example, thewaterfall model. Software
quality benefits in particular from:
A distinction between requirements, the recording of user
requirements,and specification, their translation into a systematic
form suitable forsoftware development, where rigor and precision
are essential.
Recognition of the importance of Verification and Validation
tasks.
Recognition of post-deliverythey still do not occupy a visresult
from evolutions poster
In the Cluster model, the prestask to prepare for reuse.
Also in the Cluster model, thwhich unifies the
methods,throughout the software procescounter-example here is the
us
The growing emphasis on ieven if this concept is underscluster
and RUP models.
Organizational standards
Another process-related set of debeneficial, on some segments
ofDepartment of Defense, concernsoftware capabilities and to
estSoftware Engineering Institute wMaturity Model, whose
currenIntegration) provides a collectdisciplines, rather than a
single moInternational Standard Organizativariants of its
9000-series qualproperties with CMMI. The preseactivities such as
maintenance, althoughible enough place. Many software troublesior
to the initial release.
ence, for each cluster, of the generalization
e use of a seamless and reversible approachtools, techniques and
notations that helps, rather than exaggerate them. (The textbooke
of UML for analysis and design [56].)
ncrementality in the development process,tood differently in,
for example, the spiral,
velopments has had a major effect, largelythe industry. In the
early 1990s the US
ed with the need to assess its suppliersablish consistent
standards, entrusted theith the task of developing a Capabilityt
incarnation, CMMI [74] (the I is forion of standards applicable to
variousdel for software. Largely independently, theon has produced
a set of software-orientedity standards, which share a number ofnt
discussion is based on CMMI.
-
4 PROCESS-BASED APPROACHES 11
Beyond its original target community, CMM and CMMI have been
thecatalyst for one of the major phenomena of the IT industry
starting in the mid-nineties: the development of offshore software
production, especially in India[63]. CMMI qualification provides
suppliers of outsourcing developmentservices with quality standards
and the associated possibility of independentcertification, without
which customers would not be have known how to trustdistant,
initially unknown contractors.
CMMI is (in the earlier classification) product-neutral,
phase-neutral andtechnology-neutral. In its application to software
it is intended only todetermine how well an organization controls
its development process bydefining and documenting it, recording
and assessing how it is applied inpractice, and working to
improvshould be, only how much you adeveloping in PL/I on IBM 370
an
CMMI assesses both the cap(such as software) in an
organizatiwhole. It distinguishes five levels Performed: projects
happen
control and no reproducibilit Managed: processes are clea
for the organization as a who Defined: proactive process de
Quantitatively managed: the
to qualitative techniques, but Optimizing: the mechanisms
well established that the focand its processes.
Through their emphasis on the prstandards help improve the
qualitysuch improvements of the procesproducts as well; but they
are only one module of the software waanother was providing them in
Enthe failure of the NASA Mars Orbthe project noted that the
organizastandards. Process models and prfor using the best
technologicalwould not shy away from integrtechnology could be
extremelyneutral requirements of CMMI cahold on their software
processes.e it. It doesnt prescribe what the processre on top of
it. You could presumably bed get CMMI qualification.
ability level of individual process areas inon, and the maturity
of an organization as aof increasing maturity:and results get
produced, but there is littley; the process is essentially
reactive.rly defined for individual projects, but notle. They
remain largely reactive.fined for the organization.
control mechanisms do not limit themselves add well-defined
numerical measurements.
for controlling processes are sufficientlyus can shift on
improving the organization
ocess and its repeatability, CMMI and ISOof software
development. One may expect
s to have a positive effect on the resultingpart of the
solution. After a software error
s expecting measures in the metric system,glish units was
identified as the cause ofiter Vehicle mission [82], an engineer
fromtion was heavily into ISO and other processocess-focused
practices are not a substitutesolutions. Tailored versions of CMMI
thatating specific technologies such as objectuseful. In the
meantime, the technology-n be applied by organizations to get a
better
-
DEPENDABLE SOFTWARE12
Extreme programming
The Extreme Programming movement [6] is a reaction against
precisely thekinds of lifecycle models and process-oriented
approaches just reviewed. XP(as it is also called) emphasizes
instead the primacy of code. Some of theprincipal ideas
include:
Short release cycles to get frequent feedback.
Pair programming (two people at a keyboard and terminal).
Test-driven development.
A general distrust of specifiguide of development.
Emphasis on programmers w
Some of these practices are clearprior to XP, in particular
short redescribed in 1995 by Cusumano afrequent testing as part of
develoreally specific to XP are of limitedpair programming cannot
be impowork for some people and becauuseful all the time) or, in
the cspecifications, downright detrimthe approach.
Code inspections
A long-established quality practicsession designed to examine a
certflaws. The most common form iapplied to any kind of software
en
Small meeting: at most 8 peelement under review.
The elements under reviewcirculated in advance; theidentified
possible criticismsbe bounded, for example 2 orcation and design:
testing is the preferred
elfare.
ly beneficial to quality but were developedlease cycles
(Microsofts daily build as
nd Shelby [19], see also [54]) and the use ofpment (see e.g.
quality first [55]). Those
interest (while sometimes a good practice,sed indiscriminately,
both because it doesntse those who find it useful may not find
itase of tests viewed as a replacement forental. See [75] and [64]
for critiques of
e is the inspection, also known as review: aain software element
with the aim of findings code inspection, but the process can
begineering product. Rules include:
ople or so, including the developer of the
and any supporting documents must beparticipants should have
read them andbefore the meeting. The allotted time should 3
hours.
-
4 PROCESS-BASED APPROACHES 13
The meeting must have a moderator to guide discussions and a
secretaryto record results.
The moderator should not be the developers manager. The intent
is toevaluate products, not people.
The sole goal is to identify deficiencies and confirm that they
are indeeddeficiencies; correction is not part of the process and
should not beattempted during the meeting.
Code inspections can help avoid errors, but to assess their
usefulness one mustcompare the costs with those of running
automated tools that can catch someof the same problems without
human intervention; static analyzers, discussedbelow, are an
example.
Some companies have institcheck in code (integrate it into
thewithout approval by one other dethat has a clearly beneficial
effect bat least one other team member of
Open-source processes
A generalization of the idea of comembers of the open-source
cdramatically improves quality byat the software text; some have
goall bugs are shallow [73].
As with many of the other tea beneficial contribution, but not
aof a widely used security programsubtle buffer overflow problems
hacode for years, even though it hadand security auditors One tool
wpotentially exploitable, but reseacame to the conclusion that
there w(The last observation is anecdotal esuch as static analyzers
are potent
While is no evidence that opworse) than commercial software,only
because of the wide variety oclear that more eyes potentially
seutionalized the rule that no developer mayrepository for a
current or future product)
veloper, a limited form of code inspectiony forcing the original
developer to convince the suitability of the contribution.
de inspection is the frequent assertion, byommunity, that the
open-source processenabling many people to take a critical lookne
so far as to state that given enough eyes,
chniques reviewed, we may see in this ideapanacea. John Viega
gives [78] the examplein which in the past two years, several
veryve been found Almost all had been in thebeen examined many
times by both hackersas able to identify one of the problems
asrchers examined the code thoroughly andas no way the problem
could be exploited.vidence for the above observation that tools
ially superior to human analysis.)en-source software as a whole
is better (orand no absolute rule should be expected if
f products and processes on both sides, it ise more bugs.
-
DEPENDABLE SOFTWARE14
Requirements engineering
In areas such as embedded systems, many serious software
failures have beentraced [45] to inadequate requirements rather
than to deficiencies introducedin later phases. Systematic
techniques for requirements analysis are available[76] [40] to
improve this critical task of collecting customer wishes
andtranslating them into a form that can serve as a basis for a
software project.
Design patterns
A process-related advance that hadevelopment is the
emergencearchitectural scheme that has beenin applications, and for
which astandard format. Patterns providesimplifying design
discussions, anwisdom of their predecessors.
A (minority) view of patterntowards the technique discussed
ninterpretation, suffer from the liminsert the corresponding
solutionsystem. If instead it is possible todevelopers can directly
reuse th(Abstract Program Interface). Thethan to redo.
Investigations [65]programming language constructscan be thus
componentized.
Trusted components
Quality improvement techniques,product, are only as good as
themagnitude of the necessary educamajor short-term
improvements,have not had the benefit of a forms had a strong
beneficial effect on softwareof design patterns [32]. A pattern is
anrecognized as fruitful through frequent useprecise description
exists according to a
a common vocabulary to developers, henced enable them to benefit
from the collective
s [62] [65] understands them as a first stepext, reusable
components. Patterns, in thisitation that each developer must
manuallys into the architecture of every applicableturn the pattern
into a reusable component,e corresponding solution through an
APIobservation here is that it is better to reusesuggest that with
the help of appropriateup to two thirds of common design
patterns
whether they emphasize the process or their actual application
by programmers. Thetion effort is enough to temper any hope
ofespecially given that many programmers
al computer science education to start with.
-
4 PROCESS-BASED APPROACHES 15
Another practical impediment to continued quality improvement
comesfrom market forces. The short-term commercial interest of a
company isgenerally to release software that is good enough [83]:
software that hasbarely passed the threshold under which the market
would reject it because ofbad quality; not excellent software. The
extra time and expense to go from thefirst to the second stage may
mean, for the company, losing the market to a lessscrupulous
competitor, and possibly going out of business. For the industry
asa whole, software quality has indeed improved regularly over time
but tendsto peak below the optimum.
An approach that can overcome these obstacles is increased
reliance onreusable components, providing pmany different
applications, eitherpurpose component libraries) orComponents have
already changproviding conveniently packagabstract interfaces, of
commondatabase manipulation, basic nstructures and others, thereby
elevtheir applications. When the compreuse has highly beneficial
effectsthe quality of the application-spec
Examining more closely thactually highlights two separatequality
of a system will benefit fmust note that reuse magnifies thebe even
more damaging in componsince they affect every application
The notion of trusted componone of the most pressing and promis
the industrial production of reusof quality. Producing such
trusttechniques discussed elsewhere inones, such as program
proving, away to justify the cost and efforscaling effect of
component reusequality at which it can really be trelies on
it.re-built solutions to problems that arise inregardless of the
technical domain (general-in particular fields (specialized
libraries).
ed the nature of software development byed implementations,
accessible throughaspects such as graphical user
interfaces,umerical algorithms, fundamental dataating the level at
which programmers writeonents themselves are of good quality,
suchsince developers can direct their efforts to
ific part of their programs.
e relationship of components to qualityeffects: it is comforting
to know that therom the quality of its components; but webad as
well as the good: imperfections canents than in one-of-a-kind
developments, that relies on a component.
ent [58] [61] follows from this analysis thatising tasks for
improving software quality
able components equipped with a guaranteeed components may
involve most of thethis article. For some of the more difficult
pplication to components may be the bestt and recoup the
investment thanks to the: once a component has reached the level
ofrusted, it will benefit every application that
-
DEPENDABLE SOFTWARE16
5 TOOLS AND ENVIRONMENTS
Transitioning now to product-oriented solutions, we examine some
of theprogress in tools available to software developers to the
extent that it isrelevant for software quality.
Configuration management
Configuration management is a both practice (for the software
developer) anda service (from the supporting tools), so it could in
principle be classifiedunder process as well as under product. It
belongs more properly to thelatter category since its tools
thatapplied as a pure organizational prbecomes tedious and ceases
being
Configuration management mand registering of project element
Register a new version of any Retrieve any previously regis
Register dependencies, both
registered versions of projectof A requires version 7, 8 or
9
Construct composite producbuild an executable versionreconstruct
earlier versions, i
A significant number of softwconfiguration management
errors,version of a module when compilobsolete version of some data
fileacceptable configuration managesource, are widely available.
Theshope for, have made configuratiopractices of modern software
deve
Source code is not the onlyAny product that evolves, has
deprestoring to an earlier state shconfiguration management
reposiplans, specification and design dosuch as PowerPoint slides,
test damake configuration management realistic;actice without good
tool support, it quickly applied.ay be defined as the systematic
collecting
s, including in particular the ability to: project element.tered
version of any project element.
between project elements and betweenelements (e.g. A relies on
B, and version 10 of B).
ts from their constituents for example,of a program from its
modules or
n accordance with registered dependencies.are disasters on
record followed fromtypically due to reintroducing an obsolete
ing a new release of a program, or using an. Excuses no longer
exist for such errors, asment tools, both commercial and open-e
tools, while still far from what one couldn management one of the
most importantlopment.beneficiary of configuration
management.endencies on other elements and may needould be
considered for inclusion in thetory. Besides code this may include
projectcuments, user manuals, training documentsta files.
-
5 TOOLS AND ENVIRONMENTS 17
Metrics and models
If we believe Lord Kelvins (approximate) maxim that all serious
study isquantitative, then software and software development should
be susceptible tomeasurement, tempered of course by Einsteins
equally famous quote that noteverything measurable is worth
measuring. A few software properties, processor product, are at the
same time measurable, worth measuring and relevant tosoftware
reliability.
On the process side, cost in its various dimensions is a prime
concern.While it is important to record costs, if only for
CMMI-style traceability, whatmost project managers want at a
particular time is a model to estimate the costof a future project
or of the remaindand can be useful, at least if the devcomparable
to previous ones: thenand relying on historical data for
coperson-months within reasonamodel, for which free and commer
During the development of athey shouldnt be comparable to this
an intellectual product and doesfrom the weather. In practice,
howlarge projects can follow patterns tare susceptible to similar
statisticalexist is in fact consistent with intuproduct under
development have eais unlikely that the next iteration wreliability
engineering [69][46] elaassessing and predicting
failures,requirement for meaningful predictcalibration. Reliability
models areprojects understand, predict and m
More generally, numerousquantitative assessments of
softwaexample, include: source lines ofall the same; function
points [2mechanisms implemented by the scontrol graph, such as
cyclomspecifically adapted to object-orienvironment [30] makes it
possibleunder development, including meaer of a current project.
Such models do existelopment process is stable and the project
is
by estimating a number of project parametersmparison one can
predict costs essentially,ble average accuracy. A well-known
costcial tools are available, is COCOMO II [12].system, faults will
be reported. In principlee faults of a material product, since
softwarent erode, wear out or collapse under attackever,
statistical analysis shows that faults inhat resemble those of
hardware systems andprediction techniques. That such patterns
canition: if the tests on the last five builds of ach uncovered one
hundred new bugs each, itill have zero bugs, or a thousand.
Software
borates on these ideas to develop models forfaults and errors.
As with cost models, a
ions is the ability to rely on historical data fornot widely
known, but could help software
anage anomalies better.metrics have been proposed to providere
properties. Measures of complexity, for
code (SLOC), the most primitive, but useful5], which count the
number of elementaryoftware; measures of the complexity of theatic
complexity [48][49]; and measuresented software [35][59]. The
EiffelStudioto compute many metrics applied to a projectsures
regarding the use of contracts (section
-
DEPENDABLE SOFTWARE18
8), and to compare them with values on record. While not
necessarily meaningfulin isolation, such measures elements are a
useful control tool for the manager;they are in line with the CMMIs
insistence that an organization can only reachthe higher levels of
process maturity (4 and 5) by moving from the qualitative tothe
quantitative, and should be part of the data collected for such an
effort.
Static analyzers
Static analyzers are another important category of tools,
increasinglyintegrated in development environments, whose purpose
is to examine thesoftware text for deficiencies. They lie somewhere
between type checkers(themselves integrated in compilstudied below
(page 26) after the
Integrated development env
Beyond individual tools the evoluwidespread of integrated tool
suiteInteractive) Development EnvirMicrosofts Visual Studio [66]
ananother example. These envisophisticated graphical user
interbattery of mechanisms to write(configuration management),
comexamine it effectively (browsers)(debuggers, testers), analyze
it foanalysis), generate code from desaround (diagramming,
Computereverse engineering), change arcontrolled transformations
(refacabove (metric tools), and other tas
This is one of the most active afor whom IDEs are the basic
dailyso that open-source projects suchactive community
participation. Tsoftware reliability, while diffuse,supports
quality in several ways:techniques; avoiding new bugsgenerating
some of the code withoproviding a level of comfort that fthem apply
their best skills to the ers) and full program provers, and will
bediscussion of proofs.
ironments
tion of software development has led to thes known as IDEs for
Integrated (originally:onments. Among the best known ared IBMs
Eclipse [27]; EiffelStudio [30] isronments, equipped with
increasinglyfaces, provide under a single roof a wholesoftware
(editors), manage its evolution
pile it (compilers, interpreters, optimizers),, run it and
elucidate the sources of faultsr possible inconsistencies and
errors (staticign and analysis diagrams or the other wayr-Aided
Software Engineering or CASE,chitecture in a safe way through
tool-toring), perform measurements as notedks.reas in software
engineering; programmers,tools, are directly interested in their
quality,
as Eclipse and EiffelStudio benefit fromhe effect of these
advanced frameworks onis undeniable, as their increasing
clevernessfinding bugs through static and dynamic
through mechanisms such as refactoring;ut manual intervention;
and, more generally,rees programmers from distractions and
letshardest issues of software construction.
-
6 PROGRAMMING LANGUAGES 19
6 PROGRAMMING LANGUAGES
The evolution of programming languages plays its part in the
search for morereliable software. High-level languages contribute
both positively, byproviding higher levels of expression through
advanced constructs freeing theprogrammer (in the same spirit as
modern IDEs) from mundane, repetitive orirrelevant tasks, and
negatively, by ruling out certain potentially unsafeconstructs and,
as a result, eradicate entire classes of bugs at the source.
The realization that programming language constructs could exert
amajor influence on software quality both through what they offer
and whatthey forbid dates back to structureseventies, led to
rejecting the goexpressive constructs sequenmajor step was
object-oriented pabstractions, in particular the notion object
types rather than ininheritance and genericity.
In both cases the benefit comoperationally about software.
Aexecutions, so many in fact that ihence to get it right by
thinkingBoth structured and object-orienteoperational thinking and
instead urun-time behaviors by applying th
In drawing the list of pcontributions to quality, we must ihave
to do with structure. Withambitious goals, the production ansafe
and powerful modular decomp
As pointed out, the class mecstable modules with a clear r
Techniques for informationdetails of other modules, andparts of
a system.
Inheritance, allowing the clclasses into structured collec
Genericity, allowing the cond programming [22] [20] which, in
the earlyto as a control structure in favor of morece, conditional,
loop, recursion. The nextrogramming, introducing a full new set
ofon of class, providing decomposition baseddividual operations,
and techniques of
es largely from being able to reason lesssoftware text
represents many possible
t is hard to understand the program andin terms of what happens
at execution [22].d techniques make it possible to limit
suchnderstand the abstract properties of future
e usual rules of logical reasoning.
rogramming languages most importantndeed put at the top all the
mechanisms thatever larger programs addressing ever mored
maintenance of reliable software requiresosition facilities.
Particularly noteworthy are:
hanism, which provides a general basis forole in the overall
architecture.
hiding, which protect modules againstpermit independent
evolution of the various
assification and systematic organization oftions, especially
with multiple inheritance.
struction of type-parameterized modules.
-
DEPENDABLE SOFTWARE20
Another benefit of modern languages is static typing which
requiresprogrammers to declare types for all the variables and
other entities in theirprograms, then takes advantage of this
information to detect possibleinconsistencies in their use and
reject programs, at compilation time, until alltypes fit. Static
typing is particularly interesting in object-oriented
languagessince inheritance supports a flexible type system in which
types can becompatible even if they are not identical, as long as
one describes aspecialization of the other.
Another key advance is garbage collection, which frees
programmersfrom having to worry about the details of memory
management and removesan entire class of errors such as attempts to
access a previously freedmemory cell which can othercorrect, in
particular because the rthan deterministic. Strictly speakilanguage
implementation, but itpossible, as with modern object-osuch as C
that permit arbitrary poi
Exception handling, as prehelps improve software robustnesscode
for run-time faults that wooverflow or running out of memor
A mechanism that is equallyclosure, delegate or agentobjects
that can then be passedsystem, making it possible todrastically
simplify certain kindsGUI programming and other even
The application of programsoftware quality is limited by the
csoftware industry on older langua Operating systems and low-l
which retains its attractionsknown deficiencies, such as t
The embedded and mission-clow-level languages,
includinintroduced by compilers and
The Verifying Compiler Grand Cthe development of tools that will
guarantee, during the procedescribed in the following sectionswise
be particularly hard to detect and toesulting failures are often
intermittent ratherng, garbage collection is a property of thes the
language definition that makes itriented languages, or not, as in
languagesnter arithmetic and type conversions.sent in modern
programming languages,by allowing developers to include
recovery
uld otherwise be fatal, such as arithmeticy.far-reaching in its
abstraction benefits is the[62]. Such constructs wrap operations
in
around anonymously across modules of atreat routines as
first-class values. Theyof software such as numerical
applications,t-driven (or publish-subscribe) schemes.ming language
techniques to improvingontinued reliance of significant parts of
the
ges. In particular:evel system-related tend to be written in
C,
for such applications in spite of widelyhe possibility of buffer
overflow.ritical community sometimes prefers to useg assembly, for
fear of the risks potentiallyother supporting tools.hallenge [38]
[77] is an attempt to supporteven with such programming languages
ss of compiling and thanks to techniques, the reliability of the
programs they process.
-
7 STATIC VERIFICATION TECHNIQUES 21
7 STATIC VERIFICATION TECHNIQUESStatic techniques work solely
from the analysis of the software text: unlikedynamic techniques
such as tests they do not require any execution to verifysoftware
or report errors.
Proofs
Perhaps the principal difference between mathematics and
engineering is thatonly mathematics allows providing absolute
guarantees. Given the properaxioms, I can assert with total
confidence that two plus two equals four. But ifI want to drive to
Berne the best asdown is a probability. I know itsand lower than if
my goal were Pmake it higher by buying a new, bthe highest
attention to qualityoccasionally fail.
Under appropriate assumptproposition rather than a material
stating that all executions of theat least one possible execution
witrue or not is entirely determinedassume correct functioning of
theneeded to carry out program execsystem). Another way of
expresslanguage is similar to a mathematitrue and others false, as
determine
In principle, then, it should bprograms, in particular
correctnesthe same rigorous techniques as iThis assumes overcoming
a numb Programming languages are
theories but through natural-degree of precision. To mdescribing
them in mathemmathematical semantics (orlanguage and is a huge
tasadvanced mechanisms suchwell as the details of compuintegers and
reals strays fromsurance I can get that my car will not breakhigher
than if I just drive it to the suburbs,rague, Alma-Ata, Peking or
Bombay; I canetter car; but it will never be one. Even withand
maintenance, physical products will
ions, a program is like a mathematicaldevice: any general
property of the program
program will achieve a certain goal, or thatll is either true or
false, and whether it is
by the text of the program, at least if wehardware and of other
software elements
ution (compiler, run-time system, operatinging this observation
is that a programmingcal theory, in which certain propositions ared
by the axioms and inference rules.e possible to prove or disprove
properties ofs, robustness and security properties, usingn the
proofs of any mathematical theorem.er of technical
difficulties:
generally not defined as mathematicallanguage documents
possessing a varyingake formal reasoning possible requires
atical form; this is known as providing aformal semantics) to a
programming
k, especially when it comes to modelingas exception handling and
concurrency, aster arithmetic since the computers view of their
standard mathematical properties.
-
DEPENDABLE SOFTWARE22
The theorems to be proved involve specific properties of
programs, suchas the value of a certain variable not exceeding a
certain threshold at acertain state of the execution. Any proof
process requires the ability toexpress such properties; this means
extending the programming languagewith boolean-valued expressions,
called assertions. Common languagesother than Eiffel do not include
an assertion mechanism; this means thatprogrammers will have to
resort to special extensions such as JML forJava [43] (see also
Spec#, an extension of the C# language [5]) andannotate programs
with the appropriate assertions. Some tools such asDaikon help in
this process by extracting tentative assertions from theprogram
itself [31].
In practice the softwares acta supporting hardware and somust be
complemented by gu
Not all properties lend themnon-functional propertiesbandwidth,
memory occupati
More generally, a proof is onproven. What is being
provedabsolute sense, nor even its qstated. It is never possible to
kincluded. This is not just a theadvantage of auxiliary aspedesign
and verification did no
Even if the language, the conspecified semantically andremains a
challenge. It cannoeven the proof of a few properreaches into the
thousands ofthe other hand, generally notcomputer-assisted proof
tecapplications) significant prooand expert knowledge.
Of course the effort may well becritical systems in
transportationwork has been directed; and reujustified as explained
in the discthe scaling-up effect of reuse.ual operation depends, as
noted, on those offtware environment; proofs of the
softwarearantees about that environment.
selves to easy enunciation. In particular,such as performance
(response time,
on) are hard to model.ly as useful as the program properties
being
is not the perfection of the program in anyuality, but only that
it satisfies the assertionsnow that all properties of interest have
beenoretical problem: security attacks often takects of the
programs behavior, which itst take into account.
text and the properties of interest are fullythe properties
relevant, the proof processt in any case be performed manually,
sinceties of a moderately sized programs quicklyproof steps. Fully
automated proofs are, onpossible. Despite considerable advances
inhnology (for programs as well as otherfs still require
considerable user interaction
worthwhile, especially in two cases: life-and defense to which,
indeed, much proofsable components, for which the effort isussion
of Trusted Components above by
-
7 STATIC VERIFICATION TECHNIQUES 23
Here are some of the basic ideas about how proofs work. A
typicalprogram element to prove would be, in Eiffel notation
This has a program body, the do cintroduced by require and a
pconsisting of two subclauses impessentially boolean
expressionspostcondition, of using the old nofirst subclause of the
postconditiobeen decreased by one after execu
Program proofs deal withcontracted programs (see sectionproofs
and other software qualitabsolute guarantees of quality: weonly
assess it whether througmore partial ones such as those
revproperties, expressed here through
From a programmers viewporoutine to be executed, with
sompostcondition, expressing propertproof purposes this text is a
theoreclause with its assignment instrsatisfied it will terminate
in such a
This theorem appears to holdconcern noted above that
compmathematical integers proviframework. The basic rule of
axiomcovering such cases is the assignexpression e states that the
follow
decrement-- Decrease counter by one.
requirecounter > 0
docounter := counter 1
ensurecounter = old counter 1counter >= 0
endlause, and two assertions, a preconditionostcondition
introduced by ensure and
licitly connected by an and. Assertions areof the language with
the possibility, in atation to refer to values on entry: here then
states that the value of counter will havetion of the do
clause.
such annotated programs, also called8 below). The annotations
remind us thaty assurance technique can never give uscan never say
that a program is correct,
h rigorous techniques like proofs or usingiewed next relatively
to explicitly stated assertions integrated in the program text.int
the above extract is simply the text of ae extra annotations, the
precondition and
ies to be satisfied before and after. But form, asserting that
whenever the body (the douction) is executed with the precondition
way that the postcondition is satisfied.trivially but even before
addressing the
uter integers are not quite the same asng it requires the proper
mathematicalatic semantics (or Hoare semantics [37])
ment axiom, which for any variable x anding holds
-
DEPENDABLE SOFTWARE24
where Q (x) is an assertion which may depend on x; then Q (e) is
the sameassertion with every mention of x replaced by e, except for
occurrences ofold x which must be replaced by x.
This very general axiom captures the properties of assignment
(in theabsence of side effect in the evaluation of e); its
remarkable feature is that it isapplicable even if the source
expression e contains occurrences of the targetvariable x, as in
the example (where x is counter).
We may indeed apply the axiom to prove the examples correctness.
LetQ1 (x) be x = old x 1, corresponding to the first subclause of
thepostcondition, and Q2 (x) be x >=replace counter by counter +
1counter 1 = counter 1, whictransformations to Q2 (counter), wto
the precondition counter > 0.assertion-equipped example.
From there the theory moinference rule states that if you ha
and
(note the postcondition of the fisecond part) you are entitled
to de
and so on for more instructions. Aproperties of if c then I1
else I2advanced is the case of loops: to p
require Q (e) do x := e ensure Q (x)
require P do Instruction_1 ensu
require Q do Instruction_2 ensu
require P do Instruction_1 ; Inst
fromInitialization
untilExit
loopBody
end0. Applying the rule to Q1 (counter), weand old counter by
counter; this gives
h trivially holds. Applying now the samee get counter 1 >= 0,
which is equivalentThis proves the correctness of our little
ves to more complex constructions. Anve proved
rst part matching the precondition of theduce
rule in the same style enables you to deduceend from properties
of I1 and I2. More
rove the properties of
re Q
re R
ruction_2 ensure Rt
-
7 STATIC VERIFICATION TECHNIQUES 25
you need, in this general approach, to introduce a new assertion
called the loopinvariant and an integer expression called the loop
variant. The invariant isa weakened form of the desired
postcondition, which serves as approximationof the final goal; for
example if the goal is to compute the maximum of a setof values,
the invariant will be Result is the maximum of the values
processedso far. The advantage of the invariant is that it is
possible both to:
Ensure the invariant through initialization (the from clause in
the abovenotation); in the example the invariant will be trivially
true if we start withjust one value and set Result to that
value.
Preserve the invariant througclause); in the example it suffione
element v and execute if
If indeed a loop possesses such anon exit the invariant will
still holdand preserved by all the loop iteracombination of these
two assertiothe other way around, if we startedit to get an
invariant, we will obtaexit condition states that we havethis
property with the invariantprocessed so far tells us that Resu
Such reasoning is only interminates; this is where the loopwhich
must have a non-negativewhile remaining non-negative, whcondition
not satisfied. The exisguarantee termination since a noforever. In
the example a variant isbeing considered for the maximunumber of
values processed.
Axioms and inference ruleprogramming languages, becominto more
advanced mechanisms.h one iteration of the loop body (the loopces
to extend the set of processed values byv > Result then Result
:= v end.
invariant and its execution terminates, then(since it was
ensured by the initialization
tions), together with the Exit condition. Thens gives the
postcondition of the loop. Seenfrom a desired postcondition and
weakenedin a correct program. In the example, if theprocessed all
values of interest, combiningResult is the maximum of the valueslt
is the maximum of all values.
teresting if the loop execution actuallyvariant comes in. It is
an integer expressionvalue after the Initialization and
decrease,enever the Body is executed with the Exit
tence of such an expression is enough ton-negative integer value
cannot decreaseN i where N is the total number of values
m (the proof assumes a finite set) and i the
s similarly exist for other constructs ofg, as noted, more
intricate as one moves on
-
DEPENDABLE SOFTWARE26
For concurrent, reactive and real-time systems, boolean
assertions of thekind illustrated above may not be sufficient; it
is often convenient to rely onproperties of temporal logic [47],
which given a set of successiveobservations of a programs
execution, can express, for a boolean property Q: forever Q: from
now on, Q will always hold. eventually Q: at some point in the
future (where future includes now),
Q will hold. P until Q: Q will hold at some point in the future,
and until then P will hold.Regardless of the kind of programs and
properties being targeted, there aretwo approaches to producing
proprograms as they exist, then aftmanually or with some
automatedconstructive method [24] [2] [68]construction process,
often usispecification to implementation thproved to preserve
correctness, anevery step.
Proof technology has had somsystems (and in hardware
design),reach of most software projects.
Static analysis
If hoping for a proof covering aproperties of potential interest
ismore approachable if we settle formay be very partial but very
interethat no buffer overflow can ever ato provide a firm
guarantee, throuindex used at run time to access awill be within
the defined boundsout a whole class of security attac
Static analysis is the tool-spurpose of assessing specific
qualexecution and hence can in princthan code. Proofs are a special
canalysis techniques are available.gram proofs. The analytic method
takeser equipping them with assertions, eitheraid as noted above,
attempts the proof. Theintegrates the proof process in the
softwareng successive refinements to go fromrough a sequence of
transformations, eachd integrating more practical constraints
at
e notable successes, including in industrialbut until recently
has remained beyond the
ll the correctness, reliability and securityoften too ambitious,
the problem becomesa subset of these properties a subset thatsting.
For example being able to determine
rise in a certain program in other words,gh analysis of the
program text, that everyn item in an array or a character in a
string is of great practical value since this rulesks.
upported analysis of software texts for theity properties. Being
static, it requires noiple be applied to software products
otherase, the most far-reaching, but other static
-
7 STATIC VERIFICATION TECHNIQUES 27
At the other extreme, a well-established form of elementary
staticanalysis is type checking, which benefits programs written in
a statically typedprogramming language. Type checking, usually
performed by the compilerrather than by a separate tool, ascertains
the type consistency of assignments,routine calls and expressions,
and rejects any program that contains atype incompatibility.
More generally, techniques usually characterized as static
analysis liesomewhere between such basic compiler checks and full
program proofs.Violations that can typically be detected by static
analysis include:
Variables that, on some control paths, would be accessed before
beinginitialized (in languages such
Improper array and string acc
Memory properties: attemptmemory leak
Pointer management (again ito follow void or otherwise in
Concurrency control: deadloc
Miscellaneous: certain casechanges to supposedly consta
Static analysis tools such as PRseveral years to new versions
ofmany potential errors.
One of the issues of static ainconsistency reports that, on
inspwas the weak point of older static awhich complements the type
checkcan easily swamp their users unspurious, but requiring a
manual w(In the search for errors, of coursconsidered the bad:
evidence of wbeen successful in considerably re
The popularity of static analythe reach of static analysis
toolsexamples are: as C that do not guarantee initialization).ess
(buffer overflow).to access a freed location, double freeing,
n low-level languages such as C): attemptsvalid pointers.
ks, data races.
s of arithmetic overflow or underflow,nt strings
Efix [72] have been regularly applied forthe Windows code base
and have avoided
nalysis is the occurrence of false alarms:ection, do not reveal
any actual error. Thisnalyzers, such as the widely known Lint
tooling of C compilers: for a large program theyder thousand of
messages, most of themalkthrough to sort out the good from the
bad.e, the good is what otherwise would berongdoing.) Progress in
static analysis has
ducing the occurrence of false alarms.
sis is growing; the current trend is to extendever further
towards program proofs. Two
-
DEPENDABLE SOFTWARE28
Techniques of abstract interpretation [18] with the supporting
ASTREtool [9], which has been used to prove the absence of run-time
errors inthe primary flight control software, written in C, for the
Airbus A340 fly-by-wire system.
ESC-Java [21] and, more recently, the Boogie analyzer [4] make
programproving less obtrusive by incrementally extending the kind
of diagnosticswith which programmers are familiar, for example type
errors, to moreadvanced checks such as the impossibility to
guarantee that an invariantis preserved.
Model checking
The model checking approach to vand static analysis, but
provides(testing) studied below. The inherbe exhaustive; for any
significantthe number of possible cases skyrwhere the orders of
magnitude invparticles in the universe.
The useful measure is the nunotion of state was implicit in
thesimply a snapshot of the programthat execution, by looking up
therealistically by using the debuggvariables. Indeed it is the
combdetermines the state. With every 6values, it is not surprising
that the
Model checking attempts exhby performing predicate
abstractireplacing all expressions by boolepossible values, so that
the size owill still be large, but the poweralgorithms, can make
its explordesired property holds for examof buffer overflows, or a
timing prsuffices to evaluate the correspondif a violation of that
assertion (oralso arises in the original programerification [36]
[17] [3] is static, like proofsa natural link to the dynamic
techniques
ent limitation of tests is that they can neversystem in fact,
even for toy examples ockets into the combinatorial
stratosphere,ite lyrical comparisons with the number of
mber of possible states of a program. Theearlier discussion of
assertions. A state is
execution, as could be observed, if we stopcontents of the
programs memory, or moreer to examine the values of the
programsination of all the variables values that
4-bit integer variable potentially having 264 estimates quickly
go galactic.
austive analysis of program states anywayon. The idea is to
simplify the program byan expressions (predicates), with only twof
the state space decreases dramatically; itof modern computers,
together with smartation tractable. Then to determine that aple, a
security property such as the absence
operty such as the absence of deadlock iting assertion in all of
the abstract states and,counter-example) is found, to check that
it.
-
7 STATIC VERIFICATION TECHNIQUES 29
For example, predicate abstraction will reduce a conditional
instructionif a > b then... to if p then..., where p is a
boolean. This immediately cuts downthe number of cases from 2128 to
2. The drawback is that the resulting programis only a caricature
of the original; it loses the relation of p to other
predicatesinvolving a and b. But it has an interesting property: if
the original violates theassertion, then the abstracted version
also does. So the next task is to look forany such violation in the
abstracted version. This may be possible throughexhaustive
examination of its reduced state space, and if so is guaranteed
tofind any violation in the original program, but even so is not
the end of the story,since the reverse proposition does not hold: a
counter-example in the abstractedprogram does not necessarily
signal a counter-example in the original. It couldresult from the
artificial merging oa path impossible in an executselecting both p
and q as true whexamining the state space of the a Not find any
violations, in w
original program. Report violations, each of w
simply a false alarm generateSo the remaining task, if
counterwhether they arise in the originalthat leads to each
counter-exampprogram variables (that is to say,in the example, a
> b and b > a +values for the program variablcombination, or
variable assignmeone; if not, as in the case given, it
This problem of predicate saefficient algorithms is one of the
c
The focus on counter-examadvantage over traditional proof tbuilt
with verification in mind (thabove), the first attempt to
verifydoesnt tell us the source of the probof the proof procedure
rather than ayou get a counter-example which d
Model checking has capturedin hardware design and then in
reassertions of interest are often expf several cases, for example
if it occurs onion of the original program obtained byere q is the
abstraction of b > a + 1. Thenbstracted program will either:hich
case it proves there was none in the
hich might be an error in the original ord by the abstraction
process.-examples have been found, is to ascertain. This involves
defining the path predicatele, expressing it in terms of the
original
removing the predicate abstraction, giving,1) and determining if
any combination of
es can satisfy the predicate: if such ant, exists, then the
counter-example is a real
is spurious.tisfiability is computationally hard; findingentral
areas of research in model checking.ples gives model checking a
practical
echniques. Unless a software element wasrough a constructive
method as definedit will often fail. With proofs, this failurelem
and could actually signal a limitation
n error in the program. With model checking,irectly shows whats
wrong.considerable attention in recent years, first
active and real-time systems, for which theressed in temporal
logic.
-
DEPENDABLE SOFTWARE30
8 DESIGN BY CONTRACT
The goal of developing software to support full proofs of
correctnessproperties is, as noted, desirable but still unrealistic
for most projects. Even ashort brush with program proving methods
suggests, however, that more rigorcan be highly beneficial to
software quality. The techniques of Design byContract go in this
direction and deliver part of the corresponding benefitswithout
requiring the full formality of proof-directed development.
The discussion of proofs introduced Eiffel notations such as
require assertion -- A routine precondition ensure assertion -- A
associated with individual routinewhich specify abstract semantic
papply in particular to: Individual routines: precond
routine is applicable; postcguarantee in return when it te
In object-oriented programconsistency conditions that mstate.
For example, the invprocessing system may stateequal to the
paragraph widthof the class may assume theprecondition) and must
resits postcondition).
Loops: invariant and (intege Individual instructions: asseThe
discipline of Design by Contramechanisms in software developma
system as defining a multitudsupplier modules, each
specifierelationships between companies
The benefits of such a mthroughout the lifecycle, supportin
Contracts can be used to ex
precise yet understandable wnotations, although of courseroutine
postconditions. They are examples of contract elementsroperties of
program constructs. Contracts
ition, stating the condition under which aondition, stating what
condition it willrminates.ming, classes: class invariant,
statingust hold whenever an object is in a stable
ariant for a paragraph class in a textthat the total length of
letters and spaces is. Every routine that can modify an
instanceclass invariant on entry (in addition to its
tore it on exit (in addition to ensuring
r) variant as discussed above.rt or check constructs.ct [53]
[57] [67] gives a central role to theseent. It views the overall
process of buildinge of relationships between client and
d through a contract in the same manner asin the commercial
world.ethod, if carried systematically, extendg the goal of
seamlessness discussed earlier:press requirements and
specifications in aay, preferable to pure bubbles and arrows they
can be displayed graphically too.
-
8 DESIGN BY CONTRACT 31
The method is also a powerful guide to design and
implementation,helping developers to understand better the precise
reason and context forevery module they produce, and as a
consequence to get the module right.
Contracts serve as a documentation mechanism: the contract view
ofa class, which discards implementation-dependent elements but
retainsexternally relevant elements and in particular
preconditions,postconditions and class invariants, often provides
just the right form ofdocumentation for software elements,
especially reusable components:precise enough thanks to the
contracts; abstract enough thanks to theremoval of implementation
properties; extracted from the program text,and hence having a
better chance of being up to date (at least one majorsoftware
disaster was tracspecification had changed, uncheap to produce,
since this ftools from the source text,purpose, since the output
canHTML. Eiffel environments[30], which serve as the basic
Contracts are also useful forhigh level of abstraction, and
In object-oriented programmproper use of inheritance,
byframework within which rouclasses. This is connectemanagement,
since a conseqrefinements to an design are chave been defined by
the topin the form of contracts.
Most visibly, contracts are a texecution that violates an
acontract monitoring durintechnique for identifying bugthe tools
cited in the discussi
Design by Contract mechanismslanguage [52] [28] and a key
parDozens of contract extensions halanguages (as well as UML
[80]),for Java and the Spec# extension oed [41] to a software
element whosebeknownst to the developers who reused it);orm of
documentation can be generated byrather than written separately;
and multi-be tuned to any appropriate format such assuch as
EiffelStudio produce such views form of software documentation.
managers to understand the software at a as a tool to control
maintenance.
ing, contracts provide a framework for theallowing developers to
specify the semantictines may be further refined in descendantd
with the preceding comment aboutuence is to allow a manager to
check thatonsistent with its original intent, which maydesigners in
the organization and expressed
esting and debugging mechanism. Since anssertion always signals
a bug, turning ong development provides a remarkables. This idea is
pursued further by some of
on of testing below.
are integrated in the design of the Eiffelt of the practice of
the associated method.ve been proposed for other
programmingincluding many designs such as JML [43]f C# [5].
-
DEPENDABLE SOFTWARE32
9 TESTING
Testing [70] [8] is the most widely used form of program
verification, and stillfor many teams essentially the only one. In
academic circles testing has longsuffered from a famous comment
[23] that (because of the astronomicalnumber of possible states)
testing can only show the presence of bugs, butnever to show their
absence. In retrospect its hard to find a rationalexplanation for
why this comment ever detracted anyone from the importanceof tests,
since it in no way disproves the usefulness of testing: finding
bugs isa very important task of software development. All it
indicates is that weshould understand that finding bugs is indeed
the sole purpose of testing, andnot delude ourselves that test
resuproduct under development.
Components of a test
Successful testing relies on a testdescribing choices for the
tasks of
Determining which parts to t
Finding the appropriate input
Determining the expected prInput values and the
associatcollection of which constitute
Instrumenting the software tooperation, or in addition to
itwhich may involve test drivestubs to stand for parts of
thplaceholder when other parts
Running the software on the
Comparing the outputs and b
Recording the test data (testof the system, in particular
rpreviously corrected errors h
In addition there will be a phase otest, but in line with the
above obstrict sense.lts directly reflect the level of quality of
a
plan: a strategy, expressed in a document, the testing process.
These tasks include:
est.
values to exercise.
operties of the results (known as oracles).ed oracles together
make up test cases, thes a test suite.
run the tests (rather than perform its normal); this is known as
building a test harness,rs to solicit specific parts to be tested,
ande system that will not be tested but need a call them.
selected inputs.
ehavior to the oracles.
cases, oracles, outputs) for future re-testingegression testing,
the task of verifying thatave not reappeared.
f correction of the errors uncovered by theservations this is
not part of testing in the
-
9 TESTING 33
Kinds of test
One may classify tests with respect to their scope (this was
used in the earlierdescription of the V model of the lifecycle): A
unit test covers a module of the software.
Integration test covers a complete cluster or subsystem.
A system test covers the complete delivery.
User Acceptance Testing involves the participation of the
recipients of thesystem (in addition to thevariants) to determine
whethe
Business Confidence Testinconditions as close as possibl
An orthogonal classification addre
Functional testing: whetherthe specification.
Performance testing: its use o Stress testing: its behavior
user load.
Yet another dimension is intendeficiencies but also (despite
theestimate satisfaction of desired prdecide whether to approve the
prtests corresponding to previously ierrors have a knack for
surging balong after they were thought corre
The testing technique, in part
Black-box: based on knowled
White-box: based on knowledexample to try to exercise as
Observing the state of the art in socritical: managing the test
procdevising oracles; and the toughdevelopers, responsible for the
precedingr they are satisfied with the delivery.g is further
testing with the users, ine to the real operating environment.
sses what is being tested:
the system fulfills the functions defined in
f resources.
under extreme conditions, such as heavy
t: testing can be fault-directed to findabove warnings),
conformance-directed tooperties, or acceptance testing for users
tooduct. Regression testing, as noted, re-runsdentified errors;
surprisingly to the layman,ck into the software, sometimes
repeatedly,cted.
icular the construction of test suites, can be:
ge of the systems specification only.
ge of the code, which makes it possible formuch of that code as
possible.
ftware testing suggests that four issues areess; estimating the
quality of test suites;est generating test cases automatically.
-
DEPENDABLE SOFTWARE34
Managing the testing processTest management has been made easier
through the appearance of testingframeworks such as JUnit [42] and
Gobo Eiffel Test [7] which record testharnesses to allow running
the tests automatically. This removes a considerablepart of the
burden of testing and is important for regression testing.
An example of a framework for regression testing of a
compiler,incorporating every bug ever found since 1991, is
EiffelWeasel [29]. Suchautomated testing require a solid
multi-process infrastructure, to ensure forexample that if a test
run causes a crash the testing process doesnt also crashbut records
the problem and moves on to the next test.
Estimating test qualityBeing able to estimate the qualityknow
when to stop testing. Theblack-box testing.
With white-box testing it is peach assuming the preceding
onesthe execution of the selected testonce; branch coverage, where
everand once to false; condition coversub-expressions; path
coverage, fcoverage, where each loop body h
Another technique for meapproaches is mutation testing [test
suite, this consists of makingkind of errors that programmers wthe
tests again. If a mutant progryou have made sure the mutant is nthe
changes are meaningful) that this an active area of research
[71];mutation operators, to ensure dive
With black-box testing the pthey assume access to the
sourcedefine notions of specification coexercised the various cases
listedthis will mean analyzing the varioutesting [81] is the
general name fothe input domain into representativsuite must cover
all the subsets.of a test suite is essential in particular
totechniques are different for white-box and
ossible to define various levels of coverage,: instruction
coverage, ensuring that throughcases every instruction is executed
at leasty boolean condition tests at least once to trueage, where
this is also the case for booleanor which every path has been
taken; loopas been executed at least n times for set n.asuring test
suite quality in white-box79]. Starting with a program that passes
itsmodifications similar, if possible, to theould make to the
program, and runningam still passes the tests, this indicates
(onceot equivalent to the original, in other words,e tests were not
sufficient. Mutation testingone of the challenges is to use
appropriatersity of the mutants.revious techniques are not
available since
code to set up the test plan. It is possible toverage to
estimate whether the tests have
in the specification; if contracts are present,s cases listed in
the preconditions. Partitionr techniques (black- or white-box) that
splite subsets, with the implication that any test
-
9 TESTING 35
Defining oraclesAn oracle, allowing interpretation of testing
results, provides a decision criterionfor accepting or rejecting
the result of a test. The preparation of oracles can be asmuch work
as the rest of the test plan. The best solution that can be
recommendedis to rely on contracts: any functional property of a
software system (with thepossible exception of some user-interface
properties for which human assessmentmay be required) can be
expressed as a routine postcondition or a class invariant.
These assertions can be included in the test harness, but it is
of coursebest, as noted in the discussion of Design by Contract, to
make them anintegral part of the software to be tested as it is
developed; they will thenprovide the other benefits cited, such as
aid to design and built-indocumentation, and will facilitateTest
case generationThe last of the four critical issuestoughest;
automatic generation inclose to exhaustive testing, we wapossible,
and especially to makepotential program executions coverage
measures and mutation, b
For any realistic program,enough cases; in addition, they
aautomatic test case generation, whtest cases as possible,
typically woTwo tools in this area are Korat(which draws on the
advantage texisting Eiffel software is typicallyso that AutoTest
can be run on softa significant number of problems
Manual tests, which benefit frotwo kinds are complementary:
magenerated tests at breadth. In parwhether through manual or
automregression test suite. AutoTest integthe automatic test case
generation a
Automatic test case generatContrary to intuition, random tesfrom
the input domain, can bereasonably even distribution overrandom
testing [14] which has so fvalues (for which a clear notion ois
immediately meaningful). Recoriented programming by defining
regression testing.
listed, test case generation, is probably theparticular. Even
though we cant ever get
nt the test process to cover as many cases assure they are
representative of the variousas can be assessed in white-box
testing byut needs to be sought in any form of testing.manually
prepared tests will never coverre tedious to prepare. Hence the
work onich tries to produce as many representativerking from
specifications only (black-box).for JML [13] and AutoTest for
Eiffel [15]hat contracts being native to Eiffel equipped with large
numbers of assertions,
ware as is, and indeed has already uncoveredin existing programs
and libraries).m human insight, remain indispensable. Thenual tests
are good at depth, automatically
ticular, any run that ever uncovered a bug,atic techniques,
should become part of therates manual tests and regression tests
withinnd execution framework [44].ion needs a strategy for
selecting inputs.ting [34], which selects test data randomlyan
effective strategy if tuned to ensure athat domain, a policy known
as adaptive
ar been applied to integers and other simplef distance exists,
so that even distributionent work [16] extends the idea to object-
a notion of object distance.
-
DEPENDABLE SOFTWARE36
10 CONCLUSION
This survey has taken a broad sweep across many techniques that
all havesomething to contribute to the aim of software reliability.
While it has stayedaway from the gloomy picture of the state of the
industry which seems to bede rigueur in discussions of this topic,
and is not justified given theconsiderable amount of
quality-enhancing ideas, techniques and tools that areavailable
today and the considerable amount of good work currently
inprogress, it cannot fail to note as a conclusion that the
industry could do muchmore to take advantage of all these efforts
and results.
There is not enough of a reoften, the order of concerns is
cosreassess priorities.
Acknowledgments
The material in this chapter derivecourse on Testing and
Software QIlinca Ciupa, Andreas Leitner andbenefited from the work
of Petecourse, Software Engineering forBernd Schoeller and Ilinca
Ciupa
Design by Contract is a tra
The context for this surveygrant for our SCOOP work in
theopportunities that the grant and thexperience gained in the two
DIC
REFERENCES
Note: All URLs listed were active
[1] Algirdas Avizienis, Jean-ClauConcepts of Dependability, in
ProReport, October 2000,
pagesciteseer.ist.psu.edu/article/avizien[2] Ralph Back: A Calculus
of ReInformatica, vol. 25, 1988, pages
59public/1988/ACalculusOfRefinemeliability culture in the software
world; toot, then deadlines, then quality. It is time to
s in part from the slides for an ETH industryuality Assurance
prepared with the help ofBernd Schoeller. The discussion of
CMMI
r Kolb in the preparation of another ETHOutsourced and Offshored
Development.
provided important comments on the draft.
demark of Eiffel Software.
was provided by the Hasler FoundationsDICS project. We are very
grateful for the
e project have provided, in particular for theS workshops in
2004 and 2005.
in April 2006.
de Laprie and Brian Randell: Fundamentalceedings of Third
Information Survivability7-12, available among other places
atis01fundamental.html.finements for Program Derivations, in
Acta3-624, available at
crest.cs.abo.fi/publications/ntsForProgramDerivationsA.pdf.
-
10 REFERENCES 37
[3] Thomas Ball and Sriram K. Rajamani: Automatically Validating
TemporalSafety Properties of Interfaces, in SPIN 2001, Proceedings
of Workshop onModel Checking of Software, Lecture Notes in Computer
Science 2057,Springer-Verlag, May 2001, pages 103-122, available at
tinyurl.com/qrm9m.
[4] Mike Barnett, Robert DeLine, Manuel Fhndrich, K. Rustan M.
Leino,Wolfram Schulte: Verification of object-oriented programs
with invariants, inJournal of Object Technology, vol. 3, no. 6,
Special issue: ECOOP 2003workshop on Formal Techniques for
Java-like Programs, June 2004, pages 27-56, available at
www.jot.fm/issues/issue_2004_06/article2.[5] Mike Barnett, K.
Rustan M. Leino and Wolfram Schulte: The Spec#Programming System:
An OvervAnalysis of Safe, Secure InteropComputer Science
3362,research.microsoft.com/specsharppapers at
research.microsoft.com/
[6] Kent Beck and Cynthia AEmbrace Change. 2nd edition, Ad
[7] ric Bezault: Gobo Ewww.gobosoft.com/eiffel/gobo/ge
[8] Robert Binder: Testing ObjecTools, Addison-Wesley, 1999.
[9] Bruno Blanchet, Patrick CouMauborgne, Antoine Min,
DavidStatic Analyzer for Large SafetyVerification, Dagstuhl
Seminarwww.di.ens.fr/~cousot/COUSOTtASTRE page at
www.astree.ens[10] Barry W. Boehm: Software E[11] Barry W. Boehm: A
SpirEnhancement, in Computer (IEEE[12] Barry W. Boehm et al.:
SofPrentice Hall, 2000.
[13] Chandrasekhar Boyapati, SaAutomated Testing Based on
JavInternational Symposium on SoftJuly 22--24, 2002, available at
tiniew, in CASSIS 2004: Construction anderable Smart devices,
Lecture Notes in
Springer-Verlag, 2004, available at/papers/krml136.pdf; see also
other Spec#specsharp/.
ndres: Extreme Programming Explained:dison-Wesley, 2004.
iffel Test, online documentation attest/index.html.
t-Oriented Systems: Models, Patterns, and
sot, Radhia Cousot, Jrme Feret, LaurentMonniaux and Xavier
Rival: ASTRE: A
-Critical Software, in Applied Deductive3451, November 2003,
available at
alks/Dagstuhl-3451-2003.shtml. See also.fr.
ngineering Economics, Prentice Hall, 1981.
al Model of Software Development and), vol. 21, no. 5, May 1988,
pages 61-72.tware Cost Estimation with COCOMO II,
rfraz Khurshid and Darko Marinov: Korat:a Predicates, in
Proceedings of the 2002ware Testing and Analysis (ISSTA),
Rome,yurl.com/qwwd3.
-
DEPENDABLE SOFTWARE38
[14] T.Y. Chen, H. Leung and I.K. Mak: Adaptive random testing,
in Advancesin Science - ASIAN 2004: Higher-Level Decision Making,
9th AsianComputing Science Conference, ed. Michael J. Maher,
Lecture Notes inComputer Science 3321, Springer-Verlag, 2004,
available attinyurl.com/lpxn5.[15] Ilinca Ciupa and Andreas
Leitner: Automated Testing Based on Design byContract, in
Proceedings of Net.ObjectsDays 2005, 6th Annual Conference
onObject-Oriented and Internet-Based Technologies, Concepts
andApplications for a Networked World, 2005, pages 545-557,
available atse.ethz.ch/people/ciupa/papers/soqua05.pdf. See also
AutoTest page atse.ethz.ch/research/autotest.[16] Ilinca Ciupa,
Andreas LeitnerDistance and its Application to APrograms, submitted
for publicatipublications/testing/object_distan[17] Edmund M.
Clarke Jr., OrnChecking, MIT Press, 1999.[18] Patrick Cousot:
VerificationSymposium on Verification TheoryBirthday, ed. Nachum
DershowitzSpringer-Verlag, 2003, pages 243-[19] Mic