Zen and the Art of Object Oriented Risk …...How testing is based on risk management • 4. Applying object-oriented concepts to testing • 5. Role of metrics and measurement in

P R E S E N T A T I O N

International Conference On

Software Testing, Analysis & ReviewNOV 8-12, 1999 • BARCELONA, SPAIN

Presentation

Bio

Return to Main Menu

Presentation

Bio

Return to Main Menu T23

Thursday, Nov 11, 1999

Zen and the Art of Object

Oriented Risk Management:

Does Anything Work

Neil Thompson

Paper

Properly Now?

Thompson informationSystemsConsulting Limited

©

Zen and the art ofObject-Oriented Risk Management:

does anything work properly now?

Presentation to EuroSTAR9911 November 1999

Neil Thompson

Thompson information Systems Consulting Limited

23 Oast House CrescentFarnham, Surrey

GU9 0NP England (UK)

Phone & fax +44 1252 726900

[email protected] phone +44 7000 NeilTh

(634584)Direct fax +44 7000 NeilTF

(634583)


©

Agenda• Learning objectives:

– At the talk: to think about where testing now stands within informationsystems philosophy, over 25 years on from first testing conference (and sincea popular book analysing quality), to understand why quality is still elusive,and to share new insights

– To take away: a desire and intention to make more direct use of riskmanagement principles in testing

– To use: application to testing of two concepts from object orientation:• encapsulating risk information with tests, to increase effectiveness

• inheritance of tests by other tests, to increase efficiency

• Target audience:– intermediate, but:

• newcomers to testing may find it interesting

• experienced practitioners may benefit from a “reality check”

– talk is not technical: more about risk management than object orientation


©

Structure of talk

• 1. Zen and now: a (short) history of testing philosophy

• 2. Millennial challenges: does anything work properly now?

• 3. How testing is based on risk management

• 4. Applying object-oriented concepts to testing

• 5. Role of metrics and measurement in risk management

• 6. Obstacles to attaining and maintaining quality systems,

and why testing continues to struggle

• 7. Risks and the year 2000 problem

• 8. Future of testing


©

1.1 Introductory themes: Zen and now

Zen (from Encyclopaedia Britannica) Computerising & interlinkingeverything (a millennialreligion?)

Potential to achieve enlightenment isinherent in everyone but lies dormantbecause of ignorance

Potential to work properly is (arguably)inherent in all systems but lies dormantbecause of imposed deadlines, market-ledhaste and habitual cynicism

Zen aims for:• mental tranquillity• fearlessness• spontaneity

Testers should also aim for:• mental tranquillity• fearlessness• spontaneity

Zen sect methods:• (Rinzai) sudden shock & considering

paradoxical statements• (Soto) sitting in meditation• (Obaku) continual chanting of Amida

Testing sect methods:• (Dynamikai) execution & debugging• (Statico) sitting inspecting• (Automatu) continual chanting of “tools”


©

1.2 Philosophical journeys

Zen & the art of MotorcycleMaintenance Robert M. Pirsig 1974

Testing born ?came of age 1972in its prime yet?

Route • From the US Central Plains up theRocky Mountains…

• and down to the Pacific Ocean

• Down the left slope of the V-model(URS to module specfications)…

• and up the right side (Unit Testing toAcceptance Testing)

Philosophies • Preventive / reactive maintenance• Tools & meta-tools• Classic / romantic understanding• Testing & fixing / contracting out• Functions / components• Quality in bursts / way of life

• Static / dynamic• Manual / automated• Validation / verification• Seeking defects / demonstrating absence• Black box / glass box• Quality assurance / control

Issues • Scientific method & limitations• Defining quality causes insanity• (Motorcycles were simple in 1974)

• Art or science?• Codependent behaviour?• How keep up with development

innovation?


©

1.3 Zen, Quality, Testing &Risk Management

Zen

Quality Assurance

Quality Control

............Risk

Man

agemen

t............

Testingstatic dynamic


©

History of Testing Philosophy:1.4 The nature of quality

The time-cost-quality triangle:• time is usually set first•…closely followed by cost• quality is the best we can manage, (unless we complain loudly)

Quality Assurance:• ISO9000 etc• audit, external (then fix?)

Quality Control:• ”right first time”• internal responsibility


©

1.5 Role of testing in quality

Convergenceon acceptable qualityand stability

Beware of divergenceinto instability through insufficient quality ofdesign, debugging, maintenance,enhancements


©

1.6 RAD v. Trad.

Traditional Rapid Application Development “Releases” culture

Acceptance, System

Integration,Unit

Acceptance,full System

quick System,Integration,

Unit

Iterative development & tests

Release 1.0 1.1 2.0 etc

………….………Changing requirements…………………………………………………..

Queuedchangerequests


©

1.7 Test structure for effectiveness

BLACK-BOXstyleGLASS-

BOXstyle

Testing types: functional, performance etcTesting styles: positive, negative

System components: Testing styles: top-down,bottom-up

Processing units

Output data spread: target outcomes

Input data spread:equivalence classes (including boundary values)

DATA FLOWS

ENTITY LIFE HISTORIES

STATE TRANSITIONS

FU

NC

TIO

NA

LD

EC

OM

PO

SIT

ION

S


©

2.1 Millennial challenges, EMU etc.

9 Sep 99 etc.

29 Feb 00

1 Jan 00

poink

C= UK£

Ir£ NFl FMBFr LFr DM ASchPEs SPta FFr ItL

19981999

2000

http://www

2002:EMU inreal life

2001:a cyberspaceodyssey?


©

2.2 Testing squeezed: even more!

What we’dlike tohappen

What tends tohappen inreality

Hitherto Millennial

Aargh!

Unit

System

Integration

Year 2000EMU 2002

•Each ismulti-system•SimilarlyERP & e-commercesuper-systems

URS ATS ATE ATR

•Retest &Regression Testing

•TestExecution

•TestSpecification

•RequirementsSpecification

User: Acceptance:

•vestigial•muchshorter

•shorter•delayed •no staff available

URS ATS E R

•pressure onenvironmentsfor testing


©

3.1 Risk management components

SEVERITY OFEACH BAD THING

WHICH COULD HAPPEN

BAD THINGS WHICHCOULD HAPPEN, ANDPROBABILITY OF EACH

EN

SW


©

3.2 Risk management & the V-model

Acceptancetesting

Systemtesting

Integration testing

Unit testing

Risks that system(s) haveundetected defects

in them

Risks that system(s) and business(es)are not right and ready

for each other


©

3.3 Where and how testing manages risks:first, at outline level

Level Risks How managed

Acceptancetesting

• Service ≠ requirements• Undetected errors damage

business

• Specify user-wanted tests against URS• Script tests around user guide and user

& operator training materials

Systemtesting

• System ≠ specification• Undetected errors waste user time

& damage confidence inAcceptance testing

• Use independent testers, functional &technical, to get fresh view

• Take last opportunity to do automatedstress testing before env’ts re-used

Integrationtesting

• Interfaces don’t match• Undetected errors too late to fix

• Use skills of designers before they moveaway

• Take last opportunity to exerciseinterfaces singly

Unittesting

• Units don’t work right• Undetected errors won’t be found

by later tests

• Use detailed knowledge of developersbefore they forget

• Take last opportunity to exercise everyerror message


©

3.4 Second, at detail level:risk management during test specification

• To help decision-making during the “squeezing of testing”, it would be usefulto have recorded explicitly as part of the specification of each test:

– the type of risk the set of tests is designed to minimise

– any specific risks at which a particular test or tests is aimed

• Remember, each test is a means to an end, not an end in itself

• The “object” of each test is risk management, so let’s encapsulate...

Test specification based on total magnitude of risks for all defects imaginable

x= ∑( )Estimatedprobability ofdefectoccurring

Estimatedseverity ofdefect


©

4.1 Encapsulation for effectivenessin OO development and OO testing

OBJECT FROM DEVELOPMENT OBJECT OF TESTING

Services providedby object

Risks minimisedas object

DataTestData

Functions operatingon data

Things to look out for,priorities for checking


©

4.2 Object-oriented risk managementapplies also during test execution

REMAININGRISKS

RISKS TOTEST AGAINST

}

Estimatedprobability x

x

x

Test specification:•for all defects imaginable...

{Test execution:•for each defect detected…

•for all defects as yet undiscovered...

Probability= 1

Estimatedprobability

Estimatedseverity

Estimatedseverity

Severity =f (urgency,importance)

=

=∑∑∑∑


©

“complete”firstrun

Physicalconstraints(environ-mentsetc.)

4.3 …and also into retesting andregression testing

Specificationof tests

Executionof tests

Retesting &regression testing

Testing Strategy

Testing Plan

Test Design

Test Scripts

Execution: First run

Execution: Retest & 2nd run

Regression Testing

...

Difficulties,squeeze onscripting &execution time

Time & resource con-straints

Desired coverage

What’s left for:

second run

allowance for further runs


©

4.4 Revisit risks to keep testsprioritised for best effectiveness

Risks to performance, auditability

Complex functions,interfaces

Importantbusiness processes

Key data inputs, eghigh value transactions

Important data outputs,eg invoices

Example risk areas

GLASS-BOX

BLACK-BOX


©

4.5 Another idea from OO: inheritance

INSTEAD OF THIS TRY THIS

Early, simple

tests

Artificial, stable

initial state

A later, complex

test

Progressively more complex

tests

Artificial, stable

initial state

...

Test 1

Test 2

Test 3

Test 57

Initial testdata set-up

Initial testdata set-up

More testdata set-up

...Test 1

Test 1

Test 6

Test 1

OR EVEN THIS

Test 8

Test 9 Test 57

Test 57Test 23


©

5.1 Metrics in testing

Defect source analysis:source of defects by testing level detecting them

Unit Integration System Acceptance

Testing progress:tests executed by each day of testing for a level (cumulative)

Testing productivity:defects found by each day of testing for a level (cumulative)

Time

Time

Risks to testing progress:defects reported, fixed & retested by urgency

Risks to business:defects reported, fixed & retested by importance

Coding

Design

Systemspec-ification

Require-ments

Number

Low Medium High

High

Number

Low Medium

Number

Number


©

5.2 Metrics need to be measured tohelp estimate risks

REMAININGRISKS

RISKS TOTEST AGAINST

}

Estimatedprobability x

x

x

Test specification:•for all defects imaginable...


•for all defects as yet undiscovered...

Probability= 1

Estimatedprobability

Estimatedseverity

Estimatedseverity

Severity =f (urgency,importance)

=

=∑∑∑∑


©

6.1 Obstacles to attaining andmaintaining quality systems

• Not enough time to run the tests we want, but...

• Insufficient use of risk:– initial risk analysis is common, but then often do “the tests, the whole tests and

nothing but the tests” (so scope changes involve guesswork)

• Unclear probability of risk:– risk factors are more art than science, and change as technology changes

– not enough budget made available for metrics to be measured and interpreted

• Unclear severity of risk:– this is usually easier: ask the business / organisation

– (but remember to revisit)


©

6.2 Digital obstacles v. analogue

Obstacletype

…in motorcycle maintenance(analogue: needs frequentadjustments, part replacements)

…in testing informationsystems (digital: once it works, itshould continue to work)

External • out-of-sequence re-assembly• intermittent failure• parts scarcity & confusion

• installation etc. conflicts• intermittent failure!• development delays &

configuration management

Internal:• value

• value rigidity• ego• anxiety• boredom• impatience

• no lateral thinking• arrogance, cover-ups• desire for “smooth tests”• uninspired tests• inadequate strategy / design

• truth • true / false / undefined • if…then… without else

• “psycho-motor”

• inadequate tools• poor working environment• lack of “mechanic’s feel”

• no, or imperfect, tool set-up!• inadequate test environments• “finger trouble” (deliberate?)


©

6.3 Why testing continues to struggle

Book Bad news Specific obstacles

Zen and the art ofMotorcycleMaintenanceRobert M Pirsig, 1974

Trying to define quality candamage your mental health

Not only the previous slide, but• some people don’t understand and• don’t want to understand

Digital WoesLauren Ruth Wiener, 1993

Software is inherently unreliable • invisible, abstract, discontinuous• unconstrained by common sense or

physical laws• only handles cases we know about, can

define and have time to code and test• whole world is becoming “the system”

The limits ofComputingHenry M Walker, 1994

…So are specifications,algorithms,hardware andusers

• cannot know if complete and correct• can meet unsolvable / unfeasible spec.• failures not always obvious• GIGO, security

Dilbert (mosttitles)Scott Adams, 199x

• Managers are stupid• Corporate marketing can beat

quality products• Technology will never work

properly

• Engineers do useful work but arecontrolled by managers

• Managers provide inadequate resources& support for useful work


©

6.4 But there is some hope!

Source Good news (allegedly)

• 1995 book

• 1999 bookBill Gates

• The road ahead (when it’s upgraded to “informationsuperhighway”) is good

• Business @ the speed of thought (via a digital nervous system)is good

The Dilbert FutureScott Adams, 1997

• Bad things which can be foreseen will be prevented by humans• One day technology will reduce human work, not increase it• Democracy & capitalism will always coexist happily with lazy /

stupid peopleRelease 2.0 and 2.1Esther Dyson, 1997 & 1998

• Release 1.0 is fresh and new, the realisation of the hopes anddreams of its developers

• Release 2.0 is supposed to be perfect, but…• …usually Release 2.1 comes out a few months after• Internet causes decentralisation, where the masses separate into

small groups (systems of which may automatically self-organise)Moral and legal challengesof the information eraChris Anderson, 1999

• Institutions and individuals seem to be coming apart, because…• institutions are built on “machine” principles…• and we now need “chaordic” organisations (eg Visa, Internet)


©

6.5 Institutional or chaordic:which more likely to converge?

We might expect: - closer control- defined scope

- traditionallymotivated

staff

INSTITUTIONAL eg proprietary CHAORDIC eg LinuxWe might expect: - anarchy

- runaway scope- insufficiently

motivated“public”

What we seem to get: - convergence...- but only until

the nextmajor

releaseor next

product

2.0

1.1

1.0

2.1

1.3

1.2

etc.

What we seem to get: - convergence…- less facilities

- but fewerfaults, and

- betterstability- why?


©

7. Risks and the year 2000 problem

Bad effects

Date

Not enough peoplebelieve it’s a problem:

Too many peoplebelieve it’s a problem:

1995 autumn 1999 2000

• too little remediation done • panic amendments to systems • excessively disruptive tests • unnecessary system replacements• expensive contingency arrangements• investment suppressed• stockpiling & hoarding• stagnating markets eg gilts• cash withdrawn

Are we currently here?

• almost-adequate remediation• too little testing


©

8. Future of testing

• How long can we survive computerising and interconnectingeverything, with requirements increasingly volatile andundocumented, and responsibility distributed? What we already havedoesn’t work properly!

• Does anyone except testers care? Will the battle for quality send usmad?

• Is the future for testers worse or better:– E-commerce threatens “disintermediation”; is there a similar threat to testing?

(The technicians are expected to be also business-aware and the business expertsincreasingly proficient in technical skills.)

– Or will testing get its actuaries, like insurance industry?

• Good luck for 2000 and beyond! May the Zen be with you.

1

Zen and the art ofObject-Oriented Risk Management:

does anything work properly now?

Neil ThompsonThompson informationSystemsConsulting Limited© 1999

23 Oast House CrescentFarnham, Surrey

GU9 0NPEngland (UK)

Phone & fax +44 1252 726900

Email [email protected](Website under development)

Direct phone +44 7000 NeilTh(634584)

Direct fax +44 7000 NeilTF(634583)

Abstract

Testing continues to struggle in the battle for information systems quality because:

• development continues to outpace testing, driven by market and other timepressures (such as year 2000 and European Economic and Monetary Union) andassisted by ever more productive development environments and tools;

• many managers are not sympathetic to the “culture of pessimism” often associatedwith testing, and refuse to move implementation target dates; and

• there are many obstacles to getting (and keeping) systems working properly, andthese seem to be increasing as systems become more complex and moreinterconnected.

Testers can improve their effectiveness and efficiency by:

• accepting that they will always be under time pressure, so must manage quality;

• combining risk management principles object-oriented concepts:

• encapsulating risk information with tests, to help ensure not only that the righttests are planned, but also if time pressures necessitate a reduction in scope, theleast important tests can be identified and omitted (effectiveness); and

• inheriting tests, for example reusing earlier, simpler tests within later, morecomplex tests (efficiency).

Finally, some thought is given to:

• risks and the year 2000 problem; and

• the future of testing, in the context of emerging organisational concepts.

2

Introduction

Around the time information systems testers held their first conference, a popular bookanalysed technology, art, science, quality and philosophy:

• the first testing conference was held in North Carolina, USA, in 1972 HETZ98; and

• two years later was published “Zen and the art of Motorcycle Maintenance”PIRS74,in which the narrator attempted to define quality but was declared insane in theprocess.

25 years later, testing appears to be a mature discipline, recognised as an indispensablecontributor to systems quality, and its accepted main principles have changed little. Butquality in information systems is still elusive, and many mass-market productsincorporate software which clearly does not work properly. Standard software licenceconditions explicitly allow for this, and the public apparently tolerates the situation.

Can anything be done about this, and how should testers position themselves as weenter the new millennium?

This paper is intended to provoke constructive thoughts and actions, under theheadings:

1. Zen and now: a short history of testing philosophy

2. Millennial challenges: does anything work properly now?

3. How testing is based on risk management

4. Applying object-oriented concepts to testing

5. Role of metrics and measurement in risk management

6. Obstacles to attaining and maintaining quality systems


8. Future of testing.

3

1. Zen and now, a short history of testing philosophy

1.1 Introductory themes

This paper is more about risk management than object orientation, and “Zen and theArt of Motorcycle Maintenance” turned out to be more about motorcycle maintenancethan about Zen. But there are some entertaining parallels between some of thecharacteristics of Zen and the characteristics of testers as we enter a new millennium,when the fastest-growing “religion” seems to be a devotion to ever more pervasive,interconnected and complex technology, built on computer software (Figure 1a).

T hompson informationSy stemsC onsulting L im ited

©

Figure 1a: Ze n and now

Zen ( f rom Enc yc lopaedia Brit annica) Com p uterisin g & interlink ingeveryth in g (a m illenn ia lr el ig ion?)

Poten tial to ach ieve enlightenme nt i si nherent in eve ryone bu t lie s dorm an tbecause o f igno rance

Pote nt ial to work properly is (argua bly )inheren t in all s ystem s bu t lies dorm an tbecause o f im posed deadlines , m arket-ledhaste a nd habi tual cynici sm

Zen aim s fo r:• m ental tranquil lity• fearles sness• spon taneity

Testers should als o a im fo r:• m enta l tran quil li ty• fearles sness• spon taneity

Zen s ect m ethods :• (Ri nz ai ) sudde n shock & cons ide ri ng

paradox ica l stat em ent s• (So to ) sit ting in m editati on• (O baku ) con ti nual chan ting of Am ida

Tes ting sect m ethods :• (Dynam ikai ) execu tion & debuggi ng• (S ta tico ) s itti ng inspecti ng• (Automatu ) continual chant ing o f “tools ”

1.2 Philosophical journeys

Actually, ZataoMM was not really about motorcycle maintenance either, it was aboutphilosophy. The narrator rode his motorcycle across the USA with his son and someadult friends, and used maintenance metaphors and geographic imagery to illustrate thehistory of philosophy and some troubling issues, in particular around quality. Againthere are some entertaining parallels with software testing (Figure 1b).

4


©

Figure 1b: ZataoM M and tes ting

Zen & th e art o f M otorcycleMainten an ce R ob ert M . Pi rs ig 1 9 74

Test in g b o rn ?

came o f ag e 1 9 72

in i ts p rime y et?

R ou te • From the US Central Plains up theRocky Mountains…

• and down to the Pacific Ocean

• Down the left slope of the V-m odel(URS to m odu le specfications)…

• and up the right side (Unit Testing toAcceptance Testing)

Philosop hies • Preven tive / reactive maintenance• Tools & meta-tools• Classic / romantic understanding• Testing & fix ing / contracting out• Functions / components• Quality in bu rsts / way o f life

• Static / dynamic• Manual / au tomated• Validation / verification• Seek ing defects / demonstrating absence• Black box / glass box• Quality assurance / control

Issu es • Scientific method & limi tations• Defining quality causes in sanity• (Motorcycles were simple in 1974)

• Art o r science?• Codependent behaviou r?• How keep up with development

innovation?

Just as the motorcycle man had differences of opinion and attitude with hiscompanions, so there are different philosophies among testers:

• Most people when they think of testing mean actually executing the software undertest by inputting data and observing the expected results (ie dynamic testing), butthere is a growing appreciation of the cost-effectiveness of static testing, eg codeanalysis, reviews, inspections, walkthroughs etc. Many testers do not find thisexciting, however.

• The benefits of testing automation are still controversial, after many years ofdebate and considerable improvements in the capabilities of automated tools.

• Semantic differences sometimes confuse the distinction between verification andvalidation, though the usually-accepted definitions are that verification is aboutchecking that the product is being built correctly (eg that the code meets its modulespecifications), and validation is about ensuring fitness for purpose, ie the rightproduct is being built GRAH95. Everyone agrees that both are needed.

• There is general agreement that most testing is meant to detect faults and thattesting cannot prove absence of errors, but there is less clarity over the role ofacceptance testing in error detection and the extent to which it should be designedto run “smoothly”, ie without serious faults. One way of achieving both is torehearse the acceptance tests in advance.

• Both black-box and glass-box styles are of testing are needed, but the emphasisvaries through different levels of testing, eg integration testing is mostly glass-boxbut acceptance testing is mostly black-box. Glass-box is often called white-box, butthe principle is that the tester should be able to see the structure inside.

5

The nature of quality was so troubling to the ZataoMM narrator’s “alter ego”,Phaedrus, that he was deemed insane and given electro-cranial therapy. Quality will berevisited several times in this paper, but in less dramatic terms.

There are some other particularly interesting issues which have been receiving recentattention in the testing world, and which are still the subject of lively discussion:

• Is testing an art or a science? At first sight it’s a science, because we set out witha hypothesis that the software contains faults, then attempt to prove that hypothesisthrough experiments, ie tests (arguably during acceptance testing we also try tobuild confidence in business benefits and fitness for purpose by attempting todemonstrate absence of faults). But much of the data we would like to guide thedesign of our experiments are not readily available (metrics and measurementsfrom previous projects), so we often need to fall back on intuition to design clever,efficient tests and to diagnose errors. Glenford Myers apparently saw it this waywhen he titled the first testing bestseller book “The Art of Software Testing”MYER79.A recent analysis of whether software testing is scientific found many reasons whynot BERE98. Also, there is debate over whether software is an engineering disciplineor not. Fortunately for the motivation and job satisfaction of many testers, itcurrently retains characteristics from both art and science.

• Are testers exceeding their job specification and taking on too many of otherpeoples’ problems? Some managers hold testers responsible not just for findingerrors but getting them fixed, and some see faults as a delay to the schedule (badnews) rather than a prevention (good news) of later live failures. Others ask testerswhether it’s safe to go live yet. And the reason testers are under so much pressureis that they join the critical path of a project towards the end, when mistakes andshortcomings have accumulated from constrained budgets, optimistic plans, inexactrequirements, imperfect design and imperfect coding. Worse, some of thoseproblems may perhaps have been forgotten or hidden. And the testers admit thatthey have to fix it all, and fast! This has been persuasively likened to “co-dependent behaviour”, a psychological disorder COPE98.

• How can testing keep up with the pace and innovation of development? Untilwe can educate managers to delay projects, and markets to demand (and wait for)more reliable software, and until the holy grail of automation is found, we willhave to find better ways of managing within the constraints imposed on us. This isthe main theme of this paper, even though this is “to begin tolerating abnormal,unhealthy and inappropriate behaviours, then [to go one step further, to] convinceourselves these behaviours are normal” COPE98 (making it arguably in itselfcodependent behaviour!)

1.3 Zen, Quality, Testing and Risk Management

The motorcycle man’s journey took him over the summit of the Rocky Mountains, andit was here where he came closest to Zen’s enlightenment and mental tranquillity.Before testers can attain such a state, they have to descend into the depths of the V-model and meet the standards of quality control and quality assurance. All of these arecontributors to overall project risk management (Figure 1c).

6

Thom pson inform ationSy stems

©

Figure 1c: A lake and some mountains

Zen

Quality Assurance

Quality Control............R

isk M

an

ag

emen

t............

Testingstatic dynamic

“Soon, stunted pines disappear entirely and we’re in alpine meadows. There’s not atree anywhere, only grass everywhere, filled with little pink and blue and white dots ofintense colour…we’ve reached the high country… I look over my shoulder for one lastview of the gorge… People spend their entire lives at those lower altitudes without anyawareness that this high country exists”

1.4 The nature of quality

Like Phaedrus, information systems professionals have expended much effort trying todefine quality. For the purposes of this paper, it is sufficient to outline (Figure 1d):

• the time-cost-quality triangle; and

• the distinction between quality control and quality assurance.

The time-cost-quality triangle sometimes quoted by project managers is shown on thenext page with its true axes of good, fast and low-cost. It is common in commercialprojects for the timescales and budgets to be set early, and only a proportion ofmanagers accept that each phase of a project (eg level in the V-model) can beestimated only when the previous phase has completed, or nearly so (eg one needs toknow the requirements before costing and planning the design phase). So the qualitybecomes constrained within quite a narrow range, and it is usually easier to increasethe budget than to extend the timescales. Quality should of course be maximised withinthat range by effective quality control and quality assurance mechanisms.

7


©

Figure 1d: The triangle , QA and QC

The time-cost-quali ty triangle :

• time is usual ly set fi rst

•… close ly followe d by cost• qual ity is the best w e c an ma nage, (unle ss w e com plain loudly)

Quality Assur anc e:

• ISO9000 e tc

• audi t, external (then fix?)

Quality

C ontrol:

• ”right first t ime ”• internal responsibili ty

Like validation and verification, the difference between Quality Control and QualityAssurance has different interpretations. One useful distinction THOM94 is that:

• QC is the responsibility of those doing the actual work, and may be improved bya “right first time” culture. However, the greater the time pressures on a project, themore staff are likely to make mistakes and feel that quality is a luxury for others.

• QA is an external “audit” or “policing” function, sometimes built on formalstandards eg ISO9000. Unless QC is very good or QA is weak / open tonegotiation, QA is likely to require some rework of tasks done.

1.5 Role of testing in quality

Most people have by now been convinced that testing can never be perfect, unless it isexecuted for an infinite time or with infinite resources. So all real-life systems go livebased on a threshold of acceptable quality, which for some systems claims to be “zero-defect” but for most systems is based on acceptance criteria such as “no critical faultsremaining, less than 10 important faults, less than 30 medium and less than 100 low-importance faults”.

Testing should converge on this threshold via one, or ideally two or three, cycles oftest-check-diagnose-fix-retest (Figure 1e).

8

Thom pson inform ationSy stems

©

Figure 1e: Convergence not divergence

C onve rgenceon ac ceptable quali ty

and sta bili ty

Bew are of diverge nc e

into instability through insufficie nt qual ity of

design, de bugging, ma inte nance ,enha ncem ents

It is normal for some fault-fixes themselves to contain errors. The proportion dependson a number of factors, eg working hours of staff and amount of pressure on them,clarity of original specification and of fault descriptions, quality of system design,modularity of code, adequacy of documentation. If the proportion of these “knock-on”errors becomes too high, it is theoretically possible for the spiral to diverge, ie thesystem becomes more and more unstable whenever it is changed. This is unlikely fornew systems, but is a significant risk for old systems after a long period ofmaintenance by staff who were not involved in the original development. Sometimessuch systems have to be frozen to keep them working, and replaced completely at thefirst opportunity.

1.6 RAD versus Trad.

The above-mentioned risk of divergence into instability, or more likely just failing toconverge fast enough on stability, exists for Rapid Application Developments ifinsufficient control is applied. RAD and related methods such as iterative developmentmitigate this risk by strict “timeboxing” and repetitive testing.

Also, it is widely accepted nowadays that this process extends quickly into liverunning, as the initial version of a software-based system or product is usually knownto be imperfect, and the descoped functions and lower-priority defects are scheduledfor resolution in future releases (Figure 1f).

9


©

Figure 1f: R AD versus Trad.

Traditional R apid A pplication D eve lopment “Re leases” culture

A cce ptance, System

Inte gration,U nit

Acceptance,full System

qu ick System,Integration,

Unit

I tera tiv e dev elo pm ent & tes ts

R elease 1. 0 1. 1 2 .0 etc

………….………Changing requirements…………………………………………………..

Que ued

change

requ ests

Microsoft is reported CUSU95 to use a fine-grained version of this, with:

• extensive parallel work, but with strict configuration control and dailysynchronisations and debugging;

• full build and test as frequently as the product and its current market contextrequire (this could be monthly, fortnightly, weekly or quite commonly daily); andtherefore

• never being far away from a deliverable “fit-for-market” product.

1.7 Test structure for effectiveness

It is important to know how the structure of tests planned and executed covers thefunctionality and other attributed of the system(s) under test. This is usually based onthe very well-known V-model, in which each level of tests (unit, integration, systemand acceptance) aims to exercise the system in different ways from differentviewpoints, so that in total everything is done somewhere, preferably only once (plussome regression testing). Usually nowadays a refinement such as the W-model is used,which emphasises test specification as early as possible, thereby getting the benefit ofstatic testing in addition to dynamic.

But measuring coverage at each level is not trivial. Typical measures used at differentlevels are:

• unit testing: statement and branch coverage of code

• integration testing: condition coverage of interfaces

• acceptance testing: coverage of stated requirements, user transactions for each userrole profile, business events etc.

10

It is system testing which presents the greatest challenge to coverage measurement,because it is expected to cover everything, not only all functionality but also all the“non-functional” or technical attributes such as performance, security, backup etc.

It is possible to think of this as a three-dimensional glass box, with:

• the first dimension being the structure of the system, in whatever terms it isspecified, eg functional decomposition;

• the second dimension giving structure to the way the system behaves, again inwhichever terms the specification is written; and

• the third dimension representing the various types of testing which have to beconsidered.

Following glass-box testing, or in association with it, we also need to do black-boxtesting based on data values in and out.

Figure 1g illustrates these principles THOM93.


©

1g: G lass-box a nd black-box

BLACK-BOX

styleGLASS-

BOXstyle

Tes ting ty pes : fu n ctio nal , per form anc e etc

T es ting s tyles : p os i tive, n ega t ive

Sy s tem

co m pon ents :

T es ting s tyles :

top -d ow n,

b ot tom -u p

P ro cessing units

Out put da ta s prea d: targ et o u tco mes

Input dat a spr ead :equ iva len ce cla sses

(in clu d in g b ou n da ry valu es )

D A TA F LO WS

EN TITY L IFE H IS TOR IES

ST A T E TR A N SIT IO NS

FU

NC

TIO

NA

LD

EC

OM

PO

SIT

ION

S

11

2. Millennial challenges: does anything work properly now?

2.1 Widespread change, convergence and complexity


©

Figure 2a: M illennial c hallenges to information s ystems

9 Sep 9 9 etc.

29 Feb 00

1 Jan 00

p o i nk

C= UK£

Ir£ NFl FMBFr LFr DM ASchPEs SPta FFr ItL

19981999

2000

http:/ /w ww

2002:EM U in

rea l life

2001:a cyberspace

odyssey?

(Figure 2a) Over the last few years there have been some very large-scale changes inEurope with a big impact on information systems:

• the opening of the electricity and gas markets to competition has been completed inthe UK, and similar initiatives are under way in other countries;

• the first wave of EMU has been implemented; and

• many companies and organisations have been repairing or replacing systems tohandle the year 2000.

This level of change is not yet over, and in the next few years:

• further major changes are planned to electricity markets, and other utilitycompanies are diversifying widely;

• the public conversion to the euro has still to be done, and other countries may joinEMU; and

• year 2000 is not yet here.

At the same time, systems are becoming more complex, more pervasive and moreinterconnected. For example:

• information systems are becoming more internet-based, despite difficulties withcommunications bandwidth, browser incompatibilities and the need for cookies andother add-ons to produce usable systems;

12

• energy companies are carrying telecommunications, satellites are carrying internettraffic, new telecommunications methods are emerging but giving interferenceproblems, and television is going interactive and internet-linked;

• motor vehicles are increasingly software-controlled, and this can even be accessedover a mobile phone;

• functionality is being added to application software (and then changed in futurereleases) faster than people can learn to use it;

• new releases of operating systems are frequent, and reports of problems andinstability are common;

• each new operating system version and browser version is supposed workalongside all previous versions, as these may be in use for many years; and

• the same thing applies to application software, so it is common for new systems tobe interfaced to legacy systems rather than to replace them; we thereforeaccumulate more and more diverse, interconnected systems.

2.2 Testing squeezed: even more!

It has been common for some years for testers to complain that the proper time fortesting is compressed towards the end of a project as previous phases slip yet theimplementation date does not (or if it does, by a lesser amount). This is even worse inthe current environment, because:

• instead of this problem for each individual system, at different times,…

• now complete industries, with complete sets of diverse systems, have to changesimultaneously for events like EMU, year 2000.

Also, modern systems for Enterprise Resource Planning and electronic commerce arereally super-systems. So the traditional problems of delays, late deployment of testingstaff, shortage of experts and “immovable” implementation dates compress testingmore than ever before. Some of these dates really are immovable (Figure 2b).


©

Figure 2b: Pressures on tes ting now

What we’dlike tohappen

What tends tohappen inrea lity

H ithe rto Millennial

A a r g h !

U nit

System

Inte gration

Y e a r 2 0 0 0

E M U 2 0 0 2

•Ea ch is

m ulti-syste m

•Sim ila rly

ER P & e -com m erce

sup er-syste ms

U RS A TS A TE A TR

•Retest &Regress ion T es ting

•TestEx ec ution

•TestSpecification

•Requir em entsSpecification

U se r: Acceptanc e:

•vestigial•m uc hshorter

•shorter•delayed •no s taff availab le

U RS A TS E R

��

•pressure onenv ironmentsfor testing

13

3. How testing is based on risk management

3.1 Risk management components

So is there anything we can do to make a tester’s life easier? It is well known thattesting is a form of risk management, but this knowledge is not always used asexplicitly as it could be (and should be) in planning and managing the testing process.The main components are illustrated in Figure 3a.


©

Figure 3a: Probability a nd severity

SE VER IT Y OF

EA C H B A D T HIN G

W H IC H C OU L D H A PPEN

B A D T HIN GS W HIC H

C OU LD H AP PEN , A ND

PR OB A B IL IT Y OF E AC H

EN

SW

The usual convention is to express the degree of risk as the multiplied values of:

• probability, ie the likelihood of the risk becoming a real problem (0% isimpossible, 100% is certainty); and

• severity, ie the degree of impact expected if the problem anticipated by the riskactually occurs (this may be measured in financial terms, eg lost income ofbetween £40,000 and £100,000, most likely case £50,000).

Such precise measurements of percentage and financial loss are difficult in practice,and it is often sufficient to use a high/medium/low rating system for both probabilityand impact.

However, there are some risks for which for emotional or other reasons, one or other ofprobability / severity is considered predominant. For example, most people are muchmore frightened of thunderstorms than they “should” be based on the real probabilityof being struck by lightning BERN96. Conversely, if a risk is not high-impact but isalmost certain to happen, if we have the time and ability to prevent it then most peoplewould want to do that.

It is often useful to consider a third component, visibility, or more precisely theinvisibility of the effects of a risk. If we see a risk to the correctness of the data in asystem, it will be more of a problem if we don’t detect it (ie the data looks plausiblebut is wrong) than if the problem is obvious (eg values suddenly becoming negative).This is one of the more worrying aspects of the year 2000 problem.

14

3.2 Risk management and testing

It is at different levels of the V-model that we can see the distinction between the risksof faults within the software and other risks of a system(s) implementation (Figure 3b).


©

Figure 3b: Risks and the V-model

Acceptancetesting

Systemtesting

Integration testing

Unit testing

Risks that system(s) haveundetected defects

in them

Risks that system(s) and business(es)are not right and ready

for each other

3.3 Outline level

Figure 3c shows how at each level of the V-model, specific risks can be identified andspecific actions can be taken by testers to manage those risks.


©

Figure 3c: W here and how testing ma nages risks

Level Risks H ow m anaged

A cceptancetest ing

• Serv ic e ≠ r equ i remen ts

• U n detected erro rs d amag eb u sin ess

• Sp ecif y u ser-w an ted tes ts ag ains t U RS

• Scr ip t tes t s aro un d user gu ide and u ser& o p erator train in g materials

Systemtest ing

• S ys te m ≠ sp eci ficat ion

• U n d etected erro rs w as te user t ime& d amag e c on fiden ce in

A cce ptan ce test ing

• Use in dep en d ent tes ters , fun ct io n al &

tech n ical , to g et fresh v iew• T ak e last o p po rtun ity to d o au toma ted

st res s tes t in g b efor e env ’t s re-u sed

Integrationtest ing

• In terfaces d o n’t ma tch• U n detected erro rs too late to fix

• U se s k il ls o f d es ig n ers b efor e they mo v e

a way• T ake last o pp o rtu n ity to exer cise

in terfaces s in g ly

Unittest ing

• U n its d on ’t w ork rig ht• U n detected erro rs w on ’t b e fo u nd

b y later tes ts

• Use detai led k n ow ledg e of dev elop ers

b efo re th ey for get• T ak e last o p po rtun ity to exe rcise e very

erro r m essag e

15

3.4 Detailed level

Within each level of the V-model, we should be specifying the tests which are mostlikely to address the specific risks expected at that level. For the lower levels (unit,integration and system), this means predicting the kind of defects which lurk there,and designing tests accordingly. Because testing is at still at least as much art asscience, this is often done almost subconsciously, using intuition based on previousexperience. Sometimes explicit risks are assessed and documented, especially forsystem and acceptance testing, but after explicitly (or more likely implicitly) usingthese to specify the tests, often these are then filed away. We would do better to keepsome record of the risks being addressed as part of the test specification (Figure3d).

Thompson informationSy stems

©

Figure 3d: Risk mana gement during test spec ification

• To he lp dec ision-ma king during the “squee zing of testing” , it w ould be use fulto have recorded expl ici tly a s pa rt of the spe cification of each te st :

– the type of risk the set of tests is designed to minimise

– any specific r isks at which a pa rtic ular test or te sts is a ime d

• Rem embe r, eac h test i s a means to an end, not an end in itsel f

• The “object” of eac h test i s risk m anage ment , so let’s encapsulate...

Test spe cification base d on total

magnitude of risks for al l defec ts

imaginable

x= ∑ ( )Estimatedprobabi lity of

defe ctoccurring

Estimatedseveri ty ofdefe ct

16

4. Applying object-oriented concepts to testing

4.1 Encapsulation for effectiveness

The first OO concept considered here is encapsulation: if we want to keep a record ofthe risks which each test or set of tests is intended to mitigate, why not treat the risksas part of the overall test “object”, in a similar way to how a development objectencapsulates the local data it needs? If we want at some future date to reassess the riskto determine the status of that test, we won’t have very far to look. Figure 4a illustratesthe principles of “object-oriented risk management” compared to the correspondingconcepts used in OO development.


©

Figure 4a: O bjects a nalogy

OBJECT FRO M DEVELOPMENT OBJECT O F TESTING

Serv ices providedby object

Risks minimisedas object

DataTestData

Functions operatingon data

Things to look ou t for,priorities for check ing

4.2 Object-oriented risk management in test specification and execution

The application of risk management to test specification was introduced in section 3.4above. Figure 4b builds on this by applying similar principles to test execution. We arestill looking at the probability and severity of defects, but whereas in specification wewere trying to guess which defects we would be able to find, when we start execution,we know about all the defects found so far.

17


©

Figure 4b: Probability and s everity of defects

R EM A IN IN G

R IS KS

R IS KS T O

T EST A GA IN ST

}

Estimatedprobabi lity x

x

x

Test specification:•for all defects im aginable...


•for all defects as yet und iscovered ...

Probabil ity

= 1

Estimated

probabi lity

Estimatedseveri ty

Estimatedseveri ty

Severi ty =f (urgenc y,

im portanc e)

=

=∑∑∑∑

The severity of a defect is a function of its urgency and its importance, which can bedefined as follows:

• when a test result either disagrees with the expected result or is otherwise deemedunsatisfactory, an incident should be recorded;

• the term “incident” is chosen because at this stage the tester cannot be sure whetherthis is a problem / defect (ie the system fails to meet its specification) or a changerequest (ie the system’s specification is inadequate);

• either way, there are two categories of risk associated with each incident:

• that it delays or disrupts some or all of the tests planned to follow; and

• that if not fixed or otherwise acted upon before go-live, it damages thebusiness or organisation;

• these two categories should be distinguished by recording two separate priorities inparallel:

• urgency, ie how quickly it needs fixing to minimise impact on progress oftesting; and

• importance, ie how much impact it would have on the business if not fixedbefore go-live.

Often these two priorities are the same, but they can be opposite, eg invoices correctlycalculated but printed with zero value will not stop the tests but would be disastrous fora business.

The actual severity for the purpose of risk calculation and resolution scheduling is bestreviewed by a regular meeting of business, testing and technical representatives, sincea low-importance but high-urgency incident could be blocking tests which are vital forconfidence. The closer we get to go-live, the more value is placed on importance andthe less value placed on urgency for testing.

18

4.3 A framework for safe descoping

The main point about object-oriented risk management is that it recognises that testersare rarely allowed to take as long as they want to test a system. And because manythings go not as well as planned, but few things go better than planned, a fixedimplementation date often means several iterations of removing things from scope:

• in the relaxed early days, we write a testing strategy promising to cover everythingadequately;

• when the testing plan is written, we already know that development is late and wewon’t be allowed such a large testing team;

• when it’s time for test design, we discover that we will not be allowed twodedicated environments but only one, and that will arrive late;

• any difficulties during test scripting, or any further development delays, will furtherattack the achievable test coverage; then

• test execution provides another set of threats, eg blocking errors delaying theschedule.

So our planned coverage is eaten away, bit by bit, until it is full of holes like acaterpillar-attacked leaf. We’d like those holes to be in the safest places; if a leaf iseaten away between the veins, it will still stand, but if the veins are broken, the leafwill collapse. So we need to know where it’s safest to make those holes, where therisks are lowest.

T hom psoninform ation

©

“c om ple te”fir strun

Phys ic alconstraints(envir on-m entsetc.)

Figure 4c: Repeated descoping

Specification

of tests

Execution

of tests

Retes ting &

reg res sion

testing

Testing Strategy

Testing Plan

Test Des ign

Test Scrip ts

Execu tion: First run

Execu tion: Retes t & 2 n d run

Regression Testing

.. .

Diffi cu lties,squeeze onsc rip ting &execution tim e

Time & r es ourc e con-straints

Desired coverage

What’s l eft for :

second run

allowanc e for further runs

��

��

��

��

��

��

��

��

��

��

Finally, even when it comes to retesting and regression testing there are furthercompromises still to be made. If testing is as good as it normally is, we will have foundmany defects, probably hundreds. And how much time was allowed in the plan forretesting? Did we tell the project manager we expect 450 defects so we’d better allow 3weeks for retesting?. If so, did he / she accept this?

19

And the regression testing compromises are even worse, because we need a verysophisticated regression testing plan to tell us with high confidence what it is safe toleave out of regression testing. In the absence of that confidence, it feels good to go fora comprehensive re-run. But if we get any further failures during that (which wenormally do), is there time to do it again? Not usually.

4.4 Revisiting risks

At each of these potential descoping points, we ought to check that the risks weoriginally assessed are still valid. There will not be time to repeat the whole process,but we can use the structure outlined in section 1.7 to check that our tests still cover themost important business processes, the key data inputs etc (Figure 4d).


©

Figure 4d: K eep tests prioritised for bes t effectivenes s

Risks to perform ance , auditabil ity

Complex funct ions,

interfac es

Importantbusiness

processes

K ey data inputs, e ghigh va lue t ransact ions

Important da ta outputs ,eg invoice s

Example r isk areas

GLASS-

BOX

BLACK-BOX

4.5 Inheritance of tests

A second concept from object orientation may be used by testing, this time to improveefficiency. A common approach to specifying system-level tests is to start with a freshready-to-use database (an artificial, stable initial state) then run a number of streams oftesting in parallel. Each test requires some test data set-up, and the more complex thetest, the more work is needed to get the data in the right state.

Inheritance can help us plan some short cuts here, particularly if we have testautomation in place. It is sensible, and usual, to start with simple tests and keep thecomplex tests until the basics have been proven to work. But we can go further withthis principle, and actually plan to reuse the simple tests as part of the complex tests, toget the data in the right state. If automated, we could even simply rerun the automatedscript. The principles are illustrated in Figure 4e.

20


©

Figure 4e: Complex te sts inheriting simpler tests

INSTEAD OF T HIS TRY TH IS

Early , simple

tes ts

Artif icial, stable

initial state

A la ter, complex

tes t

Progres sively more complex

tes ts

Artif icial, stable

initial state

...

Test 1

Test 2

Test 3

Test 57

Initia l t est

da ta s et-up

Initia l t est

da ta s et-up

M ore t est

da ta s et-up

...Test 1

Test 1

Test 6

Test 1

OR EVE N THIS

Test 8

Test 9 Test 57

Test 57Test 23

The advantages of this approach are that:

• we can save time, as there are less test steps to specify overall;

• as tests are accumulated, a library of reusable tests is built, evolving into a lifelike“test bed”;

• there is an opportunity to “mix and match” different tests, giving ready-madevariations;

• if two composite tests behave differently, it is relatively easy to isolate thedifferences in the tests themselves; and

• there is a by-product of some automatic regression testing, as repeating exactly thesame tests which have worked earlier can reveal unexpected side-effects of fixes.

There are some disadvantages however:

• repeating an earlier test exactly may miss the opportunity to find a problem which aslight variation would have detected; and

• if there are problems with the earlier tests, the later tests which use those aredelayed.

21

5. Role of metrics and measurement in risk management

5.1 Metrics and measurements: which?

Like verification and validation, quality assurance and quality control, metrics andmeasurement are subject to semantic debate. A dictionary definition will suffice here:“metrics: the theory of measurement” Chambers. The simplest interpretation is then that:

• defining metrics is deciding what we want to measure (eg progress through testing,defect priorities and resolution trends, defect sources), and how we can do that; and

• measurements are what we collect to populate our metrics with data.

Cynics often say that one can make statistics appear to prove anything, but that is thepoint here: we should choose our metrics to give us information on the risks we wantto manage, and then at the next level of sophistication to build up a knowledge base toguide future intelligent action.

Some common and useful metrics are illustrated in Figure 5a.


©

Figure 5a: M etric s in testing

D efec t source a nalysis:sour ce o f d efects by testing level detectin g them

Unit In teg ration System Acceptance

Test ing progress:tests executed by each d ay of testing f or a lev el ( cumulative)

Test ing product ivity:def ects fou nd by each d ay of testing f or a lev el ( cumulative)

Time

Time

Risks to te st ing progress:def ects repo rted, f ix ed & retested by u rgen cy

Risks to business:defects repor ted, fixed & retested by importance

Co ding

Desig n

Systemspec-ification

Req uire-men ts

Number

Low M edium High

High

��

��

��

��

��

��

��

Number

Low M edium

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

Number

Number

The subject is too wide to give much space here, but a few comments are useful to givecontext:

• Testing progress is needed to predict whether we are going to finish in time.Typically initial progress is slow, owing to environmental problems and “blockingdefects”, then speeds up, then slows at the end as we struggle to complete thedifficult cases. This gives an S-shaped cumulative curve.

• Testing productivity over time should be similar to testing progress, becausetesting productivity per test should be fairly uniform until we reach acceptancetesting. This should contribute to our knowledge base of which kind of tests aregood at finding errors and which are not.

22

• Defect priorities and resolution trends, like testing progress, are essential topredict whether we will be ready in time, or if not, when.

• Defect source analysis is another “knowledge” contributor, because each testinglevel, eg integration, should specialise in the kind of defects which suit it (andwhich later, higher levels have less chance of finding). Note here one criticism ofthe V-model, which is that the kind of errors in which acceptance testingspecialises are often the most difficult to fix, ie requirements errors, yet acceptancetesting is the last to execute. The answer is that although acceptance testing is thelast to run, it should be the first to start (as in the W-model). Defining acceptancetest cases can and should begin as soon as the requirements are documented.

5.2 Metrics in risk management

If we return to our picture of estimated probability and estimated severity of defects aspredicted in test specification and test execution, we can see how metrics can help uswith some of the missing information (Figure 5b).


©

Figure 5b: Me trics to estimate risks

R EM A IN IN G

R IS KS

R IS KS T O

T EST A GA IN ST

}

Estimatedprobabi lity x

x

x

Tes t spe cifica tio n:

•fo r al l d efects

im agin able. ..

{Tes t exe cution:

•fo r ea ch de fect

d etected …

•fo r al l

d efec ts as y et

u n discov ere d. ..

P robabil ity= 1

Estimatedprobabi lity

Estimatedseveri ty

Estimatedseveri ty

Severi ty =

f (urgenc y,im portanc e)

=

=∑∑∑∑

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

We can specify more effective tests in future if we know which tests were best atfinding which errors (defect source analysis). Tests which run without failures, unlessthey are necessary for data setup or are part of acceptance tests, are not goodcontributors to risk management.

During test execution, we can use not only current defect information and trends topredict when we it will be safe to stop testing, but we can use defect information fromthe lower levels already executed, to predict where in the system defects are mostlikely to be (the author calls this “glass-shelf testing”). There is evidence that metricsfrom unit testing are very strong indicators of which units will be prone to rework inlater, higher levels HOLT98.

23

6. Obstacles to attaining and maintaining quality systems

6.1 Insufficient time, and neglect of risk

There is never as much time available for testing as testers want. We could respond tothis better if we did not typically encounter the following obstacles:

• Insufficient use of risk. Initial risk analysis at the start of testing is common, butthen we tend to put our heads down and do “the tests, the whole tests and nothingbut the tests”.

• Unclear probability of risk. Knowing the risk factors is currently more art thanscience, and change as technology changes (for example, yesterday’s stack pointerproblems give way to today’s browser incompatibilities). Very few commercialorganisations seem to budget enough for metrics and their measurement andinterpretation.

• Unclear severity of risk. This is usually easier to determine: interview the usersand their managers. But remember to revisit it at key decision points.

Object-oriented risk management is intended to help overcome these obstacles. Butthere are other obstacles…

6.2 Motorcycles and information systems

In ZataoMM, the narrator discussed a number of obstacles he had identified in keepingmotorcycles running properly. There are some interesting comparisons between theseand similar obstacles met by information systems developers, maintainers and testers(Figure 6b).


©

Figure 6b: D igital obstac les v. a nalogue

Obstacletype

… in m otorcycle maintenance(analogue: needs frequentadjust ment s, par t re pl aceme nt s)

… in testing info rmationsystems (digital: once it works, itshould continue to work)

External • out-of-sequence re-assem bly• inte rm it tent failure• parts scarc i ty & confus ion

• installation etc . conflicts• inte rmit tent failure!• deve lopme nt delays &

configuration mana geme nt

Internal:• value

• va lue rigidi ty• e go• anxie ty• bore dom• impat ienc e

• no la tera l thinking• a rrogance , cover-ups• desire for “smooth tests”• uninspire d tests• inadequa te stra tegy / design

• truth • true / false / undefined • if… the n… without e lse

• “psycho-motor”

• inade quate to ols• poor working e nvironme nt• lac k of “m e chanic’s feel”

• no, or im perfec t, tool se t-up!• ina dequate te st environm ents• “finger trouble” (de libe rate?)

24

6.3 Why testing continues to struggle

There are some other books, more recent than ZataoMM, which suggest some insightsinto why getting information systems working properly, and keeping them working, isso difficult (Figure 6c)


©

Figure 6c: M ore books, more obstacle s

Book Bad news Specific obstacles

Zen and the art ofMotorcycleMaintenanceRob ert M Pir sig, 197 4

Trying to define quality candamage your mental health

Not on ly the previous slide, but• some people don’t understand and• don’t want to unders tand

Digital WoesLau ren Ru th Wiener, 199 3

Software is inherently unreliable • inv isible, abstract, d iscontinuous• unconstrained by com mon sens e or

physical laws• only handles cases we know about, can

define and have time to code and test• whole world is becoming “the system”

The limits ofComputingHen ry M W alker, 199 4

… So are specifications,algorithms,hardware andus ers

• cannot know if complete and correct• can meet unsolvable / unfeasible spec.• failures not always obvious• GIGO, security

Dilbert (m osttitles)Scott Ad ams, 199 x

• M anagers are stup id• C orporate marketing can beat

quality p roducts• Technology will never work

properly

• Engineers do u seful work but arecontrolled by managers

• Managers provide inadequate resources& support for u seful work

6.4 But there is some hope!

It is possible, however, to find some more optimistic references (Figure 6d)


©

Figure 6d: Some sources of hope!

Source Good news (allegedly)

• 1995 book

• 1999 bookB ill Gates

• The road ahead (when it’s upgraded to “informationsuperhighway”) is good

• Bus iness @ the speed of thought (v ia a d igital nervous system)is good

The Dilbert Futu reScott Adam s, 199 7

• Bad things which can be fo reseen will be preven ted by humans• One day t echnology will reduce human work , not increase it• Democracy & capitalism will always coexis t happily with lazy /

stup id peopleRelease 2 .0 and 2.1Esther Dyson, 199 7 & 199 8

• Release 1.0 is fresh and new, the realisation of the hopes anddream s of its developers

• Release 2.0 is supposed to be perfect, but…• …usually Release 2.1 comes out a few m onths after• In ternet causes decentralis ation, where the masses separate in to

small groups (systems of which may aut omat ic ally se lf-o rgani se)Moral and legal challengesof the information eraC hris And erson, 199 9

• Institutions and indiv iduals seem to be coming apart, because…• institutions are built on “machine” principles…• and we now need “chaord ic” organisations (eg Visa, Internet)

25

6.5 The surprise of Linux

Following up this idea of “chaordic” organisations: the Linux community is arguablyone such organisation. One might at first sight expect a distributed collaboration ofloosely-controlled developers and testers to produce less reliable software than largecorporations, but the growing band of Linux adherents claim the opposite, and there issome evidence to support this. Figure 6e illustrates what we might expect compared towhat seems to be emerging.


©

Figure 6e: Ins titutional and chaordic c onvergence

W e might e xpect: - close r control

- defined scope- traditionally

motivated

staff

INSTITUTION A L eg proprie tary C HA OR DIC eg LinuxW e might e xpect: - ana rchy

- runaw ay scope- insuffic ient ly

motivated

“publ ic”

W hat we se em to get: - convergenc e...

- but only unti lthe next

m ajor

rele aseor next

product

2 .0

1 .1

1 .0

2 .1

1 .3

1 .2

etc.

��

��

W hat we se em to get: - c onvergenc e…

- less fac ili tie s- but fewerfaul ts, and

- bet terstabil ity

- why?

26


The risks presented by the millennium problem (or, more correctly, the centuryproblem, since if computers had been in widespread use in 1899 then we would havehad to fix it for 1900) are interesting for several reasons:

• The whole issue, despite some popular belief, was well known in advance by all,or nearly all, technical people (the author of this paper detected it when writing hisfirst program in 1978, but declared himself “unlikely to be still in computing bythen”!

• The main reason it is still a problem, apart from a reluctance by some managers tolisten to, or believe, technical people, is the natural human weakness ofprocrastination, or “short-term-ism”. Few people wanted to pay for fixing itbefore it needed to be fixed.

• Even now, not many weeks away from the event (this was written in earlySeptember 1999), genuine experts cannot agree on whether it will really be a bigproblem or not. For every piece of bad news pounced on by the pessimists, therewill be an immediate response by the optimists, declaring the news false ordistorted, or failing that doubting the intellectual rigour of the reports. And viceversa.

Several “advance mini-crises”, bundled by many commentators with year 2000 itselfand its 01 January and 29 February problems, have passed with little or no visibledisruption:

• the start of the financial year 1999-2000;

• the Global Positioning System date rollover in August from 1023 weeks to 0weeks; and

• the potential, though less plausible, 9/9/99 problem.

These “non-events” are being taken in many quarters as indications that the pessimistshave been wrong so far and will continue to be wrong. The pessimists, on the otherhand, can quote the well-known stories of:

• the boy who falsely cried “wolf” (when a wolf did arrive, the boy was not believedand the sheep were eaten); and

• Cassandra, who knew she was telling the truth but also knew she would not bebelieved.

Arguably year 2000 optimism or pessimism is like religion; it is not a consciousdecision; one either believes or does not (agnostics being treated as non-believersrather than disbelievers). The only conscious decision most people take is to mix withthe kind of people who agree with one’s beliefs BERN96. This tends to reinforce thosebeliefs. So it is with year 2000.

It does seem, however, that the pessimists occupy the moral high ground. The mostpersonal and emotional attacks are by the optimists on the pessimists, declaring themcynical money-grabbers, opportunists, charlatans, self-promoters etc. Rarely do the

27

pessimists attack the optimists, other than to pity them. Had the pessimists not been sovociferous over the last three or four years, much of the remedial work which has beendone may have been too late. Yet there are undoubtedly cynical money-grabbers etc onthe y2k bandwagon. And if the pessimists really do think the optimists are guilty ofdangerous and culpable negligence, why do they not say so more loudly?

But there is a case to be made that even if the pessimists are right, they are right not tobe pessimistic too loudly. They were five years ago, but not now. The reason is thatwhether or not it is a really big problem may depend as much on how many people, atthe last minute, think it will be a problem. There is a risk of a “self-fulfilling prophecy”(Figure 7).


©

Figure 7: C ause a nd effect are not independe nt

Bad effects

Date

N ot enough peoplebelieve i t’s a problem:

Too many pe oplebelieve i t’s a problem:

1995 autu mn 1999 2000

• to o li tt l e rem ediat io n d on e • p an ic ame nd men ts to sys tems

• e xce ss iv ely d is ru pt iv e tes t s

• u n ne cessary sy stem rep lac emen ts

• e xp en siv e co n ting en cy arra ng eme nts

• in ves tmen t su p pre ssed

• s tock p iling & ho ard ing

• s tagn at in g ma rkets eg g il ts

• c ash w i th d raw n

A re w e c ur rently her e?

• a lmo s t-a deq u ate reme diat io n

• to o li tt l e test ing

Not only are the causes and effects of possible year 2000 problems not independent,but:

• potential causes are dependent on other causes (eg one company’s systems arecompliant, but they are fed incorrect dates or date-affected data across aninterface); and

• potential effects are dependent on other effects (eg supply chain problems,particularly the widespread “just in time” chains).

It is this lack of independence which has almost certainly caused the insurance industrynear-unanimously to refuse to insure year 2000 risks. Insurance is guided by actuarieswho perform sophisticated calculations on independent risks. Once the risks can feedoff each other, “all bets are off”. Interdependenzkraft? Nein danke! But then there arealways the lawyers on which to fall back…

Looking wider than the year 2000 problem, it has been argued that such problems ofinterdependence between cause and effect are actually a threat to the world’s currentmix of global capitalism with fragmented and diverse political control SORO98. Thefamous financier sets out concepts of:

28

• fallibility (not only is perfection impossible, but it can have positive consequencesto recognise this and plan for it);

• reflexivity (not only are our expectations of future events affected by past events,our expectations can themselves actually affect those future events); and

• open society (members of an open society recognise that understanding isimperfect and actions can have unintended consequences; capitalism is a distortionof an open society; market values need to be moderated by social values).

This strays outside the boundaries of software testing, but bears comparison with someof the ideas expressed in section 6:

• new, fragmented and cross-border groupings of mutually-interested people (via theinternet) do not necessarily threaten orderly society;

• distributed but collaborative development and testing (Linux), rather surprisingly,does not necessarily lead to bad software;

• other “chaordic” organisations (eg Visa) can be outstandingly successful;

• market forces are taking software complexity out of control, and testers have akey role in setting up the missing “error correction” mechanism based instead onsocial values; therefore perhaps…

• the worldwide community of testers may yet be able to escape the misery ofcodependent behaviour.

29

8. Future of testing.

To conclude with some thoughts on the future:

• How long can we survive computerising and interconnecting everything, withrequirements increasingly volatile and undocumented, and responsibilitydistributed? What we already have doesn’t work properly!

• Does anyone but testers care? Will the battle for quality send us mad?

• Is the future for testers worse or better?

• E-commerce threatens disintermediation, ie “cutting out the middle-man”; isthere a similar threat to testing? Technicians are expected to be increasinglybusiness-aware, business experts are increasingly technically proficient, andtesters often integrate the two other skill-sets. Will they always be necessary?

• On the other hand, will testers get the time and information in the future tobetter assess risk? The insurance industry does not do much without consultingits actuaries (and very well-paid they are too), yet there are no equivalentpositions visible in testing. Will testing get its own actuaries?

The future of object-oriented risk management has already arrived for the author of thispaper, who has just finished a successful acceptance testing / user trials project in avery large company, using the concepts outlined here. There were learning points inaddition to things that worked well, and it is hoped that a case study will be included ina future paper.

…to talk now about Phaedrus’ exploration into the meaning of the term Quality, anexploration which he saw as a route through the mountains of the mind…In the firstphase he made no attempt at a rigid, systematic definition… This was a happy,fulfilling and creative phase. The second phase emerged as a result of normalintellectual criticism of his lack of definition…he made systematic, rigid statementsabout what Quality is, and worked out an enormous hierarchic structure of thought tosupport them.

The take-home messages are therefore:

• keep your customer’s risks evaluated as part of your tests, and revisit them atkey stages in testing;

• define metrics (also based on what risks you want to manage), and measure againstthose metrics, but keep it simple;

• if necessary, be ready to go live on any of a range of dates, each with known risk;

• keep your measurements, and put them on the internet;

• work out what you would have done differently if you had had thosemeasurements at the start of the project;

• take every opportunity to argue that testing needs expert assessors of risk just asmuch as the insurance industry needs actuaries;

• continue to attend, and contribute to, conferences like EuroSTAR;

• fight the battle against codependency, but don’t only talk to other testers, go outand preach to the project managers and the “object orienteers”English pun;

• oh, and… survive year 2000.

30

References

ANDE99 Chris Anderson Moral and Legal Challenges of the InformationEra: the Effect of Y2k – a PhilosophicalDigression to Pretoria University 1999

BERE98 Bogdan Bereza-JarocinskiIs Software Testing Scientific? EuroSTAR 1998

BERN96 Peter L Bernstein Against the Gods: the RemarkableStory of Risk, Wiley 1996

COPE98 Lee Copeland When Helping Doesn’t Help: Software Testingas Codependent Behaviour, EuroSTAR 1998

CUSU95 Michael A Cusumano &Richard W Selby Microsoft Secrets, The Free Press 1995

GRAH95 Dorothy Graham &Systeme Evolutif CAST Report, CMI 1995 (also other editions)

HETZ98 Bill Hetzel Software Test and Evaluation:a 25-year Retrospective, EuroSTAR 1998

HOLT98 Peter Holt & Leading Indicators of Rework: a Method ofRonald Stewart Preventing Software Defects, EuroSTAR 1998

MYER79 Glenford J Myers The Art of Software Testing, Wiley 1979

PIRS74 Robert M Pirsig Zen and the Art of Motorcycle Maintenance:An Inquiry Into Values, Bodley Head 1974

SORO98 George Soros The Crisis of Global Capitalism:Open Society Endangered, Little, Brown 1998

THOM93 Neil Thompson Organisation before Automation: a Structured

yet Pragmatic Testing Methodology, EuroSTAR 1993

THOM94 Neil Thompson RAD v. Trad: a Case Study, EuroSTAR 1994

end of document NTCES99WP.doc v1.0 Neil Thompson 10 Sep 99

Neil Thompson

Neil Thompson is a graduate in Natural Sciences who has worked forover 20 years in information systems (with a hardware manufacturer, twosoftware houses, a user organisation and two managementconsultancies). His roles have evolved through programming, systemsanalysis and project management, and he became a leading testing expertwith Coopers & Lybrand. Now an independent testing consultant andmanager, Neil works directly for blue-chip clients through his owncompany, sometimes in association with other consultancies or agencies.

He is a member of the British Computer Society’s specialist interestgroups in Software Testing and Configuration Management, and is anassociate of the Institute of Management Consultancy.

He presented papers to EuroSTAR in 1993 (Organisation beforeAutomation) and 1994 (RAD versus Trad.) and to the BCS SIGiST in1998 (Religion, Politics and Testing, from which this paper has evolved)

Zen and the Art of Object Oriented Risk …...How testing is based on risk management • 4. Applying object-oriented concepts to testing • 5. Role of metrics and measurement in

Documents