Top Banner
Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya
47

Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Dec 14, 2015

Download

Documents

Kara McCurdy
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Generating Complex Input(and its other applications)

Course Software Testing & Verification2013/14

Wishnu Prasetya

Page 2: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Content

• BNF to describe inputs • Generating inputs• Coverage• Regular expression• Other applications of regular expression

2

Important note: generating complex inputs is a non-trivial task. But this is only partially addressed by AO. E.g. chapter 5 set up the right background, but then they went on to focus more on mutation. Here we will complement that with additional materials. Usable sections from AO related to the issues of input generation are the following:

• Section 5.1.1 (very short!) about BNF• Section 2.7 about regular expression, but this section is actually about using regular

expression to e.g. calculate the needed test paths to deliver a given graph coverage. This is not directly related to input generation; but we will discuss this as well.

Page 3: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Describing “allowed” inputs

• Using Bakus Naur Form (BNF) notation / context free grammar (see example in p171) :

• Terminologies: start symbol, terminal, non-terminal, epsilon (not in book, just check Wikipedia, or the lecture notes of Languages & Compilers)

3

S Brace | Curly | Brace “(“ S “)” SCurly “{“ S “}” S

implicitly THREE production rules!

Page 4: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Production Rule

• A production rule has the form N Z, where N is a non-terminal and Z is a sequence of symbols.

• A rule like A a(B|C)d is seen as a short hand for a set of production rules:

A aBdA aCd

• People often use extended BNF e.g. : Brace ( “(“ S “)” )*

4

Page 5: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Example: NL post codes

• Sometimes there are additional constraints, e.g. codes above 9999 XL do not actually exists (do not map to an existing address). A constraint is not always expressible in BNF; or it is expressible but not conveniently.

5

NLpostcode Area Space StreetArea FirstDigit Digit Digit DigitStreet Letter LetterFirstDigit 1 | 2 ...Digit 0 | 1 | 2 ...Letter a | b | c ... | A | B | C ...Space “ “ *

Page 6: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Generating inputs

6

S Brace | Curly | Brace “(“ S “)” SCurly “{“ S “}” S

A derivation is a series of expansion of the grammar that result in a sequence of terminal symbols. It follows that the sequence is a valid sentence of the grammar. We can use this to generate valid sentences. Example :

S Brace ( S ) S ( ) S ( )

Page 7: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Derivation tree

7

S Brace // RSbS Curly // RScS // RSeBrace “(“ S “)” SCurly “{“ S “}” S

A derivation :S Brace ( S ) S ( ) S ( )

RSb

RBrace

( RSe ) RSe

A derivation can also be described by a derivation tree such as above. Given such a tree, you can reconstruct what the derived sentence is.

Page 8: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

One more example

8

RSb

RBrace

( RSc ) RSe

CRurly

{ RSe } RSe

Represent such a tree e.g. by:

data Dtree = Term String | RuleName [DTree]

To get the sentence from a d::DTree, simply flatten it.

S Brace // RSbS Curly // RScS // RSeBrace “(“ S “)” SCurly “{“ S “}” S

Page 9: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

A more generic formulation: basic idea

• Every derivation rule implicitly generates all derivation trees that begins with its LHS non-terminal. We will implement this by a function of type: type Gen = () [Dtree]

• Later on, we still have to select which trees to take.• To actually “run” a generator gen to produce

sentences we do: map flatten . select . gen $ () , with some implementation of “select”.

• May not terminate if you have infinite trees later.9

Page 10: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Combining generators• Combining generators: to interpret “alternatives”, and

rule and rule_ to interpret a production rule: :: Gen Gen Gen(f g) () = f() g() rule :: Name [Gen] Genrule n rhs

= ((). [ n | rhs1() rhs2 ... ] )

• And this operator to represent the interpretation of terminals:

term :: String Genterm v = (). [ Term v ] 10

Page 11: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Defining “rule”

• rule :: Name [Gen] Gen

Only showing a simpler case , suppose the rule was:rule1 : A BC

rule “rule1” [genB , genC] = (). [“rule1” [ t , u ] | t genB() , u genC() ]

11

Page 12: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Example

12

S Brace | Curly | Brace “(“ S “)” SCurly “{“ S “}” S

s = rule “RSb” [brace] rule “RSc” [curly] rule “RSe” [term “”]brace = rule “Rbrace” [term “(” , s , term “)” , s ]curly = rule “Rcurly” [term “{”, s , term “}” , s]

• Notice the similarity between the structures of the generator and the original grammar.• An initial generator is the generator that corresponds to the grammar initial symbol. So,

the function s above is an initial generator.

Page 13: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Only generate trees of depth < kMax

• The previous generators generate infinite number of trees, some may have infinite depth.

• In a non-lazy language, extend the generators so that it counts how far the current node is from the root:

• Adapting the combinators, e.g.:

13

term s k = if k 0 then [Term s] else []

(f g) k = f k g k

type Gen = Int [Derivation]

Page 14: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Defining “rule”

• type Gen = Int [Derivation]• rule :: Name [Gen] Gen

Only showing a simpler case :rule name [gen1,gen2] = newrule where newrule k = if k < 0 then [] else [ name [ t , u ] | t gen1 (k-1) , u gen2 (k-1) ]

14

Page 15: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Example revisited

• s 2 produces { "()", "{}", "" }• s 4 produces:

• Too many... However, you actually have the full derivation trees which you can exploit for filtering.

15

s = rule “s0” [brace] rule “s1” [curly] rule “s2” [term “”]brace = rule “brace” [term “(” , s , term “)” , s ]curly = rule “curly” [term “{”, s , term “}” , s]

["(())()","(()){}","(())","({})()","({}){}","({})","()()","(){}","()","{()}()","{()}{}","{()}","{{}}()","{{}}{}","{{}}","{}()","{}{}","{}","“]

Page 16: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

We can see...

• We can see which terminals are produced.• We can see which rules were used.• We can infer which non-terminals were

produced during the derivation.

16

S Brace | Curly | Brace “(“ S “)” SCurly “{“ S “}” S

RSb

RBrace

( RSc ) RSe

RCurly

{ RSe } RSe

Page 17: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

BNF coverage

• (C5.29) TR contains each terminal symbol from the given grammar G.

• (C5.30) TR contains each production rule in G.• Production coverage subsumes terminal coverage; but these

are usually too weak.• Pair-wise production coverage: TR contains every feasible pair

(R1,R2) of production rules. Feasible means that they can actually be applied in succession in a derivation from G.

• Can be generalized to k-wise, but may blow up the size of TR. Alternatively, if G is not too large you can still manually add new requirements to your TR.

17

Page 18: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Pair-wise production coverage

• A derivation tree t covers covers a pair rule R1;R2 if the pair appears as two consecutive nodes in in t.

• A set T of derivation trees gives full pair-wise production coverage if every feasible pair of rules R1;R2 is covered by some t in T.

• Analogously for k-wise coverage.

18

RSb

RBrace

( RSc ) RSe

RCurly

{ RSe } RSe

Page 19: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Example

• { “()” , “{}” } gives full terminal as well as production coverage.

• Combinations of brace-curly and curly-brace can only be enforced by pair-wise coverage.

• But none of those coverage criteria can distinguish between e.g. ({}) and (){}

19

S Brace | Curly | Brace “(“ S “)” SCurly “{“ S “}” S

Page 20: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Rule-rule coverage

• alts(N) = the set of production rules of non-terminal N; alts(R,i) = the set of production rules of the i-th symbol of the rule R; equal to alts(N) if N is the non-terminal at i-th pos.

• A derivation tree t covers R ;i R’ if it R’ appears as the i-th child of some R in t.

• Each Rule-Rule Coverage (ERRC): for every rule R and every applicable i, TR includes every R;i R’ for every R’ alts(R,i).

• For example, TR includes: RBrace ; 0 RSb

20

S Brace | Curly | // RSb. RSc, RSeBrace “(“ S “)” S // RBraceCurly “{“ S “}” S // RCurly

Page 21: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

ERRC example

• Importantly, ERRC also requires these to be in TR: – <Brace ;1 RSb>, <Brace;1 RSc>, <Brace;1 RSe> – <Brace ;3 RSb>, <Brace;3 RSc>, <Brace;3 RSe>– Similarly for Curly

• Just ({}) covers Brace;1 RSc , but not Brace;3 RSc• Similarly (){} covers Brace;3 RSc, but not Brace;1 RSc • Example of a tests-set giving full ERRC coverage:

– () , ({}), (){}, (()), ()(), {}, {()}, {}(), {{}}, {}{} 21

S Brace | Curly | // RSb,c,eBrace “(“ S “)” S // RBraceCurly “{“ S “}” S // RCurly

Page 22: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

ERRC example

• Importantly, ERRC also requires these to be in TR: – <Brace ;1 RSb>, <Brace;1 RSc>, <Brace;1 RSe> – <Brace ;3 RSb>, <Brace;3 RSc>, <Brace;3 RSe>– Similarly for Curly

• ERRC does not force you to cover all “combinations” ARRC next slide, but this may produce a very large TR.

22

S Brace | Curly | // RSb,c,eBrace “(“ S “)” S // RBraceCurly “{“ S “}” S // RCurly

Page 23: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

All-combinations

• Let R be a rule producing k non-terminals. A combination of R is a vector c of : R;1 R’1 , ... , k R’k

• A derivation tree t covers such a combination c if it appears as sibling labels in t.

• All Rule-Rule Coverage (ARRC): for every rule R TR includes every combinations of R.

23

RSb

RBrace

( RSe ) RSe

Page 24: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Subsumption

24

ARRC

ERRC

pair-wise production coverage

production coverage

terminal coverage

3-wise production coverage

Page 25: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Regular expression

• Example: (aa | bb)* , ( “(“ ”)” | “{“ “}” )*• Easy to write, but not as expressive as BNF.• Syntax :

rexp | rexprexp rexprexp*rexp+( rexp )

25

Page 26: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

The sentences of an Rexp

• L(e) = the set of sentences described by the rexp e.• Defined as below :

26

L(e*) = { } L(e+)L(e+) = L(ee*)

L(e | f ) = L(e) L(f)

L(de) = { s++t | sL(d), tL(e) }

Page 27: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Regular expression

• Can be equivalently described by a BNF grammar, but this is beyond our scope; check the course Languages and Compilers.

• In practice people use e.g. POSIX extension; e.g. to describe NL post codes:

[1..9][:digit:] [:digit:] [:digit:][:blank:]*[:alpha:][:alpha:]

27

Page 28: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Generating sentences

• Discussed in AO. We’ll generalize; let’s represent a regular expression with values of this type:

• is just Term “”• AO also has eM-N, for iterating e at least m times and

at most n times. But this can be expressed with Seq and Alt.

28

data Rexp = Term String | Seq Rexp Rexp | Alt Rexp Rexp | Star Rexp

Page 29: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Generating sentences

• Will generate all derivable string, but of course may not terminate.

• Make it finite, e.g. by only expanding Star finite times.

29

gen :: Rexp [String]gen (Term s) = [s]gen (Alt d e) = gen d gen egen (Seq d e) = [ s++t | sgen d, tgen e ]gen (Star e) = { } gen (Seq e (Star e))

Page 30: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Another application, representing your control flow graph

• The language described by a regular expression can equivalently be described by a state automaton. Example:

• Such an automaton can be “executed” by following the arrows from the initial state to a final state. This produces the corresponding sequence of “labels”, which is then the sentence generated by the execution. 30

a*b(c|d)ef

a

b

c

d

e f (final state)(initial state)

Page 31: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

representing control flow graph (CFG) with Rexp

• More on the equivalence between regular expressions and state automata is discussed in the course Languages & Compilers.

• Notice that a state automaton is a graph! So, it can be seen as describing a control flow graph. It follows that we can represent a CFG with a regular expression.

• To distinguish the arrows in the CFG, we will first assign a unique label to each.

31

Page 32: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Simple example

• Not so complicated, but things can get a bit confusing when you have nested loops.

• Next slides describe a conversion algorithm; this is from AO 2.7.1

32

a*b(c|d1d2)ef

a

bd1

c e f(exit node)(entry

node) d2

Page 33: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

© Ammann & Offutt 33

You can merge sequential edges

• Assuming one single end-node; else add virtual end.• Combine/multiply sequential edges• Example: combine edges h and i

g

a0 21 3 4 6

c

hif

e

d

b

a0 21 3 54 6

c

ih

g

f

e

d

b

Introduction to Software Testing (Ch 2)

Page 34: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

© Ammann & Offutt 34

You can merge parallel edges

• Combine parallel edges (edges with the same source and target)

• Example : Combine edges b and c

g

a0 21 3 4 6

c

hif

e

d

b

g

a0 21 3 4 6

hif

e

db + c

Introduction to Software Testing (Ch 2)

Page 35: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

© Ammann & Offutt 35

You can remove self-Loops

• Combine all self-loops (loops from a node to itself)• Add a new “dummy” node • An incoming edge with exponent• Merge the resulting sequential edges with multiplication

g

a0 21 3 4 6

hif

e

db + c

g

a0 21 4 6

hib + c3

fe*d3’

Introduction to Software Testing (Ch 2) de*f2 4

Page 36: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

© Ammann & Offutt 36

You can remove “middle node”

• A middle node not an initial nor final node.• Replace the middle node by inserting edges from all

predecessors to all successors.• But the middle node should not self-loop.• Multiply path expressions from all incoming with all outgoing

edges

CA

B

3

2 5

1 4

D

AC

AD

BC

2 5

1 4

BD

Introduction to Software Testing (Ch 2)

Page 37: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

© Ammann & Offutt 37

Example of removing middle• Remove node 2• Edges (1, 2) and (2, 4) become one edge• Edges (4, 2) and (2, 4) become a self-loop

g

a0 21 4 6

hib + cde*f

a0 1 4 6

hibde*f + cde*f

gde*fIntroduction to Software Testing (Ch 2)

Page 38: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Keep doing it until only one edge is left …

Introduction to Software Testing (Ch 2)

© Ammann & Offutt 38

a0 1 4 6

hibde*f + cde*f

gde*f

0 4 6hiabde*f + acde*f

gde*f

hiabde*f + acde*f (gde*f)*0 4 64’

0 6abde*f (gde*f)* hi + acde*f (gde*f)* hi

Page 39: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Applications

• The obvious one: we can use gen exp to get the set of “all” possible test paths through the CFG, but this is perhaps not very useful because what we ultimately want are test cases.

• But you can calculate some other useful information.

39

a*b(c|d1d2)ef

a

bd1

c e f(exit node)(entry

node) d2

Page 40: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Calculating the number of paths in the CFG

• Iterating more than once is treated equivalent as iterating just once.

• AO: you can do that by “transforming” your “ expression: a*b(c|d1d2)ef (|a)b(c|d1d2)ef (1 + 1)*1*(1 + 1*1)*1 = 4 40

a*b(c|d1d2)ef

a

bd1

c e f(exit node)(entry

node) d2

Page 41: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

But we can of course also doing it like this...

• Notice similar structure as in “gen”; we just use different operators.

41

cnt :: Rexp Intcnt (Term s) = 1cnt (Alt d e) = cnt d + cnt ecnt (Seq d e) = cnt d * cnt ecnt (Star e) = 1 + cnt e

Page 42: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Other applications

1. Calculating the longest path through the CFG/regular-exp

2. Calculating the minimum number of paths that would cover all branches (assuming loops always have an exit edge)

3. Calculating a minimalistic set of test paths that would satisfy (2).

4. ...42

a*b(c|d1d2)ef

a

bd1

c e f(exit node)(entry

node) d2

Page 43: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

We can do them too, by folding...

43

maxLength :: Rexp IntmaxLength(Term s) = length smaxLength(Alt d e) = maxLength d `max` maxLength emaxLength(Seq d e) = maxLength d + maxLength e maxLength (Star e) = maxLength e

minCnt:: Rexp IntminCnt(Term s) = 1minCnt(Alt d e) = minCnt d + minCnt eminCnt(Seq d e) = minCnt d `max` minCnt e minCnt(Star e) = 1 + minCnt e

minPaths :: Rexp [String] ... do this yourself.

Page 44: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Complementary operation analysis

• Example of “complementary” operations (here called C/create and D/destruct): – push and pop– fileOpen and fileClose– getLock and releaseLock

• an execution path that contain more destructs than creates is suspicious not necessarily an error, because at the actual run it may still happen that some of the destructs simply have no effect.

• Actually, #C#D should hold along any prefix of an execution path.

44

Page 45: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Complementary operation analysis

• Given an execution s, let (t) = number of C’s in t – number of D’s in t.

• (t) has to be 0, otherwise unsafe.• Actually, for all prefixes s of t check that (s) 0• But can we check this for all executions of the CFG?

45

(CD* D | CD) D

D

C

C

D

D

D

Page 46: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

Complementary operation analysis

• Consider the regexp that equivalently describes the graph.

• We’ll write a function :: Rexpr [Formula] to generate formulas describing all possible ’s of all sentences of the regular expr.

• Checking safety is then “simple” : safe :: Rexpr Bool safe e = all [ f 0 is valid | f e ]

46

(CD* D | CD) D

D

C

C

D

D

D

Page 47: Generating Complex Input (and its other applications) Course Software Testing & Verification 2013/14 Wishnu Prasetya.

The algorithm

47

:: Rexp [Formula] C = [ “1” ] D = [ “-1” ] = [ “0” ] (Alt d e) = d e(Seq d e) = [ k + m | k d , m e ] (Star e) = [ k * n | k e ], where n is a fresh name

To also generate constraints over prefixes, extend to produce two components, to also produce formulas for the prefixes, e.g.:

(Seq d e) = (1d 2d , [ k + m | k 2 d , m 2e ] )