The SOUL Tool Suite for Querying Programs in Symbiosis with Eclipse

The SOUL Tool Suite forQuerying Programs

in Symbiosis with EclipseCoen De Roover, Carlos Noguera, Andy Kellens, Viviane Jonckers

Principles and Practice of Programming in Java (PPPJ11)

identify code with characteristics of interest

data flow

control flow

structural... commonly implemented by tool builders

Querying Programs?

e.g.

find and highlight idiomatic uses of reflection

plugin

{

Logic Program Querying... specify characteristics through logic conditions

quantify overreified program representationAST, CFG, PTA

... leave operational search to logic evaluator

Logic Program Querying

single resultvariable bindings

variable bindingreified AST node

... specify characteristics through logic conditions

... leave operational search to logic evaluator

Logic Program Querying... many instantiations, yet seldom exploited by tools

H1: queries difficult to specifyH2: results difficult to exploit

adoption hurdles? Soul

TyRuBaCodeQuest

PQLJTL

ASTLog JTransformerLogEn

LogicAJ

GenTLDeepWe

aver

H1 Illustrated... specifying queries is difficult

intricate detailsprogram representation + reification

implementation variants data flow + control flow

H2 Illustrated... exploiting results is difficult

deifying a variable bindingfinding corresponding element

code changesrepresentation vs reified representation

b

a

c

d

prevalent reification of AST= transcription to logic facts

node(a,<node(b,<>), node(c,<node(d,<>)])).

e.g.

node(a,<b,c>).node(b,<>).node(c,<d>).node(d,<>).

e.g.variable binding

corresponding node in IDE

The SOUL Tool Suite... an architectural overview

logic predicates

Soul evaluator

Cava library

Eclipse platform

Barista

JDT quantifies over

interacts withBa

rista

UI

Smar

t An

nota

tions UI for

prototyping queries

plugin frameworkon-demand + scheduled

case study

reify AST node relations

Addressing H2... exploiting results is difficult



Look Ma, No Fact Base!... reify(ASTNode)=ASTNode

through linguistic symbiosis objects as first-class

logic values

if ?m methodDeclarationHasName: ?n

MethodDeclaration

SimpleName

Cava library

JDT

quantifies over


through linguistic symbiosis expressions

in logic queries


addresses H2



trivial

Addressing H1... specifying queries is difficult

intricate detailsprogram representation + reification

implementation variants data flow + control flow

Addressing H1... example-driven specification

developer exemplifies implementation of characteristics

e.g.

but also

or

if jtExpression(?exp){Class.forName(?string)}

if jtClassDeclaration(?class){ class ?className { private ?fieldDeclaration ?fieldName; ?modList ?returnType ?methodName(?parameterList) { return ?fieldName; } } }

if jtMethodDeclaration(?m){ ?modList ?returnType ?methodName(?paramList) { ?class := Class.forName(?string); ?instance := ?class.newInstance(); (?type) ?casted; }},?casted mustAliasWith: ?instance

evaluator finds implicitimplementation variants

Addressing H1... demystified

if jtMethodDeclaration(?m){ public static void main(String[] args) { ?scanner := new Scanner(?argList); ?scanner.close(); ?scanner.next(); }}

lenient matching of code templates varies from AST-based to flow-based

domain-specific unification procedureoccurrences of variable can be bound to implementation variants

an ASTNode an ASTNode identical

a Type a Type same or co-variant return types

a MethodInvocation a MethodDeclaration may invoke

an Expression an Expression may-alias

@Target(ElementType.METHOD)@TargetVariable("method") public @interface Getter { @Necessary public static final String NAMING_CONVENTION = "?method methodDeclarationHasName: {get*}";

@Sufficient public static final String STRUCTURAL = "jtClassDeclaration(?class){ class ?className { private ?fieldDeclaration ?fieldName; ?modList ?returnType ?methodName(?parameterList){ return ?fieldName; } } }";}

Case Study... revisiting Smart Annotations

A. Kellens, C. Noguera, K. De Schutter, C. De Roover, and T. D’Hondt. Co-evolving annotations and source code through smart annotations.Conference on Software Maintenance and Re-engineering (CSMR10)

co-evolve code and annotationsembed usage constraints

all results should be annotated

all annotated should be result


to gather usage constraints

system. To extract these queries from the annotation-type defini-tions, the SMART ANNOTATIONS plugin uses the IBarista inter-face to launch a SOUL query that quantifies over the annotationtype definitions and retrieves the constraints governing these anno-tations.

The gathered queries are then returned to BARISTA andscheduled for execution. When the scheduled queries are ex-ecuted, BARISTA invokes the resultsDone() method on aSmartAnnotationQueryManager class. Once the scheduledqueries representing the annotation constraints are executed byBARISTA, the SMART ANNOTATIONS plugin will retrieve the bind-ings for violations (these are actual ASTNode objects), and will cre-ate a warning marker for each of these violations.

6.3 Comparing Eclipse Search API and BARISTA

In order to illustrate the advantages that BARISTA and SOUL bringto tool builders, we compare how the first step (i.e., the gatheringof annotation constraints) would be implemented using Eclipse’ssearch API and using BARISTA and SOUL.

Gathering constraints with Eclipse Search API Eclipse, throughthe Java Development Tooling (JDT), provides developers with asearch API for Java programs that offers other plugins search fa-cilities to find source code element declarations and references.The JDT search API centers around three main concepts: searchpatterns specify the characteristics of the sought source code ele-ments, search scopes restrict where the source code element willbe found, and search requestors accumulate results provided by theAPI’s search engine.

In the first step in the checking of annotation constraints, theSMART ANNOTATIONS plugin must find public static final fields oftype String which belong to annotation-type declarations, and thatare annotated with either @Necessary or @Sufficient. A searchpattern expressing this constraint is rather difficult to express usingthe JDT’s search API, since the API is geared towards expressingthe characteristics of a single code entity (i.e., its type) and not re-lationships between entities (i.e., the field belongs to an annotation-type). Thus we opt for an over-approximating query that relies onthe fact that we are searching for annotated items.

Figure 7 presents the snippet, that out of a Java project,finds the constraints, the field in which they are defined and theannotation-type to which they are associated. Lines 1–5 and 7–11 construct search patterns for @Necessay and @Sufficient-annotated elements. These patterns are composed into an or pat-tern allConditions in lines 13–14. The scope of the search isrestricted to the project jProject in lines 16–18, and a custom re-questor SimpleRequestor that will accumulate the annotations isconstructed in line 20. Finally, the search is done in lines 24-27.Results gathered by the requestor are navigated in lines 29–37, us-ing the getAncestor() method to obtain the field containing theconstraint and the annotation-type associated with it.

Gathering constraints using BARISTA Figure 8 contains the list-ing for gathering the SMART ANNOTATIONS constraints out of an-notation types using BARISTA’s interfaces. The query is performedin two steps. First, an evaluator (lines 1–11) is constructed. Theevaluator is obtained from an IBarista instance (cf. 5.1), a Stringcontaining the query, and the project javaProject in which thequery is to be run. The query itself (lines 3–10) will provide bind-ings for the annotation type (?type), the field that contains the con-straint (?field) as well as the constraint itself (?rule). Resultsfor the query are obtained by invoking getAllResults() on theevaluator, and extracting the map from variables to bindings (lines13 and 15). A loop then iterates over the solutions, extracting thebindings for the constraint, kind of rule and annotation type. (lines17–22).

1 SearchPattern allNecesary = SearchPattern.createPattern(2 "be.ac.vub.smart_annotations_library.Necessary",3 IJavaSearchConstants.ANNOTATION_TYPE ,4 IJavaSearchConstants.ANNOTATION_TYPE_REFERENCE ,5 SearchPattern.R_EXACT_MATCH );

7 SearchPattern allSufficient = SearchPattern.createPattern(8 "be.ac.vub.smart_annotations_library.Sufficient",9 IJavaSearchConstants.ANNOTATION_TYPE ,

10 IJavaSearchConstants.ANNOTATION_TYPE_REFERENCE ,11 SearchPattern.R_EXACT_MATCH );

13 SearchPattern allConditions = SearchPattern.14 createOrPattern(allNecesary , allSufficient );

16 IJavaSearchScope projectScope = SearchEngine17 .createJavaSearchScope(18 new IJavaElement [] { jProject }, false );

20 SimpleSearchRequestor requestor = new SimpleSearchRequestor ();

22 SearchEngine engine = new SearchEngine ();

24 engine.search(allConditions ,25 new SearchParticipant []26 {SearchEngine.getDefaultSearchParticipant () },27 projectScope ,requestor , null);

29 for (Object res : requestor.getResult ()) {30 IAnnotation ann = (IAnnotation) res;31 IField field = (IField) ann.32 getAncestor(IJavaModel.FIELD );33 IType annotation = (IType) ann.34 getAncestor(IJavaModel.TYPE);35 String rule = (String) field.getConstant ();36 // handle constraint

37 }

Figure 7. Gathering constraints using Eclipse search API

1 IEvaluator eval = barista2 .query(3 "if ?type annotationTypeDeclarationHasName: ?annotation ,"4 +"?type definesVariable: ?field , "5 +"?field variableDeclarationFragmentHasInitializer: ?rule ,"6 +"?field variableDeclarationFragmentHasName: ?rulename ,"7 +"?field variableHasAnnotation: ? named: ?annotationType ,"8 +"or(9 equals (? annotationType ,{ Necessary}),

10 equals (? annotationType ,{ Sufficient }))",11 javaProject , "Evaluator", "JavaEclipse");

13 IResults iresults = eval.getAllResults ();

15 Map <String , List <Object >> results = iresults.toMap ();

17 for (int i = 0; i < iresults.getSize (); i++) {18 ASTNode annType = (ASTNode) results.get("?type").get(i);19 String type = results.get("?annotationType").get(i). toString ();20 String ruleString = results.get("?rule").get(i). toString ();21 // Handle constraint

22 }

Figure 8. Gathering constraints using BARISTA

11 2011/7/6











37 }







22 }


11 2011/7/6


7. DiscussionComparing BARISTA and Eclipse JDT Search In the previoussection, we presented two possible ways of gathering informationabout Java source code in the context of the SMART ANNOTATIONSplugin: through the Eclipse JDT Search API and through our toolsuite. We believe that the latter is not only less operational in nature,but also conceptually simpler. In the case of the Search API, thedeveloper must deal with search patterns, scopes and requestors.She must decompose the query into a (number) of patterns, eachone characterizing individual nodes, and then specify the scopeof the query. Additionally, the developer has to implement a classthat extends SearchRequestor, which will be called back by theEclipse search engine for each matched element.

In contrast, the BARISTA interface to SOUL allows for a simpli-fied interaction. Given a SOUL query and a Java project to evaluateit against, the developer is provided a map that contains the solu-tions to the query. Additionally, a single query can provide severalresults in each solution: the query in Figure 8 results in bindings forthe annotation type, field name and constraint. The Eclipse searchAPI, on the other hand, will only return a list of single matches.Reconstructing the relationships between the resulting source codeelements is left up to the developer. For example, by navigating the

using logic program queries

result variables bound to

annotation type declaration AST node

of fields annotated with @neccesary or @sufficient within an annotation type declaration: - name - initializer - annotation type declaration


system. To extract these queries from the annotation-type defini-tions, the SMART ANNOTATIONS plugin uses the IBarista inter-face to launch a SOUL query that quantifies over the annotationtype definitions and retrieves the constraints governing these anno-tations.

The gathered queries are then returned to BARISTA andscheduled for execution. When the scheduled queries are ex-ecuted, BARISTA invokes the resultsDone() method on aSmartAnnotationQueryManager class. Once the scheduledqueries representing the annotation constraints are executed byBARISTA, the SMART ANNOTATIONS plugin will retrieve the bind-ings for violations (these are actual ASTNode objects), and will cre-ate a warning marker for each of these violations.

6.3 Comparing Eclipse Search API and BARISTA

In order to illustrate the advantages that BARISTA and SOUL bringto tool builders, we compare how the first step (i.e., the gatheringof annotation constraints) would be implemented using Eclipse’ssearch API and using BARISTA and SOUL.

Gathering constraints with Eclipse Search API Eclipse, throughthe Java Development Tooling (JDT), provides developers with asearch API for Java programs that offers other plugins search fa-cilities to find source code element declarations and references.The JDT search API centers around three main concepts: searchpatterns specify the characteristics of the sought source code ele-ments, search scopes restrict where the source code element willbe found, and search requestors accumulate results provided by theAPI’s search engine.

In the first step in the checking of annotation constraints, theSMART ANNOTATIONS plugin must find public static final fields oftype String which belong to annotation-type declarations, and thatare annotated with either @Necessary or @Sufficient. A searchpattern expressing this constraint is rather difficult to express usingthe JDT’s search API, since the API is geared towards expressingthe characteristics of a single code entity (i.e., its type) and not re-lationships between entities (i.e., the field belongs to an annotation-type). Thus we opt for an over-approximating query that relies onthe fact that we are searching for annotated items.

Figure 7 presents the snippet, that out of a Java project,finds the constraints, the field in which they are defined and theannotation-type to which they are associated. Lines 1–5 and 7–11 construct search patterns for @Necessay and @Sufficient-annotated elements. These patterns are composed into an or pat-tern allConditions in lines 13–14. The scope of the search isrestricted to the project jProject in lines 16–18, and a custom re-questor SimpleRequestor that will accumulate the annotations isconstructed in line 20. Finally, the search is done in lines 24-27.Results gathered by the requestor are navigated in lines 29–37, us-ing the getAncestor() method to obtain the field containing theconstraint and the annotation-type associated with it.

Gathering constraints using BARISTA Figure 8 contains the list-ing for gathering the SMART ANNOTATIONS constraints out of an-notation types using BARISTA’s interfaces. The query is performedin two steps. First, an evaluator (lines 1–11) is constructed. Theevaluator is obtained from an IBarista instance (cf. 5.1), a Stringcontaining the query, and the project javaProject in which thequery is to be run. The query itself (lines 3–10) will provide bind-ings for the annotation type (?type), the field that contains the con-straint (?field) as well as the constraint itself (?rule). Resultsfor the query are obtained by invoking getAllResults() on theevaluator, and extracting the map from variables to bindings (lines13 and 15). A loop then iterates over the solutions, extracting thebindings for the constraint, kind of rule and annotation type. (lines17–22).










37 }







22 }


11 2011/7/6











37 }







22 }


11 2011/7/6


7. DiscussionComparing BARISTA and Eclipse JDT Search In the previoussection, we presented two possible ways of gathering informationabout Java source code in the context of the SMART ANNOTATIONSplugin: through the Eclipse JDT Search API and through our toolsuite. We believe that the latter is not only less operational in nature,but also conceptually simpler. In the case of the Search API, thedeveloper must deal with search patterns, scopes and requestors.She must decompose the query into a (number) of patterns, eachone characterizing individual nodes, and then specify the scopeof the query. Additionally, the developer has to implement a classthat extends SearchRequestor, which will be called back by theEclipse search engine for each matched element.

In contrast, the BARISTA interface to SOUL allows for a simpli-fied interaction. Given a SOUL query and a Java project to evaluateit against, the developer is provided a map that contains the solu-tions to the query. Additionally, a single query can provide severalresults in each solution: the query in Figure 8 results in bindings forthe annotation type, field name and constraint. The Eclipse searchAPI, on the other hand, will only return a list of single matches.Reconstructing the relationships between the resulting source codeelements is left up to the developer. For example, by navigating the

search patterns for@necessary and @sufficient

manually navigate upwards to field declaration

limited pattern composition

filter out non-fieldsgather results manually

to gather usage constraintsusing org.eclipse.jdt.core.search

inter-node relations difficult

Conclusions... tool suite for querying Java programs

address adoption hurdles

example-driven

specification

SOULlogic program

query languageCava

predicate libraryBarista

Eclipse plugin

H1: queries difficult to specify

H2: results difficult to exploit

reification through

symbiosis

extension points+

query prototyping

[email protected]/SOUL/

Want to know more?example-driven pattern detection@ICSM UML templates as queries@ICSM

history querying @WCRE

3) Co-changing Classes: As the third example in this sec-tion, we consider an ABSINTHE query that detects classes thatconsistently changed together. Finding co-changing classes isinteresting, for they might hint at the presence of hiddendependencies. This query will find pairs of classes that, startingfrom a version where they first changed together, were alwayschanged together from then on.

The query expressing this pattern is depicted in Figure 8.The main components of the query are:

• Lines 4–9 consume versions on the path until two differ-ent classes change together. The predicate wasChanged/1

(lines 7 and 9) check if its argument got changed in thecurrent version of the path.

• Lines 10–14 consume one or more versions in whichclasses ?classA and ?classB are consistently changedtogether (lines 10–12) or not changed at all (lines 13 and14).

• Finally, the path expression succeeds if we reach the endof a path (terminal on line 15).

Notice that line 12 binds the current version at that pointin time to the logic variable ?changedVersion. By doing this,solutions to this query will contain bindings for all versionsin which the two classes changed together.

C. Temporal Bad smell: Zombie Code

1 if2 ?start isOrigin,3 e(4 (true)*<,5 and(?m isMethodWithName:?name inClass: ?c,6 ?invoker methodSendsMessage:?name),7 (and(?m isMethodWithName:?name inClass: ?c,8 (? methodSendsMessage:?name) not))+<,9 and(?m isMethodWithName:?name inClass: ?c,

10 ?newInvoker methodSendsMessage:?name))11 matches: ?path12 start: ?start13 end: ?end

Fig. 9. Detecting zombie methods.

As a final example we illustrate the use of ABSINTHE todetect a temporal bad smell. These are bad smells that onlybecome apparent when analyzing the evolution of the sourcecode of a system. In particular, we introduce the concept ofzombie code. This bad smell is defined as pieces of code inthe system that stop being used in a particular version but thatare not removed (i.e., become dead code), and that are usedagain in a later version. While the presence of zombie codeis not necessarily an indicator for a problem in the system,instances of this bad smell might point at uses of implicitlydeprecated code.

Figure 9 depicts the ABSINTHE query that retrieves in-stances of zombie methods. This query consists of a singlequantified regular path expression that matches a sub-path thatexhibits the following characteristics:

• Lines 4–6 advance until a version that contains methods?m with name ?n defined in a class ?c that are called bya method ?invoker.

• The second part (lines 7–8) matches one or more versionswhere the method ?m is still present, but no callers arefound (i.e., the method is implicitly deprecated).

• Finally, the third part (lines 9–10) matches a version inwhich the method ?m is still present, and it is again calledby method ?newInvoker.

When evaluated, this query will bind the logic variable ?m toall methods considered as a zombie method. The logic variable?end will be bound to the first version where the zombiemethod gets called again. Notice that we also retrieve bindingsfor the methods ?invoker and ?newInvoker: the query doesnot only provide information regarding the zombie methods,but also indicates all callers of such methods.

V. DISCUSSION AND FUTURE WORK

a) Performance of ABSINTHE: As this paper focuseson introducing the ABSINTHE tool and in particular on thesuitability of quantified regular path expressions to querythe history of a system, we have not performed extensivebenchmarking of our tool.

However, in order to obtain indications of the performanceof our tool, we have applied all the queries presented inSection IV to the last two years of development history of theSOUL program query language. Our repository representationof this part of the history includes 179 versions; on averageeach version contains 244 classes and 2369 methods (10KLoCper version). As our representation only stores entities thathave changed (see Section III-A), it only consumes a total of43Mb of memory.

Early results show that the tool performs adequately. Forexample, the query in Section IV-B1 (Figure 6) identifies allclasses that the first author of the paper has been working on,and that were also modified by other developers in 23 seconds.Solutions for most of the other queries in this paper are foundwithin one minute.

Note that the time it takes to evaluate a query correspondsto the size of the search space that needs to be investi-gated. Consider the query for identifying co-changing classes(Section IV-B3). When launched with bindings for variables?classA and ?classB (i.e., to verify whether two classes areco-changing), evaluation terminates in less than half a second.When this query is launched with one variable bound (i.e.,to find all co-changing classes of a particular class), our toolrequires 30 seconds to find all solutions. However, launchingthis query with both variables unbound (i.e., to find all pairsof co-changing classe) considerably widens the search space.For every possible pair of classes in the history of the system,ABSINTHE has to verify whether they are co-changing. Thistakes approximately two and a half hours before all answersare computed. However, it only takes about 40 seconds toproduce the first result.

There are a number of optimization opportunities in thecurrent implementation of ABSINTHE. Internally, our imple-

1 if jtClassDeclaration(?subjectClass){2 class ?subjectName {3 ?mod1List ?t1 ?observers = ?init;4 public ?t2 ?addObserver( ?observerType ?observer ) {

5 ?observers .?add( ?observer );6 }7 public ?t3 ?removeObserver( ?observerType ?otherObserver) {

8 ?observers .?remove(?otherObserver);9 }

10 ?mod2List ?t4 ?notifyObservers(?param1List) {11 ?observers ;12 ?observer . ?update (?argList);13 }14 }15 },

16 jtClassDeclaration(?observerClass){17 class ?observerName {18 ?mod3List ?t5 ?update (?argList) {}19 }20 },

21 jtExpression(?register){ ?subject.?addObserver( ?lapsed ) },

22 not(jtExpression(?unregister){ ?subject.?removeObserver( ?lapsed ) }),

23 jtExpression(?alloc){ ?lapsed := new ?observerName (?argList) }

Figure 6: Example-based specification for the Observer design pattern [GHJV94] (lines 1– 21) and the lapsed listenerpitfall [Liv05] in its implementation (lines 1–25).

Detecting Lapsed Listeners

“Lapsed listeners” [Liv05] are observer participants in implementations of the Observer design pattern that are no longerneeded, but never unregister from their subject. Lines 1–21 of Figure 6 exemplify the classes that participate in the designpattern (i.e., ?subjectClass and ?observerClass). Lines 23–25 exemplify the lapsed listener pitfall at the instance-level:as instances of the participating classes that exhibit the characteristics of the pitfall. They identify ?lapsed objects thatare added to a ?subject (line 23), but never removed from it (line 24). The final condition is optional. It identifies theexpression that instantiated the lapsed object. Lines 1–21 of the specification are therefore situated at the class-level,while lines 23–25 are situated at the instance-level.

Note that the depicted specification only detects possible lapsed listeners. It does not identify the point in the pro-gram’s execution after which an observer is no longer needed, nor does it specify that the ?unregister expression shouldbe executed after the ?register expression. It can therefore only be used to issue warnings.

5.2 Scientific Activities

5.2.1 Industrial Validation

We have used SOUL++ to detect potentially enhanceable for-statements (cf. Figure 4), violations against the protocol ofan API, design pattern [GHJV94] instances in JHOTDRAW, pitfalls in the implementation of design patterns, violationsagainst coding conventions of the AMBIENTTALK [Amb] interpreter and instances of µ-patterns [GM05].

5.2.2 Refined Specification Language

5.2.3 Methodology for Pattern Specification

5.2.4 Real-time Detection Mechanism

We will shift our focus in the remaining research efforts from the specification language to the detection mechanism ofour example-driven approach to pattern detection. As this approach is founded in logic programming, we will mainlyadapt results from this domain to our specific problem setting. Other pattern detection tools that are based on variants

14

mailto:[email protected]

mailto:[email protected]

One more thing;)

inspector on Smalltalk proxy for Java object

The SOUL Tool Suite for Querying Programs in Symbiosis with Eclipse

Technology

logic program queries

logic program querying

logic conditions

logic factse

n n o t

string field

queries difficult

t soulh1