IEEE TRANSACTIONS The Roles Execution and … at anystep ofthe creation process.2 Indeveloping AL, wetried to identify a set ofalgorithmic objects that correspond closely to the conceptual

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-ll, NO. 11, NOVEMBER 1985

The Roles of Execution and Analysisin Algorthm Design

DAVID M. STEIER AND ELAINE KANT

Abstract-The analysis and execution of partial algorithm descrip-tions is an important part of the algorithm design process (as is borneout by studying the behavior of human algorithm designers). In thispaper, we describe a language for representing partially designed al-gorithms and a process, developmental evaluation, that can discoveruseful knowledge to guide design. Using these and other results fromour research in artificial intelligence, we are building a system, DE-SIGNER, that autbmatically designs algorithms. This paper also com-pares developmental evaluation to execution and analysis techniques usedfor testing complete programs and for validation of abstract specifica-tions; concepts similar to those found in developmental evaluation arethus shown to apply to all stages of the software life cycle.

Index Terms-Algorithm design, automatic programming, develop-mental evaluation, meta-evaluation, symbolic execution.

I. INTRODUCTIONOUR studies of how people design algorithms [21], [22]

reveal that the execution and analysis of partiallycompleted designs is frequently used to guide design inthe absence of specific knowledge. Such execution andanalysis assess the consequences of the current set of de-sign decisions to help decide on a next step. A companionpaper in this issue, "Understanding and Automating Al-gorithm Design," provides an overview of our frameworkfor algorithm design. In this paper, we first describe therepresentation for algorithms in the framework. Then wediscuss the execution and analysis procedure (which wewill call developmental evaluation) and illustrate it withan example. The last part of the paper surveys related workin the software engineering and artificial intelligence lit-erature and identifies computational support requirementsfor developmental evaluation.

Since the literature comes from subfields of computerscience that until recently have focused on different is-sues, it is helpful to begin by defining some terms. Thephrase "developmental evaluation" is our own; we intro-duce it because no term currently in widespread use in-cludes all the techniques we wish to discuss. We intenddevelopmnental evaluation to include the following:

-Symbolic execution, the production of a symbolicexpression for each output of a program in terms of theinputs by dynamic interpretation of a program. It allowsinput data to contain not only concrete values such as

Manuscript received May 1, 1985; revised July 3, 1985. This work wassupported in part by the Defense Advanced Research Projects Agency un-der Contract F336 15-81-K-1539 and in part by the National Science Foun-dation under Grant DCR-84 12139.

D. M. Steier is with the Department of Computer Science, Carnegie-Mellon University, Pittsburgh, PA 15213.

E. Kant is with Schlumberger-Doll Research Ridgefield, CT 06877, onleave from the Department of Computer Science, Carnegie-Mellon Univer-sity, Pittsburgh, PA 15213.

"point B at coordinates (1, 5)," but also abstract symbolssuch as "an arbitrary point." When all items are concretedata values, we call this process test-case execution, whichis similar to standard program interpretation to obtain anactual result.

* Symbolic evaluation, the production of symbolicexpressions for program outputs, but by static analysisrather than dynamic interpretation. The distinction be-tween evaluation and execution is due to Cheatham [8].

* Meta-evaluation, static analysis of high-level specifi-cations to resolve ambiguities. We use meta-evaluation todenote the mechanism described by Balzer [4] rather thanthe more rigorous procedure described by Yonezawa andHewitt [41].

* Partition analysis, a procedure described by Richard-son and Clarke [29] to execute both the specification andthe implementation of a module and compare the results.

* Simulation of mental models, a process to detect un-expected interactions between new elements in a plan ordesign. This use of simulation was introduced in the cog-nitive science literature discussing planning [17] and soft-ware system design [1].

Developmental evaluation incorporates concerns fromtraditional symbolic execution and the other processeslisted above: creating symbolic representations of outputsas functions on inputs (useful in generating invariants andformally verifying programs or algorithms); describing theconditions for following each path to detect nonexecutablesegments and to define subdomains of the input; and in-dicating coverage of concrete test data and helping to gen-erate additional tests. However, little work has extendedand integrated symbolic execution into the problem-solv-ing processes in design.1 In contrast, developmental eval-uation does this. Developmental evaluation includes fea-tures from symbolic execution and related processes that,to our knowledge, have never been combined into a singleframework before.

1) Developmental evaluation analyzes incomplete al-gorithms expressed as data-flow graphs rather than fullprograms written in a procedural language.

2) Developmental evaluation may uncover errors in thedesign and may post difficulties informing other problem-solving processes of the problem. These other processesuse knowledge encoded in production rules to take appro-priate action.

3) Propagation of information during developmentalevaluation is restricted by active design goals. Testing for

'The execution of program plans on concrete data for debugging duringdesign was however suggested a decade ago [35].

0098-5589/85/1100-1375$01.00 © 1985 IEEE

1375

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-I1, NO. 11, NOVEMBER 1985

special-case handling, determining the run time of an al-gorithm, and checking for type consistency between in-

puts and outputs of adjacent algorithm steps are three suchgoals.

4) Data objects may be arbitrarily complex structures(e.g., geometrical objects or sets). In contrast, most sym-

bolic evaluators (an exception is KOKO [12]) allow onlyvariables subject to conjunctions of numerical constraints,which are inadequate to represent many algorithms.(However, there is no guarantee that these more complexstructures can be reasoned about satisfactorily in allcases).An implementation of developmental evaluation em-

bodying most of the principles described in this paper isoperational in DESIGNER, an initial version of an auto-matic algorithm design system. Algorithms are repre-

sented as collections of object instantiations in an object-oriented system with a few simple forms of inheritance;developmental evaluation operators are implemented inLisp; and evaluation control rules are written in the pro-

duction-system language 0PS5 [16]. This implementationhas evaluated algorithms in several task, domains includ-ing geometric algorithms (such as finding the convex hullof a set of points), set operations, and numeric algorithms(such as Fibonacci and recursive factorial). While our cur-

rent system does not design these algorithms, we haveworked out detailed hypothetical syntheses for some de-signs and are now undertaking a reimplementation of DE-SIGNER that should result in rapid progress towards a

complete design system.Before describing the implementation, we would like to

stress that although it has been guided by a detailed studyof human behavior, we do not claim that every detail ofthe system, or even every representation and operator ituses, has a cognitive equivalent. For example, the level offormality and detail in our algorithm language approxi-mates that which people use to describe algorithms to one

another. But at the same level of detail, variations in ex-

perience may result in different sets of algorithm buildingblocks for different people. We do believe, however, thata DESIGNER-like system has a greater chance of exhib-iting the flexibility, robustness, and creativity necessary

for design of nontrivial algorithms than an automatic pro-

gramming or design system based on techniques such as

formal derivation. Reasons for this belief are discussed inthe companion paper in this issue, which provides a more

general description of our approach to the automation ofalgorithm design.

II. REPRESENTING ALGORITHMS FOR DESIGNAND EXECUTION

In this section, we describe the language to set the con-

text for our description of developmental evaluation.Although we believe that languages based on the same

principles as the one described here will be useful for de-sign, we are not claiming that successful design requiresthis particular language.

A. An Algorithm LanguagePart of our research is the development of a language

called AL (Algorithm Language) for representing partialdesigns at any step of the creation process.2 In developingAL, we tried to identify a set of algorithmic objects thatcorrespond closely to the conceptual building blocks oursubjects use. We call such blocks components. Compo-nents are connected by links between ports (for input oroutput) of the components. A few basic component typesare predefined, but new components may also be definedat any time in terms of other components. Assertions statefacts about components and data objects. A collection ofcomponents, links, ports, and assertions grouped togetherforms a network called a configuration. Configurations,which are the partial algorithm descriptions, bear someresemblance to circuit diagrams, where the links are wirescarrying signals between electronic components.

Assertions are crucial to our representation of algo-rithms because the variations in processing that make con-figurations meaningful come from the assertions on thecomponents. Our assertional language is a variant of first-order predicate calculus that can express the state of com-putations over time. Some assertions are predefined asbeing relevant to execution of all instances of a given type;for example, all selects must have a composite object asone of their inputs. Other assertions are added during thedesign phase to specific component instances; for exam-ple, a particular generator produces elements from its in-put collection in increasing order, a particular test placesits input numerical item on the true-exit port if the valueof the item is greater than one. An assertion may also referto the complexity of the algorithm, for example that a testmust be performed in time linearly proportional to the sizeof the input in the worst case.

Only a brief overview of AL's type hierarchies of com-ponents, ports, links, and assertions will be given heresince a detailed language description is the subject of an-other report in preparation [34].

Components: In AL some components representsteps in algorithms: applies create new data based on theirinputs; selects extract an element from an input collection;tests conditionally alter the data flow; compares report therelationship between two items; generators produce indi-vidual items from a collection in any specified order; andrecursive-calls apply a process recursively. Other com-ponents, memories, hold a representation of an object orcollection.

Ports: Input ports and output ports represent data in-puts and outputs of components, while signal ports servecontrol functions on specific component types, such as re-setting a generator.

Links: Links usually connect components or config-urations at the same level of the hierarchy. Special kindsof links, vertical-input links and vertical-output links,connect components to their refinements.

2Other areas in our research include: implementation of operators forediting algorithms expressed in AL; collection of algorithm design heuris-tics in a number of task domains; and development of a problem solvingarchitecture to handle the mixture of top-down and opportunity-driven goalspresent in algorithm design. In addition, we have also done preliminarywork on specification of a subsystem for reasoning with visual images ingeometric algorithm design. For more details of this work, see the com-panion paper in this issue and past publications on DESIGNER [21]-[231.

1376

STEIER AND KANT: EXECUTION AND ANALYSIS IN ALGORITHM DESIGN

Assertions: Assertions are classified in AL by the do-main of the operations they describe, for example numer-ical assertions and geometrical assertions. When used ina data-flow configuration, each assertion instance is as-signed a role in accordance with its use. Roles relevant todevelopmental evaluation are: operation, to specify theoperation of an active component; description, to describea data item; precondition and postcondition, to assert thatcertain conditions hold before and after the execution of acomponent; and complexity, to state the time or space re-quired for the execution of a component.

Generators cause certain components to be repeatedlyexecuted. As the algorithm design proceeds, it may benecessary to represent this repeated computation explic-itly, which we do with a loop. We classify the computationin a loop into separate configurations by function: the ini-tialization, the loop-body, the repetition (how inputs forthe next iteration should be computed using the outputsfrom the iteration just completed) and the termination. Inthe graphic representation of our language, all configura-tions belonging to the same loop are enclosed in one boxin a special format: initialization at the left side, termi-nation at the right side, and repetition along the bottom.All parts of a loop-box are optional (although the absenceof an essential part may lead to a semantically incorrectdesign). During refinement, it is expected that some of theparts will be unspecified, implying default processing isin effect for those parts.

Before presenting an example of AL's use, we compareAL to traditional data-flow languages. Both AL and tra-ditional data-flow languages use a directed graph repre-sentation. The execution sequencing is governed by thedata flow (with null tokens transmitted for synchroniza-tion when necessary), and the data movement is forward-on-availability, rather than fetch-on-demand.3 On the otherhand, AL is not a "minimalist" data-flow representation.Our goal is to use objects that occur naturally as algo-rithmic steps, not to limit the system to a minimal set ofprimitive components. To avoid proliferation of objecttypes, a design idea is initially represented as a compo-nent of approximately the right type with descriptive as-sertions. As design proceeds, any component representinga complex process that does not correspond directly to asingle action "known" to DESIGNER may be refined intoa complete configuration called a refinement configura-tion. This ability to refine components into subconfigu-rations along with the ability to define new concepts byadding assertions to simpler concepts provides a naturalabstraction mechanism in AL.

B. A Sample Algorithm DescriptionWe now illustrate the use of AL by presenting a speci-

fication of the convex hull problem and a configurationshowing the state at one point in a hypothetical synthesisof an algorithm that meets the specifications. Specified in-formally (as it was for our subjects), the problem is to findthe convex hull of a set of input points. A convex hull is

3A good summary of these data-flow concepts may be found in Chapter29 of the Handbook of Software Engineering [26].

Assertions:

Precondition on CH: Type(Input, point-set)Operation on CH: Convex-hull(Output, Input)Complexity on CH: Runtime(CH, Inputl)Precondition on S1: S1-input C Input)Operation on S1: Output(S1-output,

arbitrary-element-in(S1 -input))Postcondition on T 1: On-hull(T1 -output-true-exit, Input)Precondition on M2: Type(M2-input, point)

Fig. 1. Partial algorithm design for generating a sequence of hull points.

defined as either a convex polygon whose vertices are asubset of the input points that encloses all the input pointsor the set of points on that polygon. This ability to viewthe output as either a set of points or a polygon turns outto be important during the design.An input to DESIGNER that represents this problem is

a configuration with an input memory (labeled as input inFig. 1) connected to an apply (labeled CH) connected toan output memory (Output). The required input/output re-lationship is described by an assertion of role operation onCH. We use a dot notation to indicate part-whole relation-ships; if Output contains a polygon, Output. vertices in-dicates the vertices of the polygon in Output. The asser-tion, which will be abbreviated henceforth as Convex-hull(Output, Input), is written in expanded form as:

(Type(Output, convex-polygon) AOutput. vertices C Input A (V pt) (pt E InputInside (pt, Output)))

V(Type (Output, point-set) A (Output C Input A

(3 cp) (Type (cp, convex-polygon) A (Vpt) (pt EOutput -- On-boundary-of (cp, pt)) A

(Vpt) (pt E Input -_ Inside (pt, cp)))))

This specification has several interesting properties.Note that in both the English and AL specifications, themechanism or efficiency of any algorithm satisfying thespecifications is not constrained. There is no concern withatypical cases yet; for example, a point on the boundaryof a polygon must be defined to be inside the polygon forthis specification to be strictly correct. People usuallyconsider such details later in the design process and thispostponement of concern with the details is also charac-teristic of DESIGNER. We also assume that the Insidepredicate may be evaluated directly, since an operator torecognize examples of enclosure is available from presup-plied knowledge of the task domain (geometry). The abil-ity to use predicates from the task domain in this mannercontributes to the informality of our algorithm specifica-tions.The configuration of Fig. 1 represents an intermediate

state in an attempt to get an algorithm for this specifica-

1377

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-il, NO. 11, NOVEMBER 1985

tion without using sophisticated design principles.4 Thekernel schema is a transfer paradigm, similar to that de-scribed by Barstow [5]. In this schema, the points in theinput set are generated by the producer and built into anoutput set by the consumer. During a previous part of thedesign (not described here) this kernel schema was refinedinto a loop. The body of the loop has been refined into aconfiguration producing the desired output, which isshown in Fig. 1. On components, assertions with opera-tion roles specify the functionality of those components,assertions with precondition roles are expected to be trueabout the inputs before the component is executed, andassertions with postcondition roles should be true about theoutputs after the component finishes executing. On eachiteration of the loop, the algorithm arbitrarily selects apoint (in select S1) from the set of input points that havenot yet been considered (stored in memory MI) and testsif the selected point is on the hull (in test T1). The pointis added to the set of points found to be on the hull so far(stored in memory M2) if the test returns true. Some ofthis functionality is conveyed by the port types used aslabels in the figure; if the test returns true, an item willbe placed on the true-exit port of the test, and then, sinceit feeds MI's add-element port, added to the contents ofMI. In this kind of loop, defaults handle the facts that MIis initialized to the contents of Input, that a different pointis selected by SI on each loop iteration, and that the loopterminates when all the points in MI have been consid-ered.

Also, built-in knowledge about the kernel schema hasled to a hope (recorded as an assertion) that the algorithmwill be linear in the number of input points. The beliefthat this schema leads to a linear algorithm arises becausethe producer should be executed at most once for eachinput point to test all the points. The refinement has notyet been evaluated to analyze its run time, so the fact thatthis belief is incorrect (if the test cannot be performed inconstant time) has not yet been discovered. We will con-tinue this example in Section Ill-C.During developmental evaluation, assertions are also

used to describe items, which may represent concrete orsymbolic data objects passed between components (andstored on links). Items have associated domains, such asgeometry and arithmetic. Several items may be groupedtogether into a collection (which corresponds to a mathe-matical set or sequence). Fig. 2 shows the loop-body ofthe partially refined convex hull algorithm in the middleof an evaluation sequence. The symbolic items g and phave assertions on them indicating the results of the dataflow through the components. We use time-copies of itemsto indicate that one item is essentially the same object asanother, but it has acquired new assertions indicating dif-ferent information known about it as a result of flowingthrough components. The predicate time-copy(p, g) in-dicates that p is just g later in time and inherits its asser-

4The convex hull algorithm that eventually results from this design se-quence is not a particularly efficient one. Many other convex hull algorithmshave been developed, and the problem can be solved in N log N time. Amore experienced algorithm designer might be able to develop such an al-gorithm from a divide-and-conquer kernel schema, for example.

true-exit

} M1 t *S1 , ~~Tl add W

element

Assertions:

Description on g:Type(g, point) A (g E M1)

Description on p:Time-copy(p, g) A On-hull(p, Input)

Fig. 2. Symbolic items during execution of partial convex hull design.

tions, so that p also represents a point that is an elementof the input set.

Ensuring that a time-copy item has the right propertiesis not a trivial task. Simply collecting the assertions as-sociated with previous time-copies of the item (what isimplemented currently in DESIGNER) only works in sim-ple cases. Suppose, for example, that assertions are beingmade about a set of points. There may be assertions aboutwhether individual points are on or off the convex hull andalso there may be an assertion about the point set as awhole containing only points that are on the hull. Now ifa point is added to or removed from the point set, this canchange the validity of the assertions about whether otherpoints are on the set's convex hull and about whether allpoints in the set are on the hull. Keeping the validity ofthese assertions updated is basically the same problem thatoccurs in any form of nonmonotonic reasoning. Thus,either there needs to be some way of determining whichoperations are liable to invalidate previous operations, orotherwise all assertions must be rechecked after any op-eration that might affect them. There is also an issue inthe implementation of the inheritance mechanism. If aninheritance method that collects assertions from all pre-vious time-copies of an item is used, either there must bea provision for cancellation links or there must be a wayto indicate how far back to go when collecting assertions.In the latter case, when assertions are invalidated, a newset of the valid assertions can be collected and explicitlystored with that time copy and that object is marked as astopping point in further collection of assertions. The se-mantics of different implementations to handle this prob-lem have been studied in detail elsewhere [37].

III. DEVELOPMENTAL EVALUATION IN DESIGNERNow we are ready to describe how algorithms expressed

in AL may be executed and analyzed. We explain the roleof developmental evaluation in design, describe the stepsof the process, and illustrate its use with a fragment of adesign sequence that uses the configuration introduced inSection Il-B.

A. The Uses ofDevelopmental EvaluationOne way to view the design process in our framework

is to consider each change to a configuration as a modifi-cation of the current set of constraints on the behavior ofthe algorithm. The possible configuration changes are notlimited to producing only satisfiable sets of constraints;that is, configurations that can be refined into correct al-gorithms. Altering one part of the configuration may causeproblems in another part, since refinement is often locally

1378


driven, without consideration of possible global effects ofchanges. Propagation of altered constraints helps to focusattention on potential problems in development of otherparts of the algorithm. Propagation also brings togethernew combinations of facts that may lead to additional de-sign opportunities. This selective propagation of configu-ration-derived constraints is driven by developmentalevaluation.Advocates of the formal derivation approach to algo-

rithm design will argue that developmental evaluationwould not be necessary if the design operators are guar--

anteed in advance to produce only useful, well-formedconfigurations. It is not our intention here to argue thatsuch correctness-preserving transformations are the wrongapproach to algorithm design; when such transformationsare available and known to be useful, they should of coursebe applied. On the other hand, if it is not obvious whichtransformation best applies to a partial design, a systemthat relies exclusively on such techniques will not makemuch progress. Our subjects often produce novel ideas bydoing the right thing without a fully understood reason,

and we do not wish to prevent our system from makingsimilar accidental discoveries. Rules encoding the expe-

rience of good designers combined with the results of de-velopmental evaluation would prevent the system fromsearching through all possible designs.The principal activity in developmental evaluation is a

combination of symbolic and test-case execution known as

trial execution, or more simply, execution. Execution isan information-gathering strategy that executes the cur-

rent description of the algorithm on symbolic or actual dataobjects to expose problems or opportunities for refine-ment. Execution uses assertions on items that arrive on

the input links of a component to produce appropriate as-

sertions on items on the output links. The assertions on

the output items describe the results of applying opera-tions defined by the component and its assertions to theinput items. Developmental evaluation continually evalu-ates assertions and compares the results against expectedvalues determined locally by the component being exe-

cuted. If an expectation is violated, then a difficulty (con-taining a subgoal to solve, and perhaps a method for solv-ing it) is posted to notify the rest of the system of theinconsistency. Knowledge of the inconsistency combinedwith algorithm design heuristics should guide the systemin selection of the next operator to apply. This processallows detection of interactions between components in a

uniform and efficient manner.

One of the most important choices in developmentalevaluation is whether symbolic or concrete data is used.Several factors influence this decision. For example, sym-bolic execution gathers more complete information forverification. Also, it may be better to use a descriptionsuch as "an integer less than 10" than actually to pickone, when the system is looking for a counter example or

example and doesn't have all the constraints on an objectyet. On the other hand, symbolic execution can be more

expensive if considerable expression simplification is re-

quired. Furthermore, discoveries caused by the combina-tion of unexpected assertions and previous experience are

more likely to result from working with concrete examplesthan from abstract reasoning based on the results of sym-bolic execution. In the case of the geometry domain, ma-nipulation of mental images provides new kinds of sym-bolic reasoning abilities. Subjects seem to be able to usethese abilities to solve problems even if they do not knowhow to derive an answer formally.The constraint propagation performed in DESIGNER is

selective and the nature of the selectivity is determined bythe goal of the execution effort. This is not only efficient,but also defines a simple criterion for deciding when eval-uation of a configuration is complete. For example, if thegoal is to guarantee that the algorithm is correct, then de-velopmental evaluation is tantamount to a verification ofpartial correctness using the data-flow language. If the goalis to analyze the efficiency of the algorithm, then onlythose constraints relevant to the time and space use of eachcomponent are propagated. Since many of the constraintsin the design are often irrelevant to the active goal, thisselectivity limits the computation necessary for automateddesign.5 However, the development of a surprising asser-tion can change the active goal or modify the type of in-formation collected during developmental evaluation. Thedesign goals currently satisfiable through developmentalevaluation include:

* Checking the consistency of the design, includingchecking for missing inputs or specific errors such as data-type conflicts between components.

* Given actual test-case data for some or all of the in-put, applying the algorithm to the data to compare the re-sulting output items to expected values.

* Producing symbolic values for outputs in terms of theinputs. When combined with a theorem prover, this allowschecking that the algorithm (or component) satisfies itsspecifications as indicated by its preconditions and po-stconditions and those of any subcomponents.

* Analyzing the time complexity of the algorithm.We have also identified several other goals for possible

implementation: explanation, space use analysis, checkingfor boundary conditions, test-case example generationusing assertions on symbolic items, and induction of loopinvariants from repeated executions. These goals wouldresult in functionality similar to that of the symbolic exe-cution systems discussed in Section IV.

B. The Developmental Evaluation ProcessThe processing follows a standard sequence of steps, as

described below. The process is similar to that of symbolicinterpretation used in the Programmer's Apprentice [31].In that system, however, the steps are not explicitly iden-tified but rather result from the system's attempt to verifythat a plan achieves its intended purposes. Also, in oursystem, the necessary processing may be limited by thepresence of actual data items, which simplify the asser-tions produced and reduce the number of execution pathsthat need to be considered.

1) Do the necessary preprocessing: Evaluation starts

5Other advantages of explicitly considering design goals in automatedsystems are explained in Mostow's survey of design research [25].

1379

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-11, NO. 11, NOVEMBER 1985

with some preprocessing of a data-flow configuration builtby design operator application. The main purpose of thisinitialization stage is identification of the data items to useas input. Unlike similar processes in other systems, de-velopmental evaluation in DESIGNER may be started,stopped, or resumed in midstream and interwoven withother design operations. If it is being resumed, thereshould be items on the links of the component that canserve as data for execution. Otherwise, input items mustbe generated, usually by executing the components whoseoutputs connect to the input links of the current compo-nent. If there are no connected components, symbolic in-put items are created from the precondition assertions onthe input ports of the current component.

2) Check that the component's preconditions aremet: There may be preconditions on components in theform of assertions. These assertions may be from the pre-defined base of knowledge about the type of componentbeing executed or may be added during the design phaseto the specific instance of the component. Depending onwhich goals are active, various subsets of these precon-ditions (e.g., data types, domain restrictions, boundaryconditions, run time constraints) are checked. In effect,this step tries to prove the preconditions of the compo-nents using the assertions guaranteed by the specificationsin conjunction with the assertions on items that flow intothe component.A difficulty is posted during this step if a precondition

assertion, evaluated with input items as parameters, failsto return true. The difficulty notifies the rest of the systemwhich precondition hasn't been satisfied, so that otherprocesses may fix the problem.

3) Create input profiles from the input items: An in-put profile is created for each input port of the componentbeing executed. The input profile is a new item that sum-marizes all the items that are on links attached to a spe-cific input port. Each input port may have several inputlinks, and each link holds an item. (If there is an inputport with no link attached to it, a difficulty is posted, withthe suggestion of adding a link to fix the problem.) Allassertions on items going into the same port that have thesame role are combined (according to a function that de-pends on the role) into a single assertion. The end resultof the input profile step is that each input port has oneprofile containing at least a descriptive assertion, and pos-sibly assertions of other roles.An example of how assertions are combined is that all

assertions with role "description" from a single item aremade into a conjunction, and those conjuncts from sepa-rate input items are made into a conjunction of condition-als (path-condition -f description) that is attached to theprofile associated with the port. Another example is thatif the appropriate goal is active, analysis assertions arecombined. For rough worst-case time complexity estima-tion, the frequencies of the items are added ignoring con-stants and low-order terms.

4) Execute refinements, ifnecessary: If the componentis defined by a subconfiguration, and if evaluation of therefinements is desired, then the input profiles are copiedto the corresponding vertical-input links and the sequence

of steps defined here is applied recursively to each com-ponent in the subconfiguration. When evaluation is com-plete, the outputs will be on the vertical-output links.A problem that may occur during execution is that a

component may have the appropriate number of inputs andoutputs, but may have assertions that describe the resultsof operations that are not described algorithmically. Thatis, the process required to produce an output has not beendecomposed into a sequence of known (to DESIGNER)functions to apply to the input. This situation indicatesthat the component needs refinement. If the current goalis to detect places for refinement rather than to just eval-uate the component at the current level of abstraction asif it worked, a difficulty is posted to notify the system thatthe design process should be invoked recursively on thiscomponent.

5) Create an output profile for each output port: Eachoutput port has an operational assertion whose parametersare a (possibly' empty) subset of the input profiles. Theresult of evaluating the operational assertion is an outputprofile for the port. When analyzing a component's sub-configuration rather than the component itself, the itemson the vertical-output links associated with each port serveas the output profile.

If an output port has no associated operational asser-tion, a difficulty is posted, suggesting that an assertion beadded. If the operational assertions on the ports requireparameters (excluding constants already known) that arenot covered by the input ports on the component, a diffi-culty is posted, with the suggestion of adding a port to fixthe problem. If there are not enough output ports, indicated by an operational assertion defining an output for aport type that has no corresponding instance on the com-ponent, the posted difficulty will suggest adding a port.

6) Check that the component's postconditions aremet: Postconditions have the same origins, format, anduse as preconditions, except that output profiles are sub-stituted into postconditions where outputs are referenced.This step tries to prove the postconditions of a componentfrom its preconditions in combination with the assertionson the output profiles. If a postcondition assertion does notevaluate to true with output profiles as parameters, a dif-ficulty is posted to notify the system of the unsatisfiedpostcondition.

7) Create output itemsfrom the output profiles: A time-copy of the output profile is placed on each link emanatingfrom an output port.

8) Do the necessary postprocessing: The exact ac-tions differ depending on the type and instance of com-ponent being executed. A common action, guided by con-trol assertions, is removing items 'from a component's in-put links to allow further processing. Other possible actionsare copying items back up to higher level configurationsafter execution at the lower level is complete.

Currently, the flexibility needed to satisfy the goals indevelopmental evaluation is provided by allowing a varietyof processing options while executing the sequence of stepsdescribed above. More than one option may be active at atime; which options are activated depends on the reasonfor trying developmental evaluation. For example, the

1380


standard-symbolic-execution option atttaches assertionsrelated to the functionality of the components to symbolicitems and the big-o-time option attaches assertions aboutthe rough time complexity of executing the algorithm tothe items and to the components they pass through. Op-tions are not completely independent and activating onemay require activating others to compute prerequisite in-formation. As a simple example, a space-time analysis op-tion requires information about both space and time; if theappropriate information has not been computed previ-ously, other options will be activated along with the orig-inal request.The order in which constraints are propagated is usually

determined by a simple default: unless directed to do oth-erwise by options or interrupted by another process, eval-uation generally proceeds by depth-first search of thegraph formed by the configuration. The 'search is re-stricted by the semantics of the data flows and bounded indepth when a component is sufficiently understood forpresent purposes (possibly from previous execution). De-velopmental evaluation terminates when an active goal issatisfied, a difficulty is encountered or if nothing remainsto be executed.The issue of termination of developmental evaluation

brings up a problem that we do not feel has yet been ad-equately resolved in current symbolic execution and re-lated systems: handling loops that may be repeated for anundetermined number of times, yielding a potentially in-finite number of execution paths to be taken into account.In the general case, of course, we do not want to try toguarantee that all loops can be completely analyzed, forsuch a guarantee implies that we would be able to deter-mine if any given algorithm ever halts. But even in re-stricted cases, it may not be necessary for design to verifythat a loop is completely correct. Our subjects are usuallyconvinced when a small number of test cases work, byrecognizing some pattern which they believe generalizescorrectly to all possible cases. We wish to capture the abil-ity to recognize familiar patterns from the structure of thealgorithm and from the results of developmental evalua-tion in our system. We do not think analysis will usuallybe difficult because most loops are not pathological; theyare designed with a particular simple behavior in mind.Most automated symbolic interpreters either execute

loops for a user-determined number of iterations or ignoreloops entirely. This approach is not helpful for us, bothbecause of the automatic nature of DESIGNER and be-cause too few iterations may cause bugs to be overlooked.One approach that has been taken [8], [40] is to try toautomatically derive and solve recurrence relations to ex-press the behavior of loops without the need for repeatedexecution. We are currently investigating an approach thatshares some of the characteristics of temporal abstractionin the Programmer's Apprentice. As mentioned before,we decompose loops into the standard parts of initializa-tion, the body, the repetition, and termination condition.If the number of iterations is not totally constrained dueto the prese ce/jpf symbolic data, we execute each partonce to che¢k ernal consistency and give the results toa set of rVles :h't can recognize familiar configurations

and results. Such rules could perform the temporal ab-straction necessary to determine the behavior of a loop inthe general case. The more powerful the pattern-recog-nizing heuristics possessed by the system, the easier theanalysis will be.We conclude this section by discussing how the nature

of assertions and the developmental evaluation procedurein DESIGNER allows us to solve posted difficulties. Sincethe roles that assertions play in configurations, (e.g., pre-conditions or complexity constraints) are known in ad-vance, many difficulties can be fixed using one of a smallset of procedures. When a specific port or link is knownto be missing, then the suggestion to add that port or linkas specified by the posted difficulty may be applied di-rectly. When a postcondition or precondition is unsatisfiedby the current design, then more complex algorithm de-sign heuristics must be applied. There could also be amechanism (not yet implemented) for determining, whensome condition is unsatisfied, what else has to be true inthe current context to make the entire condition true. Sucha mechanism was implemented by Smith [33] in a systemfor deriving the specifications for divide and conquer al-gorithms.

C. An Example ofDevelopmental EvaluationIn this section we illustrate developmental evaluation

with an example. The process is applied to the convex hullexample to integrate a test predicate discovered in the do-main space (geometry) into the current design. During theevaluation it is also noticed that the algorithm is not linear,as was originally hoped. Details of the discovery of thetest predicate have been described elsewhere [22], and sowill not be repeated here. The original goal was to find atest to determine if a point is on the hull, but the discov-ered predicate tests if a segment is on the hull (a line seg-ment is on the convex hull of a set of points if all pointsin the input set are on the same side of that segment).Since it is known that the convex hull may either be de-fined by the vertices of an enclosing polygon or by the linesegments that form the sides of the enclosing polygon, thediscovered test seems sufficiently related to the desiredtest. The system decides to use the currently designed loopas much as possible, but to modify it to make use of thepredicate. As a first approximation, the test is initiallyrepresented as an assertion on test TI, with the expecta-tion that developmental evaluation will indicate whatchanges need to be made to the design. The Output pred-icate indicates that the output of the port indicated by thefirst argument is the result of evaluating the second argu-ment. Here points-one-side is a predicate that is true whenall the points in its second parameter are on the same sideof the line segment in its first parameter. If the predicatereturns false, no item is put on the true-exit-output.

Operation on TI

Output (TI-output-true-exit, Points-one-side(TI-input, Input))

Since this assertion has just been added to TI, TI is eval-uated to see if it is fully refined. The first problem that isuncovered is that the inputs are not well-defined, because

1381

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-ll, NO. 11, NOVEMBER 1985

the assertion of role operation requires different input thanis currently available. First of all, there is only one inputport, and there need to be two ports. Execution is there-fore suspended, and another port (we will call the existingone P1 and the new one P2) is added. A link is attachedto the new port and preconditions on inputs to the test areadded to further specify P1 andP2.

Precondition on TI:

Type (T I -input-P 1, segment) A Equal (T 1 -input-P 2,Input)

When evaluation of TI is attempted again, there is no in-put on the link to P2, since it is not connected to a com-ponent's output. There is a design rule that applies to thisproblem:

If a link in the current design needs to be attached,and there is another output in the current configu-ration that will satisfy this precondition on the portat the end of the link, then connect the link to theoutput.

When this rule fires, the link into P2 will be attached tothe Input memory. With all the inputs present, the pre-condition on the input to Ti is compared to the descriptionassertions on the input items. The clause about Pi is notsatisfied, since Ti is not connected to a component pro-ducing a segment. The previous rule will not apply, sinceno segment is present in the design, but another rule doesapply:

If a precondition is not satisfied by a link attachedto a port in the current design, but it is known thatan apply component with an appropriate operationassertion can create a component with the requiredoutput, then add or splice in an apply to the config-uration with the required assertion and connect itsoutput to the port described by the unsatisfied pre-condition. (This is a rule that is not expected to beused, since its unrestricted application would lead toa combinatorial explosion of possible designs. It isonly to be used when no other knowledge is avail-able.)

In this case, the output of the apply should be a segment.This knowledge is used to find the operation assertion forthe apply by searching the domain knowledge base of thesystem for an assertion producing a segment. This searchsuggests a draw-segment as an assertion, so apply Ai isadded with a draw-segment assertion that takes a headpoint and a tail point and produces the segment connectingthe two points.

Operation on Ai:

Output (A I-output, draw-segment (A I-input-1,A I-input-2))

Since Al has just been added, that component is evaluatedto see if it is complete. The evaluation points out the dif-ficulty that ports and links are missing from the segment-constructing component and collects information about theinputs. The link from SI to Ai will furnish one of theinputs to the apply. Unless directed otherwise, the same

Input( _ CH --iOutput4

- \ ~

Assertions:

Precondition on CH:Operation on CH:Complexity on CH:Precondition on SI:Operation on SI:

Operation on Al:

Operation on Ti:

Precondition on Ti:

Postcondition on Ti:Precondition on M2:Operation on S2:

Type(Input, point-set)Convex-hull(Output, Input)Runtime(CH, O(lInputl3))SI-input C Input)Output(select-output,arbitrary-element-in(S1-input))

Output(A I-output,draw-segment(A I -input- 1, A I -input-2))

Output(T1 -output-true-exit,Points-one-side(TI-input, Input)

Type(T1-input-Pi, segment)A Equal(T1-input-P2, Input)

On-hull(T1-output-true-exit, Input)Type(M2-input, segment)Output (S2-output,most-recent-element-in(S2-input))

Fig. 3. Convex hull algorithm after changes (changed assertions in bold-face).

input should not be used twice for the same component(since that may produce only degenerate cases of the de-sired output) so another point is needed. The descriptionof Output contains the fact that the output may be viewedas a polygon, so that the process of finding a convex hullmay be seen as building a polygon. Possible polygon frag-ments may be constructed by repeated extension from themost recently added vertex. Therefore, the other input toAl is obtained by following the memory of the hull so far(Mi) with a select component (S2) that gets the point mostrecently added, and connecting the output of S2 to Ai.

Operation on S2:

Output (S2-output, most-recent-element-in(S2-input))

Execution of S2 and Ai proceeds correctly, but M2 mustnow be changed to a component that manipulates polygonfragments by adding and deleting segments rather thanadding or deleting from a point set. This is done by chang-ing the precondition on M2. The configuration resultingfrom all the changes described is shown in Fig. 3. Furtherdesign will focus on the details of the initialization andtermination of the loop, which were not previously ad-dressed in refinement of the loop-body, but we will notdiscuss this here.Another use of developmental evaluation is analysis of

the algorithm's run time in a manner similar to symbolicexecution. Analysis proceeds by propagating constraintsabout the number of different items that may occur on linksthat force re-execution of certain components. This isequivalent to determining the size of the set of items on alink as a function of the size of the input, when consider-ing the set as a temporal abstraction of all items that mayflow on the link over the entire computation. The repeti-tion and termination parts of loops are especially impor-tant to this analysis. In this example, at a later point in thedesign, it is observed that M1 will have to be reset, per-

1382

__.


haps several times, if the select does not pick a point onthe hull as a starting point. This means that the worst-caserun time of this piece of the algorithm is 0 (NNT), whereN is the number of points in Input and T is the time toexecute TI for each segment. Since the formulation of thetest in the domain space requires comparing the input seg-ment to each point in input, the test takes time propor-tional to N. The total time is therefore 0(N3) in the worstcase. This assertion is shown in Fig. 3. This assertion con-tradicts the original expectation of linear time. In this ex-ample, only starting with a new kernel schema will yielda more efficient algorithm.

IV. RESEARCH RELATED TO DEVELOPMENTALEVALUATION

Previous research on techniques in the category that wehave defined as developmental evaluation focused primar-ily on the goals of testing and verification, although othergoals such as debugging, understanding, explanation, andanalysis have also been addressed. Here, we survey theresearch in this area that is most relevant to algorithm de-sign. The first part of this section summarizes the researchin software engineering and artificial intelligence that wehave found to be relevant to developmental evaluation. Thesummary is presented roughly in historical order. The sec-ond part of this section lists some support requirements,which, while not a primary focus of our research, wouldbe useful in developing a fully operational developmentalevaluation subsystem within DESIGNER.

A. Symbolic Execution and Similar TechniquesEXDAMS [3], one of the first sophisticated tools de-

signed to help programmers understand and debug high-level language programs, influenced many later symbolicexecution systems although it only considered concretedata. EXDAMS was to monitor program execution on ac-tual data and store a history of variable values to help theprogrammer (or additional debugging aids) to detect re-sults that differed from expectations. Interactive graphicdisplays of program execution, maintenance of data-flowlinks, and the ability to replay the history forwards orbackwards using the data-flow information were included,but analysis and test-case generation were not automated.

In the 1970's, several systems introduced the conceptof symbolic execution as an extension of normal programexecution, allowing input values to be symbols in additionto concrete data objects. In addition to the more commonuses of testing and verification discussed below, symbolicexecution, in conjunction with other techniques, has alsobeen used to analyze the run-time performance of simpleprograms [20], [40]. For a more detailed survey of theuses of symbolic execution for program testing specifi-cally, see the chapter categorizing the different methodsand their applications in the book Computer Program Test-ing [11].6

1) The EFFIGY system [15] suggested the use of sym-bolic execution for testing and verification. For testing,

6Test-case generation by methods other than symbolic execution [24],[27], [39] is usually language- and implementation-dependent and is not ashelpful in high-level design.

the programmer could inspect the execution tree to deter-mine whether the program performed correctly and toidentify test cases needing further examination. For veri-fication, the programmer could see if the patterns in theresults of symbolic execution suggested loop invariants orother assertions. We note that in a design system, suchpatterns could be a source of data to be examined by adiscovery or assertion induction mechanism.

2) The SELECT system [7] emphasized symbolic exe-cution as an alternative to mechanical program verifica-tion. SELECT generated conditions on inputs that causedeach path to be executed and produced specific input datato satisfy these conditions. The path conditions were ex-pressed as systems of linear equalities and inequalities (ahill-climbing routine for nonlinear path conditions wassuggested but not implemented). Also introduced were thenotions of and techniques for a) simplifying path condi-tions at each step so that if an expression is inconsistentthe system can stop considering a particular path and b)verifying user-supplied assertions.

3) DISSECT [19] drew ideas from EFFIGY andSELECT, using symbolic execution as a software testingtool. DISSECT relied on the user to specify a desired pathwith selection commands and then produced symbolic val-ues for each output. DISSECT's author distinguishes er-rors that affect the domain of a path from errors in thefunction computed by a path and demonstrated that withsymbolic execution, users could reliably detect certainkinds of path-function errors but not path-domain errors.

4) The goals of ATTEST [9], [10] are similar to thoseof EFFIGY and SELECT: to generate test data to driveexecution down a particular path, to detect nonexecutablepaths, and to create symbolic representations of the out-puts as functions of the inputs. ATTEST works in fullANSI Fortran (unlike previous systems, which were re-stricted to language subsets). Another feature is the ad-dition of artificial constraints that simulate error condi-tions during execution to detect common errors such asarray subscripts being out of bounds.

5) Dannenberg and Ernst [14] described a language andrules of inference that formalized symbolic execution foruse as the basis of a mechanical verification condition gen-erator. The language and rules handle loops and proce-dures with multiple exits, and formally treat expressionswith side effects. However, this work did not treat pathselection problems, and its handling of complex controlconstructs and side effects was tied too closely to the syn-tax and semantics of the formal language definition to beadapted for our use in executing informal algorithm de-scriptions.

Also, a symbolic evaluator for the language EL1, thebase language of ECL [8], statically analyzes rather thandynamically interprets programs to describe their behav-ior. In addition to handling complex control structures,this system analyzes the behavior of loops and proceduresmore completely than previous systems. Loops are eval-uated by deriving recurrence relations for the values ofeach variable changed within a loop as a function of thenumber of iterations through the loop (this is not alwayspossible, as in the case of interdependencies between vari-

1383

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-I1, NO. 11, NOVEMBER 1985

ables and complex conditionals). Procedures are analyzedby creating a template, a generalized description of thecall environment, for each procedure the first time it isevaluated. The template is then used on future invocationsof the procedure to minimize repeated analysis.The systems described so far all analyze behavior from

the program code and make no use of program specifica-tions. More recent work, however, includes symbolic eval-uation of specifications to help in the development pro-cess.

Meta-evaluation as a tool for understanding informalprogram specifications has been explored by researchersat USC-ISI [4] and at M.I.T. [18]. The USC-ISI meta-evaluation procedure is less rigorous than symbolic exe-cution. For example, loop bodies are executed only onceand the exact path conditions under which statements areexecuted are not computed, since information that wouldbe obtained from further analysis is not required to dis-ambiguate informalities in the specifications. The M.I.T.meta-evaluation procedure is grounded in the actors for-malism [41]. The actor model is based on objects that cansend or receive messages, unifying the notions of datastructures and procedures. Contracts are agreements aboutdesired behavior between implementers of a model and itsusers. Meta-evaluation is used to show that contracts holdby showing that when a contract's preconditions are sat-isfied, the postconditions will hold (this is similar to de-velopmental evaluation in DESIGNER). The precondi-tions and postconditions reference states of actors duringthe computation. The authors also suggested that meta-evaluation be used to answer user questions and to tracethe implications of changes to the current implementation.Cohen wrote a symbolic evaluator called KOKO [12] for

the GIST high-level specification language. He noted thatstandard symbolic execution techniques could not handleGIST's very high level specifications of behavior, whichinclude nondeterminism, reference by description ratherthan actual name, and arbitrary constraints. KOKO for-mulates the task of symbolically executing a specificationin the GIST language as computation of nonobvious orsurprising inferences from a set of axioms. Although themechanism by which this is done, theorem-proving, is dif-ferent from that of previous symbolic execution systems,the same kinds of consequences are computed. A note-worthy feature of this system is that traces from the sym-bolic evaluator have been used as the basis of an explana-tion system [36]. KOKO's symbolic execution can not bedirectly adapted to execute lower-level specifications since,among other reasons, KOKO can only execute loops inwhich the computation on each iteration is independent ofthe results of previous iterations.The partition analysis method [11], [29] detects errors

by comparing the results of symbolic execution of both thespecification and of the implementation. Partition analysispartitions the domain of the program into subdomains: allelements in a subdomain are treated the same way by boththe specification and the implementation of a program.Similarities and differences between the two execution be-haviors embodied in the partition are used to generate testdata and to verify consistency between parts of the spec-

ification and corresponding parts of the implementation.Thus partition analysis is the only method described so farthat can find missing-path errors. However, Richardsonand Clarke's specification language does not containenough abstract constructs to be directly useful for algo-rithm design.The Programmer's Apprentice project [28], [31] in-

cludes a symbolic evaluation component that proves as-sertions about plan diagrams (which are design represen-tations).7 This component, REASON [32], represents thereasoning strategies and justifications of conclusionswithin a Reason Maintenance System, a system embody-ing nonmonotonic logic so that conclusions that are nolonger true after a design change are automatically re-tracted. Goals of deduction (subgoals of symbolic evalu-ation) trigger the firing of rules that suggest appropriatereasoning strategies to achieve the goal, resulting in aflexible and robust reasoning engine. In addition, REA-SON works with descriptions of programs at different lev-els of abstraction so that it can reason about the purposesfulfilled by implementations (links to specifications).A concept similar to developmental evaluation is that of

simulation. It was suggested by Adelson and Soloway [1]as part of a model of the human design process that fo-cuses on complete software systems rather than on algo-rithms. In this model, simulation of the current designplays a major role in software system creation. They de-scribe the design process as repeated expansion of the par-tially completed design to add more detail by evaluatingthe present design, preparing a new set of refinements for--the design, and integrating these refinements into the de-sign. Evaluation is driven by means-ends analysis in whichthe current design is compared to the goal design, and thedifferences are used to choose between refinement strat-egies. Simulation is apparently one of the principal meth-ods of determining differences between current and goaldesigns, but the exact simulation mechanism was notworked out. Simulation is also used to detect interactionsbetween elements of the design and to uncover the exis-tence of poorly understood elements.

B. Support Requirements for Developmental EvaluationDevelopmental evaluation requires several forms of

symbolic expression processing support. As described inthis section, the types of processing necessary vary incomplexity from data-flow analysis to theorem proving. Wewould like to connect the DESIGNER system to existingmore powerful general-purpose symbolic manipulationand theorem proving systems but have not yet investigatedenough to determine whether there are compatible sys-tems that have the necessary capabilities. Rather than tryto build such systems ourselves, we are collecting exam-ples of what we want DESIGNER to be able to do andimplementing those abilities by simple special-purpose

7The plan calculus of the Programmer's Apprentice is similar to AL aslanguage for representing processes. Features which distinguish AL fromthe plan calculus are that in AL control flow can be defined implicitly bydata-flow links rather than control flow links (with a few exceptions), andthe fact that AL has only a small number of initial built-in components.Also, in the Programmer's Apprentice all loops must be expressed as re-cursive calls, while in AL they can be expressed iteratively.

1384


expression simplifiers, example generation rules, verifi-cation rules and interactive advice.

Data-flow analysis can detect some missing link or ini-tialization errors based on use-definition chaining. Al-gorithms to solve use-definition chaining problems, usingbit vector equations for example, have been described inthe compiler literature [2], [38] and should be fairlystraightforward to implement for the simple configurationcharacteristics we wish to identify.

Expression simplification also has many applications indevelopmental evaluation. For example, if expressionsrepresenting the conditions under which particular pathsare taken can be simplified to either "true" or "false,"then those paths are, respectively, either always taken orcan never be reached. In addition to simplification of log-ical and mathematical expressions, an evaluator forexpressions involving task-domain objects and operationsis needed. If developmental evaluation is used to analyzeprogram costs, then a symbolic algebraic manipulationsystem is required and the ability to solve common recur -rence relations in closed form is helpful. These types ofsimplification may be provided with rewrite rules or aspart of a more elaborate theorem proving system.Example generation, the process of finding specific ex-

amples to test a design or attempting to find counter-ex-amples to disprove a conjecture is also an important aspectof the design process. A technique that has been used infinding test-cases for programs coded in a conventionallanguage is linear inequality solving [9]. Finding exam-ples for partially specified designs might require knowl-edge about solving expressions involving domain conceptsas well as standard arithmetic inequalities and would belikely to depend on more heuristic problem solving. Onesystem that has pioneered in the generation of examplesfrom specifications is LOPS [6]. Other studies of examplegeneration have not focused specifically on the design do-main [30].

Theorem proving for developmental evaluation will haveto include the same sorts of features as those used as partof symbolic evaluators. It remains to be seen exactly howpowerful a theorem prover we will need for the algorithmdesign process we consider here. There is some evidencethat it is within the abilities of current technology becausewe do not intend to rely on complex proofs. We expect touse theorem proving mainly for simplification and localinferencing. There is one extension of theorem provingthat is of interest however. As mentioned earlier, it is oftendesirable to find out more information than just whetheran expression is "true" or "false" when determining thesatisfiability of preconditions or fulfilling other theoremproving requests. An expression may be almost true, andindicating what must be added to complete the proof willmove the design forward [33]. Cohen has pointed out,based on work described in [13], that deriving the conse-quences of the negation of an almost true expression mayalso be useful in such a case.

V. SUMMARY

We have described a system that makes explicit the roleof developmental evaluation in the design process. Other

researchers have demonstrated the usefulness of symbolicexecution and related techniques in testing and debuggingcomplete programs. We make a much stronger claim; thatin uncovering opportunities for refinement of a data-flowalgorithm representation, developmental evaluation is theprincipal method guiding design in the absence of specificcontrol knowledge. A standard set of difficulties combinedwith appropriate design heuristics allows automated selec-tion of appropriate design operators even if the algorithmis incompletely described. The power and flexibility of thisapproach is being validated by our current experienceswith implementing the DESIGNER system.

ACKNOWLEDGMENTThis research is the result of joint work with A. Newell.

Contributions to the initial implementation of develop-mental evaluation in the system were also provided by D.Marshall, B. Milnes, J. Muller, A. Peterson, and R.Thompson. We also appreciate the very useful sugges-tions made by D. Cohen and an anonymous referee.

REFERENCES[1] B. Adelson and E. Soloway, "A model of software design,"in The

Nature of Expertise, Chi, Glaser, and Farr, Eds. Hillsdale, NJ: Law-rence Erlbaum, to be published.

[2] A. V. Aho and J. D. Ullman, Principles of Compiler Design.Reading MA: Addison-Wesley, 1977.

[3] R. M. Balzer, "EXDAMS-Extendable debugging and monitoringsystem," in Proc. 1969 Spring Joint Comput. Conf., AFIPS, 1969,pp. 567-580.

[4] R. Balzer, N. Goldman, and D. Wile, "Meta-evaluation as a tool forprogram understanding," Univ. Southern California Inform. Sci. Inst.,Marina del Rey, CA, Tech. Rep. ISI/RR-78-69, Jan. 1978.

[5] D. R. Barstow, "The roles of knowledge and deduction in algorithmdesign," in Automatic Program Construction Techniques, A. W. Bier-mann, Ed. New York: Macmillan, 1984, ch. 10, pp. 201-222.

[6] W. Bibel and K. M. Horning, "LOPS-A system based on a strateg-ical approach to program synthesis," in Proc. Int. Workshop ProgramConstruction, France, Sept. 1980.

[7] R. S. Boyer, B. Elspas, and K. N. Levitt, "SELECT-A formal sys-tem for testing and debugging programs by symbolic execution," inProc. Int. Conf Software Rel., 1975.

[8] T. E. Cheatham, G. H. Holloway, and J. A. Townley, "Symbolic eval-uation and the analysis of programs," IEEE Trans. Software Eng.,vol. SE-5, July 1979.

[9] L. A. Clarke, "A system to generate test data and symbolically exe-cute programs," IEEE Trans. Software Eng., vol. SE-4, Sept. 1976.

[10] L. Clarke, "Test data generation and symbolic execution of programsas an aid to program validation,"Ph.D. dissertation, Univ. Colorado,Boulder, July 1976.

[11] L. Clarke and D. J. Richardson, "Symbolic evaluation methods-Im-plementations and applications," in Computer Program Testing, B.Chandrasekaran and S. Radicchi, Eds. Amsterdam, The Nether-lands: North-Holland, 1981, pp. 65-102.

[12] D. Cohen, "Symbolic execution of the Gist specification language,"in Proc. 8th Int. Joint Conf Artificial Intell., Karlsruhe, West Ger-many, Aug. 1983, pp. 457-462.

[13] -, "A forward inference engine to aid in understanding specifica-tions," -in Proc. AAAI-84, Amer. Ass. Artificial Intell., 1984.

[14] R. B. Dannenberg and G. W. Ernst, "Formal program verificationusing symbolic execution," IEEE Trans. Software Eng., vol. SE-8,Jan. 1982.

[15] J. A. Darringer and J. C. King, "Applications of symbolic executionto program testing," Computer, vol. 11, no. 4, Apr. 1978.

[16] C. L. Forgy, "OPS5 user's manual," Dep. Comput. Sci., Carnegie-Mellon Univ., Pittsburgh, PA, Tech. Rep. CMU-CS-81-135, July, 1981.

[17] B. Hayes-Roth and F. Hayes-Roth, "A cognitive model of plan-ning,"Cogn. Sci., vol. 3, 1979.

[18] C. E. Hewitt and B. Smith, "Towards a programming apprentice,"IEEE Trans. Software Eng., vol. SE-1, Mar. 1975.

[19] W. E. Howden, "Symbolic testing and the DISSECT symbolic eval-uation system," IEEE Trans. Software Eng., vol. SE-3, July 1977.

[20] E. Kant, Efficiency in Program Synthesis. Ann Arbor, MI: UMI Re-search Press, 1981.

1385

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-1I, NO. 11, NOVEMBER 1985

[21] E. Kant and A. Newell, "Naive algorithm design techniques: A casestudy," in Proc. European Conf. Artificial Intell., Orsay, France, July1982. Reprinted in Progress in Artificial Intelligence, L. Steels and J.A. Campbell, Eds. Chichester, England: Ellis Horwood, 1985.

[22] -, "Problem solving techniques for the design of algorithms," In-form. Processing and Management, vol. 20, no. 1-2, Spring 1984.

[23] -, "An automatic algorithm designer: An initial implementation,"in Proc. AAAI-83, Amer. Ass. Artificial Intell., 1983.

[24] E. F. Miller, Jr. and R. A. Melton, "Automated generation of testcasedatasets," in Proc. Int. Conf Reliable Software, Los Angeles, CA,Apr. 1975.

[25] J. Mostow, "Towards better models of the design process," Al Mag.,vol. 6, no. 1, Spring 1985.

[26] D. Oxley, B. Sauber, and M. Cornish, "Software development fordata-flow machines," in Handbook of Software Engineering, C. R.Vick and C. V. Ramamoorthy, Eds. New York: Van Nostrand Rhein-hold, 1984, ch. 29, pp. 640-655.

[27] C. V. Ramamoorthy, S. F. Ho, and W. T. Chen, "On the automatedgeneration of test data," IEEE Trans. Software Eng., vol. SE-2, Dec.1976.

[28] C. Rich, "Inspection methods in programming," Ph.D. dissertation,Massachusetts Inst. Technol., Cambridge, Tech. Rep. Al-TR 604, June1981.

[29] D. J. Richardson and L. A. Clarke, "A partition analysis method toincrease program reliability," in Proc. 5th Int. Conf. Software Eng.,Mar. 1981.

[30] E. L. Rissland and E. M. Soloway, "Overview of an example generation system," in Proc. AAAI-80, Amer. Ass. Artificial Intell., 1980.

[31] H. E. Shrobe, "Dependency directed reasoning for complex programunderstanding," Ph.D. dissertation, Massachusetts Inst. Technol.,Cambridge, Tech. Rep. Al-TR 503, Apr. 1979.

[32] -, "Explicit control of reasoning in the programmer's apprentice,"in Proc. 4th Workshop Automated Deduction, Feb. 1979.

[33] D. R. Smith, "Top-down synthesis of simple divide and conquer al-gorithms," Naval Postgraduate School, Tech. Rep. NPS52-82-011,Nov. 1982.

[34] D. M. Steier, "A language for representing and executing partial al-gorithm descriptions," Dep. Comput. Sci., Carnegie-Mellon Univ.,Pittsburgh, PA, Tech. Rep., 1985, in preparation.

[35] G. J. Sussman, A Computer Model of Skill Acquisition. New York:American Elsevier, 1975.

[36] W. R. Swartout, "The GIST behavior explainer," Univ. Southern Cal-ifornia Inform. Sci. Inst., Marina del Rey, CA, Tech. Rep. ISI/RS-83-3, July 1983.

[37] D. Touretzky, "Implicit ordering of defaults in inheritance systems,"in Proc. AAAI-84, Amer. Ass. Artificial Intell., 1984.

[38] J. D. Ullman, "A survey of data flow analysis techniques," in Proc.2nd USA-Japan Comput. Conf., AFIPS and IPSJ, 1975, pp. 335-342.

[39] U. Voges, L. Gmeiner, and A. A. von Mayrhauser, "SADAT-Anautomated testing tool," IEEE Trans. Software Eng., vol. SE-6, May1980.

[40] B. Wegbreit, "Mechanical program analysis," Commun. ACM, vol.18, no. 9, Sept. 1975.

[41] A. Yonezawa and C. E. Hewitt, "Symbolic evaluation using concep-tual representations for programs with side effects," MassachusettsInst. Technol., Cambridge, Tech. Rep. Al-TR399, Dec. 1976.

t David M. Steier was born in Chicago, IL, in 1961.He received the B.Sc. degree in computer science

i from Purdue University, West Lafayette, IN, in1982.He has held summer research positions at

Schlumberger-Doll Research, Texas Instruments,.W>, ;.' and Evanston Hospital. He is currently a doctoral

candidate in computer science at Carnegie MellonUniversity, Pittsburgh, PA. His research interestsin artificial intelligence include automatic pro-gramming, machine learning and discovery, andexpert systems.

Elaine Kant for a photograph and biography, see this issue, p. 1374.

Expert Systems and the "Myth" of SymbolicReasoning

JON DOYLE

Abstract-Elements of the artificial intelligence approach to expertsystems offer great productivity advantages over traditional ap-proaches to application systems development, even though the end re-sult may be a program employing no Al techniques. These productivityadvantages are the hidden truths behind the "myth" that symbolic rea-soning programs are better than ordinary ones.

Index Terms-Automatic programming, expert systems, Lisp, pro-grammer productivity, Prolog, prototypes, specifications, symbolic rea-soning.

INTRODUCTIONE XPERT systems attract much attention today, and the

outsider is hard pressed to judge the claims of the pro-ponents and detractors. One of the central claims of many

Manuscript received February 14, 1985; revised July 2, 1985. This workwas supported by the Defense Advanced Research Projects Agency, Order3597, monitored by the Air Force Avionics Laboratory under ContractF33615-81-K-1539.

The author is with the Department of Computer Science, Carnegie-Mel-lon University, Pittsburgh, PA 15213.

proponents is that expert systems based on artificial intel-ligence (Al) techniques are more powerful and morewidely applicable than traditionally organized systems be-cause of their use of symbolic reasoning as opposed tonumerical and algorithmic processing. Detractors see thatmany extant advanced application systems are plausiblycalled "expert systems" although they use no "artificialintelligence" techniques, and see no applications wherethe best traditional methods would not work as well. Thepurpose of this paper is to suggest that both positions aresomewhat misleading, in that while there are many non-Al expert systems, the AI techniques do offer somethingnew and important, but not what the proponents oftenclaim it is.Viewed abstractly, the heart of the matter is that the

"knowledge base" of an Al system is actually a formalspecification of the desired expert behavior (initially, apreliminary, incomplete, and imprecise specification), and

0098-5589/85/1100-1386$01.00 © 1985 IEEE

1386

IEEE TRANSACTIONS The Roles Execution and … at anystep ofthe creation process.2 Indeveloping AL, wetried to identify a set ofalgorithmic objects that correspond closely to the conceptual

Documents