Top Banner
Automatic Accurate Cost-Bound Analysis for High-Level Languages Yanhong A. Liu and Gustavo Go ´mez Abstract—This paper describes a language-based approach for automatic and accurate cost-bound analysis. The approach consists of transformations for building cost-bound functions in the presence of partially known input structures, symbolic evaluation of the cost- bound function based on input size parameters, and optimizations to make the overall analysis efficient as well as accurate, all at the source-language level. The calculated cost bounds are expressed in terms of primitive cost parameters. These parameters can be obtained based on the language implementation or can be measured conservatively or approximately, yielding accurate, conservative, or approximate time or space bounds. We have implemented this approach and performed a number of experiments for analyzing Scheme programs. The results helped confirm the accuracy of the analysis. Index Terms—Cost analysis, cost bound, performance analysis and measurements, program analysis and transformation, program optimization, timing analysis, time analysis, space analysis, worst-case execution time. æ 1 INTRODUCTION A NALYSIS of program cost, such as running time and space consumption, is important for real-time systems, embedded systems, interactive environments, compiler optimizations, performance evaluation, and many other computer applications. It has been extensively studied in many fields of computer science: algorithms [25], [16], [17], [53], programming languages [50], [26], [41], [44], and systems [46], [37], [43], [42]. It is particularly important for many applications, such as real-time systems and em- bedded systems, to be able to predict accurate time bounds and space bounds automatically and efficiently and it is particularly desirable to be able to do so for high-level languages [46], [37], [38]. For analyzing system running time, since Shaw proposed timing schema for high-level languages [46], a number of people have extended it for analysis in the presence of compiler optimizations [37], [12], pipelining [20], [28], cache memory [4], [28], [14], etc. However, there remains an obvious and serious limitation of the timing schema, even in the absence of low-level complications. This is the inability to provide loop bounds, recursion depths, or execution paths automatically and accurately for the analysis [36], [3]. For example, the inaccurate loop bounds cause the calculated worst-case time to be as much as 67 percent higher than the measured worst-case time in [37], while the manual way of providing such information is potentially an even larger source of error, in addition to its inconvenience [36]. Various program analysis methods have been proposed to provide loop bounds or execution paths [3], [13], [19], [21]; they ameliorate the problem but cannot completely solve it because they apply only to some classes of programs or use approximations that are too crude for the analysis. Similarly, loop bounds and recursion depths are needed also for space analysis [38]. This paper describes a language-based approach for automatic and accurate cost-bound analysis. The approach combines methods and techniques studied in theory, languages, and systems. We call it a language-based approach because it primarily exploits methods and techniques for static program analysis and transformation. The approach consists of transformations for building cost-bound functions in the presence of partially known input structures, symbolic evaluation of the cost-bound function based on input size parameters, and optimizations to make the overall analysis efficient as well as accurate, all at the source-language level. We describe analysis and transformation algorithms and explain how they work. The calculated cost bounds are expressed in terms of primitive cost parameters. These parameters can be obtained based on the language implementation or can be measured conservatively or approximately, yielding accurate, con- servative, or approximate time or space bounds. The cost analysis currently does not include cache analysis. We have implemented this approach and performed a number of experiments for analyzing Scheme programs. The results helped confirm the accuracy of the analysis. We describe our prototype system, ALPA, as well as the analysis and measurement results. This approach is general in the sense that it works for multiple kinds of cost analysis. Our main analysis sums the cost in terms of different operations performed; it gives upper bounds for all kinds of operations, such as arithmetic operations, data field selections, and constructor alloca- tions. Variations of it can analyze stack space, live heap space, output size, etc., and can analyze lower bounds as well as upper bounds. The basic ideas also apply to other programming languages. IEEE TRANSACTIONS ON COMPUTERS, VOL. 50, NO. 12, DECEMBER 2001 1295 . Y.A. Liu is with the Computer Science Department, State University of New York at Stony Brook, Stony Brook, NY 11794-4400. E-mail: [email protected]. . G. Go´mez is with the Computer Science Department, Indiana University, Bloomington, IN 47405-7104. E-mail: [email protected]. Manuscript received 18 Sept. 1998; revised 6 Dec. 2000; accepted 22 May 2001. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference IEEECS Log Number 107380. 0018-9340/01/$10.00 ß 2001 IEEE
15

Automatic Accurate Cost-Bound Analysis for High-Level Languages

Jan 16, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Automatic Accurate Cost-Bound Analysis for High-Level Languages

Automatic Accurate Cost-Bound Analysisfor High-Level Languages

Yanhong A. Liu and Gustavo GoÂmez

AbstractÐThis paper describes a language-based approach for automatic and accurate cost-bound analysis. The approach consists

of transformations for building cost-bound functions in the presence of partially known input structures, symbolic evaluation of the cost-

bound function based on input size parameters, and optimizations to make the overall analysis efficient as well as accurate, all at the

source-language level. The calculated cost bounds are expressed in terms of primitive cost parameters. These parameters can be

obtained based on the language implementation or can be measured conservatively or approximately, yielding accurate, conservative,

or approximate time or space bounds. We have implemented this approach and performed a number of experiments for analyzing

Scheme programs. The results helped confirm the accuracy of the analysis.

Index TermsÐCost analysis, cost bound, performance analysis and measurements, program analysis and transformation, program

optimization, timing analysis, time analysis, space analysis, worst-case execution time.

æ

1 INTRODUCTION

ANALYSIS of program cost, such as running time andspace consumption, is important for real-time systems,

embedded systems, interactive environments, compileroptimizations, performance evaluation, and many othercomputer applications. It has been extensively studied inmany fields of computer science: algorithms [25], [16], [17],[53], programming languages [50], [26], [41], [44], andsystems [46], [37], [43], [42]. It is particularly important formany applications, such as real-time systems and em-bedded systems, to be able to predict accurate time boundsand space bounds automatically and efficiently and it isparticularly desirable to be able to do so for high-levellanguages [46], [37], [38].

For analyzing system running time, since Shaw proposedtiming schema for high-level languages [46], a number ofpeople have extended it for analysis in the presence ofcompiler optimizations [37], [12], pipelining [20], [28], cachememory [4], [28], [14], etc. However, there remains anobvious and serious limitation of the timing schema, evenin the absence of low-level complications. This is theinability to provide loop bounds, recursion depths, orexecution paths automatically and accurately for theanalysis [36], [3]. For example, the inaccurate loop boundscause the calculated worst-case time to be as much as67 percent higher than the measured worst-case time in [37],while the manual way of providing such information ispotentially an even larger source of error, in addition to itsinconvenience [36]. Various program analysis methodshave been proposed to provide loop bounds or execution

paths [3], [13], [19], [21]; they ameliorate the problem butcannot completely solve it because they apply only to someclasses of programs or use approximations that are toocrude for the analysis. Similarly, loop bounds and recursiondepths are needed also for space analysis [38].

This paper describes a language-based approach forautomatic and accurate cost-bound analysis. The approachcombines methods and techniques studied in theory,languages, and systems. We call it a language-basedapproach because it primarily exploits methods andtechniques for static program analysis and transformation.

The approach consists of transformations for buildingcost-bound functions in the presence of partially knowninput structures, symbolic evaluation of the cost-boundfunction based on input size parameters, and optimizationsto make the overall analysis efficient as well as accurate, allat the source-language level. We describe analysis andtransformation algorithms and explain how they work. Thecalculated cost bounds are expressed in terms of primitivecost parameters. These parameters can be obtained basedon the language implementation or can be measuredconservatively or approximately, yielding accurate, con-servative, or approximate time or space bounds. The costanalysis currently does not include cache analysis. We haveimplemented this approach and performed a number ofexperiments for analyzing Scheme programs. The resultshelped confirm the accuracy of the analysis. We describeour prototype system, ALPA, as well as the analysis andmeasurement results.

This approach is general in the sense that it works formultiple kinds of cost analysis. Our main analysis sums thecost in terms of different operations performed; it givesupper bounds for all kinds of operations, such as arithmeticoperations, data field selections, and constructor alloca-tions. Variations of it can analyze stack space, live heapspace, output size, etc., and can analyze lower bounds aswell as upper bounds. The basic ideas also apply to otherprogramming languages.

IEEE TRANSACTIONS ON COMPUTERS, VOL. 50, NO. 12, DECEMBER 2001 1295

. Y.A. Liu is with the Computer Science Department, State University ofNew York at Stony Brook, Stony Brook, NY 11794-4400.E-mail: [email protected].

. G. GoÂmez is with the Computer Science Department, Indiana University,Bloomington, IN 47405-7104. E-mail: [email protected].

Manuscript received 18 Sept. 1998; revised 6 Dec. 2000; accepted 22 May2001.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference IEEECS Log Number 107380.

0018-9340/01/$10.00 ß 2001 IEEE

Page 2: Automatic Accurate Cost-Bound Analysis for High-Level Languages

The rest of the paper is organized as follows: Section 2outlines our language-based approach. Sections 3, 4, and 5present the analysis and transformation methods andtechniques. Section 6 describes our implementation andexperimental results. Section 7 compares with related workand concludes.

2 LANGUAGE-BASED APPROACH

2.1 Cost and Cost Bound

Language-based cost-bound analysis starts with a givenprogram written in a high-level language, such as C or Lisp.The first step is to build a cost function that (takes the sameinput as the original program but) returns the cost in placeof (or in addition to) the original return value. This is doneeasily by associating a parameter with each programconstruct representing its cost and by summing theseparameters based on the semantics of the constructs [50],[10], [46]. We call parameters that describe the costs ofprogram constructs primitive cost parameters. To calculateactual cost bounds based on the cost function, three difficultproblems must be solved.

First, since the goal is to calculate cost without beinggiven particular inputs, the calculation must be based oncertain assumptions about inputs. Thus, the first problem isto characterize the input data and reflect them in the costfunction. In general, due to imperfect knowledge about theinput, the cost function is transformed into a cost-boundfunction.

In algorithm analysis, inputs are characterized by theirsize; accommodating this requires manual or semi-auto-matic transformation of the cost (time or space) function[50], [26], [53]. The analysis is mainly asymptotic andprimitive cost parameters are considered independent ofinput size, i.e., are constants while the computation iteratesor recurses. Whatever values of the primitive cost para-meters are assumed, a second problem arises and it istheoretically challenging: optimizing the cost-bound func-tion to a closed form in terms of the input size [50], [10],[26], [41], [17], [7]. Although much progress has been madein this area, closed forms are known only for subclasses offunctions. Thus, such optimization cannot be automaticallydone for analyzing general programs.

In systems, inputs are characterized indirectly using loopbounds or execution paths in programs and such informa-tion must in general be provided by the user [46], [37], [36],[28], even though program analyses can help in some cases[3], [13], [19], [21]. Closed forms in terms of parameters forthese bounds can be obtained easily from the cost function.This isolates the third problem, which is the mostinteresting to systems research: obtaining values of primi-tive cost parameters that depend on compilers, runtimesystems, operating systems, and machine hardware. Inrecent years, much progress has been made in analyzinglow-level dynamic factors, such as clock interrupt, memoryrefresh, cache usage, instruction scheduling, and parallelarchitectures, for time analysis [37], [4], [28], [14]. Never-theless, the inability to compute loop bounds or executionpaths automatically and accurately has led calculatedbounds to be much higher than measured worst-case time.

In the programming-language area, Rosendahl proposedusing partially known input structures [41]. For example,instead of replacing an input list l with its length n, as donein algorithm analysis, or annotating loops with numbersrelated to n, as done in systems, we simply use as input alist of n unknown elements. We call parameters, such as n,for describing partially known input structures input sizeparameters. The cost function is then transformed automa-tically into a cost-bound function: At control points wheredecisions depend on unknown values, the maximum cost ofall possible branches is computed; otherwise, the cost of thechosen branch is computed. Rosendahl concentrated onproving the correctness of this transformation. He assumedconstant 1 for primitive cost parameters and relied onoptimizations to obtain closed forms in terms of input sizeparameters, but, again, closed forms cannot be obtained forall cost-bound functions.

2.2 Language-Based Cost-Bound Analysis

Combining results from theory to systems and exploringmethods and techniques for static program analysis andtransformation, we have studied a language-based approachfor computing cost bounds automatically, efficiently, andmore accurately. The approach has three main components.

First, we use an automatic transformation to construct acost-bound function from the original program based onpartially known input structures. The resulting functiontakes input size parameters and primitive cost parametersas arguments. The only caveat here is that the cost-boundfunction might not terminate. However, nonterminationoccurs only if the recursive/iterative structure of theoriginal program depends on unknown parts in the givenpartially known input structures.

Then, to compute worst-case cost bounds efficientlywithout relying on closed forms, we optimize the cost-bound function symbolically with respect to given values ofinput size parameters. This is based on partial evaluationand incremental computation. This symbolic evaluationalways terminates, provided the cost-bound functionterminates. The resulting function expresses cost boundsas counts of different operations performed, where the costof each kind of operations is denoted by a primitive costparameter.

A third component consists of transformations thatenable more accurate cost bounds to be computed: liftingconditions, simplifying conditionals, and inlining nonre-cursive functions. The transformations should be appliedon the original program before the cost-bound function isconstructed. They may result in larger code size, but theyallow subcomputations based on the same control condi-tions to be merged, leading to more accurate cost bounds,which can be computed more efficiently as well.

The approach is general because all three componentswe developed are based on general methods and techni-ques. Each particular component is not meant to be a newanalysis or transformation, but the combination of them forthe application of automatic and accurate cost-boundanalysis for high-level languages is new. In the resultingcost bounds, primitive cost parameters can be obtainedbased on the language implementation, or can be measured

1296 IEEE TRANSACTIONS ON COMPUTERS, VOL. 50, NO. 12, DECEMBER 2001

Page 3: Automatic Accurate Cost-Bound Analysis for High-Level Languages

conservatively or approximately, to give accurate, conser-vative, or approximate time or space bounds.

We have implemented the analyses and transformationsfor a subset of Scheme [2], [11], [1], a dialect of Lisp. All thetransformations are done automatically, and the costbounds, expressed as operation counts, are computedefficiently and accurately. Example programs analyzedinclude a number of classical sorting programs, matrixcomputation programs, and various list processing pro-grams. We also estimated approximate bounds on theactual running times by measuring primitive cost para-meters for running times using control loops and calculatedaccurate bounds on the heap space allocated for construc-tors in the programs based on the number of bytes allocatedfor each constructor by the compiler. We used a functionalsubset of Scheme for three reasons:

1. Functional programming languages, together withfeatures like automatic garbage collection, havebecome increasingly widely used, yet work forcalculating actual running time and space of func-tional programs has been lacking.

2. Much work has been done on analyzing andtransforming functional programs, including com-plexity analysis, and it can be used for estimatingactual running time and space efficiently andaccurately as well.

3. Analyses and transformations developed for func-tional language can be applied to improve analysesof imperative languages as well [52].

All our analyses and transformations are performed at thesource level. This allows implementations to be indepen-dent of compilers and underlying systems and allowsanalysis results to be understood at source level.

2.3 Language

We use a first-order, call-by-value functional language thathas structured data, primitive arithmetic, Boolean, andcomparison operations, conditionals, bindings, and mu-tually recursive function calls. A program is a set ofmutually recursive function definitions of the form

f�v1; . . . ; vn� �4 e;where an expression e is given by the grammar below:1

e ::� v variable referencej c�e1; . . . ; en� data constructionj p�e1; . . . ; en� primitive operationj if e1 then e2 else e3 conditional expressionj let v � e1 in e2 end binding expressionj f�e1; . . . ; en� function application

For binary primitive operations, we will be changingbetween infix and prefix notations, depending on which-ever is easier for the presentation. Following Lisp andScheme, we use cons�h; t� to construct a list with head h andtail t and use car�l� and cdr�l� to select the head and tail,respectively, of list l. We use nil to denote an empty list anduse null�l� to test whether l is an empty list. For example,

the program below selects the least element in a nonempty

list.

least�x� �4 if null�cdr�x�� then car�x�else let s � least�cdr�x��

in if car�x� � s then car�x� else s end;

We use least as a small running example. To present

various analysis results, we also use several other examples:

insertion sort, selection sort, merge sort, set union, list

reversal (the standard linear-time version), and reversal

with append (the standard quadratic-time version).Even though this language is small, it is sufficiently

powerful and convenient to write sophisticated programs.

Structured data is essentially records in Pascal, structs in C,

and constructor applications in ML. Conditionals and

bindings easily simulate conditional statements and assign-

ments, and recursions can simulate loops. We can also see

that cost analysis in the presence of arrays and pointers is

not fundamentally harder [37] because the costs of the

program constructs for them can be counted in a similar

way as costs of other constructs. For example, accessing an

array element a�i� has the cost of accessing i, offsetting the

element address from that of a, and, finally, getting the

value from that address. Note that side effects caused by

these features often cause other analysis to be difficult [9],

[22]. For pure functional languages, higher-order functions

and lazy evaluations are important. Cost functions that

accommodate these features have been studied [49], [44].

The symbolic evaluation and optimizations we describe

apply to them as well.

3 CONSTRUCTING COST-BOUND FUNCTIONS

3.1 Constructing Cost Functions

We first transform the original program to construct a cost

function, which takes the original input and primitive cost

parameters as arguments and returns the cost. This is

straightforward based on the semantics of the program

constructs.Given an original program, we add a set of cost

functions, one for each original function, which simply

count the cost while the original program executes. The

algorithm, given below, is presented as a transformation Con the original program, which calls a transformation Ce to

recursively transform subexpressions. For example, a

variable reference is transformed into a symbol Cvarref

representing the cost of a variable reference; a conditional

statement is transformed into the cost of the test plus, if the

condition is true, the cost of the true branch, otherwise, the

cost of the false branch, plus the cost for the transfers of

control. We use cf to denote the cost function for f .

LIU AND G�OMEZ: AUTOMATIC ACCURATE COST-BOUND ANALYSIS FOR HIGH-LEVEL LANGUAGES 1297

1. The keywords are taken from ML [35]. Our implementation supportsboth this syntax and Scheme syntax.

Page 4: Automatic Accurate Cost-Bound Analysis for High-Level Languages

program :

Cf1�v1; . . . ; vn1

� �4 e1;

. . .

fm�v1; . . . ; vnm� �4 em;

264375

264375 �

f1�v1; . . . ; vn1� �4 e1;

. . .

fm�v1; . . . ; vnm� �4 em;

cf1�v1; . . . ; vn1� �4 Ce��e1��;

. . .

cfm�v1; . . . ; vnm� �4 Ce��em�;

variable reference :

Ce��v�� � Cvarrefdata construction :

Ce��c�e1; . . . ; en��� � add�Cc; Ce��e1��; . . . ; Ce��en���primitive operation :

Ce��p�e1; . . . ; en��� � add�Cp; Ce��e1��; . . . ; Ce��en���conditional :

Ce��if e1 then e2 else e3�� �add�Cif ; Ce��e1��; if e1 then Ce��e2�� else Ce��e3���

binding :

Ce��let v � e1 in e2 end�� �add�Clet; Ce��e1��; let v � e1 in Ce��e2��end�

function call :

Ce��f�e1; . . . ; en��� �add�Ccall; Ce��e1��; . . . ; Ce�n��; cf�e1; . . . ; en��

Applying this transformation to program least, we

obtain function least as originally given and cost function

cleast below, where infix notation is used for additions, and

unnecessary parentheses are omitted. Note that various Cs

are indeed arguments to the cost function cleast; we omit

them from argument positions for ease of reading.

cleast�x� �4 Cif � Cnull � Ccdr � Cvarref� �if null�cdr�x�� then Ccar � Cvarref

else Clet � Ccall � Ccdr � Cvarref � cleast�cdr�x��� �let s � least�cdr�x��

in Cif � C� � Ccar � Cvarref � Cvarref� �if car�x� � s then Ccar � Cvarref

else Cvarref� end��:This transformation is similar to the local cost assign-

ment [50], step-counting function [41], cost function [44],

etc. in other work. Our transformation extends those

methods with bindings and makes all primitive cost

parameters explicit at the source-language level. For

example, each primitive operation p is given a different

symbol Cp and each constructor c is given a different

symbol Cc. Note that the cost function terminates with the

appropriate sum of primitive cost parameters if the original

program terminates and it runs forever to sum to infinity if

the original program does not terminate, which is the

desired meaning of a cost function.

3.2 Constructing Cost-Bound Functions

Characterizing program inputs and capturing them in the

cost function are difficult to automate [50], [26], [46].

However, partially known input structures provide a

natural means [41]. A special value unknown represents

unknown values. For example, to capture all input lists of

length n, the following partially known input structure can

be used.

list�n� �4 if n � 0 then nil

else cons�unknown; list�nÿ 1��;Similar structures can be used to describe an array of n

elements, a matrix of m-by-n elements, a complete binary

tree of height h, etc.Since partially known input structures give incomplete

knowledge about inputs, the original functions need to be

transformed to handle the special value unknown. In

particular, for each primitive function p, we define a new

function fp such that fp�v1; . . . ; vn� returns unknown if any vi

is unknown and returns p�v1; . . . ; vn� as usual otherwise. For

example,

f��v1; v2� �4 if v1 � unknown _ v2

� unknown then unknown else v1 � v2:

We also define a new function lub, denoting least upper

bound, that takes two values and returns the most precise

partially known structure that both values conform with.

For example, if v1 � cons�3; nil� and v2 � cons�4; nil�, then

lub�v1; v2� � cons�unknown; nil�.

fp�v1; :::; vn� �4 if v1 � unknown_ . . . _vn � unknown

then unknown

else p�v1; . . . ; vn�;lub�v1; v2� �4 if v1 is c1�x1; . . . ; xi� ^

v2 is c2�y1; . . . ; yj� ^c1 � c2 ^ i � j

then c1�lub�x1; y1�; . . . ; lub�xi; yi��else unknown;

Also, the cost functions need to be transformed to compute

an upper bound of the cost: If the truth value of a

conditional test is known, then the cost of the chosen

branch is computed normally; otherwise, the maximum of

the costs of both branches is computed. Transformation B,

given below, embodies these algorithms, where Be trans-

forms an expression in the original functions and Bctransforms an expression in the cost functions. We use uf

to denote function f extended with the value unknown and

we use cbf to denote the cost-bound function for f .

1298 IEEE TRANSACTIONS ON COMPUTERS, VOL. 50, NO. 12, DECEMBER 2001

Page 5: Automatic Accurate Cost-Bound Analysis for High-Level Languages

program :

Bf1�v1; . . . ; vn1

� �4 e1;

. . .

fm�v1; . . . ; vnm� �4 em;

cf1�v1; . . . ; vn1� �4 e01;

. . .

cfm�v1; . . . vnm� �4 e0m;

264375

264375 �

uf1�v1; . . . ; vn1� �4 Be��e1��;

. . .

ufm�v1; . . . ; vnm��4 Be��em��;

cbf1�v1; . . . ; vn1� �4 Bc��e01��;

. . .

cbfm�v1; . . . ; vnm��4 Bc��e0m��;fp�v1; . . . ; vn� �4 . . . as above

lub�v1; v2� �4 . . . as above

variable reference :

Be��v�� � vdata construction :

Be��c�e1; . . . ; en��� � c�Be��e1��; . . . ;Be��en���primitive operation :

Be��p�e1; . . . ; en��� � fp�Be��e1��; . . . ;Be��en���conditional :

Be�if e1 then e2 else e3�� �let v � Be��e1��in if v � unknown then lub�e02; e03�

else if v then e02 else e03 end

where e02 � Be��e2��; e03 � Be��e3��binding :

Be��let v � e1 in e2 end�� � let v � Be��e1�� in Be��e2�� end

function call :

Be��f�e1; . . . ; en��� � uf�Be��e1��; . . . ;Be��en���

primitive cost parameter :

Bc��C�� � Csummation :

Bc��add�e1; . . . ; en��� � add�Bc��e1��; . . . ;Bc��en���conditional :

Bc��if e1 then e2 else e3�� �let v � Be��e1��in if v � unknown then max�e02; e03�

else if v then e02 else e03 end

where e02 � Bc��e2��; e03 � Bc��e3��binding :

Bc��let v � e1 in e2 end�� � let v � Be��e1�� in Bc��e2�� end

function call :

Bc��cf�e1; . . . ; en��� � cbf�Be��e1��; . . . ;Be�jenj��

Applying this transformation on functions least and

cleast yields functions uleast and cbleast below, where

function fp for each primitive operator p and function lub

are as given above. Shared code is presented with where-

clauses when this makes the code smaller.

uleast�x� �4 let v � fnull�fcdr�x��in if v � unknown then lub�e1; e2�

else if v then e1 else e2 end

where e1 � fcar�x�e2 � let s � uleast�fcdr�x��

in let v � f��fcar�x�; s�in if v � unknown then lub�fcar�x�; s�else if v then fcar�x�else s end end

cbleast�x� �4 Cif � Cnull � Ccdr � Cvarref� �let v � fnull�fcdr�x��

in if v � unknown then max�e1; e2�else if v then e1 else e2 end�;

where e1 � Ccar � Cvarref ;e2 � Clet � Ccall � Ccdr � Cvarref � cbleast�fcdr�x��� �let s � uleast�fcdr�x��

in Cif � C� � Ccar � Cvarref � Cvarref� �let v � f��fcar�x�; s�

in if v � unknown then

max�Ccar � Cvarref ; Cvarref�else if v then Ccar � Cvarrefelse Cvarref end�end�

The resulting cost-bound function takes as arguments

partially known input structures, such as list�n�, which take

as arguments input size parameters, such as n. Therefore,

we can obtain a resulting function that takes as arguments

input size parameters and primitive cost parameters and

computes the most accurate cost bound possible.

Both transformations C and B take linear time in terms of

the size of the program, so they are extremely efficient, as

also seen in our prototype system ALPA. Note that the

resulting cost-bound function might not terminate, but this

occurs only if the recursive structure of the original

program depends on unknown parts in the partially known

input structure. As a trivial example, if the partially known

input structure given is unknown, then the corresponding

cost-bound function for any recursive function does not

terminate since the original program does cost infinite

resources in the worst case. We can modify the analysis to

detect nontermination in many cases, as, for example, in

[27]. For the example of giving unknown to a recursive cost-

bound function, nontermination is trivial to detect since the

arguments to recursive calls would remain unknown.

4 OPTIMIZING COST-BOUND FUNCTIONS

This section describes symbolic evaluation and optimiza-

tions that make computation of cost bounds more efficient.

The transformations consist of partial evaluation, realized

as global inlining, and incremental computation, realized as

local optimization.

LIU AND G�OMEZ: AUTOMATIC ACCURATE COST-BOUND ANALYSIS FOR HIGH-LEVEL LANGUAGES 1299

Page 6: Automatic Accurate Cost-Bound Analysis for High-Level Languages

We first point out that cost-bound functions might be

extremely inefficient to evaluate given values for their

parameters. In fact, in the worst case, the evaluation takes

exponential time in terms of the input size parameters since

it essentially searches for the worst-case execution path for

all inputs satisfying the partially known input structures.

4.1 Partial Evaluation of Cost-Bound Functions

In practice, values of input size parameters are given for

almost all applications. This is why time-analysis techni-

ques used in systems can require loop bounds from the user

before time bounds are computed. While generally it is not

possible to obtain explicit loop bounds automatically and

accurately, we can implicitly achieve the desired effect by

evaluating the cost-bound function symbolically in terms of

primitive cost parameters given specific values of input size

parameters.The evaluation simply follows the structures of cost-

bound functions. Specifically, the control structures deter-

mine conditional branches and make recursive function

calls as usual and the only primitive operations are sums of

primitive cost parameters and maximums among alterna-

tive sums, which can easily be done symbolically. Thus, the

transformation inlines all function calls, sums all primitive

cost parameters symbolically, determines conditional

branches if it can, and takes maximum sums among all

possible branches if it cannot.The symbolic evaluation E defined below performs the

transformations. It takes as arguments an expression e and

an environment � of variable bindings (where each variable

is mapped to its value) and returns as a result a symbolic

value that contains the primitive cost parameters. The

evaluation starts with the application of the cost-bound

function to a partially unknown input structure, e.g.,

cbleast�list�100��, and it starts with an empty environment.

We assume that adds is a function that symbolically sums its

arguments, and maxs is a function that symbolically takes

the maximum of its arguments.

variable reference :

E��v��� � ��v�look up binding of v in environment

primitive cost parameter :

E��C��� � Cdata construction :

E��c�e1; . . . ; en���� � c�E��e1���; . . . ; E��en����primitive operation :

� E��p�e1; . . . ; en���� � p�E��e1���; . . . ; E��en����summation :

E��add�e1; . . . ; en���� � adds�E��e1���; . . . ; E��en����maximum :

E��max�e1; . . . ; en���� � maxs�E��e1���; . . . ; E��en����conditional :

E��if e1 then e2else e3��� � E��e2��� if E��e1��� � trueE��e3��� if E��e1��� � false

binding :

E��let v � e1 in e2 end��� � E��e2����v 7! E��e1����bind v to value of e1 in environment

function calls :

E��f�e1; . . . ; en���� where f is defined by f�v1; :::; vn� �4 e� E��e����v1 7! E��e1���; . . . ; vn 7! E��en����

As an example, applying symbolic evaluation to cbleaston a list of size 100, we obtain the following result:

cbleast�list�100�� �4 497 � Cvarref � 100 � Cnull � 199 � Ccar� 199 � Ccdr � 99 � C� � 199 � Cif � 99 � Clet � 99 � Ccall;This symbolic evaluation is exactly a specialized partial

evaluation. It is fully automatic and computes the mostaccurate cost bound possible with respect to the givenprogram structure. It always terminates as long as the cost-bound function terminates.

The symbolic evaluation given only values of input sizeparameters is inefficient compared to direct evaluationgiven values of both input size parameters and particularprimitive cost parameters, even though the resultingfunction takes virtually constant time given any values ofprimitive cost parameters. For example, directly evaluatinga quadratic-time reverse function (that uses append opera-tion) on input of size 20 takes about 0.96 milliseconds,whereas the symbolic evaluation takes 670 milliseconds,hundreds of times slower. We propose further optimiza-tions below that greatly speed up the symbolic evaluation.

4.2 Avoiding Repeated Summations overRecursions

The symbolic evaluation above is a global optimization overall cost-bound functions involved. During the evaluation,summations of symbolic primitive cost parameters withineach function definition are performed repeatedly while thecomputation recurses. Thus, we can speed up the symbolicevaluation by first performing such summations in apreprocessing step. Specifically, we create a vector and leteach element correspond to a primitive cost parameter. The

1300 IEEE TRANSACTIONS ON COMPUTERS, VOL. 50, NO. 12, DECEMBER 2001

Page 7: Automatic Accurate Cost-Bound Analysis for High-Level Languages

transformation S, given below, performs this optimization.We use vcbf to denote the transformed cost-bound functionof f that operates on vectors. We use function addv tocompute component-wise sum of the argument vectors, andwe use function maxv to compute component-wise max-imum of the argument vectors.

program :

Scbf1�v1; :::; vn1

� � e1;

. . .

cbfm�v1; . . . ; vnm� � em;

264375

264375 � vcbf1�v1; . . . ; vn1

� � Sc��e1��;. . .

vcbfm�v1; . . . ; vnm� � Sc��em��;primitive cost parameter :

Sc��C�� � create a vector of 0s except with the

component corresponding to C set to 1

summation :

Sc��add�e1; . . . ; en��� � addv�Sc��e1��; . . . ;Sc��en���maximum :

Sc��max�e1; . . . ; en��� � maxv�Sc��e1��; . . . ;Sc��en���all others :

Sc��e�� � eLet V be the following vector of primitive cost

parameters:

hCvarref ; Cnil; Ccons; Cnull; Ccar; Ccdr; C�; Cif ; Clet; Ccalli:Applying the above transformation on function cbleastyields function vcbleast, where components of the vectorscorrespond to the components of V and infix notation �v isused for vector addition.

vcbleast�x� �4 < 1; 0; 0; 1; 0; 1; 0; 1; 0; 0 >

�v �let v � fnull�fcdr�x��in if v � unknown then maxv�e1; e2�else if v then e1 else e2 end�;

where e1 � < 1; 0; 0; 0; 1; 0; 0; 0; 0; 0 >;

e2 � < 1; 0; 0; 0; 0; 1; 0; 0; 1; 1 > �vvcbleast�cdr�x���v �let s � uleast�fcdr�x��

in < 2; 0; 0; 0; 1; 0; 1; 1; 0; 0 >

�v �let v � f��fcar�x�; s�in if v � unknown then < 1; 0; 0; 0; 1; 0; 0; 0; 0; 0 >

else if v then < 1; 0; 0; 0; 1; 0; 0; 0; 0; 0 >

else < 1; 0; 0; 0; 0; 0; 0; 0; 0; 0 > end� end�The cost-bound function cbleast�x� is simply the dotproduct of vcbleast�x� and V .

This transformation incrementalizes the computationover recursions to avoid repeated summation. Again, thisis fully automatic and takes time linear in terms of the sizeof the cost-bound function.

The result of this optimization is a drastic speedup ofthe evaluation. For example, optimized symbolic evalua-tion of the same quadratic-time reverse on input of size20 takes only 2.55 milliseconds, while direct evaluationtakes 0.96 milliseconds, resulting in less than three timesslow-down; it is over 260 times faster than symbolicevaluation without this optimization.

5 MAKING COST-BOUND FUNCTIONS ACCURATE

While loops and recursions affect cost bounds most, theaccuracy of the cost bounds calculated also depends on thehandling of the conditionals in the original program, whichis reflected in the cost-bound function. For conditionalswhose test results are known to be true or false at thesymbolic-evaluation time, the appropriate branch is chosen;so, other branches, which may even take longer, are notconsidered for the worst-case cost. This is a major source ofaccuracy for our worst-case bound.

For conditionals whose test results are not known atsymbolic-evaluation cost, we need to take the maximumcost among all alternatives. The only case in which thiswould produce inaccurate cost bound is when the test in aconditional in one subcomputation implies the test in aconditional in another subcomputation. For example,consider a variable v whose value is unknown and

e1 � if v then 1 else Fibonacci�1000�;e2 � if v then Fibonacci�2000� else 2:

If we compute the cost bound for e1 � e2 directly, the resultis at least cFibonacci�1000� � cFibonacci�2000�. However, ifwe consider only the two realizable execution paths, weknow that the worst case is cFibonacci�2000� plus somesmall constants. This is known as the false-path eliminationproblem [3].

Two transformations, lifting conditions and simplifyingconditionals, applied on the source program before con-structing the cost-bound function, allow us to achieve theaccurate analysis results. In each function definition, theformer lifts conditions to the outermost scope that the testdoes not depend on and the latter simplifies conditionalsaccording to the lifted condition. For e1 � e2 in the aboveexample, lifting the condition for e1, we obtain

if v then 1� e2 else Fibonacci�1000� � e2:

Simplifying the conditionals in the two occurrences of e2 toFibonaccis�2000� and 2, respectively, we obtain

if v then 1� Fibonacci�2000� else Fibonacci�1000� � 2:

To facilitate these transformations, we inline all functioncalls where the function is not defined recursively.

The power of these transformations depends on reason-ings used in simplifying the conditionals, as have beenstudied in many program transformation methods [51],[45], [47], [18], [32]. At least syntactic equality can be used,which identifies the most obvious source of inaccuracy.These optimizations also speed up the symbolic evaluationsince now obviously infeasible execution paths are notsearched.

These transformations have been implemented andapplied on many test programs. Even though theresulting programs can be analyzed more accurately andmore efficiently, we have not performed separate mea-surements. The major reason is that our exampleprograms do not contain conditional tests that are impliedby other conditional tests. These simple transformationsare just examples of many powerful program optimiza-tion techniques, especially on functional programs, that

LIU AND G�OMEZ: AUTOMATIC ACCURATE COST-BOUND ANALYSIS FOR HIGH-LEVEL LANGUAGES 1301

Page 8: Automatic Accurate Cost-Bound Analysis for High-Level Languages

can be used to make cost-bound function more accurateas well as more efficient. We plan to explore more ofthese optimizations and measure their effects as weexperiment with more programs.

Note that these transformations on the source programare aimed at making the cost-bound function more accurateand more efficient, not at optimizing the source program.Even though making the source program faster also makesthe corresponding cost-bound function faster, these twogoals are different. Optimizing the source program is meantto produce a different program that has a smaller cost. Costanalysis is meant to accurately analyze the cost of a givenprogram.

To make use of all the techniques for making cost-boundanalysis efficient and accurate, we perform an overall cost-bound analysis by applying the following transformationsin order to the source program: lifting conditions andsimplifying conditionals (as in Section 5), constructing costfunctions and then cost-bound functions (as in Section 3),and precomputing repeated local summations and thenperforming global symbolic evaluation (as in Section 4).

6 IMPLEMENTATION AND EXPERIMENTATION

We have implemented the analysis approach in a prototypesystem, ALPA (Automatic Language-based PerformanceAnalyzer). We performed a large number of experimentsand obtained encouraging good results.

6.1 Implementation and Experimental Results

The implementation is for a subset of Scheme [2], [11], [1].An editor for the source programs is implemented using theSynthesizer Generator [40] and, thus, we can easily changethe syntax for the source programs. For example, thecurrent implementation supports both the syntax used inthis paper and Scheme syntax. Construction of cost-boundfunctions is written in SSL, a simple functional languageused in the Synthesizer Generator. Lifting conditions,simplifying conditionals, and inlining nonrecursive callsare also implemented in SSL. The symbolic evaluation andoptimizations are written in Scheme.

Fig. 1 gives the results of symbolic evaluation of the cost-bound functions for six example programs on inputs ofsizes 10 to 2,000. For example, the second row of the figuremeans that, for insertion sort on inputs of size 10, the cost-bound function is

cbinsertionsort�list�10�� �4321 � Cvarref � 11 � Cnil � 55 � Ccons � 66 � Cnull� 100 � Ccar � 55 � Ccdr � 45 � C� � 111 � Cif � 65 � Ccall;

The last column lists the sums for every row. For the setunion example, we used inputs where both arguments wereof the given sizes. These numbers in the figure characterizevarious aspects of the examples; they contribute to theactual time and space bounds discussed below. We verifiedthat all numbers are also exact worst-case counts. Forexample, for insertion sort on inputs of size 10, 65 functioncalls are indeed made during a worst-case execution. Theworst-case counts are verified by using a modifiedevaluator. These experiments show that our cost-bound

functions can give accurate cost bounds in terms of countsof different operations performed.

Fig. 2 compares the times of direct evaluation of cost-bound functions, with each primitive cost parameter set to 1and the times of optimized symbolic evaluation, obtainingthe exact symbolic counts as in Fig. 1. These measurementsare taken on a Sun Ultra 1 with 167MHz CPU and 64MBmain memory. They include garbage-collection time. Thetimes without garbage-collection times are all about1 percent faster, so they are not shown here. Theseexperiments show that our optimizations of cost-boundfunctions allow symbolic evaluation to be only a few timesslower than direct evaluation rather than hundreds of timesslower.

For merge sort, the cost-bound function constructedusing the algorithms in this paper takes several days toevaluate on inputs of size 50 or larger. Special but simpleoptimizations were done to obtain the numbers in Fig. 1,namely, letting the cost-bound function for merge avoidbase cases as long as possible and using sizes of lists inplace of lists of unknowns; the resulting symbolic evalua-tion takes only seconds. Such optimizations are yet to beimplemented to be performed automatically. For all otherexamples, it takes at most 2.7 hours to evaluate the cost-bound functions.

Note that, on small inputs, symbolic evaluation takesrelatively much more time than direct evaluation, due to therelatively large overhead of vector setup; as inputs getlarger, symbolic evaluation is almost as fast as directevaluation for most examples. Again, after the symbolicevaluation, cost bounds can be computed in virtually notime given values of primitive cost parameters.

Among over 20 programs we have analyzed usingALPA, two of them did not terminate. One is quicksortand the other is a contrived variation of sorting; bothdiverge because the recursive structure for splitting a listdepends on the values of unknown list elements. This issimilar to nontermination caused by merging paths in othermethods [33], [34], but nontermination happens much lessoften in our method since we essentially avoid mergingpaths as much as possible. We have found a differentsymbolic-evaluation strategy that uses a kind of incrementalpath selection and the evaluation would terminate for bothexamples, as well as all other examples, giving accurateworst-case bounds. That evaluation algorithm is not yetimplemented. A future work is to exploit results from staticanalysis for identifying sources of nontermination [27] tomake cost-bound analysis terminate more often. Forpractical use of a cost-bound analyzer that might notterminate on certain inputs, we can modify the evaluator sothat if it is stopped at any time, it outputs the cost boundcalculated till that point. This means that a longer-runninganalysis might yield a higher bound.

6.2 Further Experiments

We also estimated approximate bounds on the actualrunning times by measuring primitive cost parameters forrunning times using control loops and calculated accuratebounds on the heap space allocated for constructors in theprograms based on the number of bytes allocated for eachconstructor by the compiler. For time-bound analysis, we

1302 IEEE TRANSACTIONS ON COMPUTERS, VOL. 50, NO. 12, DECEMBER 2001

Page 9: Automatic Accurate Cost-Bound Analysis for High-Level Languages

performed two sets of experiments: the first for a machinewith cache enabled and the second for a machine with cachedisabled. The first gives tight bounds in most cases, but has

a few underestimations for inputs that are very small orvery large, which we attribute to the cache effects. Thesecond gives conservative and tight bounds for all inputs.

LIU AND G�OMEZ: AUTOMATIC ACCURATE COST-BOUND ANALYSIS FOR HIGH-LEVEL LANGUAGES 1303

Fig. 2. Times of direct evaluation vs. optimized symbolic evaluations (in milliseconds).

Fig. 1. Results of symbolic evaluation of cost-bound functions.

Page 10: Automatic Accurate Cost-Bound Analysis for High-Level Languages

We first describe experiments for time-bound analysis withcache enabled and for analysis of heap space allocationbound and then analyze the cache effects and show resultsfor time-bound analysis with cache disabled.

The measurements and analyses for time-bounds areperformed for source programs compiled with the ChezScheme compiler [8]. The source program does not use anylibrary; in particular, no numbers are large enough totrigger the bignum implementation of Chez Scheme. Wetried to avoid compiler optimizations by setting theoptimization level to 0; we view necessary optimizationsas having already been applied to the program. To handlegarbage-collection time, we performed separate sets ofexperiments: those that exclude garbage-collection times inboth calculations and measurements and those that includegarbage-collection time in both.2 Our current analysis doesnot handle the effects of cache memory or instructionpipelining; we approximated cache effects by takingoperands circularly from a cycle of 2,000 elements whenmeasuring primitive cost parameters, as discussed furtherbelow. For time-bound analysis with cache enabled, theparticular numbers reported are taken on a Sun Ultra 1 with167MHz CPU and 64MB main memory; we have alsoperformed the analysis for several other kinds of SPARCstations and the results are similar.

Since the minimum running time of a program constructis about 0.1 microseconds and the precision of the timingfunction is 10 milliseconds, we use control/test loops that

iterate 10,000,000 times, keeping measurement error under0.001 microseconds, i.e., 1 percent. Such a loop is repeated100 times and the average value is taken to compute theprimitive cost parameter for the tested construct (thevariance is less than 10 percent in most cases). Thecalculation of the time bound is done by plugging thesemeasured parameters into the optimized time-boundfunction. We then run each example program an appro-priate number of times to measure its running time withless than 1 percent error.

Fig. 3 shows the estimated and measured worst-casetimes for six example programs on inputs of sizes 10 to2,000. These times do not include garbage-collection times.The item me/ca is the measured time expressed as apercentage of the calculated time. In general, all measuredtimes are closely bounded by the calculated times (withabout 90-95 percent accuracy) except when inputs are verysmall (20, in one case) or very large (2,000, in three cases),which is analyzed and addressed below. The measurementsincluding garbage-collection times are similar except with afew more cases of underestimation. Fig. 4 depicts thenumbers in Fig. 3 for inputs of sizes up to 1,000. Examplessuch as sorting are classified as complex examples inprevious study [37], [28], where calculated time is as muchas 67 percent higher than measured time and where onlythe result for one sorting program on a single input (of size10 [37] or 20 [28]) is reported in each experiment.

Using the cost bounds computed, we can also calculate,accurately instead of approximately, bounds on the heapspace dynamically allocated for constructors in the sourceprograms. The number of bytes allocated for each con-structor can be obtained precisely based on the languageimplementation. For example, Chez Scheme allocates 8 bytes

1304 IEEE TRANSACTIONS ON COMPUTERS, VOL. 50, NO. 12, DECEMBER 2001

2. We had originally tried to avoid garbage collection by writing loopsinstead of recursions as much as possible and tried to exclude garbage-collection times completely. The idea of including garbage-collection timescomes from an earlier experiment, where we mistakenly used a timingfunction of Chez Scheme that included garbage-collection time.

Fig. 3. Calculated and measured worst-case times (in milliseconds) with cache enabled.

Page 11: Automatic Accurate Cost-Bound Analysis for High-Level Languages

for a cons-cell on the heap; this information can also be

obtained easily using its statistics utilities. Based on results

in Fig. 1, by setting Ccons to 8 and other primitive cost

parameters to 0, we obtain exact bounds on the heap space

dynamically allocated for constructors in the programs, as

shown in Fig. 5.Consider the accuracy of the time-bound analysis with

cache enabled. We found that, when inputs are very small

(20), the measured time is occasionally above the calculated

time for some examples. Also, when inputs are very large

(1,000 for measurements including garbage-collection time

or 2,000 excluding garbage-collection time), the measured

times for some examples are above the calculated time. We

attribute these to cache memory effects for the following

reasons: First, the initial cache misses are more likely to

show up on small inputs. Second, underestimation for

LIU AND G�OMEZ: AUTOMATIC ACCURATE COST-BOUND ANALYSIS FOR HIGH-LEVEL LANGUAGES 1305

Fig. 4. Comparison of calculated and measured worst-case times with cache enabled.

Page 12: Automatic Accurate Cost-Bound Analysis for High-Level Languages

inputs of size 2,000 in Fig. 3 happens exactly for the threeexamples whose allocated heap space is very large in Fig. 5and recall that we used a cycled data structure of size 2,000when measuring primitive cost parameters. Furthermore,for programs that use less space, our calculated bounds areaccurate for even larger input sizes and, for programs thatuse an extremely large amount of space even on smallinputs, we have much worse underestimation. For example,for Cartesian product, underestimation occurs for smallinput sizes (50 to 200); as an example, on input of size 200,the measured time is 65 percent higher than the calculatedtime.

We performed a second set of experiments for time-bound analysis for a machine with cache disabled. Themachine used is a Sun Ultra 10 with 333MHz CPU and256MB main memory. Fig. 6 shows the estimated andmeasured worst-case times for the same six programs oninputs of sizes 10 to 2,000. These times do not includegarbage-collection times. We can see that all measuredtimes are closely bounded by the calculated times, with nounderestimation. Fig. 7 depicts the numbers in Fig. 6.

To accommodate cache effect in time-bound analysiswith cache enabled, we could adjust our measurements ofprimitive cost parameters on data structures of appropriate

size. The appropriate size can be determined based on a

precise space usage analysis. Heap-space allocation is onlyone less direct aspect. More directly, we can incorporate

precise knowledge about compiler-generated machine

instructions into our analysis method. We leave this asfuture work. Our current method can be used to approx-

imate time-bound estimation in the presence of low-level

effects or precise analysis in their absence and can be usedfor more accurate space-bound analysis that helps in

addressing memory issues.

7 RELATED WORK AND CONCLUSION

A preliminary version of this work appeared in [30]. An

overview of comparison with related work in cost analysisappears in Section 2. Certain detailed comparisons have

also been discussed while presenting our method. This

section summarizes them, compares them with analyses forloop bounds and execution paths in more detail, and

concludes.Compared to work in algorithm analysis and program

complexity analysis [26], [44], [53], [7], this work consis-tently pushes through symbolic primitive cost parameters,

so it allows us to calculate actual cost bounds and validate

1306 IEEE TRANSACTIONS ON COMPUTERS, VOL. 50, NO. 12, DECEMBER 2001

Fig. 5. Bounds of heap space allocated for constructors (in bytes).

Fig. 6. Calculated and measured worst-case times (in milliseconds) with cache disabled.

Page 13: Automatic Accurate Cost-Bound Analysis for High-Level Languages

the results with experimental measurements. There is alsowork on analyzing average-case complexity [17], which hasa different goal than worst-case bounds. Compared to workin systems [46], [37], [36], [28], this work explores programanalysis and transformation techniques to make theanalysis automatic, efficient, and accurate, overcoming thedifficulties caused by the inability to obtain loop bounds,recursion depths, or execution paths automatically andprecisely. There is also work for measuring primitive cost

parameters for the purpose of general performance predic-tion [43], [42]. In that work, information about executionpaths was obtained by running the programs on a numberof inputs; for programs such as insertion sort, whose best-case and worst-case execution times differ greatly, thepredicted time using this method could be very inaccurate.

A number of techniques have been studied for obtainingloop bounds or execution paths for time analysis [36], [3], [13],[19], [21]. Manual annotations [36], [28] are inconvenient and

LIU AND G�OMEZ: AUTOMATIC ACCURATE COST-BOUND ANALYSIS FOR HIGH-LEVEL LANGUAGES 1307

Fig. 7. Comparison of calculated and measured worst-case times with cache disabled.

Page 14: Automatic Accurate Cost-Bound Analysis for High-Level Languages

error-prone [3]. Automatic analysis of such information hastwo main problems. First, even when a precise loop boundcan be obtained by symbolic evaluation of the program [13],separating the loop and path information from the rest ofthe analysis is generally less accurate than an integratedanalysis [34]. Second, approximations for merging pathsfrom loops, or recursions, very often lead to nonterminationof the time analysis, not just looser bounds [13], [19], [34].Some newer methods, while powerful, apply only to certainclasses of programs [21]. In contrast, our method allowsrecursions, or loops, to be considered naturally in theoverall cost analysis based on partially known inputstructures. In addition, our method does not merge pathsfrom recursions, or loops; this may cause exponential timecomplexity of the analysis in the worst case, but ourexperiments on test programs show that the analysis is stillfeasible for inputs of sizes in the thousands. We have alsostudied simple but powerful optimizations to speed up theanalysis dramatically.

In the analysis for cache behavior [14], [15], loops aretransformed into recursive calls, and a predefined callstringlevel determines how many times the fixed-point analysisiterates and, thus, how the analysis results are approxi-mated. Our method allows the analysis to perform the exactnumber of recursions, or iterations for the given partiallyknown input data structures. The work by Lundqvist andStenstroÈm [33], [34] is based on ideas similar to ours. Theyapply the ideas at machine instruction level and can moreaccurately take into account the effects of instructionpipelining and data caching, but they cannot handledynamically allocated data structures as we can and theirmethod for merging paths for loops would lead tononterminating analysis for many more programs thanour method. We apply the ideas at the source level and ourexperiments show that we can calculate more accurate costbound and for many more programs than merging pathsand the calculation is still efficient. There are also methodsfor time analysis based on program flow graphs [39], [6].Unlike our method, these methods do not exploit giveninput sizes and they require programmers to give precisepath information.

The idea of using partially known input structuresoriginates from Rosendahl [41]. We have extended it tomanipulate primitive cost parameters. We also handlebinding constructs, which is simple but necessary forefficient computation. An innovation in our method is tooptimize the cost-bound function using partial evaluation,incremental computation, and transformations of condi-tionals to make the analysis more efficient and moreaccurate. Partial evaluation [5], [24], [23], incrementalcomputation [32], [31], [29], and other transformations havebeen studied intensively in programming languages. Theirapplications in our cost-bound analysis are particularlysimple and clean; the resulting transformations are fullyautomatic and efficient.

We have started to explore a suite of new language-based techniques for cost analysis, in particular, analysesand optimizations for further speeding up the evaluation ofthe cost-bound function. We have also applied our generalapproach to analyze stack space and live heap space [48],

which can further help predict garbage-collection and

caching behavior. We can also analyze lower bounds using

a symmetric method, namely, by replacing maximum with

minimum at all conditional points. A future work is to

accommodate more lower-level dynamic factors for timing

at the source-language level [28], [14] by examining the

corresponding compiler generated code, where cache and

pipelining effects are explicit.In conclusion, the approach we propose is based entirely

on high-level programming languages. The methods and

techniques are intuitive; together they produce automatic

tools for analyzing cost bounds efficiently and accurately

and can be used to accurately or approximately analyze

time and space bounds.

ACKNOWLEDGMENTS

The authors thank the anonymous referees for their careful

reviews and many very helpful comments. This work was

supported in part by the US National Science Foundation

under Grant CCR-9711253 and US Office of Naval Research

under Grants N00014-99-1-0132 and N00014-01-1-0109.

REFERENCES

[1] H. Abelson et al., ªRevised Report on the Algorithmic LanguageScheme,º Higher-Order and Symbolic Computation, vol. 11, no. 1,pp. 7-105, Aug. 1998.

[2] H. Abelson, G.J. Sussman, and J. Sussman, Structure andInterpretation of Computer Programs. MIT Press and McGraw-Hill,1985.

[3] P. Altenbernd, ªOn the False Path Problem in Hard Real-TimePrograms,º Proc. Eighth EuroMicro Workshop Real-Time Systems,pp. 102-107, June 1996.

[4] R. Arnold, F. Mueller, D.B. Whalley, and M.G. Harmon,ªBounding Worst-Case Instruction Cache Performance,º Proc.13th IEEE Real-Time Systems Symp., 1994.

[5] Partial Evaluation and Mixed Computation, B. Bjùrner, A.P. Ershov,and N.D. Jones, eds. Amsterdam: North-Holland, 1988.

[6] J. Blieberger, ªData-Flow Frameworks for Worst-Case ExecutionTime Analysis,º Real-Time Systems, to appear.

[7] J. Blieberger and R. Lieger, ªWorst-Case Space and TimeComplexity of Recursive Procedures,º Real-Time Systems, vol. 11,no. 2, pp. 115-144, 1996.

[8] Cadence Research Systems, Chez Scheme System Manual, revision2.4, Cadence Research Systems, Bloomington, Ind., July 1994.

[9] D.R. Chase, M. Wegman, and F.K. Zadeck, ªAnalysis of Pointersand Structures,º Proc. ACM SIGPLAN '90 Conf. ProgrammingLanguage Design and Implementation, pp. 296-310, June 1990.

[10] J. Cohen, ªComputer-Assisted Microanalysis of Programs,º Comm.ACM, vol. 25, no. 1, pp. 724-733, Oct. 1982.

[11] R.K. Dybvig, The Scheme Programming Language. Englewood Cliffs,N.J.: Prentice Hall, 1987.

[12] J. Engblom, P. Altenbernd, and A. Ermedahl, ªFacilitating Worst-Case Execution Time Analysis for Optimized Code,º Proc. 10thEuroMicro Workshop Real-Time Systems, June 1998.

[13] A. Ermedahl and J. Gustafsson, ªDeriving Annotations for TightCalculation of Execution Time,º Proc. Euro-Par '97, pp. 1298-1307,Aug. 1997.

[14] C. Ferdinand, F. Martin, and R. Wilhelm, ªApplying CompilerTechniques to Cache Behavior Prediction,º Proc. ACM SIGPLAN1997 Workshop Languages, Compilers, and Tools for Real-TimeSystems, pp. 37-46, 1997.

[15] C. Ferdinand and R. Wilhelm, ªEfficient and Precise CacheBehavior Prediction for Real-Time Systems,º Real-Time Systems,vol. 17, nos. 2-3, pp. 131-181, Nov. 1999.

[16] P. Flajolet, B. Salvy, and P. Zimmermann, ªLambda-Upsilon-Omega: An Assistant Algorithms Analyzer,º Applied Algebra,Algebraic Algorithms and Error-Correcting Codes, T. Mora, ed.,pp. 201-212, July 1989.

1308 IEEE TRANSACTIONS ON COMPUTERS, VOL. 50, NO. 12, DECEMBER 2001

Page 15: Automatic Accurate Cost-Bound Analysis for High-Level Languages

[17] P. Flajolet, B. Salvy, and P. Zimmermann, ªAutomatic Average-Case Analysis of Algorithms,º Theoretical Computer Science,Series A, vol. 79, no. 1, pp. 37-109, Feb. 1991.

[18] Y. Futamura and K. Nogi, ªGeneralized Partial Evaluation,ºPartial Evaluation and Mixed Computation, B. Bjùrner, A.P. Ershov,and N.D. Jones, eds., pp. 133-151, Amsterdam: North-Holland,1988.

[19] J. Gustafsson and A. Ermedahl, ªAutomatic Derivation of Pathand Loop Annotations in Object-Oriented Real-Time Programs,º J.Parallel and Distributed Computing Practices, vol. 1, no. 2, June 1998.

[20] M.G. Harmon, T.P. Baker, and D.B. Whalley, ªA RetargetableTechnique for Predicting Execution Time,º Proc. 11th IEEE Real-Time Systems Symp., pp. 68-77, Dec. 1992.

[21] C. Healy, M. SjoÈdin, V. Rustagi, and D. Whalley, ªBounding LoopIterations for Timing Analysis,º Proc. IEEE Real-Time ApplicationsSymp., June 1998.

[22] L.J. Hendren, J. Hummel, and A. Nicolau, ªAbstractions forRecursive Pointer Data Structures: Improving the Analysis andTransformation of Imperative Programs,º Proc. ACM SIGPLAN '92Conf. Programming Language Design and Implementation, pp. 249-260, June 1992.

[23] N.D. Jones, ªAn Introduction to Partial Evaluation,º ACMComputing Surveys, vol. 28, no. 3, pp. 480-503, 1996.

[24] N.D. Jones, C.K. Gomard, and P. Sestoft, Partial Evaluation andAutomatic Program Generation. Englewood Cliffs, N.J.: PrenticeHall, 1993.

[25] D.E. Knuth, The Art of Computer Programming, vol. 1. Reading,Mass.: Addison-Wesley, 1968.

[26] D. Le MeÂtayer, ªAce: An Automatic Complexity Evaluator,º ACMTrans. Programing Languages and Systems, vol. 10, no. 2, pp. 248-266,Apr. 1988.

[27] C.S. Lee, N.D. Jones, and A.M. Ben-Amram, ªThe Size-ChangePrinciple for Program Termination,º Conf. Record 28th Ann. ACMSymp. Principles of Programming Languages, Jan. 2001.

[28] S.-S. Lim, Y.H. Bae, G.T. Jang, B.-D. Rhee, S.L. Min, C.Y. Park, H.Shin, K. Park, S.-M. Moon, and C.-S. Kim, ªAn Accurate WorstCase Timing Analysis for RISC Processors,º IEEE Trans. SoftwareEng., vol. 21, no. 7, pp. 593-604, July 1995.

[29] Y.A. Liu, ªEfficiency by Incrementalization: An Introduction,ºHigher-Order and Symbolic Computation, vol. 13, no. 4, pp. 289-313,Dec. 2000.

[30] Y.A. Liu and G. GoÂmez, ªAutomatic Accurate Time-BoundAnalysis for High-Level Languages,º Proc. ACM SIGPLAN 1998Workshop Languages, Compilers, and Tools for Embedded Systems,pp. 31-40, June 1998.

[31] Y.A. Liu, S.D. Stoller, and T. Teitelbaum, ªStatic Caching forIncremental Computation,º ACM Trans. Programming Languagesand Systems, vol. 20, no. 3, pp. 546-585, May 1998.

[32] Y.A. Liu and T. Teitelbaum, ªSystematic Derivation of IncrementalPrograms,º Scientific Computer Programming, vol. 24, no. 1, pp. 1-39, Feb. 1995.

[33] T. Lundqvist and P. StenstroÈm, ªIntegrating Path and TimingAnalysis Using Instruction-Level Simulation Techniques,º Proc.ACM SIGPLAN 1998 Workshop Languages, Compilers, and Tools forEmbedded Systems, pp. 1-15, June 1998.

[34] T. Lundqvist and P. StenstroÈm, ªAn Integrated Path and TimingAnalysis Method Based on Cycle-Level Symbolic Execution,ºReal-Time Systems, vol. 17, nos. 2-3, pp. 183-207, Nov. 1999.

[35] R. Milner, M. Tofte, and R. Harper, The Definition of Standard ML.Cambridge, Mass.: MIT Press, 1990.

[36] C.Y. Park, ªPredicting Program Execution Times by AnalyzingStatic and Dynamic Program Paths,º Real-Time Systems, vol. 5,pp. 31-62, 1993.

[37] C.Y. Park and A.C. Shaw, ªExperiments with a Program TimingTool Based on Source-Level Timing Schema,º Computer, vol. 24,no. 5, pp. 48-57, May 1991.

[38] P. Persson, ªLive Memory Analysis for Garbage Collection inEmbedded Systems,º Proc. ACM SIGPLAN 1999 WorkshopLanguages, Compilers, and Tools for Embedded Systems, pp. 45-54,May 1999.

[39] P.P. Puschner and A.V. Schedl, ªComputing Maximum TaskExecution TimesÐA Graph-Based Approach,º Real-Time Systems,vol. 13, no. 1, pp. 67-91, 1997.

[40] T. Reps and T. Teitelbaum, The Synthesizer Generator: A System forConstructing Language-Based Editors. New York: Springer-Verlag,1988.

[41] M. Rosendahl, ªAutomatic Complexity Analysis,º Proc. FourthInt'l Conf. Functional Programming Languages and Computer Archi-tecture, pp. 144-156, Sept. 1989.

[42] R.H. Saavedra and A.J. Smith, ªAnalysis of Benchmark Char-acterization and Benchmark Performance Prediction,º ACM Trans.Computer Systems, vol. 14, no. 4, pp. 344-384, Nov. 1996.

[43] R.H. Saavedra-Barrera, A.J. Smith, and E. Miya, ªMachineCharacterization Based on an Abstract High-Level LanguageMachine,º IEEE Trans. Computers, vol. 38, no. 12, pp. 1659-1679,Dec. 1989.

[44] D. Sands, ªComplexity Analysis for a Lazy Higher-OrderLanguage,º Proc. Third European Symp. Programming, N.D. Jones,ed., pp. 361-376, May 1990.

[45] W.L. Scherlis, ªProgram Improvement by Internal Specialization,ºConf. Record Eighth Ann. ACM Symp. Principles of ProgrammingLanguages, pp. 41-49, Jan. 1981.

[46] A. Shaw, ªReasoning about Time in Higher Level LanguageSoftware,º IEEE Trans. Software Eng., vol. 15, no. 7, pp. 875-889,July 1989.

[47] V.F. Turchin, ªThe Concept of a Supercompiler,º ACM Trans.Programming Languages and Systems, vol. 8, no. 3, pp. 292-325, July1986.

[48] L. Unnikrishnan, S.D. Stoller, and Y.A. Liu, ªAutomatic AccurateLive Memory Analysis for Garbage-Collected Languages,º Proc.ACM SIGPLAN 2001 Workshop Languages, Compilers, and Tools forEmbedded Systems, June 2001.

[49] P. Wadler, ªStrictness Analysis Aids Time Analysis,º Conf. Record15th Ann. ACM Symp. Principles of Programming Languages, Jan.1988.

[50] B. Wegbreit, ªMechanical Program Analysis,º Comm. ACM, vol. 18,no. 9, pp. 528-538, Sept. 1975.

[51] B. Wegbreit, ªGoal-Directed Program Transformation,º IEEETrans. Software Eng., vol. 2, no. 2, pp. 69-80, June 1976.

[52] D. Weise, R.F. Crew, M. Ernst, and B. Steensgaard, ªValueDependence Graphs: Representation without Taxation,º Conf.Record 21st Ann. ACM Symp. Principles of Programming Languages,Jan. 1994.

[53] P. Zimmermann and W. Zimmermann, ªThe Automatic Complex-ity Analysis of Divide-and-Conquer Algorithms,º Computer andInformation Sciences VI, Elsevier, 1991.

Yanhong Annie Liu received the BS degreefrom Peking University, the MEng degree fromTsinghua University, and the MS and PhDdegrees from Cornell University, all in computerscience. She is an associate professor in theComputer Science Department at the StateUniversity of New York at Stony Brook. Herprimary research interests are in the areas ofprogramming languages, compilers, and soft-ware systems, with emphasis on general and

systematic methods for improving the efficiency of computations.

Gustavo Go mez received the BS and MSdegrees in computer science from the MonterreyInstitute of Technology (ITESM). He is a PhDcandidate in the Computer Science Departmentat Indiana University. His research interestsinclude programming language-based methodsfor cost-bound analysis, garbage collection, andefficient compilation.

. For more information on this or any computing topic, please visitour Digital Library at http://computer.org/publications.dlib.

LIU AND G�OMEZ: AUTOMATIC ACCURATE COST-BOUND ANALYSIS FOR HIGH-LEVEL LANGUAGES 1309