Part 4 - Expressions and Statement

Louden, 2003 1

Chapter 7 - Control I: Chapter 7 - Control I: Expressions and StatementsExpressions and Statements

Programming Languages:Principles and Practice, 2nd Ed.Kenneth C. Louden

Chapter 7 K. Louden, Programming Languages 2

Expression and Statement: Basic Expression and Statement: Basic introductionintroduction "Control" is the general study of the

semantics of execution paths through code: what gets executed, when, and in what order.

Most important control issue in modern languages: procedure/function/method call and return, studied in Chapter 8.

Here we study more localized control issues in expressions and statements, and:

Exception handling, which involves non-local control but has a local component too.

IntroductionIntroduction An expression represents a single data item--

usually a number. The expression may consist of a single entity,

such as a constant or variable, or it may consist of some combination of such entities, interconnected by one or more operators.

Expressions can also represent logical conditions which are either true or false.

However, in C, the conditions true and false are represented by the integer values 1 and 0, respectively.


Several simple expressions are given below:a + b x = y t = u + v x <= y ++j

The first expression, which employs the addition operator (+), represents the sum of the values assigned to variables a and b.

The second expression involves the assignment operator (=), and causes the value represented by y to be assigned to x.


In the third expression, the value of the expression (u + v) is assigned to t.

The fourth expression takes the value 1 (true) if the value of x is less than or equal to the value of y.

Otherwise, the expression takes the value 0 (false). Here, <= is a relational operator that compares the values of x and y.

The final example causes the value of j to be increased by 1. Thus, the expression is equivalent to

j = j + 1


The increment (by unity) operator ++ is called a unary operator, because it only possesses one operand.

A statement causes the computer to carry out some definite action.

There are three different classes of statements in C: expression statements, compound statements, and control statements.

An expression statement consists of an expression followed by a semicolon. The execution of such a statement causes the associated expression to be evaluated. For example:

a = 6; c = a + b;

++j;Chapter 7 K. Louden, Programming Languages 6

The first two expression statements both cause the value of the expression on the right of the equal sign to be assigned to the variable on the left.

The third expression statement causes the value of j to be incremented by 1.

Again, there is no restriction on the length of an expression statement: such a statement can even be split over many lines, so long as its end is signaled by a semicolon.


A compound statement consists of several individual statements enclosed within a pair of braces { }.

The individual statements may themselves be expression statements, compound statements, or control statements.

Unlike expression statements, compound statements do not end with semicolons.

A typical compound statement is shown below:{

pi = 3.141593; circumference = 2. * pi * radius;

area = pi * radius * radius; }


This particular compound statement consists of three expression statements, but acts like a single entity in the program in which it appears.

A symbolic constant is a name that substitutes for a sequence of characters.

The characters may represent either a number or a string.

When a program is compiled, each occurrence of a symbolic constant is replaced by its corresponding character sequence.

Symbolic constants are usually defined at the beginning of a program, by writing

#define NAME textChapter 7 K. Louden, Programming Languages 9

where NAME represents a symbolic name, typically written in upper-case letters, and text represents the sequence of characters that is associated with that name.

Note that text does not end with a semicolon, since a symbolic constant definition is not a true C statement.

In fact, during compilation, the resolution of symbolic names is performed (by the C preprocessor) before the start of true compilation.

For instance, suppose that a C program contains the following symbolic constant definition:

#define PI 3.141593Chapter 7 K. Louden, Programming Languages 10

Suppose, further, that the program contains the statement

area = PI * radius * radius;

During the compilation process, the preprocessor replaces each occurrence of the symbolic constant PI by its corresponding text. Hence, the above statement becomes

area = 3.141593 * radius * radius;


Symbolic constants are particularly useful in scientific programs for representing constants of nature, such as the mass of an electron, the speed of light, etc.

Since these quantities are fixed, there is little point in assigning variables in which to store them.



Expressions : Expounded DefinitionExpressions : Expounded Definition In their purest form, expressions do not involve

control issues: subexpressions can be evaluated in arbitrary order, and the order does not affect the result. Functional programming tries to achieve this goal for whole programs.

Of course, there must always be a few expressions that can modify the execution/evaluation process: if-then-else expressions, short-circuit boolean operators, case/switch expressions.

If these could have arbitrary evaluation order, programs would become non-deterministic: any of a number of different outcomes might be possible. Usually this is not desirable, but see later.


Side EffectsSide Effects A side effect is any observable change to

memory, input or output. A program without any side effect is useless. Side effects expose evaluation order:

class Order{ static int x = 1; public static int getX() { return x++; } public static void main(String[] args) { System.out.println( x+getX() ); }}

This prints 2, but the corresponding C program will usually print 3!

Referential transparency limits side effects, so this can't happen for r. t. expressions.


Prefix notation Prefix notation

Ordinary prefixFunction name precedes its arguments f(a,b,c)Example:(a+b)*(c/d) becomes *(+(a,b),/(c,d))

A variant (Cambridge Polish or fully parenthesized) moved the left paren before the operand and deletes the commas Example: (a+b)*(c/d) becomes (*(+a b) (/c d)) - LISP

Polish, allows parens to be dropped Parens are unnecessary if number of args is fixed and knownExample: *+ab/cd Named because the Polish mathematician Lukasiewiez invented the notation.Difficult to readWorks for any number of operands (unlike infix notation)Easy to decode mathematically.


Postfix Notation (suffix or reverse Polish) notation Not used in programming languages, but frequently for execution time representationEasily evaluated using a stack - Easy code generation

Infix Notation Suitable only for binary operationsCommon use in mathematics and programming languages


Problems with infix Since only works for binary operations, others must use prefix

(or postfix) making translation worse ambiguity: parens, precedence Comparison of notations Infix is a natural representation,

but requires complex implicit rules and doesn't work for non-binary operatorsIn the absence of implicit rules, large number of parens are required

prefix and Cambridge Polish require large number of parens Polish requires no parens, but requires you know the arity of

each operatorHard to read


Functional Side EffectsFunctional Side Effects Two Possible Solutions to the Problem:

1. Write the language definition to disallow functional side effects

– No two-way parameters in functions– No non-local references in functions– Advantage: it works!– Disadvantage: Programmers want the

flexibility of two-way parameters (what about C?) and non-local references


Functional Side EffectsFunctional Side Effects2. Write the language definition to demand that

operand evaluation order be fixed– Disadvantage: limits some compiler

optimizations


Overloaded OperatorsOverloaded Operators C++ and Ada allow user-defined

overloaded operators Potential problems:

– Users can define nonsense operations– Readability may suffer, even when the

operators make sense


Type ConversionsType Conversions Def: A narrowing conversion is one that

converts an object to a type that cannot include all of the values of the original type e.g., float to int

Def: A widening conversion is one in which an object is converted to a type that can include at least approximations to all of the values of the original type e.g., int to float


Type ConversionsType Conversions Def: A mixed-mode expression is one that has

operands of different types Def: A coercion is an implicit type conversion The disadvantage of coercions:

– They decrease in the type error detection ability of the compiler

In most languages, all numeric types are coerced in expressions, using widening conversions

In Ada, there are virtually no coercions in expressions


Type ConversionsType Conversions Explicit Type Conversions Often called casts


Eager evaluation For each operation node in the expression tree, first evaluate (or

generate code to do so) each operand, then apply the operation. Sounds good - but complications: Z+(y==0?x:x/y)If we evaluate the operands of condition first, we do what the condition was set up to avoid.

Sometimes optimizations make use of the fact that association can be changed. Sometimes, reordering causes problems:

adding large and small values together - lose small ones due to number of significant digits which can be stored

can be confusing to reader - exception thrown or side effects seen to be out of order

Evaluation Order Bottom-up evaluation of syntax tree For function calls: all arguments evaluated before call. translator may rearrange the order of computation so it is more

efficient. If order of evaluation is unspecified, not portable


Short Circuit EvaluationShort Circuit Evaluation Suppose Java did not use short-circuit

evaluation Problem: table look-upindex = 1;while (index <= length) && (LIST[index] != value)

index++;Problem: divide by zero


Short Circuit EvaluationShort Circuit Evaluation C, C++, and Java: use short-circuit evaluation for

the usual Boolean operators (&& and ||), but also provide bitwise Boolean operators that are not short circuit (& and |)

Ada: programmer can specify either (short-circuit is specified with and then and or else)

Short-circuit evaluation exposes the potential problem of side effects in expressions e.g. (a > b) || (b++ / 3)


Delayed order evaluation used in functional languages. Don't evaluate

until actually needed.Example:

function sq(x:integer):integer; begin sq = x*x; end; sq(i++) Becomes sq(i++) = (i++)*(i++) is evaluated

twice.


StrictnessStrictness An evaluation order for expressions is strict if all

subexpressions of an expression are evaluated, whether or not they are needed to determine the value of the result, non-strict otherwise.

Arithmetic is almost always strict. Every language has at least a few non-strict

expressions (?:, &&, || in Java). Some languages use a form of non-strictness called

normal-order evaluation: no expression is ever evaluated until it is needed (Haskell). Also called delayed evaluation.

A form of strict evaluation called applicative-order is more common: "bottom-up" or "inside-out".

Still leaves open whether left-to-right or not.


Function callsFunction calls Obey evaluation rules like other

expressions. Applicative order: evaluate all arguments

(left to right?), then call the procedure. Normal order: pass in unevaluated

representations of the arguments. Only evaluate when needed.

With side effects, order makes a difference.

Representation of argument value also makes a difference (value or reference?).


ExamplesExamples C and Scheme: no explicit order required for

subexpressions or arguments to calls. Java always says left to right, but warns against

using that knowledge. Case/switch/cond expressions imply a top-down

order:(define (f) (cond (#t 1) (#t 2)))

Theoretically, this could return either 1 or 2 (non-determinism—the "guarded if" of text).

Java and C outlaw this: no duplicate cases.


Sequencing and StatementsSequencing and Statements A sequence of expressions makes no sense

without side effects. Thus, a referentially transparent program should not need sequences.

Both ML and Scheme have sequences:(e1;e2;…) [ML] and (begin e1 e2 …) [Scheme].

What about a let expression? Is there an implied sequence? (let val x = e1 in e2 end;)

Applicative order would say yes: e1 is an argument to a call: (fn x => e2) e1.

Normal order would say no: only evaluate e1 if the value of x is actually needed in e2.

Statements by definition imply sequencing, since there is no computed value.


StatementsStatements Can be viewed as expressions with no

value. Java, C have very few: if, while, do, for,

switch, plus "expression statements." Scheme: valueless expressions also exist:

define, set! (some versions give these values).

ML: "valueless" expressions have value (). What about val and fun? Declarations may be neither expressions

nor statements.


SummarySummary Every language has three major program

components: expressions, statements, and declarations.

Expressions are executed for their values (but may have side effects), and may or may not be sequenced.

Statements are executed solely for their side effects, and they must be sequenced.

Declarations define names; they can also give values to those names. They may or may not be viewed by a language as expressions or statements.

Part 4 - Expressions and Statement

Documents