A Set-Oriented Method Definition Language for Object Databases

CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCEConcurrency Computat.: Pract. Exper. 2003; 15:1275–1335 (DOI: 10.1002/cpe.731)

Object Systems

A set-oriented method definitionlanguage for object databasesand its semantics

Elisa Bertino1, Giovanna Guerrini2,∗,† andIsabella Merlo2

1Dipartimento di Scienze dell’Informazione, Universita degli Studi di Milano,Via Comelico 39 - I20135 Milano, Italy2Dipartimento di Informatica e Scienze dell’Informazione, Universita degli Studi di Genova,Via Dodecaneso 35 - I16146 Genova, Italy

SUMMARY

In this paper we propose a set-oriented rule-based method definition language for object-orienteddatabases. Most existing object-oriented database systems exploit a general-purpose imperative object-oriented programming language as the method definition language. Because methods are written in ageneral-purpose imperative language, it is difficult to analyze their properties and to optimize them.Optimization is important when dealing with a large amount of objects as in databases. We therefore believethat the use of an ad hoc, set-oriented language can offer some advantages, at least at the specification level.In particular, such a language can offer an appropriate framework to reason about method properties.

In this paper, besides defining a set-oriented rule-based language for method definition, we formallydefine its semantics, addressing the problems of inconsistency and non-determinism in set-orientedupdates. Moreover, we characterize some relevant properties of methods, such as conflicts among methodspecifications in sibling classes and behavioral refinement in subclasses. Copyright c© 2003 John Wiley &Sons, Ltd.

KEY WORDS: object-oriented database systems; rule-based languages; database programming languages

1. INTRODUCTION

One relevant innovation introduced in the database field by the object-oriented paradigm is theremoval of the distinction between data and the operations manipulating them. In relational databasesystems, two languages are provided: a data definition language (DDL) and a data manipulation

∗Correspondence to: Giovanna Guerrini, Dipartimento di Informatica e Scienze dell’Informazione, Universita degli Studi diGenova, Via Dodecaneso 35 - I16146 Genova, Italy.†E-mail: [email protected]

Published online 22 September 2003Copyright c© 2003 John Wiley & Sons, Ltd.

Received 15 February 1999Revised 11 August 2000

1276 E. BERTINO, G. GUERRINI AND I. MERLO

language (DML), with the query language (QL) included as a sublanguage. The DML provides asmall number of update primitives to express atomic data manipulations (such as insertion or deletionof a tuple or the modification of the field of a tuple). The DML is thus simple but not computationallycomplete. As a consequence, complex data manipulations must be expressed by embedding DMLprimitives in a host language, which is usually a general-purpose, computationally complete language.A well-known example is the use of SQL [1] embedded in host languages, such as C or COBOL.

The dichotomy between the DML and the host language has both advantages and disadvantages.Because of the limited expressive power of the DML, queries can be optimized by rewriting themin equivalent queries that can be efficiently executed by taking advantage of auxiliary structures(e.g. indexes). The same query, if written in a general-purpose language, would not be amenable tooptimization. The DML supports efficient data access because of its simplicity. Traditional DMLs are‘ad hoc’ languages that reflect the data model for which they have been designed. Therefore, althoughthey are not computationally complete, they allow queries and updates to be expressed according to adeclarative style. Queries and updates are thus formulated by compact expressions which are easy tounderstand and to maintain. We can undoubtedly state that the success of relational database systemsis due to their declarative DMLs. Moreover, a DML with these characteristics is not well suited for thedevelopment of applications requiring tight communication between the host language and the DML.In general, in commonly adopted solutions, the two languages do not adequately integrate. They usedifferent types and they have different computational models. In particular, DMLs support sets as thelogical units of computation, whereas conventional programming languages reason on a single instance(record) at a time. These factors ‘impede’ the communications between the data manipulation and hostlanguages. For this reason, the problem of effectively interfacing the DML with the host language hasbeen termed the impedance mismatch [2].

In an object-oriented database, objects encapsulate both the state and the operations manipulating it.Thus, information related to data manipulation are recorded in the database rather than dispersed inapplication programs. In object-oriented database systems, we can then distinguish three differentlanguages: the DDL, the QL, and the method definition language (MDL). Most object-orienteddatabase systems [3–6] use existing object-oriented programming languages or extensions of existingobject-oriented programming languages, to define methods. We argue, by contrast, that the use of arule-based, set-oriented language for defining methods rather than an object-oriented programminglanguage can offer some advantages. Indeed, object-oriented programming languages have not beendesigned to manipulate large amounts of data. Predefined primitives, well suited for data manipulation,implemented at system level, are missing. Another missing feature is set-orientedness. In the databasecontext, the common case is to execute a given update operation on a set of objects rather than ona single object. The MDLs of some existing systems [5,7] provide an iterate operator to solve thisproblem. We consider a MDL supporting a number of predefined data manipulation primitives, suchas those found in SQL (insertion, update and delete primitives). These primitives have set-orientedsemantics, i.e. they work on a set of objects at a time, exactly as in SQL. Such a language could also beused as a method specification language, from which methods containing SQL-like statements wouldbe automatically derived.

Writing data manipulation code in an ad hoc set-oriented language has two important advantagesover using general-purpose object-oriented programming languages: it offers better possibilities foroptimization and for analysis and verification. The use of a special purpose language, specificallydesigned to support set-oriented access to large amounts of data, substantially increases efficiency,

Copyright c© 2003 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2003; 15:1275–1335

A SET-ORIENTED METHOD DEFINITION LANGUAGE FOR OBJECT DATABASES 1277

because access and query optimization techniques can be exploited. Methods written in a general-purpose object-oriented programming language are black boxes to the QL optimizers. In a set-oriented rule-based language, by contrast, optimizations can be applied to evaluate the condition(query) that controls update execution and in executing update sequences themselves. Moreover, sucha language is well suited for reasoning about methods. That is, useful semantic properties can beeasily characterized for methods specified in that language. Important examples of these propertiesare: behavioral refinement of methods in subclasses [8], i.e. how to ensure that, when overriding amethod, the method in the subclass refines, from a behavioral viewpoint, the method it overrides;and the absence of conflicts between different method implementations, which is useful for verifyingamong implementations in sibling classes for models supporting multiple class direct membership [9].

In this paper we propose a set-oriented rule-based method definition language. Besides definingthe syntax of the language, we formally define the language semantics addressing the problems ofset-oriented update semantics. An important issue is related to the non-determinism of the updatesemantics [10–12]. Because of non-determinism, set-oriented updates may lead to different correctdatabase states. Therefore, an important goal of our work is to define a language with deterministicsemantics. Then we show how, by relying on such semantics, the relevant properties of the methodscan be formally stated and investigated. In particular, we address the problem of conflicting methodimplementations in sibling classes and behavioral refinement of methods in subclasses.

The language we propose in this paper has some similarities with deductive MDLs for object-oriented databases (such as those in [13,14]). However, we distinguish two different kinds of methods:side-effect-free methods and methods with side effects. The main difference between the two is thatside-effect-free methods do not modify the database state, whereas methods with side effects modifythe database state. Side-effects-free methods are formulas which allow one to derive information fromthe database. By contrast, methods with side effects allow us to update objects. Thus, an importantdifference between our approach and the deductive one is that our language not only allows thededuction of derived information but also updates on objects. This state evolution aspect is crucialfor databases and is missing in existing proposals for which formal semantics has been specified.

Our language can be regarded as a proposal for a MDL for the object-oriented standard data modelODMG [15]. No ad hoc data manipulation language has been proposed for that model. Standards fordata definition and query language have been developed for the ODMG model. However, the issue ofupdates has been largely neglected. A number of ad hoc data manipulation primitives, such as those wepropose here, should in our opinion be provided by that model. The choice of a deductive-like syntaxis only a syntactic choice, since our methods can also be expressed as sequences of SQL-like datamanipulation statements.

The contributions of this paper can thus be summarized:

(i) the definition of a set-oriented, rule-based MDL for object-oriented databases;(ii) definition of formal semantics for such a language, addressing the problem of inconsistency and

non-determinism; and(iii) use of the proposed language as a formal basis for characterizing some relevant properties of

object-oriented methods, namely, conflicting method implementations and the refinement ofmethod implementations.

The paper is organized as follows. In Section 2 we introduce the syntax of the MDL. The semanticsof the language is presented in Section 3, while the method properties are discussed in Section 4.



person

manager

consultantemployee

Figure 1. Inheritance hierarchy of Example 1.

Section 5 compares our approach with other related approaches; Section 6 concludes the paper,outlining some directions for future work. Finally, Appendix A is devoted to typing issues.

2. SYNTAX

In this section, before introducing the syntax of the language, we first briefly introduce the notion ofour schema. Then, we specify the syntax of the MDL by introducing the formulas, which are usedin method conditions, the update primitives, which are used in the method actions, and, finally, themethod implementations.

2.1. Database schema

Our model, like most data models, distinguishes between the schema level, which represents thedatabase structure definition and is the time-invariant component, and the instance level, which is thetime-varying one. Intuitively, a database schema consists of a set of class declarations. An example ofdatabase schema follows.

Example 1. In this example we introduce the definitions of the classes which will be used throughoutthe paper as a running example. Chimera [16] syntax is used. The inheritance hierarchy is representedin Figure 1. The meaning of the classes and attributes should be clear from their names, whereas themeaning of the methods will be clarified in Example 9 where their implementations will be presented.



define object class personattributes name: string(40),

address: string(50),phone: string(15),birthday: date,spouse: person,money: integer,debit: integer,

operations spend_money(in expense:integer),hire(out cod:integer, in dep:integer,

in new_rank:string, in new_boss:manager, in sal:integer),del()

define object class employeesuperclasses personattributes emp#: integer,

department: department,rank: integer,boss: manager,monthly_salary: integer,hiring_date: date,firing_date: date,

operations raise_salary(in amount: integer),update_boss(in new_boss: manager)

define object class managersuperclasses employeeattributes bonus: integer,

dependents: set-of(employee),operations casc_del()

is_responsible(in an_employee: employee)

define object class consultantsuperclasses personattributes cons#: integer,

hourly_salary: integer,internal_contact: employee

define object class departmentattributes nbr_of_employees: integer,

chief: manager



In a database schema each class declaration has three components:

– the declaration of the superclasses of the declared class;– the class signature, which consists of the names and types of the attributes, and the method

signatures, each of which in turn consists of the method name and the type of its parameters; and– the method implementations.

Note that in Example 1 only the signature specification and inheritance information are reported.Thus, the database schema is not complete. Method definitions will be presented in Example 9, thusthe complete schema is composed by the definitions in Examples 1 and 9.

The formal definition of the set of types of our reference object model, T , can be found inAppendix A. The set T includes a set BVT of basic value types (such as integer and real) which arepredefined. Given a basic value type T ∈ BVT , its corresponding domain (i.e. the set of legal valuesfor T ) is postulated and is denoted as dom(T ). Besides the basic value types the set of types T includesa set of class names CI, defined through a database schema, and structured types built by means of theset-of , list-of and record-of constructors.

The structural part of the objects instances of a class of a schema is represented by the names and thetypes of its attributes. Specifically, this information is modeled as a function σ that defines the type of aclass. Let AN be a set of attributes names, then the following definition formalizes the previous notion.

Definition 1. (Class type) Given a class c, its corresponding class type σ(c) is the record type:

record-of (a1 : T1, . . . , an : Tn)

where ai ∈ AN (i = 1, . . . , n) is an attribute of class c and Ti ∈ T (i = 1, . . . , n) is the domain of ai

according to the declaration of c.

The information, i.e. signature and implementation, related to each method in a class c is representedby a function µ, formally defined as follows.

Definition 2. (Class methods) Given a class c, its methods are represented by µ(c) = {〈sign(m, c),

impl(m, c)〉 | m is a method of class c} which associates with c the signatures and the implementationsof its methods, that is,

• for each method m declared in class c for which the variables P1, . . . , Pj of type T1, . . . , Tj ,respectively, are input variables and the variables Pj+1, . . . , Pn of type Tj+1, . . . , Tn,respectively, are output variables, the signature of m, denoted by sign(m, c), is an expressionof the form: T1 × · · · × Tj → Tj+1 × · · · × Tn;

• for each method m declared in class c, the implementation of m, denoted by impl(m, c),is a method implementation according to the syntax that will be defined in what follows(see Definition 13).

In our model we distinguish two different kinds of methods:

• side-effect-free methods which do not modify the database state; and• methods with side effects which modify the database state.

Intuitively, side-effects-free methods allow information to be derived from the database. By contrast,methods with side effects allow objects to be updated.



In our model, the inheritance relationships among the classes are described by an ISA hierarchyestablished by the user when defining the schema [15,17]. This ISA hierarchy identifies which classesare subclasses of (inherit from) other classes. Let c1 and c2 be two classes such that c2 is definedas a subclass of c1, then the ISA relationship between them is denoted as c2 ≤ISA c1. Note that theISA relationship is reflexive, antisymmetric and transitive. c2�ISAc1 and c1�ISAc2 denote that the twoclasses c1 and c2 are not related by the ISA relationship. The relationship ≤ISA is extended to thesubtyping relationship ≤T in Appendix A.

Example 2. Referring to the inheritance hierarchy in Figure 1, the ISA relationships among the classesare:employee≤ISA personmanager ≤ISA employeemanager ≤ISA personconsultant≤ISA person

A set of conditions must be satisfied by two classes related by the ISA relationship. These conditionsare related to the fact that each subclass must contain all attributes and methods of all its superclasses.Apart from the inherited concepts, additional features can be introduced into a subclass, e.g. a newattribute. Inherited methods may be redefined (overwritten) in a subclass definition under a numberof restrictions. The signature of a method must verify the covariance rule for result parameters andthe contravariance rule for the input ones. These rules establish that result parameter domains maybe specialized, whereas input parameter domains may be generalized, in the signature of the methoddefined in the subclass. The implementation of a method may also be redefined, introducing a differentimplementation of the respective concept, which ‘overrides’ the inherited definition. Some constraintshave to be verified by the overriding implementation. For instance, if a method has no side effects,each redefinition of such method must be without side effects. Likewise for methods with side effects.Issues regarding method overriding in subclasses are extensively addressed in Section 4.

We now have all the notions to define the database schema formally.

Definition 3. (Database schema) A database schema is a 4-tuple Sch = (CI, σ, µ,≤ISA) where

• CI is a set of class names;• σ associates with each class c ∈ CI its type (see Definition 1);• µ associates with each class c ∈ CI the signatures and the implementations of its methods

(see Definition 2); and• ≤ISA is a subclass relationship.

We refer the interested reader to [18] for a complete formal treatment of the reference data model.

2.2. Terms and formulas

In what follows, we consider a set of variables VarT for each type T ∈ T of the language, and theset Var = ⋃

T ∈T VarT , to which the special variable Self belongs. This special variable denotes theobject on which the method is being executed, as usual in object-oriented programming languages.Finally, OI denotes the set of object identifiers, also referred to as oids.



Method execution is controlled by conditions on the database expressed as formulas, built from theterms. The formulas we consider are similar to first-order formulas extended with some notions typicalof the object-oriented paradigm and, in particular, with path expressions and method invocations.

2.2.1. Terms

Before formally defining the formulas, we define the values and terms on which the formulas are built.

Definition 4. (Values) The set of values V is inductively defined as follows:

• each v ∈ dom(T ), T ∈ BVT , is a value;• each object identifier i ∈ OI is a value;• null is a value;• let v1, . . . , vn, n ≥ 0 be values, then

– {v1, . . . , vn} is a value, called set value;– [v1, . . . , vn] is a value, called list value;– let a1, . . . , an ∈ AN be distinct attribute names, then (a1 : v1, . . . , an : vn) is a value,

called record value.

Terms are built starting from values and variables using set, list and record constructors, the dotnotation for path expressions, a set of standard predefined operators, OP , and aggregate operations.These predefined operators include arithmetic operators, set operators and list operators; for instance,+ ∈ OP , − ∈ OP , and so on. We do not detail here all the operators for building terms, since they arethe standard operators for integers, reals, strings, sets, lists and so on. Each operation symbol, whichcan be ambiguous, is introduced before using it in the examples. Moreover, we consider the standardaggregate functions on sets, i.e. count, min, max, avg and sum.

Definition 5. (Terms) The set of terms Term is inductively defined as follows:

• each variable is a term (Var ⊆ Term);• each value, other than oids, is a term ((V \ OI) ⊆ Term);• let t1, . . . , tn, n ≥ 0 be terms, then

– {t1, . . . , tn} is a term, called set term;– [t1, . . . , tn] is a term, called list term;– let a1, . . . , an ∈ AN be distinct attribute names, then (a1 : t1, . . . , an : tn) is a term,

called the record term;

• let t be a term, a ∈ AN an attribute name, then t .a is a term, called a path expression term;• let t1 and t2 be terms and op ∈ OP be an operator between terms, then t1 op t2 is a term;• let t be a term, op ∈ {count, min, max, avg, sum}, then op(t) is a term.

Example 3. Let X, Y ∈ Var, +,∪ ∈ OP , and a, b, c, name ∈ AN , then the following are examplesof terms: X; {‘bob’,‘john’,‘sue’}; [true, X]; (a : X, b : true, c : 1627); X.name; X+Y; X ∪ Y.

According to this definition, path expressions (built by making use of the dot notation) are terms.Let O be an object denoting a variable and let a1, . . . , an be attribute names. The path expressions weconsider throughout the paper are of the form O.a1. · · · .an. Such path expressions allow us to navigatethrough the aggregation hierarchy among objects.



The set of terms deduced according to Definition 5 is a superset of the legal ones. Type constraints,presented in Figure A1 in Appendix A, establish the legal terms. For instance, we consider a number ofterms obtained using classical predefined operators, belonging to OP , for integers, reals, strings, lists,sets, and so on. However, there are type constraints for the applicability of such operators. For example,if the symbol ∪ belongs to OP and denotes the union of sets, one of the requirements for applying ∪to two terms t1 and t2 is that t1 and t2 be set terms. Obviously, not all the terms obtained applying theoperators on any two terms will result in correct terms according to typing rules.

According to Definition 5, oids, even though they are values, are not included in terms. This is dueto the fact that we do not allow oids to be manipulated explicitly by the user. In a formula the usermay bind a variable to an oid and then ‘access’ the object denoted by this variable. We have indeedincluded in the set of terms the typed set of variables, Var, which also contains variables of class types,i.e. variables denoting objects.

2.2.2. Formulas

Formulas are either atomic or complex. Each atomic formula consists of a predicate symbol and alist of parameter terms. Predicate symbols include a number of special predefined relations includingcomparison predicates, such as < or ==, and a membership predicate (in). Three different types ofequality between objects are supported: = denotes object identity; == denotes shallow value equalitywhereas ==d denotes deep value equality [18]. Equality by identity (=) means that the two comparedterms, ranging over a class, denote the same object, i.e. the two objects have the same oid. Shallowvalue equality (==) holds if the two compared terms denote objects with the same direct attributevalues but not necessarily the same oids. Deep value equality (==d ) considers objects not necessarilywith the same oids and, in addition to the equality between direct attributes, requires the equality ofthe values of all attributes of the objects that are recursively reached by means of oid-based references.Type names, seen as unary predicate symbols, are predicate symbols, too. Finally, side-effect-freemethod invocations are formulas. Side-effect-free methods do not modify the database state. They canbe invoked in formulas for two reasons:

(i) to compute values that can be used inside the condition itself; for example, the method invocationX.year salary(Y) which executed on an employee X multiplies his monthly salary by twelveand binds Y to the amount X gets in one year;

(ii) to compute Boolean predicates; for example, the method invocation X.is mgr() which ifinvoked on an employee X evaluates to true if such an employee is a manager.

Definition 6. (Atomic formulas) Atomic formulas are defined as follows.

• Comparison formulas: if t1, t2 are terms and op ∈ {<,>,≥,≤,=,==,==d, �=} is a predefinedpredicate, then t1 op t2 is a comparison atomic formula;

• Membership formulas: if t1, t2 are terms, then t1 in t2 is a membership atomic formula;if t is a term and c ∈ CI is a class name then t in c is a membership atomic formula;

• Type formulas: if t is a term and T ∈ T is a type name, then T (t) is an atomic formula, referredto as a type formula;

• Method invocations: if m is the name of a side-effect-free method, O is an object-denotingvariable and t1, . . . , tn are terms, then O.m(t1, . . . , tn) is an atomic formula.



Since class names are type names, if t is a term and c is a class name, then c(t) is a particular caseof a type formula; these formulas are referred to as class formulas. Some examples of atomic formulasfollow.

Example 4. Examples of atomic formulas:

• comparison formulasX = Y (tests for identity) X == Z (tests for shallow value equality)Self.money > 150 X.name = ‘John Smith’

• membership formulasX in Self.dependents {1, 2} in X.rank Y.boss in employee

• type formulasperson(X) integer(Y)

Complex formulas (or simply formulas) are obtained from atomic formulas and negated atomicformulas by means of conjunctions. All variables are assumed to be implicitly quantified as inDatalog [19].

Definition 7. (Formulas) Formulas are inductively defined as follows:

• all atomic formulas are formulas;• if F is an atomic comparison or membership formula‡, then ¬F is a (complex) formula; and• if F1 and F2 are formulas, then F1, F2 is a (complex) formula, where the symbol ‘,’ denotes the

∧ logical connective.

In evaluating complex formulas we consider the precedence of the negation operator with respect toconjunction.

Example 5. Examples of formulas:

• person(X),X.money=200;• person(X), ¬X in employee, X.debit<600.

As for terms, the set of formulas deduced by Definition 7 are a superset of the legal ones. Indeed,legal formulas have to verify type constraints which are expressed in Figure A2 in Appendix A.In addition, formulas appearing in methods have to satisfy some requirements which are discussedin the following section.

2.2.3. Range-restricted formulas

Certain syntactically correct formulas which can be deduced by Definition 7 are not valid methodconditions. Specifically, two constraints have to be satisfied by formulas appearing in the methodconditions:

(i) formulas must contain exactly one type formula for each variable; and(ii) they must be range restricted.

‡Type formulas and method invocations cannot be negated.



The first constraint has to be satisfied in order to associate a unique type with each variable; type-checking is then performed with respect to this type. Indeed, type formulas included in a given formula,together with parameter declarations, form the basis from which the considered formula is typed.We recall that an important restriction imposed on complex formulas is that for every variable theremay be at most one type formula in order to retain assignment of a unique type for each variable.Such a type formula can be explicitly inserted in the formula or can be implicitly expressed throughmethod signature declarations, in which type parameters are specified. We will elaborate further on thiscondition in Section 2.4 and in Appendix A.2.

The second constraint is related to range restriction, which prevents formulas from being satisfiedby an infinite set of instances. In fact, range restriction guarantees that a formula is satisfied by finitesets of instances if applied to a finite database. The notion of a range-restricted formula is based on thatof a range-restricted variable which is guaranteed to be bound only to a finite (possibly empty) set ofvalues. In addition, it also relies on the notion of a ground term. A term t is a ground term if it eitherconsists of values or is bound to values. Thus, for instance, 2, ‘bababaa’ and [true,false,true]are ground terms since they are values. Moreover, given the formulas X=3+5 and Y=‘bababaa’, X andY are ground terms since they are bound to values.

The following definition formalizes the notion of a range-restricted variable.

Definition 8. (Range-restricted variable) A variable X is range restricted in a formula F if one of thefollowing conditions holds:

• X occurs in at least one class formula of F ;• X occurs in any equation X = t or t = X in F where t is a ground term or where all variables

in t are range restricted in F ; or• X occurs in any membership formula (X in S) in F where S is a ground set or list term or all

variables in S are range restricted in F .

Example 6. Consider the following formula:

X in Z.dependents,employee(Y),Y.boss=Z.

Each variable appearing in such a formula is range restricted. In fact, Y is range restricted because itappears in a class formula. Z is range restricted because it appears in an equation of the form term = varwhere the unique variable in the considered term Y.boss, i.e. Y is range restricted. Finally, X is rangerestricted because it appears in a membership formula of the form var in set where the unique variableappearing in the set term Z.dependents, i.e. Z is range restricted.

The definition of a range-restricted formula derives from that of a range-restricted variable asfollows.

Definition 9. (Range-restricted formula) A formula F is range restricted if and only if all variables ofF are range restricted in F .

Example 7. An example of a range-restricted formula is the one presented in Example 6. Anotherexample is:

employee(X), integer(Z), integer(W), Z= X.monthly salary, W = Z + 50



whereas the following example presents a formula which is not range restricted:

employee(X), integer(Y), X.monthly salary > Y

It is easy to prove that such a formula is satisfied by an infinite set of instances. In fact, for each integerrepresenting the salary of an employee (X.monthly salary), there is always an infinite number ofintegers greater than the value to which Y can be bound.

2.3. Update primitives

Our language provides a number of predefined operations for expressing basic data manipulations onthe object database. Supported update primitives are: object creation and deletion; object migrationfrom one class to another; modifications to the value of an object attribute. In addition, methodinvocations are also (non-atomic) update operations. The following definition defines the updateoperations of our language.

Definition 10. (Updates) Let c, c′ ∈ CI be class names, a ∈ AN an attribute name, O an object-denoting variable, m a method name and t, t1, . . . , tn terms. An update has one of the following forms:

• create(c, t,O);• delete(c,O);• specialize(c, c′,O, t);• generalize(c, c′,O);• modify(c.a,O, t);• O.m(t1, . . . , tn).

A create(c, t,O) operation creates a new object of class c, returning in O the oid of thenewly created object. t is a record term, containing the values for initializing the object attributes.A delete(c,O) operation deletes the object denoted by the oid to which O is bound from the class c

to which it belongs (and also from any other class of the database).A specialize(c, c′,O, t) operation migrates the object denoted by O from class c to class c′,

which must be a subclass of c. t is a record term, containing the values for initializing additionalattributes of class c′. The new state of the object denoted by O is obtained by appending t to its oldstate. Obviously, the object denoted by O still belongs to c as a member§. A generalize(c, c′,O)

operation migrates the object denoted by O from class c to class c′, which must be a superclass of c.The object is thus deleted from class c and becomes an instance of class c′, of which it was already amember. All the values for the proper attributes of class c are removed from the object state. Note thatobject migration operations do not affect object identity.

A modify(c.a,O, t) operation sets the value of attribute a, which must be an attribute of class c, ofthe object denoted by O , to the value to which t evaluates. The object denoted by O must be a memberof class c. Finally, a O.m(t1, . . . , tn) operation invokes the operation m on the object denoted by O ,with parameters t1, . . . , tn.

§An object is an instance of a class c if it belongs to class c and it does not belong to any subclass of c. An object is a memberof a class c if it is an instance of c or is an instance of some subclass of c.



Example 8. Given the class person and its subclass employee from Example 1 and a variable X, thefollowing are examples of updates:

• create(person,name:‘John Smith’,address:‘3 Red St.London’,phone:‘65436’,birthday:(day:8,month:10,year:1969),spouse:null,money:5000,debit:0), X);

• delete(person,X);• specialize(person, employee, X, (emp#:14038,department:null,rank:4,boss:null,monthly salary:1000,hiring date:(day:1,month:1,year:2000),firing date:null);

• generalize(employee, person, X);• modify(person.money, X, 6000); and• X.spend money(1000).

Type constraints for updates are presented in Figure A3 in Appendix A.3.

2.4. Methods

We can now discuss how method implementations are expressed in our language. We first introducethe update rules by which method implementations are specified, then we consider typing issues andfinally we discuss recursive methods.

2.4.1. Method implementations

The update primitives we have introduced in the previous section are the constructs which can appearin the action part of a method. An action list is a sequence of update primitives.

Definition 11. (Action list) Let ui , i = 1, . . . , n, be updates defined according to Definition 10, thenu1; . . . ; un is an action list.

Note that in the previous definition n ≥ 0. If n = 0 then the action list is empty.An update rule is a rule of the form condition → action, where condition is a formula

(see Definition 7) expressing a declarative control upon action execution. That is, the action is executedonly if the condition is verified, and the evaluation of the condition determines the set of bindings onwhich the action is executed. The condition must be a range-restricted formula such that a unique typeis associated with each variable appearing in it.

Definition 12. (Update rule) An update rule has the form condition → u1; . . . ; un where condition isa formula (according to Definition 7) such that

• it contains exactly one type formula for each variable;• it is range restricted (according to Definition 9);

and u1; . . . ; un is an action list (according to Definition 11).

A method implementation is specified by a set of update rules. The rules in a method implementationare executed sequentially, according to the order in which they appear in the implementation.



Definition 13. (Method implementation) Given a class c and a method name m a methodimplementation impl(m, c) (see Definition 2) has the form:

m(P1, . . . , Pn) : condition1 → action1‖ · · · ‖conditionm → actionm

where m is a method name; P1, . . . , Pn are the formal method parameters, i.e. variables; andconditioni → actioni , i = 1, . . . ,m, is an update rule (according to Definition 12). If each rule has anempty action list, then the method is side-effect-free. By contrast, if each rule has a non-empty actionlist, then the method has side effects.

As for the previously defined language constructs, the set of method implementations deduced byDefinition 13 is a superset of the legal ones. One of the constraints that has to be satisfied by a methodimplementation impl(m, c) is that if a method m′ is invoked in the condition of a rule of anothermethod m, then it must not have side effects, whereas if a method m′ is invoked in the action of a ruleof another method m, then it must have side effects.

In addition, each method must meet a number of type constraints, discussed in the following section.

2.4.2. Type constraints

We refer to a generic method implementation of the form:

m(P1, . . . , Pn) : R1‖ · · · ‖Rm

where P1, . . . , Pj are input parameters and Pj+1, . . . , Pn are output parameters. Suppose, moreover,that the signature of the method is

sign(m, c) = T1 × · · · × Tj → Tj+1 × · · · × Tn

Variables can be assigned types in different ways within the body of a method. Variables in a methodimplementations can be categorized into two types: parameters and local variables. The scope of atype assignment in a method implementation is the whole implementation (i.e. rules R1, . . . , Rm) formethod parameters, whereas it is the single update rule (i.e. a rule Ri) for local variables.

Parameter types are determined by the method signature. Thus, the basis from which we start to typecheck method implementations is established by the type of method parameter:

Pi : Ti

1 ≤ i ≤ n

For local variables, an update rule can establish their type in one of the following ways:

(i) a type formula in the update rule condition explicitly specifies a type for the variable;(ii) if the variable is the output parameter of a method invocation contained in the condition or action

of the update rule; its type is determined by the signature of the invoked method; and(iii) the variable is the output parameter of a create operation invoked in the action of the update

rule; its type is determined by the class on which the creation is executed.

Starting from one of these criteria, a unique type is assigned to each variable. Each update rule inthe method implementation is type checked with this type.



An update rule is type correct if both its condition and its action are type correct. The typing rules forconditions, i.e. formulas, and actions, i.e. updates, are given in Appendix A. Specifically, each methodinvocation in impl(m, c) must conform to one of the signatures of the invoked method. This constraintis formalized in Appendix A.4 where type-checking of method invocations is discussed.

A method implementation is type correct if all its update rules are type correct.

2.4.3. Recursive methods

Our language allows us to write recursive methods. A method is recursive if its implementation directlyor indirectly invokes the method itself. Both side-effect-free methods and methods with side effects canbe recursive: in Example 9 the method is responsible is an example of a side-effect-free recursivemethod whereas the method casc del is an example of a recursive method with side effects.

As an obvious consequence of admitting recursion, the possibility of non-terminating computationsexists in our language. As usual in programming languages, however, avoiding non-termination is theresponsibility of the programmer. The semantics of our language captures non-termination since non-terminating computations simply fail to converge to a final state.

Finally, we would like to point out that because of the limitations we have imposed on our language,recursion does not introduce problems in evaluating the update rule conditions. In particular, sincemethod invocations cannot be negated (see Definition 7), recursive method implementations dependingon each other through negation are not allowed in our language. Moreover, we have not introduced intoour language any grouping construct, allowing all the elements in a set satisfying a certain condition(expressed through a formula) to be grouped together. Thus, recursive method implementations thatdepend on one another, such that the implementation of one method involves some sets whose elementsare determined by executing the other method, cannot be expressed in our language. Note that thepossibility of including some limited form of negation for (side-effect-free) method invocations, as wellas some grouping mechanisms could be interesting for our language. In this case, some stratificationconditions would need to be imposed on the method implementations, both with respect to negation[19] and sets [20]. Since, however, negation and grouping do not raise new issues in our language, wehave decided not to include them in order to focus on the more innovative aspects of the language.

2.5. Examples

In this section the implementation of the methods whose signatures were presented in Example 1 isreported.

Example 9. We assume the existence of two methods whose implementations have not been reported.The method next emp#, given an object of the class person, computes the code to be generated forhis/her as a new employee and associates it with a variable which is passed as a parameter. The systemmethod today returns the current date.

The method spend money in the class person, given an input parameter, which represents themoney to be spent, updates the attribute representing the money owned by the person on which themethod is called.

spend money(expense):Self.money≥expense, Self.debit = 0→modify(person.money,Self,Self.money-expense)



The method hire in the class person, given some parameters representing the new state of theobjects, migrates the object from the class person to the class employee or manager depending onthe new rank of the object:

hire(dep,cod,new rank,new boss,sal):new rank ≤ 5, Self.next emp#(cod), date(d), system.today(d)→specialize(person,employee,Self,(emp#:cod,

department:dep,rank:new rank,boss:new boss,monthly salary:sal,hiring date:d,firing date:null))||

new rank > 5, Self.next emp#(cod), date(d), system.today(d)→specialize(person,employee,Self,(emp#:cod,

department:dep,rank:new rank,boss:new boss,monthly salary:sal,hiring date:d,firing date:null));

specialize(employee,manager,Self,(bonus:0,dependents:null))

The method del in the class person, deletes a person and sets to null the spouse attribute of theperson to whom he/she was married.

del():Self.spouse �= null →modify(person.spouse,Self.spouse,null);delete(person,Self)

The method raise salary in the class employee, given an input parameter amount, sets the salaryof the calling object equal to that of his/her boss and increases the salary of the boss of the given objectby a value equal to the amount:

raise salary(amount):M = Self.boss→modify(employee.monthly salary,Self,M.monthly salary);modify(employee.monthly salary,M,M.monthly salary+amount)

The method update boss in the class employee updates the boss of the employee represented by thecalling object after checking some constraints in the condition:

update boss(new boss):new boss.rank≥Self.rank,new boss �=Self,new boss in manager→modify(employee.boss,Self,new boss)



The method is responsible in the class manager, evaluates to true if the employee passed as aparameter is one of the manager subordinates:

is responsible(X):X in Self.dependents ||manager(E),E in Self.dependents, E.is responsible(X)

The method casc del in the class manager, propagates the deletion of a manager to her/hissubordinates:

casc del():employee(D),D in Self.dependents, ¬ D in manager→delete(employee,D);delete(manager,Self)||manager(D),D in Self.dependents→D.casc del();delete(manager,Self)

3. SEMANTICS

In this section we define the language semantics. We start by providing an intuitive idea andwe introduce some preliminary notions. In giving the semantics we then distinguish between twokinds of updates: atomic and non-atomic. An update is atomic when it cannot be decomposed intosimpler updates. The atomic updates of our language are create, delete, modify, specialize,generalize, while non-atomic updates are method invocations. Atomic operations transform thedatabase from one state to another without any intermediate states, whereas non-atomic updates requiresome intermediate states. Therefore, the semantics of atomic update operations is simpler than that ofnon-atomic operations.

3.1. Intuitive idea

As we have seen, a method implementation in our language (see Definition 13) is an expression of theform:

m(P1, . . . , Pn) : condition1 → action1‖...

conditionm → actionm

where conditioni , i = 1, . . . ,m, is a formula expressing control over method execution and eachactioni , i = 1, . . . ,m, is an action list of the form ui

1; . . . ; uik , where each ui

j , j = 1, . . . , k is anupdate operation. An update operation can be either an update primitive or a method invocation in ourlanguage. An expression conditioni → actioni in this method implementation is called an update rule(see Definition 12). Moreover, conditioni is called the condition part of the rule, whereas actioni iscalled the action part of the rule. A method invocation is an expression of the form

O.m(t1, . . . , tn)



where O is an object-denoting variable. For the sake of simplicity, we assume that the first parametersP1, . . . , Pj (j ≤ 0) in a method implementation are the formal input parameters, and the lastPj+1, . . . , Pn (j ≤ n) are the formal output parameters. Note that a method can have no input orno output parameters. The actual parameters, t1, . . . , tn, are terms. In particular, the actual outputparameters, tj+1, . . . , tn, are variables.

As is usual in database systems, a method is invoked with respect to a database state, denoted by S,and a set B of bindings (i.e. assignments of values to variables). S and B represent the state with respectto which the method is invoked. In particular, B represents the state of the variables of the programwithin which the method is invoked. This state together with the method behavior completely definesthe semantics of the method execution¶.

The semantics of the execution of method m is informally defined as follows. When the methodis first invoked, the actual parameters t1, . . . , tj are evaluated and a new set of bindings B/P isgenerated. The generation of this set of bindings establishes the correspondence between the formalinput parameters P1, . . . , Pj and the values resulting from the evaluation of t1, . . . , tj . Note that nobindings for the formal output parameters Pj+1, . . . , Pn are included in B/P . Then, each condition isevaluated against state S starting from the set of bindings B/P . As a result we obtain the sets of bindingsB1, . . . , Bm which can be considered as extensions of the initial set B/P . Starting from the first rule,updates u1

1; . . . ; u1n are performed on state S with the set of bindings B1; after the execution of u1

n thedatabase state will be S1

n . Similarly, the second rule body is executed in state S1n , with set of bindings

B2 and so on. After all the rules have been executed, the database state is in the final state Sf whichis returned as a result of the method invocation. In the following, we will also specify the final set ofbindings Bf resulting from the method invocation. We note that if an update operation ui is a methodinvocation, its evaluation makes the database go through a sequence of intermediate states resulting ina final set of bindings and a final state.

3.2. Preliminaries

As we have already mentioned, the semantics of a method invocation transforms a database state anda set of bindings into a new database state and a new set of bindings. Thus, we have first to define thenotion of database state and of set of bindings formally. Then, we introduce our semantic domains.

3.2.1. Database state

Our model, like most data models, distinguishes between the schema level, which represents thedatabase structure definition and is the time-invariant component, and the instance level, which is thetime-varying one. We have already introduced the schema level in Section 2. The notion of databasestate we introduce here is quite simple, since we focus on the aspects that allow us to specify thesemantics of state changes and these are quite similar to the ones that can be found in the literature [21].

¶We assume that concurrency control is used to synchronize accesses on the underlying database. Therefore, there is nointerference between concurrent executions.



In the following definitions we consider a set V of values and a set OI ⊆ V of all possible oids.Moreover, given a set S, 2S denotes the powerset of S.

To specify a database state we must first of all assign a set of objects as instances to each class,i.e. assign an extent to each class.

Definition 14. (Class extent) π : CI → 2OI is a function that associates with a class the set of oids ofits instances, i.e. for each c ∈ CI: π(c) = {oid|oid ∈ OI and oid instance of class c}. π(c) is calledthe extent of class c.

We note that π(c) denotes the proper instances of class c, i.e. the set of oids of those objects forwhich c is the most specific class in the inheritance hierarchy. Thus, objects members of c are notincluded in π(c).

Definition 15. (Class extent closure) Given a function π defined according to Definition 14, π∗ denotesthe closure of the extent of class c with respect to the subclass relationship, defined as follows:π∗(c) = ⋃

c′≤ISAc π(c′). π∗(c) is called the non-proper extent of class c.

Then we must assign a state to each object, as formalized by the following definition.

Definition 16. (Object state) Function ν : OI → V such that ν(oid) = (a1 : v1, . . . , an : vn) returnsfor each object its state, i.e. the value of its attributes.

To denote the values of single attributes we use the following notation: ν(oid).ai = vi , i = 1, . . . , n.Note that ν is a partial function. Indeed, given an object, its state is unique and since OI is the set ofall possible oids and includes oids that are not part of the database, there exists oid ∈ OI such thatν(oid) =⊥. Function ν is not injective: given two distinct objects, ν can return the same state, althoughthey are distinct entities.

A database state S is then simply defined as a pair S = (π, ν). Some constraints must, however, beimposed for a database state to be consistent. Given a type T ∈ T , consider the set of legal values oftype T . Note that the set of values of value types (such as integer) does not depend on the considereddatabase state, whereas the set of legal values of object types (i.e. classes) depends on the currentextent of the class, i.e. on the set of objects that are currently instances of the class. Thus, let us modelthe set of legal values of a type as a function dom : T × State → V , where, as we will formalizeshortly, State denotes the set of all possible database states over a given schema. If T is a predefinedvalue type, dom(T , S) = dom(T )∀S ∈ State, whereas if T ∈ CI is an object type and S = (π, ν),dom(T , S) = π∗(T ).

We are now able to define the notion of database state over a given schema.

Definition 17. (Database state) An instance of a schema (CI, σ, µ,≤ISA) or a database state (databasefor short) is a pair S = (π, ν) where π, ν are defined according to Definitions 14 and 16, respectively,such that ∀c ∈ CI,∀oid ∈ π(c), ν(oid) ∈ dom(σ (c), S).

This condition requires that for each class c and for each of its instances oid, if σ(c) = record-of (a1 : T1, . . . , an : Tn), then ν(oid) = (a1 : v1, . . . , an : vn) where vi ∈ dom(Ti, S) for i = 1, . . . , n.Thus the condition also requires that in the case of object-valued attributes, all the referenced objectsexist and belong to the correct class (either the type of the attribute or one of its subclasses). Thus,referential integrity is enforced in the database state (and it is ensured by the semantics of atomicupdates).



In what follows S.π and S.ν denote the first and the second component of the database state S,respectively. An example of a database state follows.

Example 10. Consider the class definitions presented in Example 1 where the only populated class isperson. Let o1 and o2 be identifiers of objects belonging to the class person, then an example of adatabase state on the schema presented in Example 1 is the pair S = (π, ν), where

π(person) = {o1, o2} and π(c) = ∅ for each c �= person;

ν(oid) =⊥ for each oid /∈ {o1, o2} and

ν(o1) = (name: ‘Mary Brown’,address: ‘118 Green St. London’,phone: ‘071-2543678’,birthday: (day:31,month:12,year:1971),spouse: o2,money: 10000,debit: 0 )

ν(o2) = (name: ‘Robert Spencer’,address: ‘118 Green St. London’,phone: ‘071-2543678’,birthday: (day:22,month:5,year:1969),spouse: o1,money: 20000,debit: 0 )

3.2.2. Set of bindings

For the notion of set of bindings, we can informally state that a set of bindings B is a set of substitutions.This set is the means by which the bindings for variables obtained by evaluating the condition arepassed to the action part of the rule. The condition and action parts may, indeed, share some variables,in which case the action must be executed for every binding generated by the condition on the sharedvariables. We now formally define the notion of binding we will use in the subsequent discussion. Firstof all, we introduce a definition of substitution which is similar to the one generally adopted [19]. Werecall that Var denotes the set of possible variables, while V denotes the set of possible values in ourlanguage.

Definition 18. (Substitution and set of bindings) A substitution θ is a partial function from Var to V ;θ : Var → V . A set of bindings B is a set of substitutions {θ1, . . . , θm}.

We note that, according to the previous definition, a substitution is a partial function. The domain offunction θ is the set of possible variables of the language. However, given a formula, only a subset ofthe set of all variables Var is used in the formula. Therefore, for the variables appearing in the formula,function θ returns a value, whereas for the remaining ones it is undefined. Given X ∈ Var a variable,and θ a substitution, θ(X) =⊥ denotes that θ is not defined on X. In the following, we often refer tosubstitutions as a subset of the Cartesian product Var × V ; such a subset includes only pairs for whichθ is defined.



Intuitively, the set of bindings B = {θ1, . . . , θm} satisfying a condition C is the set of substitutionssuch that the application of each θi (i = 1, . . . ,m) to C, denoted Cθi , is a ground formula which istrue according to first-order logic. Finally, we give the following definitions, which will be used in theremainder of the paper.

Definition 19. (Variable interpretation with respect to a set of bindings) Given a variable X and a setof bindings B, XB denotes the interpretation of X in B, i.e. the set of values (or oids) to which X isbound in B. That is, given B = {θ1, . . . , θm}, XB = ⋃m

i=1{θi(X)}.Definition 20. (Substitution domain) Given a substitution θ , the substitution domain, denoted bydom(θ), is defined as the set dom(θ) = {X | X ∈ Var ∧ θ(X) �=⊥}.Definition 21. (Restriction of a substitution to a set of variables) Given a substitution θ and a set ofvariables V ⊆ dom(θ), θ|V denotes the restriction of θ to V , i.e.

θ|V (X) ={

θ(X) if X ∈ V

⊥ otherwise

Definition 22. (Union of substitutions) Let θi and θj be two substitutions such that dom(θi) ∩dom(θj ) = ∅. The union of θi with θj , denoted as θi ∪ θj , is defined as the following substitution:

θi ∪ θj (X) =

θi(X) if X ∈ dom(θi)

θj (X) if X ∈ dom(θj )

⊥ otherwise

The following example illustrates the previous definitions.

Example 11. Given X, Y, Z, W ∈ Var and a set of values V including integers, the following areexamples of substitutions:

θ1 = {X/5, Y/7, Z/8} θ2 = {X/7, W/10} θ3 = {W/8}Given B = {θ1, θ2}, XB = {5, 7}. Moreover, dom(θ1) = {X, Y, Z}, dom(θ2) = {X, W}, dom(θ3) = {W}.Finally, θ1|{X,Z} = {X/5, Z/8} whereas θ1 ∪ θ3 = {X/5, Y/7, Z/8, W/8}.

3.2.3. Semantic domains

We now introduce semantic domains. In defining semantic domains we refer to well-formed syntacticobjects, i.e. objects which meet the given static constraints and for which a semantics can be defined.

Definition 23. (Semantic domains) The semantic domains we consider in giving the semantics are:

• Bind = set of possible sets of bindings.• State = set of possible database states over a given database schema.• Term = set of well-formed terms of the language, over a given database schema.• 2V = powerset of V , set of possible values of the language.• Cond = set of well-formed formulas of the language, over a given database schema.• Update = set of well-formed update sequences of the language, over a given database schema.



Definition 24. (Semantics) The semantics of the method language is a family of functions defined asfollows:

E : (Term × State) → ((Bind × State) → 2V ∪ {〈∅, s〉 | s ∈ State})C : (Cond × State) → ((Bind × State) → Bind ∪ {〈∅, s〉 | s ∈ State})U : (Update × State) → ((Bind × State) → (Bind × State))

Note that semantic functions take pairs of the form 〈construct to be evaluated, state〉 as input.The state is needed because in the case of run-time errors the initial database state must be restored tokeep the database consistent.

In each semantic function, the considered construct, which is a term in the case of E , a conditionin the case of C and an update in the case of U , is evaluated with respect to a set of bindings, i.e. anenvironment, and a database state. According to each type of construct a different result is returned.A value belonging to 2V is returned if a term is evaluated. If a condition is evaluated, then the newset of bindings for the variables appearing in the formula is returned. Finally, in the case of updatesboth a new set of bindings and a new database state are returned. For instance, the database operationcreate modifies the database state creating new objects and binds these new objects to the variableappearing in the statement. Thus, both the set of bindings and the database state are modified by theupdate execution. In evaluating these constructs run-time errors can arise. For instance, in the case ofterm evaluation, a division by zero raises a run-time error. In the case of run-time error occurrence, anempty set (∅) of bindings is returned and, thus, the variables are not bound to any value. In addition,the initial database state is restored. Thus, each semantic function can return the correct evaluation ofthe considered construct or the pair 〈∅, initial state〉 in the case of run-time errors, as specified by theresulting domains.

In what follows, we specify semantic functions C and U . We do not detail the definition of E here,instead we only present some examples of the semantic evaluation of terms. The formal definition of Eis not detailed, since it is the standard semantics evaluation of expressions manipulating integer, string,real, set values and so on.

Example 12. Let o1 and o2 be two objects of the class person defined in Example 1. Moreover,let B = {{X/5, Y/o1}, {X/13, Y/o2}} be a set of bindings and S be a database state such thatS.ν(o1).money = 450, and S.ν(o2).money = 600. Finally, let S0 be the initial state, then the followingare examples of semantic evaluation of terms.

• E�XS0�BS = {5, 13};• E�125S0�BS = {125};• E�Y.money+ XS0

�BS = {455, 613};• E�Y.money− 100S0

�BS = {350, 500}.Let ∗ be the symbol denoting string concatenation and let ∪ be the symbol denoting set union, then

• E�{‘Bob’,‘John’,‘Sue’} ∪{‘Mary’,‘Clare’,‘Sue’}S0�BS ={{‘Bob’,‘John’,‘Sue’,‘Mary’,‘Clare’}};

• E�‘abbabba’∗‘ccaabb’S0�BS = {‘abbabbaccaabb’}.Let S.ν(o1).dependents = {o′

1, o′2, o

′3}, and S.ν(o2).dependents = {o′′

1}, then

E�count(Y.dependents)S0�BS = {3, 1}.



U�create(c, t , O)S0�BS = 〈B ′, S′〉

B ′ = ∅ and S′ = S0 if ∃ θi ∈ B such that E�tS0�θiS �∈ dom(σ (c), S)

otherwise:

B ′ = {θ1 ∪ 〈O/oidf (1)〉, . . . , θm ∪ 〈O/oidf (m)〉}, where:

let V be the set of variables in t , then f : {1, . . . , m} → {1, . . . , k}, k ≤ m, is a functionsuch that f (i) = j if θi|V = σj ,

where {σ1, . . . , σk} = {θi|V | i = 1, . . . , m} and {oid1, . . . oidk} = OIDNEW

S′ = 〈π ′, ν′〉, where:

π ′(c) = π(c) ∀ c ∈ CI such that c �= c

π ′(c) = π(c) ∪ OIDNEW c = c

ν′(oid) = ν(oid) ∀ oid /∈ OIDNEWν′(oidi ) = E�t�σiS ∀ oidi ∈ OIDNEW, 1 ≤ i ≤ k

U�delete(c, O)S0�BS = 〈B ′, S′〉

B ′ = ∅ and S′ = S0 if ∃ oid∗ �∈ OB , ∃a∗ such that ν(oid∗).a∗ ∈ OB or ν(oid∗).a∗ ∩ OB �= ∅otherwise B ′ = B and S′ = 〈π ′, ν′〉, where:

π ′(c) = π(c) ∀ c ∈ CI such that c�ISAc

π ′(c) = π(c) \ OB c ≤ISA c

ν′(oid) = ν(oid) ∀ oid /∈ OB

ν′(oid) =⊥ ∀ oid ∈ OB

Figure 2. Semantics of atomic update operations (I).

3.3. Semantics: atomic updates

Figures 2 and 3 specify the semantics for each atomic update operation. In what follows we brieflydiscuss each operation in turn.

3.3.1. Creation

Consider the operation create(c, t , O) executed on a set of bindings B = {θ1, . . . , θm} in a state S.The intuitive idea of the create operation is to create as many distinct objects as the differentevaluations of the record t with respect to the set of variables appearing in t . Each evaluation representsthe state of the new, just created objects.



U� specialize(c1, c2, O, t )S0�BS = 〈B ′, S′〉

B ′ = ∅ and S′ = S0 if ∃θi ∈ B such that ν(oid)∪E�t�θiS �∈ dom(σ (c2), S)

otherwise B ′ = B and S′ = 〈π ′, ν′〉, where:

π ′(c) = π(c) ∀ c ∈ CI such that c �= c1 and c �= c2π ′(c) = π(c) \ OB c = c1π ′(c) = π(c) ∪ (OB ∩ π(c1)) c = c2

ν′(oid) = ν(oid) ∀ oid /∈ OB ∩ π(c1)

ν′(oid) = ν(oid)∪E�t�θiS ∀ oid ∈ OB ∩ π(c1) such that θi (O) = oid

U�generalize(c1, c2, O)S0�BS = 〈B ′, S′〉

B ′ = ∅ and S′ = S0 if ∃ oid∗, ∃c∗, ∃a∗ such that oid∗ ∈ π(c∗), c2�ISAσ(c∗).a∗, and(ν(oid∗).a∗ ∈ OB or ν(oid∗).a∗ ∩ OB �= ∅)


π ′(c) = π(c) ∀ c ∈ CI such that c�ISAc1 and c �= c2π ′(c) = π(c) \ OB c ≤ISA c1π ′(c) = π(c) ∪ OB c = c2


ν′(oid) = ν(oid)|c2∀ oid ∈ OB such that θi (O) = oid

U�modify(c.a, O, t)S0�BS = 〈B ′, S′〉

B ′ = ∅ and S′ = S0 if (∃ θi ∈ B such that E�tS0�θi �∈ dom(σ (c).a, S)) or(∃ θi , θj ∈ B such that θi (O) = θj (O) but E�tS0�θiS �= E�tS0�θj S)


π ′(c) = π(c) ∀ c ∈ CI


ν′(oid).ak = ν(oid).ak ∀ oid ∈ OB and ak �= a

ν′(oid).ak = v ∀ oid ∈ OB, ak = a and if θi (O) = oid then E�tS0�θiS = {v}

Figure 3. Semantics of atomic update operations (II).

First we have to ensure that these states are legal, that is, that each attribute value is a legal value forthe corresponding domain. Thus, we must check that the evaluation of t produces a value which is alegal initialization value for the object state, i.e. for each substitution θi ∈ B, E�tS0�θiS must be a valuein dom(σ (c), S). If this constraint is not satisfied, the pair 〈∅, S0〉 is returned, since an error situationhas occurred.



Otherwise, given the set V of variables appearing in term t , the set BV = {θ1|V , . . . , θm|V } isconsidered. Two restricted substitutions, say θi|V and θj|V , can be equal even if θi �= θj . As aconsequence, the set BV includes only k substitutions, k ≤ m, rather than m; i.e. BV = {σ1, . . . , σk}.Therefore, the evaluation of t in 〈BV , S〉 produces k records, one for each substitution in BV . Then, k

distinct objects are created, each with state ti = E�t�σiS, i = 1, . . . , k. Two distinct objects with thesame state can, however, be created, when two ti’s (i = 1, . . . , k) are equal, but these ti’s result from theevaluation of t with respect to two different substitutions in BV . Therefore, these ti ’s must correspondto different oids. Let OIDNEW = {oid1, . . . , oidk} be the set of the oids of the new, just created objects.The idea is to add to each θi ∈ B the pair 〈O/oidi〉 where oidi is the object whose state is given byE�t�θi|V S. To model this correspondence we introduce a function f : {1, . . . ,m} → {1, . . . , k} suchthat f (i) = j if θi|V = θj , i.e. E�t�θi|V S = tj .

We note that in the state S which precedes the create operation, the state of the oids assigned tothe new objects is undefined, i.e. ∀ oidi ∈ OIDNEW, i = 1, . . . , k, ν(oidi ) =⊥. Moreover, each newobject state ti can be the result of the evaluation of record t with respect to different substitutions θi .In particular if each evaluation of t in B returns a different value, then the number of created objectsis equal to the number of substitutions in B, i.e. m, and the state of the objects would be defined asν′(oidi ) = E�t�θiS. Finally, note that the create operation returns O as the output parameter, boundto the oids of the created objects.

3.3.2. Deletion

Consider the operation delete(c, O) executed on a set of bindings B in a state S. The deleteoperation deletes all the objects which are bound to the variable O in B, i.e. OB . Note that for eachoid ∈ OB the most specific class can be c or a subclass of c.

First of all, we must check whether the deletion is legal, that is, whether it violates the referentialintegrity. If so, i.e. if any other object exists in the database, referencing one of the objects we areattempting to delete, the update fails and the pair 〈∅, S0〉 is returned, since an error situation hasoccurred‖.

Otherwise, the set of bindings B remains unchanged but the state S changes. Function π , whichrepresents the proper instances of a class, is updated in the following way. For each class that isnot a subclass of c, function π remains the same. By contrast, for c and for each class which isa subclass of c the set OB is subtracted by the extent. We recall that given two sets X and Y ,X \ Y = {el | el ∈ X and el /∈ Y }. Thus, each object identifier oid ∈ OB is, in fact, droppedfrom the set of objects π(c), where c is its most specific class. For function ν, the new state of thedeleted objects is undefined, i.e. ν′(oid) =⊥,∀ oid ∈ OB . Note that, in general, the oids such thatν(oid) =⊥ are the ones that are not currently assigned to any object. They can thus be assigned to newobjects. Hence, in our language, oids of deleted objects can be reassigned.

Note finally that the delete operation does not modify the set of bindings B on which it is executed.This is in agreement with the idea we have of B and S: if an update returns output variables (as in

‖Referring to SQL terminology, note that we adopt a restrict semantics to enforce referential integrity; alternative semantics,such as cascade or set null, could also be adopted.



the create operation), both B and S change, otherwise only the state is modified (as in the deleteoperation).

3.3.3. Specialization

Consider the operation specialize (c1, c2, O, t) executed on a set of bindings B = {θ1, . . . , θm}in a state S. The intuitive idea is that objects to which variable O is bound migrate from class c1, towhich they belong, to class c2, which is a direct subclass of c1, i.e. c2 ≤ISA c1 and a class c such thatc2 ≤ISA c ≤ISA c1 does not exist. t is the record term providing the values for the new attributesacquired by such objects. Note that one of the problems which arise in executing the specializeoperation is that for each oid ∈ OB the most specific class can be a subclass of c1, thus objectswhich already belong to c2 or to a subclass of c2 can be migrated. In these cases, as stated by the givensemantics, such objects are not manipulated by the update and no run-time error is given. Thus, the setof objects manipulated by specialize (c1, c2, O, t) is not, in fact, OB , but OB ∩ π(c1), i.e. the setof objects belonging to OB for which c1 is the most specific class. In the semantic clauses the symbol∪ denotes record concatenation. Note that the objects in the set OB ∩ π(c1) become instances of classc2, but they are still members of class c1. The new state of each object is obtained by adding to theold state the record E�t�θiS, containing the values for proper attributes of c2. If any of these values,however, is not correct, i.e. if, for a θi ∈ B, ν(oid)∪E�t�θiS �∈ dom(σ (c2), S), then the pair 〈∅, S0〉 isreturned, since an error situation has occurred.

3.3.4. Generalization

Consider the operation generalize (c1, c2, O) executed on a set of bindings B = {θ1, . . . , θm} in astate S. The intuitive idea is that objects to which variable O is bound migrate from class c1, to whichthey belong, to class c2. In the semantic clauses ν(oid)|c2

denotes the restriction of ν(oid) to attributesof class c2; i.e. ν′(oid).a = ν(oid).a if a is an attribute of c2, otherwise ν′(oid).a =⊥. Note thatthe objects in the set OB become instances of class c2, while they are removed from the extent ofclass c1. Thus, since a generalization is a partial deletion, referential integrity must also be ensured forgeneralization; i.e. if a reference to an object in OB exists through an attribute whose type is a classmore specific than c2, the generalization is not allowed, since that object, once migrated in c2, wouldno longer be a legal value for that attribute. Thus, the pair 〈∅, S0〉 is returned, since an error situationhas occurred.

3.3.5. Attribute modification

Consider the operation modify(c.a, O, t) executed on a set of bindings B = {θ1, . . . , θm} in a state S.Three cases can be distinguished, the first and second corresponding to error situations and the last onecorresponding to a correct situation. If term t evaluates to a value not belonging to dom(σ(c).a, S) or ifit evaluates to different values corresponding to two different substitutions θi, θj ∈ B which assign thesame value to variable O , then an error situation occurs and the initial database state is restored. In thefirst case, the value to be assigned to the attribute is not legal, thus this obviously corresponds to anerror situation. In the second case, variable O has the same value, say object o, in the two substitutions,whereas t has different values, say v1 and v2. Thus it is not possible to decide which value, between v1



and v2, to assign to attribute a of object o. If no problems of these kinds arise, the value resulting fromthe evaluation of t is assigned to attribute a of the object to which O evaluates. Thus, this semanticsmodels intrinsic ambiguity problems and clearly states that the evaluation of the term t must be in thestate preceding the operation execution.

In what follows some examples of the semantic evaluation of some atomic updates are presented.

Example 13. Let θ1 = {X/o1} and θ2 = {X/o2} be two substitutions, and B = {θ1, θ2} be a set ofbindings. Moreover, let S be the database state presented in Example 10. In the following we presentthe semantics of some updates presented in Example 8 according to the semantics defined in Figures 2and 3.

(i) U�create(person,(name:‘John Smith’,address:‘29 Red St.London’,phone:‘071-643764’,birthday:(day:8,month:10,year:1969), spouse:null,money:5000,debit:0),Z)�BS = 〈B ′, S′〉. First of all, according to the semantics of the create operation,we have to check that the values obtained by applying substitutions in B to term

t=(name:‘John Smith’,address:‘29 Red St.London’,phone:‘071-643764’,

birthday:(day:8,month:10, year:1969),spouse:null,money:5000,debit:0),

defining the state of the newly created objects, are legal values for σ(person). In this example,t is a ground term and hence a legal value for σ(person). Then, the set V of variables appearingin term t and function f must be determined. Since no variables appear in t, then V = ∅, andthus θi|V = ∅, for each i ∈ {1, 2}. Then only one substitution σ1 exists, and f : {1, 2} → {1}is such that f (1) = f (2) = 1. In this case only one object of type person is created, thusOIDNEW = {o1}. The resulting pair 〈B ′, S′〉 is defined as follows:

B ′ = {θ1 ∪ 〈Z/o1〉, θ2 ∪ 〈Z/o1〉}S′ = 〈π ′, ν′〉

where π ′(person) = π(person) ∪ {o1}; π ′(c) = π(c) if c �= person; andν′(o1) = (name:‘John Smith’, address:‘29 Red St.London’,phone:‘071-643764’,birthday:(day:8,month:10,year:1969),spouse:null,money: 5000, debit:0), andν′(oid) = ν(oid) if oid �= o1.

• U�delete(person,X)�BS = 〈B ′, S′〉. Let X be a variable declared of type person. We recallthat XB = {θ1(X)} ∪ {θ2(X)} = {o1, o2}. The resulting pair 〈B ′, S′〉 is defined as follows:

B ′ = B

S′ = 〈π ′, ν′〉where π ′(c) = π(c) = ∅ for each class c �= person and π ′(person) = π(person) \ {o1, o2};and ν′(oid) = ν(oid) if oid �∈ {o1, o2} and ν′(o1) = ν′(o2) =⊥. Note that the deletion is indeedallowed, since it leaves no dangling references.

3.4. Semantics: method invocations

Method invocations are the only non-atomic updates in our language. A method is implemented byseveral rules, where each rule has the form: condition → u1; . . . ; un. Hence, we first formalize the



semantics of update concatenation and the semantics of conditions. Then the semantics of each rulein the method implementation is presented and finally we specify the way in which the different rulesinteract with each other.

3.4.1. Concatenation

The semantics of concatenation is defined inductively on the number of updates in the sequence asfollows.

Definition 25. (Semantics of concatenation) Let u1, . . . , un be updates according to Definition 10, letB be a set of bindings, S0 the initial state and S a database state; then

U�u1; u2; . . . ; un�BS ={

〈∅, S0〉 if U�(u1)S0�BS = 〈∅, S0〉U�(u2; . . . ; un)S0�(U�(u1)S0�BS) otherwise

This definition is an inductive definition where the basis (n = 1) consists of sequences composed bya single update. Each update in turn can be an atomic update, whose semantics is specified according toFigures 2 and 3, or a method invocation, whose semantics is specified later according to Definition 31.

3.4.2. Conditions

The semantic domain of function C is a set of well-formed formulas in our language. Intuitively theevaluation of a condition on a set of bindings B and in a state S returns the subset of the Cartesianproduct of the variables appearing in the formula and satisfying the formula itself, as specified by thefollowing definition.

Definition 26. (Semantics of conditions) Let C be a formula according to Definition 7, let B ={θ1, . . . , θm} be a set of bindings, S0 the initial state and S a database state. The evaluation of conditionC in 〈B, S〉 is defined as

C�CS0�BS ={

〈∅, S0〉 if run-time errors occur in the evaluation of C⋃mi=1 C�C�θiS otherwise

where C�CS0�θiS = {θ i1, . . . , θ

ik} and ∀j = 1, . . . , k,

• θi ⊆ θ ij , i = 1, . . . ,m, i.e. each substitution θ i

j coincides with θi on the variables for which θi isdefined;

• Cθij , i = 1, . . . ,m, is a ground formula which evaluates to true in state S.

Note that the condition evaluation returns 〈∅, S0〉 when a run-time error (as a division by zero)occurs in the condition evaluation. Otherwise, the result of the evaluation of the condition C isa set of substitutions, each grounding for C and such that the instantiation of C with respect toeach of them evaluates to true in state S according to first-order logic. Each substitution θi ∈ B

partially instantiates the formula C, while substitutions θ i1, . . . , θ

ik are extensions of θi specifying the

bindings for the variables in C non-bound by θi . More formally, if Var(C) denotes variables appearingin condition C and dom(θ) denotes the domain of substitution θ , according to Definition 20, thendom(θ i

j ) = Var(C) ∪ dom(θi).



Example 14. Consider the simple class formula c(X). C�c(X)S0�BS = B ′ where B = {θ1, . . . , θm} and

X does not appear in B. B ′ = {θ1 ∪{X/oid1}, . . . , θ1 ∪{X/oidn},. . . , θm ∪{X/oid1}, . . . , θm ∪{X/oidn}}where S.π∗(c) = {oid1, . . . , oidn}. In this case, since Var(C) ∩ dom(θi) = ∅, i = 1, . . . ,m, the set ofsubstitutions generated for each θi ∈ B is the same, i.e. {{X/oid1}, . . . , {X/oidn}}. For each substitutionθ ′i , belonging to B ′, (c(X))θ ′

i = c(θ ′i (X)) = c(oidi ) = true since oidi ∈ S.π∗(c) for definition of B ′.

Based on first-order logic, the semantics of conditions can be easily formalized. We do not introduceall the clauses here, rather we focus on the semantics of other constructs in our language for which nowell-known formalisms such as first-order logic exist.

3.4.3. Method invocations

We recall that a method invocation has the form O.m(t1, . . . , tn) where t1, . . . , tn are terms. Moreover,the former j parameters are assumed to be input parameters and the remaining parameters are assumedto be output parameters. t1, . . . , tj are thus terms which represent the values associated with the formalinput parameters; tj+1, . . . , tn are, by contrast, variables.

In formalizing the semantics of method invocations, we can distinguish four phases:

(i) selection of the method implementation,(ii) generation of the set of bindings in which the rule conditions are evaluated,

(iii) evaluation of the method implementation and(iv) generation of the set of bindings to return as a result.

Phase (iii) differs for side-effect-free methods and those with side-effects, whereas the other phases arenot affected by such difference. In the following, we discuss in detail the various phases.

3.4.3.1. Selection of method implementation. The first problem arising in the definition of thesemantics of method invocations is to establish which method has to be invoked for each oid boundto the variable, on which the method is invoked. Our language supports late binding, i.e. at run-time,for each object the method related to its most specific class is invoked. In what follows d denotesthe dispatching function, whose definition is: given the object O and the method m, d(O , m) = c

where c is the class whose implementation for m is selected for execution. Note that the most specificimplementation is selected for each object, i.e. d(θi(O),m) = min≤ISA{c | θi(O) ∈ S.π∗(c) ∧ m ∈µ(c)}.

Late binding, however, complicates the semantics of set-oriented method invocations, especiallyfor methods with side effects. Suppose we invoke method m on object O , with respect to a set ofbindings B, then the set OB = {oid1, . . . , oidm} can be partitioned into several disjoint subsets of oidsaccording to the method implementation selected for each object. Therefore, each subset contains theoids of objects for which the same method implementation is considered, as stated by the followingdefinition.

Definition 27. (Partitioning) Let B = {θ1, . . . , θm} be a set of bindings. The dispatching function d

induces a partitioning D(B) = {B1 . . . , Bk} of B such that for j = 1, . . . , k, Bj = {θi | θi ∈B ∧ d(θi(O),m) = cj }.

According to the set-oriented approach, given the invocation O.m(t1, . . . , tn) to be executed in thestate S with respect to the set of bindings B, the body of method m should be executed in the state S on



each object O and on each parameter evaluation corresponding to a substitution in B. Note that eachof these body executions transforms the database from one state to another. Therefore, when differentmethod implementations must be executed for different sets of objects, the database transformationinto a new state is not atomic. Each implementation may perform completely different updates on thedatabase. In Section 4 we discuss which conditions must be imposed to ensure absence of conflicts.

3.4.3.2. Generation of the set of bindings in which the rule conditions are evaluated. Let m be amethod name, B = {θ1, . . . , θm} a set of bindings and S a database state. We define the semanticsof O.m(t1, . . . , tn) where t1, . . . , tj are the actual input parameters and tj+1, . . . , tn are the actualoutput parameters. Since we assume that the database schema meets all the static constraints, theformal parameters of method m are P1, . . . , Pn, and a correspondence between the actual parameterst1, . . . , tn, and the formal parameters P1, . . . , Pn must be established. For each substitution θi wegenerate a substitution, θi/P

, local to the method. A local substitution establishes the correspondencebetween each formal input parameter Pi , i = 1, . . . , j , and the relative actual input parameter ti .No binding is established for the output parameters, since the substitutions obtained from the evaluationof the conditions implementing the method establish such bindings. We recall that variable Selfis implicitly included in the input parameters and, locally to the method, it is bound to the objectscorresponding to variable O . The set of bindings local to the method is, thus, obtained as specified bythe following definition.

Definition 28. (Local substitution) Let O.m(t1, . . . , tn) be a method invocation. Let P1, . . . , Pj bethe formal input parameters of method m and P = {P1, . . . , Pj }. Let θi be a substitution, then thesubstitution local to m θi/P

is:θi/P

= {P1/E�t1�θiS, . . . , Pj /E�tj �θiS, Self/E�O�θiS}Given a set of bindings B = {θ1, . . . , θm}, B/P = {θi/P

| i = 1, . . . ,m}.

3.4.3.3. Evaluation of method implementation: side-effect-free methods. The semantics of a side-effect-free method invocation, which is an atomic formula, given a database state S and a set bindingsB = {θ1, . . . , θm}, is defined as follows.

Definition 29. (Semantics of side-effect-free method invocations) Let O.m(t1, . . . , tn) be a side-effect-free method invocation, let S be a database state and B = {θ1, . . . , θm} a set of bindings. Then,

C�O.m(t1, . . . , tn)S0�BS = C�O.m(t1, . . . , tn)S0�B/P S =C�O.m(t1, . . . , tn)S0�θ1/P

S ∪ · · · ∪ C�O.m(t1, . . . , tn)S0�θm/PS = B ′

/P

where C�O.m(t1, . . . , tn)S0�θi/PS, for i = 1, . . . ,m is defined as follows.

Let d(E�OS0�θi/PS, m) = c and let the implementation for method m in class c be m(P1, . . . , Pn) :

condition1‖ · · · ‖conditionk . Then,

C�O.m(t1, . . . , tn)S0�θi/PS = (C�(condition1)S0�θi/P

S) ∪ · · · ∪ (C�(conditionk)S0�θi/PS)

Note that, because of the definition of condition evaluation, this set of substitutionsC�O.m(t1, . . . , tn)S0�θi/P

S is the union of the sets of substitutions each of which verifies one of theconditions. Thus, for methods without side effects, the order in which the conditions appear, i.e. theorder of rules, is not relevant.



3.4.3.4. Evaluation of method implementation: methods with side effects. For methods with sideeffects, by contrast, the order of rules is relevant. Consider a method implementation consisting of m

rules condition1 → action1‖ · · · ‖conditionm → actionm. As we stated earlier, the evaluation of thecondition for each rule results in a set of bindings. For the ith condition, 1 ≤ i ≤ m, let Bi denoteC�(conditioni )S0�BS. Note that we omit the restriction to P in this context, because it is irrelevant andmakes the notation heavier.

One of the main issues in giving the semantics of a method implemented by several rules is thatthe sets of bindings returned by the condition evaluations are not, in general, disjoint. Intuitively wewant a method to perform different operations corresponding to different sets of objects. If, in the samemethod, two conditions return the same set of bindings B, on such a set both the operations in thefirst action, related to the first condition, and the operations in the second action, related to the secondcondition, are executed. We could impose that the set of bindings resulting from the evaluation of theconditions of different rules be disjoint, i.e. B1 ∩ · · · ∩ Bm = ∅, and return an error whenever such arestriction is not fulfilled. In our opinion such a restriction would be too strict. Suppose that two Bi ’shave in common a single substitution θ . Because of such a substitution, the method invocation wouldfail, even when all other substitutions raise no errors. Thus, we have chosen to take into account theorder among the rules in the implementation of a method. If a substitution θ is shared between two Bi ’s,only updates related to the first rule, whose condition evaluation includes θ , are performed on such asubstitution. Thus, the evaluation of the body of the ith rule takes place in Bi \ (B1 ∪· · ·∪Bi−1), i.e. allBi ’s from which all the substitutions already computed in the previous rules have been eliminated, ifthey are present.

Note that this semantics does not prevent the same object from being modified by two differentrules. As an example, consider two substitutions θ ′ and θ ′′ belonging to two different Bi ’s, such thatfor a variable X, θ ′(X) = θ ′′(X) holds. Correct handling of these cases is left to the programmer whomust ensure that the variable X is used only as an input parameter or that, if changes are subsequentlyperformed, they must be coherent.

The basic idea in dealing with the issues related to Bi is that, intuitively, we want a method to performdifferent updates for different sets of objects. In a certain sense, we would like rules to be mutuallyexclusive. Since it is difficult to establish whether the rules in a method always modify disjoint sets ofobjects for every possible database state, our semantics solves this problem by making the Bi ’s disjointif they are not, by using the implicit order of the rules. If the Bi ’s are already disjoint, mutual exclusionis automatic.

To model this notion of ‘binding consumption’, we introduce the notion of filtered set of bindings.

Definition 30. (Filtered set of bindings) Let m(P1, . . . , Pn) : condition1 → action1‖ · · · ‖conditionm

→ actionm be a method implementation, let S be a database state and B a set of bindings. Moreover,for 1 ≤ i ≤ m, let Bi/P

denote C�(conditioni )S0�B/P S. The filtered set of bindings FBi/Pis simply

defined as

FBi/P= Bi/P

\ (B1/P∪ · · · ∪ Bi−1/P

)

Thus, starting from this filtered set of bindings FBi/Pand from the state Si−1, the update sequence

of the body of the ith rule is evaluated in U�(actioni )S0�FBi/PSi−1. Note that the set of bindings in

which the update sequence constituting the action of the ith rule is evaluated is not Bi/P, rather it is

FBi/P= Bi/P

\ (B1/P∪ · · · ∪ Bi−1/P

).



The set of bindings B ′/P

produced by the evaluation of the method invocation is the union of all B ′if ’s

resulting from the evaluation of the different rules. The resulting substitutions in B ′/P

are bound to thelocal set of bindings restricted to the formal parameters. Therefore, this set of bindings must be joined tothe set B of bindings on which the method has been invoked, and a correspondence must be establishedbetween the formal and the actual output parameters. We address this issue in the following paragraph.

For the state, the semantics of the body of the ith rule is computed in the state Si−1 resulting fromthe evaluation of the body of the previous rule, according to the implicit order.

The semantics for a method invocation for methods with side-effects is thus defined as follows.

Definition 31. (Semantics of method invocations with side effects) Let O.m(t1, . . . , tn) be a methodinvocation with side effects, let S be a database state and B a set of bindings. Let d(OB,m) = c∗∗ andlet the implementation for method m in class c be

m(P1, . . . , Pn) : condition1 → u11; . . . ; u1

n‖ · · · ‖conditionm → um1 ; . . . ; um

k .

Then,

U�O.m(t1, . . . , tn)S0�BS = U�O.m(t1, . . . , tn)S0�B/P S =U�(condition1 → u1

1; . . . ; u1n‖ · · · ‖conditionm → um

1 ; . . . ; umk )S0)�B/P S = 〈B ′

/P, S′〉

where B ′/P

= ⋃mi=1(π1(U�(ui

1; . . . ; uih)S0�FBi/P

S) = ⋃mi=1(π1〈Bi

f , Sif 〉)

and S′ = Sm is defined by the following recurrence:{S0 = S

Si = π2(U�(ui1; . . . ; ui

h)S0�FBi/PSi−1)

where πi , i = 1, 2, denotes the projection of the ith component of a pair 〈B, S〉.3.4.3.5. Generation of the set of bindings to return as a result. Two cases must be distinguished:

(i) the method with output parameters and(ii) the method without output parameters.

If the method has output parameters, before defining the set B ′, we have to establish the linkbetween the formal output parameters Px (x = j + 1, . . . , n) and the actual output parameters tx(x = j + 1, . . . , n). Let Boutput be the set defined as follows. Intuitively, the set Boutput is the set B ′

/P

corresponding to the actual output parameters.

Definition 32. (Boutput) Let B ′/P

= {θ ′1/P

, . . . , θ ′k/P

} as before, for each substitution θ ′i/P

∈ B ′/P

let

θioutput be defined as follows:

• if θ ′i/P

(Px) = v then θioutput(tx) = v x = j + 1, . . . , n;

• if θ ′i/P

(Px) =⊥ then θioutput(tx) = null x = j + 1, . . . , n.

Then Boutput = {θ1output, . . . , θhoutput}.

∗∗For the sake of simplicity we consider the case in which this class is unique. The extension to more than one class will bepresented in Section 4.



Let f : {1, . . . , k} → {1, . . . ,m}, m ≤ k, be a function such that given B = {θ1, . . . , θm} andB ′

/P= {θ ′

1/P, . . . , θ ′

k/P} as before, f (i) = j if θ ′

i/P(Px) = E�tx�θjS, where Px is a formal parameter

and tx is an actual parameter.To define B ′ we have to join each substitution θioutput , which binds the actual output parameters,

with the substitution θf (i), with respect to which the corresponding actual input parameters have beenevaluated, as stated in the following definition.

Definition 33. (Resulting set of bindings for methods with output parameters) Let f be the functiondefined earlier, then

B ′ =m⋃

i=1

{θf (i) ∪ θioutput}

where θf (i) ∈ B and θioutput ∈ Boutput.

In the second case, i.e. if the method has no output parameters, two subcases must be distinguished:either the method is side-effect-free; or it has side effects. B ′ is defined as follows.

Definition 34. (Resulting set of bindings for method without output parameters) Let f be thepreviously defined function, then

B ′ ={⋃m

j=1{θj ∈ B s.t. ∃ i, 1 ≤ i ≤ k, f (i) = j } for side-effect-free methods

B otherwise

Intuitively, methods implementing Boolean predicates have no output parameters. Therefore, foreach substitution θi ∈ B they compute the corresponding θi/P

. If such a substitution verifies one of theconditions implementing the method, then the corresponding substitution θi ∈ B is given as the resultin B ′.

This semantics formalizes cases (i) and (ii) introduced at the beginning of the section for side-effect-free methods. In case (i), the returned set of bindings is completed by including the output valuesof the method corresponding to each substitution of input parameters which satisfies the conditions.In case (ii), the returned set of bindings consists of the substitutions in B verifying one of the conditions.If the method has side effects but no output parameters, it means that the method aims at updating thedatabase, thus the database state changes whereas the set of bindings remains invariant, i.e. B ′ = B.

The following example illustrates the semantic evaluation of a method invocation.

Example 15. Here we report the semantic evaluation of the method raise salary presented inExample 9 whose implementation is the following:

raise salary(amount):M = Self.boss→modify(employee.monthly salary,Self,M.monthly salary);modify(employee.monthly salary,M,M.monthly salary+amount)

Consider the invocation of method raise salary with respect to the following set of bindingsB = {θ1, θ2} such that θ1 = {O/o1, amount/100} and θ2 = {O/o2, amount/100}, and a database stateS = 〈π, ν〉 such that S.ν(o1).boss = o2 and S.ν(o2).boss = o2 (i.e. o2 is manager of himself), andsuch that S.ν(o1).monthly salary = 200 and S.ν(o2).monthly salary = 400.



In this example we present the semantics of the method invocation O.raise salary(amount)withrespect to the set of binding B and the state S. In order to focus on the important characteristics of thepresented semantics, we do not detail all the semantics steps.

Since we assume that no run-time errors occur in the evaluation of O.raise salary(amount), theinitial state S0 is useless, thus it is omitted to simplify the notation.

Before evaluating U�O.raise salary(amount)�BS, thus obtaining a pair 〈B ′, S′〉, we have tocompute the local substitution B/P . According to Definition 28,

B/P = {{Self/o1, amount/100}, {Self/o2, amount/100}}Thus, according to Definition 31,U�O.raise salary(amount)�BS = U�O.raise salary(amount)�B/P S =U�M = Self.boss→ modify(employee.monthly salary,Self,M.monthly salary);

modify(employee.monthly salary,M,M.monthly salary+amount) �B/P S = 〈B ′/P

S〉.Since the body of the method is composed by a single rule, we have:

B ′/P

= ⋃1i=1(π1(U�modify(employee.monthly salary,Self,M.monthly salary);

modify(employee.monthly salary,M,M.monthly salary+amount)�B1/P

S)

where B1/P= C�M = Self.boss�B/P S = {θ ′

1/P, θ ′

2/P}= {{Self/o1, amount/100, M/o2}, {Self/o2,

amount/100, M/o2}}, and

S′ = π2(U�modify(employee.monthly salary,Self,M.monthly salary);modify(employee.monthly salary,M,M.monthly salary+amount)�B1/P

S)

According to Definition 25,

U�modify(employee.monthly salary,Self,M.monthly salary);modify(employee.monthly salary,M,M.monthly salary+amount) �B1/P

S) =U�modify(employee.monthly salary,M,M.monthly salary+amount) �

(U�modify(employee.monthly salary, Self, M.monthly salary)�B1/PS )

The semantics of the modify operation is specified in Figure 3. The first condition of themodify operation semantics, i.e. ∃ θi, θj ∈ B1/P

such that θi(Self) = θj (Self) butE�M.monthly salary�θiS �= E�M.monthly salary�θjS, is not satisfied. We recall that this conditionis satisfied if a run-time error occurs. Thus we have:

U�modify(employee.monthly salary, Self, M.monthly salary)�B1/PS = 〈B1/P

, S1〉 =〈B1/P

, 〈π, ν′〉〉 whereν′(oid) = ν(oid) ∀ oid /∈ SelfB1/P

= {o1, o2}ν′(oid).ak = ν(o2).ak ∀ oid ∈ SelfB1/P

= {o1, o2} and ak �= monthly salary

ν′(o1).monthly salary = 400 since θ ′1/P

(Self) = o1

and E�M.monthly salary�θ ′1/P

S = {400}ν′(o2).monthly salary = 400 since θ ′

2/P(Self) = o2

and E�M.monthly salary�θ ′2/P

S = {400}



U�modify(employee.monthly salary, M, M.monthly salary+ amount)�B1/P, S1 is then com-

puted analogously, giving as result the pair 〈B1/P, S2〉 = 〈B1/P

, 〈π, ν′′〉〉 where

ν′′(oid) = ν′(oid) ∀ oid /∈ MB1/P= {o2}

ν′′(oid).ak = ν′(o2).ak ∀ oid ∈ MB1/P= {o2} and ak �= monthly salary

ν′′(o2).monthly salary = 500 since θ ′1/P

(M) = θ ′2/P

(M) = o2

and E�M.monthly salary+ amount�θ ′1/P

S =E�M.monthly salary+ amount�θ ′

2/PS = {500}.

This implies that

B ′/P

= ⋃1i=1(π1(〈B1/P

, 〈π, ν′′〉〉)) = B1/P

and

S′ = π2(〈B1/P, 〈π, ν′′〉〉) = 〈π, ν′′〉

The last step in the semantics for method invocations is the generation of the set of bindings to returnas a result. Intuitively, since our method does not have any output parameters and has side effects, theset of binding to be returned as a result is the same as the input one, as formalized in Definition 34.Finally, the semantic evaluation of the method invocation O.raise salary(amount)with respect tothe set of binding B and state S is U�O.raise salary(amount)�BS = 〈B, S′〉, where S′ is the statepreviously built.

4. PROPERTIES

In this section we discuss how some relevant semantic properties of methods can be reasoned aboutin our language. The properties we consider are the behavioral refinement of methods in subclassesand conflicts among method implementations in sibling classes. Both these properties can be formallycharacterized in terms of the semantics of our language. Since, however, both those properties areundecidable in general, some sufficient static conditions ensuring the verification of both of them arethen devised.

First of all, we discuss the expressive power of our method definition language, leading to theundecidability of the considered properties, and introduce some notions that will be used in thefollowing development. Each of these properties is then discussed, in turn.

4.1. Preliminaries

We first show that our MDL is computationally complete and then we introduce the notion of updatesperformed by a method, both as a semantic and as a syntactic notion.

4.1.1. Expressive power

First of all we show that our MDL is computationally complete, that is, Turing equivalent. We simplyshow that we can encode a counter machine [22] in a method of our language.



(l, σ ) → (l + 1, σ [i �→ j + 1]) if Il = “Xi := Xi + 1” and σ(i) = j

(l, σ ) → (l + 1, σ [i �→ j − 1]) if Il = “Xi := Xi − 1”and σ(i) = j �= 0(l, σ ) → (l + 1, σ [i �→ 0]) if Il = “Xi := Xi − 1”and σ(i) = 0(l, σ ) → (l1, σ ) if Il = “if Xi = 0 goto l1 else l2”and σ(i) = 0(l, σ ) → (l2, σ ) if Il = “if Xi = 0 goto l1 else l2”and σ(i) �= 0

Figure 4. Counter machine transition rules.

τ(Il ) =

label = l → modify(c.Xi, self, self.Xi + 1); m(label + 1) if Il = “Xi := Xi + 1”

label = l, self.Xi �= 0 → modify(c.Xi, self, self.Xi − 1); if Il = “Xi := Xi − 1”m(label + 1) ||

label = l, self.Xi = 0 → modify(c.Xi, self, 0); m(label + 1)

label = l, self.Xi = 0 → m(l1) || if Il = “if Xi = 0label = l, self.Xi �= 0 → m(l2) goto l1 else l2”

Figure 5. Counter machine instruction encoding.

A counter machine program, CM, has as storage a finite number of counters X0, X1, X2, . . . , eachholding a natural number. Program instructions allow a counter to be tested for zero or incrementingand decrementing a counter content by 1 (where, by definition, 0 − 1 = 0). The following grammardescribes the CM syntax.

I ::= Xi := Xi + 1 | Xi := Xi − 1 | if Xi = 0 goto l else l′

A counter machine program is a sequence of instructions, each labeled by Il , the execution of whichfollows the transition rules in Figure 4, where the store is modeled as a function σ : N → N whereσ(i) is the current content of counter Xi.

A counter machine program CM = I1I2 . . . Ik can be encoded in a database schema consisting ofa single class c with natural attributes X1, . . . , Xn corresponding to counters and a single method m

with a single natural parameter label whose implementation is

m(label) : τ (I1) || τ (I2) || · · · || τ (Ik)

where the translation τ (Il) of instruction Il is obtained depending on the form of Il as shown inFigure 5.

A computation in the counter machine corresponds to the creation of an object instance of c with theinput values given by the values of the corresponding attributes followed by the invocation of m(1) onthat object. The output values are the final values in the object attributes.

The fact that a method in our language can encode a counter machine also implies the undecidabilityof all non-trivial extensional properties of our methods (Rice’s theorem [22]). The considered



properties of our methods are thus undecidable and we then identify some sufficient static conditionsto ensure them. Note that we will not address the problem of the complexity of testing these staticconditions, since this is not an issue: first of all, these conditions are checked once, statically, at themethod (i.e. schema) definition time, and, furthermore, their execution cost only depends on the sizeof the syntactic expression defining the method and not on the database size. Note also that the staticconditions we identify are sufficient conditions and, as such, they could seem to be too restrictive.Some of them can be weakened, through a more sophisticated analysis, at the expense, however, ofmaking them more complex. We have chosen not to do this since, in our opinion, they already capturemost of the real needs occurring in practice. Note, finally, that an alternative approach could be todetect violations of these properties at run-time during method execution, rather than preventing them.This alternative approach has the advantage of being less restrictive, even if it can take some run-timeoverhead. Since, however, we are mostly interested in showing that our language offers some potentialfor method static analysis, we do not consider this alternative approach further in this paper.

4.1.2. Updates performed by a method execution

In this section we characterize the updates performed by a method execution both from a semantic (i.e.dynamic) and from a syntactic (i.e. static) point of view.

4.1.2.1. Updates performed by a method execution. The first notion we need to model, tocharacterize the two semantic properties we are interested in, is that of changes made by a methodexecution. Given a method call m∗ = O.m(t1, . . . , tn), executed on a database state S with respectto a set of bindings B, let δBS

m∗(c) be the set of objects deleted from class c and ιBSm∗(c) be the set of

objects inserted in class c as a consequence of the execution of m∗. Moreover, given an oid oid and anattribute name a, let νBS

m∗(oid).a be defined if and only if the execution of m∗ has modified the value ofattribute a of the object identified by oid and, if defined, let it contain the new value of the attribute.These functions are formally defined as follows.

Definition 35. (Updates performed by a method execution) Given a method invocation m∗ =O.m(t1, . . . , tn), a database state S and a set of bindings B such that U�m∗�BS = 〈B ′, S′〉, considerthe following functions:

• ∀ c ∈ CI : δBSm∗(c) = S.π(c) \ S′.π(c);

• ∀ c ∈ CI : ιBSm∗(c) = S′.π(c) \ S.π(c);

• ∀ c ∈ CI, ∀ oid ∈ S′.π(c) : νBSm∗(oid).a = v if and only if the execution of method m has

modified the value of the attribute a of object oid setting it equal to v.

The triple 〈δBSm∗ , ιBS

m∗, νBSm∗ 〉 models the updates performed by the method execution.

In what follows, we will also make use of function νBSm∗ : CI → 2OI , defined as follows:

νBSm∗(c) = {oid | oid ∈ S.π∗(c) ∧ ∃a s.t. νBS

m∗(oid).a �=⊥}. Moreover, let δ∗BSm∗ (c) = S.π∗(c) \ S′.π∗(c)

(that is, the objects deleted either from c or from one of its subclasses) and ι∗BSm∗ (c) = S′.π∗(c)\S.π∗(c)

(i.e. the objects inserted either in c or in one of its subclasses). Note that δBSm∗(c) ⊆ δ∗BS

m∗ (c) andιBSm∗(c) ⊆ ι∗BS

m∗ (c).



4.1.2.2. Schema elements updated by a method execution. The notion just introduced depends onthe actual execution of the method and, thus, on the specific database state and set of bindings withrespect to which the invocation is executed. To perform the syntactic analysis of methods, we need tointroduce the notion of the portion of a schema on which an update, a rule or a method performs adeletion, an insertion or a modification. These sets are denoted through functions D, I, M and can bedetermined by a syntactic analysis of the construct.

The schema elements updated by atomic updates can be easily determined:

• A create(c, t,O) operation always inserts in class c and has no other effect; thus, functions Dand M return the empty set, whereas function I returns the singleton {c}.

• A delete(c,O) operation deletes from class c or from one of its subclasses, and has no othereffect; thus, functions I and M return the empty set, whereas function D returns the set containingc and its subclasses.

• Similarly, a modify(c.a,O, t) operation updates attribute a of objects belonging to class c orto one of its subclasses and has no other effect; thus, functions I and D return the empty set,whereas function M returns the set containing the pairs 〈c′, a〉 for c′ equal to c or to any of itssubclasses.

The schema elements updated by specialization and generalization are similarly determined.When considering non-atomic update operations, we must define the schema elements updated by a

method implementation. They are simply defined as the union of the schema elements updated by eachupdate operation in the action of each rule. Two further issues must, however, be taken into account:

• recursive method implementations and• method overriding in subclasses.

For recursive method implementations, to obtain a set of updated schema elements a fixpoint notionis considered. Let F ∈ {D, I, M} denote the function that, applied to an update operation, returns theschema element updated (by deletion, insertion, attribute modification, respectively) by that operation.Consider, moreover, the complete poset (2CI ∪ 2CI×AN ,⊆), where ⊆ denotes subset inclusion.Each function F is monotonic on that poset, since it is only defined through union. Thus [23], it has aleast fixpoint, denoted as fix(F)††. Since, moreover, the database schema is finite, 2CI ∪ 2CI×AN isfinite as well, thus this fixpoint is reached in a finite number of steps.

For method overriding, this means that the invocation of a method m on an object o of static type c

may result at run-time in the execution of the implementation of m in any of the subclasses of c.Object instances of subclasses of c are indeed legal values for type c, and for them the most specificbehavior will be exhibited, through late binding. Thus, to determine the schema elements (potentially)updated by a method execution correctly, all the implementations of method m in subclasses of c mustbe considered and the union of the schema elements updated by each of them must be taken.

Functions D, I, M are thus formally defined as follows. In what follows, given a class c ∈ CI, c↓denotes the set of its subclasses, i.e. c↓ = {c | c ∈ CI ∧ c ≤ISA c}. We need to consider this set since,because of subtyping, each operation op on a class c can actually, at run-time, involve objects that areinstances of a class c subclass of c.

††We recall that an element S ∈ 2CI ∪ 2CI×AN is a fixpoint for a function F if F(S) = S.



Definition 36. (Updated schema elements) Given an update u, D(u), I(u) are sets of classes, M(u) isa set of class-attribute pairs, defined as follows:

• if u = create(c, t,O): D(u) = ∅, I(u) = {c}, M(u) = ∅;• if u = delete(c,O): D(u) = c↓, I(u) = ∅, M(u) = ∅;• if u = specialize(c1, c2,O, t): D(u) = {c1}, I(u) = {c2}, M(u) = ∅;• if u = generalize(c1, c2,O): D(u) = c

↓1 , I(u) = {c2}, M(u) = ∅;

• if u = modify(c.a,O, t): D(u) = ∅, I(u) = ∅, M(u) = {〈c, a〉 | c ∈ c↓};• if u = O.m(t1, . . . , tn): F(u) = ⋃

c∈c↓ fix(F(impl(m, c)), where c is the type of O , F ∈{D, I, M} and fix denotes the fix point w.r.t. the poset (2CI ∪ 2CI×AN ,⊆)

where each function F ∈ {D, I, M} is generalized to a method implementation as follows:

• if u1; . . . ; un is an update sequence, F(u1; . . . ; un) = F(u1) ∪ · · · ∪ F(un);• if R = C → A is an update rule, F(R) = F(A);• if impl(m, c) = m(P1, . . . , Pn) : R1‖ · · · ‖Rm is a method implementation, F(impl(m, c)) =

F(R1) ∪ · · · ∪ F(Rm).

Example 16. Referring to the method implementations in Example 9 and assuming that no othermethod implementations belong to the schema:

• I(impl(spend money, person)) = ∅,D(impl(spend money, person)) = ∅,M(impl(spend money, person)) = {〈person, money〉, 〈employee, money〉,

〈manager, money〉, 〈consultant, money〉}• I(impl(hire, person)) = {employee, manager},

D(impl(hire, person)) = {person, employee},M(impl(hire, person)) = ∅

• I(impl(del, person)) = ∅,D(impl(del, person)) = {person, employee, manager, consultant},M(impl(del, person)) = {〈person, spouse〉, 〈employee, spouse〉, 〈manager, spouse〉,

〈consultant, spouse〉}• I(impl(is responsible, manager)) = ∅,

D(impl(is responsible, manager)) = ∅,M(impl(is responsible, manager)) = ∅

• I(impl(casc del, manager)) = ∅,D(impl(casc del, manager)) = {employee, manager},M(impl(casc del, manager)) = ∅

Given the notion of updated schema elements, we can introduce the (syntactic) notion of non-interfering updates, that will be used in what follows and is formalized by the following definition.Intuitively, two update sequences are non-interfering if they do not perform different actions on thesame schema element. In the following definition, given a set S in 2CI×AN , �1(S) denotes the set ofthe first components of pairs in S, i.e. �1(S) = {c | 〈c, a〉 ∈ S}.



Definition 37. (Non-interfering updates) Given A1, A2 updates, update sequences, update rules ormethod implementations, they are said to perform disjoint updates if, for i, j ∈ {1, 2}, i �= j ,

• D(Ai) ∩ I(Aj ) = ∅;• D(Ai) ∩ �1(M(Aj )) = ∅;• I(Ai) ∩ �1(M(Aj )) = ∅;• M(Ai) ∩ M(Aj ) = ∅.

Finally, we introduce some notation that will be used in what follows. A method m in a class c willbe denoted by mc. Moreover, given a class c, c� will denote the set {c | c ∈ CI∧(c ≤ISA c∨c ≤ISA c)}.Given a record term t and a class c, t|c denotes the restriction of the term to the fields whose labels areattributes of c. Finally, given an update rule R and two class names c and c′, R[c/c′] denotes the ruleobtained from R by replacing each occurrence of c with an occurrence of c′. The same notation willalso be applied for conditions and updates.

4.2. Behavioral refinement

In object-oriented languages, polymorphic code can be written. However, this makes reasoning aboutmethods difficult. There might be several method implementations that could be executed upon amethod invocation. In most object-oriented programming languages, moreover, method overriding isalways uncontrolled. The only restrictions imposed on method redefinitions in subclasses concernmethod signatures but no restriction is imposed on method implementations to ensure that thesemantics is preserved. Intuitively, if a class c′ is a subclass of a class c, every object instance ofc′ must behave like some object of class c. This notion of semantic method refinement is known asbehavioral subtyping or behavioral refinement [8]. Consider, as an example, an init method initializingthe attributes of an object. The implementation of such a method in each class initializes the attributesof that class to the proper values. An init implementation in class c′ that only extends the initimplementation of class c with a new piece of code to initialize the additional attributes of c′ is anexample of behavioral subtyping. To obtain correct behavioral subtyping, super calls or the innermechanism could be exploited.

A method m in a class c′ (denoted by mc′) is a behavioral refinement of method m in a class c

(denoted by mc) if the updates performed by mc′includes the updates performed by mc‡‡. This means

that a method executed on an object in a subclass performs a set of changes which includes the setof changes that the execution of the same method on the same object would have performed if theobject were a proper instance of the superclass. In this inclusion, however, we take into account thefact that the creation of an object instance of a class c implies the creation of an object member of allthe superclasses of c. The following definition formalizes this notion.

Definition 38. (Behavioral refinement) Given a class c′ subclass of a class c, method mc′is a behavioral

refinement of method mc if, for each invocation m∗ = O.m(t1, . . . , tn), for each database state S, and

‡‡Note that we have decided to take into account only the updates performed on the database state and not the set of computedbindings.



for each set of bindings B, such that OB ⊆ S.π(c′), letting 〈B, Sg〉 = U�generalize(c′, c,O)�BS,the following conditions hold:

• ∀c ∈ CI: δ∗BSg

m∗ (c) ⊆ δ∗BSm∗ (c) and ι∗BSg

m∗ (c) ⊆ ι∗BSm∗ (c);

• ∀oid ∈ OI,∀a ∈ AN : if νBSg

m∗ (oid).a is defined, then νBSm∗(oid).a is defined and νBSg

m∗ (oid).a =νBSm∗(oid).a.

Given two method definitions, behavioral refinement is undecidable. It has indeed been reduced to anon-trivial extensional property of counter machines.

Some sufficient static conditions ensuring that a method mc′is a behavioral refinement of a method

mc can, however, be devised. These conditions can be checked at method definition time, so that theoverriding of a method in a subclass can be disallowed if the overriding method is not a behavioralrefinement of the overridden one. These conditions (referred to as refinement conditions) requires thatfor each rule R in mc a corresponding rule R′ in mc′

must exist†. All the rules in mc must correspondto rules in mc′

that are obtained by replacing occurrences of c with occurrences of c′. By contrast, thelast rule of mc, say R, can correspond to a rule R′ refining R. The implementation for mc′

may alsocontain additional rules.

An update rule R′ refines an update rule R if the condition of R′ refines the condition of R, and theaction of R′ refines the action of R. Condition refinement basically means that the condition of R′ isweaker (i.e. less selective) than the condition of R. In fact, condition refinement must be tested modulovariable renaming, since methods could exploit different parameter names and different local variables.Since, however, considering variable renaming does not bring in any relevant issues and complicatesthe notation, we will not consider this aspect in what follows. Moreover, the condition of the refinedrule may also share some more variables with its action. We will take this issue into account in definingupdate rule refinement.

The following definition formalizes condition refinement. Basically, we require that for each formulain the conjunction that forms the condition of R′ a corresponding formula appears in the condition ofR that is either the same or, in the case of comparison, class and membership formulas, implies it.In the following definition, in establishing whether a comparison formula is weaker than another one,we make use of the notation F �F ′ to denote that an atomic comparison formula F entails another oneF ′, i.e. all the solutions of F are also solutions of F ′. For instance, X < Y �X ≤ Y and X ≥ 4�X ≥ 0.

Definition 39. (Condition refinement) A formula C′ = F ′1, . . . , F

′m is a refinement of a formula

C = F1, . . . , Fn (denoted as C′ ≤c C) if and only if for all F ′i in C′, 1 ≤ i ≤ m, Fj in C exists,

1 ≤ j ≤ n, such that one of the following conditions holds:

• F ′i = Fj ;

• F ′i and Fj are atomic comparison formulas and Fj � F ′

i ;• F ′

i = c′(X) and Fj = c(X) are atomic class formulas and c ≤ISA c′;• F ′

i = t ′ in c′ and Fj = t in c are atomic membership formulas and c ≤ISA c′.

†This requirement could be weakened by allowing a rule to be refined by two or more rules, the disjunction of whose conditionsmust subsume the condition of the original rule. However, we do not consider this case.



Example 17. Referring to the database schema of Example 1, the following are examples of conditionrefinement.

• Self.birthday ≤ ‘01/01/1930’≤c Self.birthday ≤‘01/01/1900’;• Self.money > 0 ≤c Self.money >0, Self.debit=0;• person(X), X.birthday≤ Self.birthday ≤c employee(X),X.birthday ≤ Self.birthday.

Action refinement basically means that the action of the refined rule must do at least what the actionof the original rule does. This means that, for each update in the sequence constituting the action of theoriginal rule, there must be a corresponding update in the refined action. However, since rule actionsare update sequences, the corresponding update could be discarded by some complementary updatesexecuted after it in the sequence. Consider as an example the case of a rule creating an object in itsaction, overridden by a rule whose action first creates a corresponding object and then deletes it. We donot allow, therefore, addition of updates in the sequence that can undo (i.e. compensate) the actionperformed by the updates in the sequence. Note that the notion of potential compensation we consideris purely syntactical and only relies on complementary database operations. We start by defining howsingle updates can be refined.

Definition 40. (Basic update refinement) An atomic update u′ is a refinement of an atomic update u

(denoted as u′ ≤b u) if and only if one of the following conditions holds:

• u = create(c, t,O), u′ = create(c′, t ′,O) and c′ ≤ISA c, t ′|c = t ;

• u = delete(c,O), u′ = delete(c′,O) and c′ ∈ c�;• u = generalize(c1, c2,O), u′ = generalize(c′

1, c′2,O) and c′

1 ∈ c�1 , c2 ≤ISA c′

2;

• u = specialize(c1, c2,O, t), u′ = specialize(c′1, c

′2,O, t ′) and c′

1 ∈ c�1 ,

c′2 ≤ISA c2, t ′|c2

= t;

• u = modify(c.a,O, t), u′ = modify(c′.a,O, t) and c′ ∈ c�;• u = O.m(t1, . . . , tn), u′ = u.

We now introduce the notion of a potentially compensating update, i.e. an update that could undothe effect of another update operation. To denote in a compact way a set of updates we express them asan update type, i.e. as a pair 〈update name, target schema element〉, where the update name belongs tothe set UN = {create, delete, specialize, generalize, modify} ∪ MN whereas the schema element is aclass, a pair of classes or a class–attribute pair, depending on the update. Let UPD denote the set of alltypes of the possible updates of our language. Finally, let us, by abuse of notation, extend the notion ofnon-interfering updates to update types (the details of this extension are straightforward).

Definition 41. (Potentially compensating update) Given an update u, C(u) is a set of updates, definedas follows:

• If u = create(c, t,O), C(u) = {〈delete, c〉 | c ∈ c�} ∪ {〈generalize, 〈c1, c2〉〉 | c1 ≤ISA

c, c ≤ISA c2}.• If u = delete(c,O), C(u) = ∅.• If u = specialize(c1, c2,O, t), C(u) = {〈delete, c〉 | c ∈ c

�2 } ∪ {〈generalize, 〈c1, c2〉〉 |

c1 ≤ISA c2, c1 ≤ISA c2}.



• If u = generalize(c1, c2,O), C(u) = {〈specialize, 〈c1, c2〉〉 | c1 ≤ISA c1 ≤ISA c2, c2 ∈ c�1 }.

• If u = modify(c.a,O, t), C(u) = 〈modify, 〈c, a〉〉 | c ∈ c�}.• If u = O.m(t1, . . . , tn): C(u) = UPD \ {u | u and

⋃c∈c↓ impl(m, c) are non-interfering

according to Definition 37}.Two updates u and u′ are potentially compensating if the update type of u belongs to C(u′) or theupdate type of u′ belongs to C(u).

An update sequence refines another update sequence if the former contains all the updates appearingin the latter (eventually refined), in the same order. Additional updates can be added, provided thatthey do not (potentially) compensate the updates in the refined sequence. The following definitionformalizes this notion.

Definition 42. (Action refinement) An action list A′ is a refinement of an action list A (denoted asA′ ≤a A) if the following conditions hold:

• A′ = u′1; . . . ; u′

m, A = u1; . . . ; un and m ≥ n.• For each ui , 1 ≤ i ≤ n, in A, u′

j , 1 ≤ j ≤ m, in A′ exists, such that u′j ≤b ui , i.e. u′

j is arefinement of ui according to Definition 40. Let function ξ : {1, . . . , n} → {1, . . . ,m}, such thatξ(i) = j , model this correspondence.

• If ui precedes uk in A, then u′ξ(i) precedes u′

ξ(k) in A′.• For each u′

j , 1 ≤ j ≤ m, such that j �∈ codom(ξ), � ui , 1 ≤ i ≤ n, such that ui and u′j are

potentially compensating.

We note that, since both the basic action refinement and the computation of potentially compensatingupdates only rely on the syntactical properties of the action list, action refinement is decidable.

Example 18. Suppose we add to the schema in Example 1 the classes employee log and person logsuch that employee log ≤ISA person log, then

• A′ = create(employee log, (who : Self.name, age : 2000− Self.birthday.year,salary : Self.monthly salary), L)

is a refinement ofA = create(person log, (who : Self.name, age : 2000− Self.birthday.year), L); and

• A′ = modify(department.nbr of employees, Y, Y.nbr of employees− 1);delete(employee, X)

is a refinement ofA = delete(person, X).

We are now able to define update rule refinement. Given an update rule R, let BVar(R) denote theset of variables appearing in the condition of R and in the action of R (i.e. the variables employed forpassing bindings). Given a formula F and a set of variables V , let F|V denote the formula obtainedfrom F by eliminating all atomic formulas containing only variables which do not belong to V .

Definition 43. (Update rule refinement) An update rule R′ = C′ → A′ is a refinement of an updaterule R = C → A if the following conditions are satisfied:



• BVar(R) ⊆ BVar(R′);• C′

|BVar(R) ≤c C, i.e. C′ restricted to the variables appearing in R is a refinement of C accordingto Definition 39; and

• A′ ≤a A, i.e. A′ is a refinement of A according to Definition 42.

Finally, we can state when a method implementation refines a superclass method implementation.All the rules but the last one are identical up to replacing the occurrences of the class name.This guarantees that for each object for which a rule of the superclass is applicable, the same ruleis applied, and thus the same behavior is obtained in the subclass. The last rule can be refined, i.e. itcan be applicable to additional objects and perform additional actions. Finally, new rules can be added.Note that new rules can be added only after the refined ones, to guarantee that this additional behavioris exhibited only when no rules of the original method would have been applicable.

Definition 44. (Syntactic refinement) Given a class c, a subclass c′ of c, the definition of method mc′,

consisting of rules R′1, . . . , R

′n, and the definition of method mc, consisting of rules R1, . . . , Rk , mc′

isa syntactic refinement of method mc if k ≤ n and

• for each i, 1 ≤ i ≤ k − 1, R′i = Ri [c/c′] and all the methods invoked in Ri have identical (up to

a proper substitution of classes) implementations in c and c′; and• R′

k is a refinement of Rk , according to Definition 43.

Example 19. Referring again to the schema in Example 1 the following are examples of behavioralrefinement:

(i) • impl(del, person):

del():Self.spouse �= null →modify(person.spouse,Self.spouse,null);delete(person,Self)

• impl(del, employee):

del():Self.spouse �= null, consultant(C), C.internal contact = Self,department(D), Self.department= D →modify(person.spouse,Self.spouse,null);modify(consultant.internal contact,C,null);modify(department.nbr of employees,D,D.nbr of employees - 1);delete(employee,Self)

• impl(del, manager):

del():Self.spouse �= null, consultant(C), C.internal contact = Self,department(D), Self.department= D, employee(S),S in Self.dependents→modify(person.spouse,Self.spouse,null);modify(consultant.internal contact,C,null);modify(department.nbr of employees,D,D.nbr of employees - 1);modify(employee.boss,S,Self.boss);delete(manager,Self)



(ii) • impl(spend money, person):

spend money(expense):Self.money≥expense, Self.debit = 0→modify(person.money,Self,Self.money-expense)

• impl(spend money, employee):

spend money(expense):Self.money≥expense→modify(employee.money,Self,Self.money-expense)

• impl(spend money, manager):

spend money(expense):Self.money≥expense→modify(manager.money,Self,Self.money-expense)||Self.bonus �= 0 →modify(manager.money,Self,Self.money-expense)

The following result holds.

Proposition 1. Given two classes c and c′, such that c′ ≤ISA c, and methods mc and mc′,

• we can decide whether mc′is a syntactic refinement of method mc; and

• if mc′is a syntactic refinement of method mc, then mc′

is a behavioral refinement of mc.

Proof sketch. The proof of the first item is based on the fact that both the condition and action refinementare decidable.

The second item is more complex to prove. We prove it by contradiction, thus we suppose thata database state S and a set of bindings B exist violating a condition of Definition 38. Let us firstconsider the first condition of Definition 38; this means, for instance, that a class c and an object

identifier oid exist such that oid ∈ δ∗BSg

m∗ (c) and oid �∈ δ∗BSm∗ (c). This means that oid has been deleted

from the members of c as an effect of the execution of U�m∗�BSg, whereas oid has not been deletedfrom the members of c as an effect of the execution of U�m∗�BS. Because of the semantics of our

methods (see Section 3), oid ∈ δ∗BSg

m∗ (c) if oid ∈ Sg.π∗(c) and the implementation of mc contains arule R whose action contains an update u of one of the following forms:

• a delete(c∗,O) operation for a class c∗ ∈ c�;• a generalize(c1, c2,O) operation, for c1 ∈ c�, c ≤ISA c2; or• the invocation of a method m′ which performs the deletion of oid from the members of c.

Since mc′is a syntactic refinement of mc, from Definition 44, either R is not the last rule of mc, and

thus mc′contains a rule identical to R thus containing the same action, or mc′

contains a rule R′ thatrefines R. This means that the R′ action contains an update u′, i.e. a refinement of u according toDefinition 40, that is of one of the following forms:

• u′ = delete(c,O) operation for a class c ∈ c∗� = c�;• u′ = generalize(c′

1, c′2,O) operation, for c′

1 ∈ c�1 = c�, c ≤ISA c2 ≤ISA c′

2; or• u = u′.



Thus, if action u′ were executed with oid belonging to the bindings for variable O , its effect wouldbe the deletion of oid from the members of c.

Since this violates our starting hypothesis, there are two more cases to consider:

(i) oid does not belong to the bindings for O , i.e. it does not satisfy the R′ condition; and(ii) u′ is discarded by an update following it in R′.

In case (i), this violates the hypothesis that R′ is a refinement of R, since oid belongs to the bindingsfor O computed by R’s condition, which, by Definition 39, is more selective than that of R′. Moreover,since all rules preceding R′ in mc′

have conditions identical to those of the corresponding rules inmc, oid cannot have satisfied the condition of one of the rules preceding R′ (otherwise, it would havesatisfied the condition of the corresponding rule in mc).

In case (ii), let us consider the actions that could have discarded the deletion of oid from the membersof c:

• If oid has been deleted as an effect of a delete operation or as an effect of the invocation of amethod m′ containing (also in an indirect way) a delete operation, it cannot be discarded.

• if oid has been deleted as an effect of a generalize operation or as the effect of the invocationof a method m′ containing (also in an indirect way) a generalize operation, it can onlybe discarded by (a sequence of) specialize operation(s) or by the invocation of a methodcontaining them; in this case the fourth item of Definition 42 would be violated, against thehypothesis mc′

syntactical refinement of mc, since the two updates are potentially compensatingaccording to Definition 41.

Similar arguments can be employed to conclude the proof considering the other items ofDefinition 38. �

4.3. Conflicting method definitions

Some object-oriented data models [9,24,25] allow an object to be an instance of multiple classes.This possibility has several advantages from a modeling viewpoint. However, it poses a number ofproblems with respect to method dispatching. Each class in the inheritance hierarchy can specifya different implementation for a certain method. For each method invocation, an implementationmust be chosen among the most specific ones for the receiver object. In a model supportingmultiple class direct membership, however, the choice of a method implementation that ‘most closelymatches’ an invocation is not obvious. In [9] we have addressed the problem of dispatching in objectsystems supporting multiple class direct membership. The approach is based on selecting the methodimplementation from one of the most specific classes of the object. This approach is, however, notalways adequate. When different implementations for a method are defined in the most specific classesof an object, there are cases in which the most intuitive approach is executing all of them. Considerthe previous init method, initializing the attributes of an object. Suppose that different classes providedifferent implementations for the init method. Remember that the implementation of the init methodin each class initializes the attributes of that class to the proper values. If, when the init method isinvoked on an object, all the init implementations of all the classes of which the object is a directmember are executed, the desired result is obtained, i.e. all the attributes of the object are properlyinitialized. This approach causes the problem that some attributes can be assigned different values



by different implementations of the method. To solve these issues we introduce the notion of non-conflicting method implementations. Different method implementations are non-conflicting if each ofthem modifies a portion of data distinct from the one modified by the others and if data modified bythe two methods are modified consistently.

Intuitively, two methods are non-conflicting if, considering their executions on states that only differfor the classes of which the receiver objects are instances, their effects do not give rise to conflicts,i.e. they can be composed. In particular, the situations to be excluded are as follows.

(i) The two methods perform different update operations on the same object:

• one deletes it from a class, the other one updates one of its attributes,• one creates it, or inserts it into a class, and the other one deletes it or• one creates it, or inserts it into a class, and the other one updates one of its attributes.

(ii) The two methods both perform an update on the same attribute of the same object, assigning itdifferent values.

Those situations are excluded since they correspond to cases in which the two executions cannot beproperly combined.

The semantic notion of non-conflicting method implementations is formalized as follows.

Definition 45. (Non-conflicting method implementations) Given two classes c1 and c2 both containingan implementation of method m, methods mc1 and mc2 are said to be non-conflicting if, for eachinvocation m∗ = O.m(t1, . . . , tn), for each set of bindings B and for each pair of database states S1 andS2, such that OB ⊆ S1.π(c1) and 〈B, S2〉 = U�generalize(c1, c,O); specialize(c, c2,O, t)�BS1,where c is the most specific common superclass of c1 and c2 and t is a term of the appropriate type, thefollowing conditions are satisfied:

(i) ∀ c ∈ CI, for i, j ∈ {1, 2}, i �= j ,

• δBSi

m∗ (c) ∩ νBSj

m∗ (c) = ∅,

• δBSi

m∗ (c) ∩ ιBSj

m∗ (c) = ∅ and

• ιBSi

m∗ (c) ∩ νBSj

m∗ (c) = ∅;

(ii) ∀ c ∈ CI, ∀ a attribute of c, ∀ oid instance of c: if νBS1m∗ (oid).a �=⊥ and ν

BS2m∗ (oid).a �=⊥ then

νBS1m∗ (oid).a = ν

BS2m∗ (oid).a.

The notion of non-conflicting method definitions is undecidable. It can indeed be reduced to a non-trivial extensional property of counter machines.

Some static conditions ensuring the absence of conflicts between two method implementations can,however, be devised. These conditions can be checked at method definition time, so that the user canbe warned when two (potentially) conflicting methods are both executed as a consequence of a methodinvocation on an object.

A first static condition ensuring that two method implementations are non-conflicting is that theactions of the rules of the two method implementations do not contain updates on the same attributesor on the same classes. This condition (we will refer to it as disjointness of actions) can be checked



through a simple syntactic analysis of the actions of the rules of which the method implementationconsists. The notion of methods with disjoint actions, formalized by the following definition, simplyrequires that the schema elements on which conflicting updates are performed by the two methods aredisjoint.

Definition 46. (Methods with disjoint actions) Given two classes c1 and c2 both containing animplementation of method m, methods mc1 and mc2 are said to have disjoint actions if impl(m, c1)

and impl(m, c2) are non-interfering according to Definition 37.

Example 20. Referring to the database schema in Example 1 extended with a class consultant log,this is an example of methods with disjoint actions.

• impl(del, consultant):

del(): →create(consultant log, (who : Self.name, age : 2000− Self.birthday.year,

salary : Self.hourly salary), L);delete(consultant,Self)


del(): consultant(C), C.internal contact = Self, department(D),Self.department= D →delete(consultant,C);modify(department.nbr of employees,D,D.nbr of employees - 1);delete(employee,Self)

Indeed, according to Definition 36:

• I(impl(del, consultant)) = ∅,D(impl(del, consultant)) = {consultant},M(impl(del, consultant)) = ∅

• I(impl(del, employee)) = ∅,D(impl(del, employee)) = {employee, manager, consultant},M(impl(del, employee)) = {〈department, nbr of employees〉}

thus, the disjointness conditions of Definition 37 are satisfied.

Proposition 2. Given two classes c1 and c2, and methods mc1 and mc2

• we can decide whether mc1 and mc2 are methods with disjoint actions; and• if mc1 and mc2 are methods with disjoint action, they are non-conflicting.

Proof sketch. The first item immediately follows from the definition of methods with disjoint actionswhich, in fact, gives an algorithm for testing the properties.

The second item is more complex to prove. We prove it by contradiction. Thus we suppose thata pair of database states S1, S2, and a set of bindings B exist violating a condition of Definition 45.



Let us consider the first condition of Definition 45, this means that a class c and an object identifier oidexist such that oid ∈ δ

BS1m∗ (c) ∩ ν

BS2m∗ (c). This means that oid has been deleted from c as an effect of the

execution of U�m∗�BS1, whereas an attribute of oid has been updated as an effect of the execution ofU�m∗�BS2. Because of the semantics of our methods (see Section 3), oid ∈ δ

BS1m∗ (c) if oid ∈ S1.π(c)

and the implementation of mc1 contains one of the following operations (where, as usual in logiclanguages a symbol denotes a dumb variable):

• a delete(c, ) operation or a delete(c, ) operation for a superclass c of c;• a specialize(c, c′, , ) operation;• a generalize(c, c′, ) operation or a generalize(c, c′, ) operation for a superclass c of c; or• the invocation of a method m′ which performs the deletion of oid.

In all these cases, from Definition 36, c ∈ D(mc1).Similarly, oid ∈ ν

BS2m∗ (c) if oid ∈ S2.π

∗(c) and an attribute a exists such that νBS2m∗ (oid).a �=⊥.

Because of the semantics of our methods (see Section 3), this can only happen if the implementationof mc2 contains one of the following operations:

• a modify(c.a, ) operation, or a modify(c.a, ) operation for a superclass c of c; or• the invocation of a method m′ which performs the update of attribute a of oid.

In all these cases, from Definition 36, 〈c, a〉 ∈ M(mc2).Thus, by Definition 46, they cannot be methods with disjoint actions (they violate the second

condition of Definition 37), which violates the hypothesis.Similar arguments can be employed to conclude the proof considering the other items in

Definition 45. �

Though the condition of methods with disjoint actions ensures that two method implementationsare non-conflicting, it is very restrictive. Indeed, methods modifying an object in the same way(e.g. methods with identical implementations) are also non-conflicting. Thus, the notion of methodswith disjoint additional actions is defined. This notion requires that the two method implementationsare identical (up to a proper substitution of classes) up to a certain point and that the additional partsare non-conflicting.

Definition 47. (Methods with disjoint additional actions) Given two classes c1 and c2 both containingan implementation of method m, such that

impl(m, c1) = m(P1, . . . , Pl) : C1 → u11; . . . ; u

n11 ‖ · · · ‖Ch → u1

h; . . . ; unh

h

and

impl(m, c2) = m(P1, . . . , Pl) : C′1 → w1

1; . . . ; wn′

11 ‖ · · · ‖C′

k → w1k ; . . . ; w

n′k

k

methods mc1 and mc2 are said to have disjoint additional actions if i, j exists, i ∈ [0, min{h, k}],j ∈ [0, min{ni, n

′i}], such that

• ∀s < i: R′s = Rs [c1/c2];

• C′i|BVar(Ri)∩BVar(R′

i )= Ci|BVar(Ri)∩BVar(R′

i )[c1/c2], and ∀t < j : wt

i = uti[c1/c2];

• all the methods invoked in⋃

s<i Cs and in⋃

s<i,t<j uts have identical implementations (up to a

proper substitution of classes) in c1 and c2;



• the updates in set {uts | s > i, t > j } are non-interfering, according to Definition 37, with the

updates in set {wts | 1 ≤ s ≤ k, 1 ≤ t ≤ n′

k}, and the updates in set {wts | s > i, t > j } are

non-interfering, according to Definition 37, with the updates in set {uts | 1 ≤ s ≤ h, 1 ≤ t ≤ nh}.

Example 21. Referring to the database schema in Example 1 extended with a consultant log class,this is an example of methods with disjoint additional actions.

• impl(del, consultant):

del(): Self.spouse �= null →modify(person.spouse,Self.spouse,null);create(consultant log, (who : Self.name, age : 2000− Self.birthday.year,

salary : Self.hourly salary), L);delete(consultant,Self)


del():Self.spouse �= null, department(D), Self.department= D →modify(person.spouse,Self.spouse,null);modify(department.nbr of employees,D,D.nbr of employees - 1);delete(employee,Self)

A variation of Proposition 2 holds for methods for disjoint additional actions. The proof can be easilyobtained by extending that for Proposition 2.

The notion of methods with disjoint additional actions is, however, still too restrictive. Indeed,methods modifying the same attributes of different objects are also non-conflicting. Some sufficientconditions ensuring that two rule conditions are exclusive, i.e. they will never be satisfied by thesame objects, can be developed. Then, if two rules in a method implementation have non-disjointactions but do have exclusive conditions, the methods are non-conflicting. Since, however, the notion ofexclusiveness of condition is much more difficult to characterize than the notion of action disjointness,we restrict ourselves to the case in which the methods to analyze do not contain method invocations intheir implementations.

We start from the notion of exclusiveness of pairs of atomic comparison and membership formulas,denoted as F1 ⊗ F2, modeling the fact that F1 and F2 are mutually exclusive. For instance, X = 5 ⊗ X= 7, X > 7 ⊗ X < 5 and X in employee ⊗ ¬ X in person. Two formulas are then exclusive ifthey contain exclusive atomic formulas on the same variable.

Of course, exclusiveness makes sense only when the two formulas are tested on the same databasestate. Thus, we require the two method implementations to be identical (up to a proper substitution ofclasses) up to the rule containing the exclusive condition. We, moreover, require that the additional partto be non-conflicting.

Definition 48. (Methods with disjoint additional non-exclusive actions) Given two classes c1 and c2both containing an implementation of method m, such that

impl(m, c1) = m(P1, . . . , Pl) : C1 → A1‖ · · · ‖Ch → Ah

andimpl(m, c2) = m(P1, . . . , Pl) : C′

1 → A′1‖ · · · ‖C′

k → A′k

methods mc1 and mc2 are said to have disjoint additional non-exclusive actions if i, i ∈ [0, min{h, k}]exists, such that



• ∀s < i rule R′s = Rs [c1/c2];

• all the methods invoked in⋃

s<i Cs and in⋃

s<i As have identical implementations in c1 andc2;

• Ci and C′i are exclusive;

• the updates in set {As | i < s ≤ h} are non-interfering, according to Definition 37, with theupdates in set {A′

s | 1 ≤ s ≤ k}, and the updates in set {A′s | i < s ≤ k} are non-interfering,

according to Definition 37, with the updates in set {As | 1 ≤ s ≤ h}.Example 22. Referring to the database schema in Example 1, this is an example of methods withdisjoint additional non-exclusive actions.

• impl(reclassify, consultant):

reclassify(): Self.birthday < ‘01/01/1930’ →generalize(consultant,person,Self)

• impl(reclassify, employee):

reclassify():Self.birthday > ‘01/01/1970’, Self.monthly salary > 2000,Self.hiring date < ‘01/01/1999’ →

specialize(employee,manager,Self,(bonus:0,dependents:null)

A variation of Proposition 2 holds for methods for disjoint additional actions. The proof can beobtained by extending that for Proposition 2, taking into account that if two rules have exclusiveconditions their actions will never be executed together.

Given two non-conflicting method implementations m1 and m2, moreover, we can execute the twomethod implementations ‘in parallel’, obtaining two different final states S′

1 and S′2, and we can then

recombine (i.e. merge) these states so that the effect of both method executions are taken into account.Since the two method implementations are non-conflicting, their resulting states can be unambiguouslymerged. This means that we take the union of the objects inserted/deleted from each class by the twomethods, and the union of the attribute updates performed by the two methods. Note that our notion of‘non-conflicting’ is weaker than requiring that the sequential execution of m1 and m2, or vice versa,will result in the same database state obtained by this ‘parallel’ execution with state merge.

5. RELATED WORK

In this section a comparison with other approaches is presented. First, we compare our approachwith those which deal with method definition languages for object-oriented databases performing set-oriented updates. Then, existing work on identifying method properties and their static analysis isbriefly discussed.

5.1. Method definition languages and set-oriented updates

The issue of defining an ad hoc data manipulation language (DML) for object-oriented databaseshas been largely neglected. One of the proofs of this is that the standard for object-oriented



databases ODMG [15] provides both a data definition language and a query language but no DML.Data manipulation is left to methods written in an object-oriented programming language such as C++or Java. One of the main differences between these two languages and an ad hoc DML is that theymanipulate one object a time, whereas DMLs manipulate collections of objects. This is the reason whyupdates in DMLs are said to be set-oriented.

One of the main problems related to updating object-oriented databases by manipulating collectionsof objects is non-determinism. Let us introduce this problem through an example.

Example 23. Consider the class employee presented in Example 1, and consider another classproject with two attributes: budget of type integer which stores the project budget, and membersof type set-of(employee) which stores the employees involved in the project. Consider thefollowing method of class employee which increases the salary of each employee involved in theproject, earning less than 5000, of the project budget divided by 1000.

div budget():project(P),Self in P.members,Self.monthly salary<5000→modify(employee.monthly salary,Self,

Self.monthly salary+(P.budget/1000))

Let o1 be the oid of an object belonging to class employee, such that o1 is a member of two projectsidentified by the oids p1 and p2, and let the salary of o1 be less than 5000. Moreover, let the budgetof p1 be 100 000 and the budget of p2 be 200 000. Consider the invocation of div budget on o1.Depending on the order in which projects p1 and p2 are considered the salary of o1 could be increasedby 100 or 200. Thus, the semantics is non-deterministic.

To the best of our knowledge, [10–12] are the most well-known approaches which have dealt withthe non-determinism of set-oriented updates in object-oriented databases, pointing out problems andproposing solutions. All these proposals follow a similar approach: they aim to establish at compiletime whether or not an update can cause non-determinism and they then process only the updates thatdo not cause any problems. Let us consider each approach separately.

The first is by Laasch and Scholl [11], who in order to make the semantics deterministic, onlyaccept as valid those update sequences which commute on the set of objects with respect to which theupdate is performed. This semantics eliminates the sequences which cause problems, by consideringthem invalid. Moreover, some problems arising from aggregate functions and data-sharing are thenconsidered.

In [10] the update behavior is analyzed thanks to schema annotations called colorings which alloworder-independent updates to be described. Decidability of order-independence for a restricted class ofmethods is discussed. In the last part of the paper, an alternative ‘parallel’ way of applying an updatemethod to a set of objects is considered.

In [12] the notion of a potentially non-deterministic update is introduced. The authors give a prooftechnique for establishing when an update is deterministic. If the update cannot be proved to bedeterministic using their reasoning system, then it could be non-deterministic. Since the proof system isnot complete, updates which are not deduced by the system are not guaranteed to be non-deterministic,thus they refer to such situations as potential non-determinism. In this case the user is notified of thefact that an update could cause non-determinism and he/she can decide what to do.

According to all the previously mentioned approaches an update such as the one presented inExample 23 is non-deterministic. This is in accordance with intuition: if an employee is participating



in two projects with different budgets, there is no way to assign him/her a percentage of the budget ina deterministic way.

Our approach differs from the previous ones. Indeed, the goal of our work was not to detect non-determinism, rather to give the semantics of the updates of our language in order to establish goodproperties. However, in defining the semantics we had to cope with the problem of non-determinism. Inour language, every update is evaluated and if the semantic evaluation is non-deterministic, an error issaid to have occurred and the update is ‘aborted’, i.e. the database state preceding the update executionis recovered. Thus, in our approach, the semantic evaluation of the method presented in Example 23fails if the two projects have different budgets, whereas if the budget is the same it does not fail.

A different area in which set-oriented updates have been investigated is that of deductive databases.Some proposals can be found in the literature concerning deductive MDLs for object databases[13,14,25–27] and dealing with ways to incorporate updates in deductive languages [28–30]. In thedeductive approach a notion of consistent updates is defined. Consistency is important because itprevents a set of updates containing both an insertion and a deletion of the same object to be executed.One disadvantage of this approach is that not all syntactically correct update sequences are semanticallycorrect. Moreover, the sharp distinction between the phase in which bindings are collected and theupdate phase is not convenient for object-oriented database languages, such as ours, in which updatescan be method invocations.

5.2. Method properties and static analysis

The notion of behavioral refinement is due to the behavioral subtyping notions proposed in the contextof object-oriented programming languages [8,32,33]. In this context behavioral subtyping is achievedby weakening the precondition of an inherited method by a disjunction with an additional Booleanexpression, while the post condition of an inherited method is strengthened by a conjunction with anadditional Boolean expression. The same ideas are applied by approaches aiming at enriching Java-based object-oriented database systems with constraints [34,35].

The notion of non-conflicting methods, by contrast, could have some similarities with the severalnotions of serializability proposed in database concurrency control. These notions have been deeplyinvestigated in the context of relational databases, whereas they become very cumbersome in thecontext of object-oriented databases, where transactions can invoke methods whose implementationcan be specified in an object-oriented programming language. The notion of non-conflicting methodsis, however, weaker than the notion of view serializability, since it does not impose any constraint onthe order in which various rule conditions are tested, provided that their actions do not interfere.

Our proposal for the development of static conditions, has been influenced by some existingapproaches, though in different respects. In particular, Benzaken and Doucet [36] dealt with waysin which to detect statically whether the execution of a method can potentially violate an integrityconstraint, and introduced the ideas of identifying the portion of a schema manipulated by an updateoperation. As we have already pointed out, Liefke and Davidson [12] also address the problem ofanalyzing pairs of updates and query-update paths to detect possible sources of non-determinism.Their analysis, however, heavily depends on the fact that certain attributes are known to be partitioningattributes, i.e. attributes whose inverse is single valued; such knowledge is not assumed in our model(nor is any attribute required to have an inverse). Work on active rule termination [37] also addressed theproblem of statically detecting the interactions among different rules. Some other work, finally, could



offer some interesting ideas for extensions and refinements of our proposal, for example Benzakenand Schaefer [38] in which abstract interpretation techniques are employed to obtain a more accurateanalysis of method behavior.

6. CONCLUSIONS AND FUTURE WORK

In this paper we have proposed a set-oriented MDL for object databases. The use of an ad hocset-oriented language offers several advantages compared to the use of general purpose imperativeprogramming languages. Our language, though computationally complete, still has potential foroptimization and automatic verification.

In particular, in the paper we have presented the syntax and defined the semantics of our set-orientedrule-based language. We have compared our choices concerning set-oriented update semantics withother approaches. Finally, we have shown how the relevant semantic properties of methods can beformally stated in terms of our language. In particular, we have considered conflicts among methoddefinitions in sibling classes and behavior refinement in subclasses. Note that these properties areundecidable in general. Some sufficient static conditions, ensuring the verification of these properties,have however, been devised. Comparisons with approaches developing static analysis of methods havebeen presented.

The work reported in this paper could be extended in several directions. We are interested ininvestigating method optimization and verification in the context of our language. The topic of updateoptimization has been quite neglected in the literature; we think, however, that some optimizationthrough static rewriting of methods can be devised for our language, in the spirit of [39]. The approachproposed in [40] dealing with the static analysis of deductive languages with updates could also beextended to our framework. For verification, there are interesting properties that could be investigatedin the context of our language. For example, compensation, i.e. whether a method undoes the effectsof another one; this notion is exploited in the context of advanced transaction models to providemore flexible mechanisms for concurrency control [41]. Another is the verification of consistencyrequirements, i.e. whether a method can potentially violate a number of integrity constraints [38,42].

APPENDIX A. TYPING ISSUES

In this section we deal with the typing rules of our language. We first present a formal definitionof the set of types of the reference model and the subtype relationship [43]. Then, the rules fortyping terms and formulas, making explicit the type requirements we impose on such constructs, arepresented. Typing is performed assuming a given set of classes, CI, defined in a database schema(see Definition 3) and that for each variable a type is known, thus we explicitly investigate these issues.Finally, we consider typing rules for update primitives and also discuss type correctness of methodinvocations.

A.1. Types and subtype relationship

The set of types in our language, T , is formally defined as follows.

Definition A1. (Types) The set of legal types T is inductively defined as follows:



• each T ∈ BVT is a type (BVT ⊆ T );• each class name c ∈ CI is a type (CI ⊆ T );• let T be a type, then

– set-of (T ) is a type, called a set type;– list-of (T ) is a type, called a list type;

• let T1, . . . , Tn, n ≥ 1, be types and let a1, . . . , an ∈ AN be distinct labels, then record-of (a1 : T1, . . . , an : Tn) is a type, called a record type.

In the previous definition the set BVT is the set of basic valued types (such as integer and real)which are predefined. Moreover, AN denotes a set of attribute names.

Starting from the ISA relationship established by the user when defining the schema, the subtyperelationship ≤T can be defined. Note that the subtype relationship for basic valued types is the identity.

Definition A2. (Subtypes) Given T1, T2 ∈ T , T2 is a subtype of T1 (denoted as T2 ≤T T1) if and onlyif one of the following conditions holds:

• T1 = T2;• T1 ≤ISA T2;• T2 = set-of (T ′

2), T1 = set-of (T ′1) and T ′

2 ≤T T ′1;

• T2 = list-of (T ′2), T1 = list-of (T ′

1) and T ′2 ≤T T ′

1;• T1 = record-of (a1 : T ′

1, . . . , an : T ′n), T2 = record-of (a1 : T ′′

1 , . . . , an : T ′′n ), and for each

i = 1 . . . n, T ′′i ≤T T ′

i .

The set of types T with the ordering ≤T is a poset. Indeed ≤T is a partial order: it can be easily checkedthat it is reflexive, antisymmetric and transitive. Given the poset (T ,≤T ) and a set T S ⊆ T we mayconsider the upper bound of T S (the type T ∈ T such that TT S ≤T T ∀TT S ∈ T S). Intuitively anupper bound of a set of classes is a class c superclass of all the classes in the set. Consider, for instance,the hierarchy of Example 1, depicted in Figure 1. The upper bound of the set {employee, person} isperson, as the upper bound of the set {employee, consultant}.

We consider the notion of least upper bound (lub) in our poset, denoted by symbol⊔

. An upperbound T for a set T S is the least upper bound if and only if for all upper bounds T ′, T ≤T T ′ holds.Therefore, the lub of a set of types is the most specific among the supertypes of the types in the set.

Note, however, that since we do not assume a single root of the inheritance hierarchy, the lub doesnot always exist. Furthermore, if the lub exists, it may not be unique. We refer the interested reader to[18] for a comprehensive treatment of the properties of the lub in our model.

A.2. Typing of terms and formulas

Figure A1 presents the term typing rules. Typing rules for terms built with predefined operators are notpresented in detail here since they are the standard ones. For instance, if the symbol ∪ denotes the setunion, the rule of Figure A1

t1 : T1 t2 : T2

t1 op t2 : TT = T1

⊔T2

would be



X ∈ VarTX : T

T ∈ T

t ∈ dom(T )

t : TT ∈ BVT

null : TT ∈ T

ti : Ti (1 ≤ i ≤ n)

{t1, . . . , tn} : set-of (T )T = ⊔n

i=1 Ti

vi : Ti (1 ≤ i ≤ n)

[t1, . . . , tn] : list-of (T )T = ⊔n

i=1 Ti

ti : Ti (1 ≤ i ≤ n)

(a1 : t1, . . . , an : tn) : record-of (a1 : T1, . . . , an : Tn)a1, . . . , an ∈ AN

t : record-of (a1 : T1, . . . , an : Tn)

t.ai : Tii = 1 . . . n

t : c

t.ai : Tic ∈ CI, σ (c) = record-of (a1 : T1, . . . , an : Tn), i = 1 . . . n

t1 : T1 t2 : T2

t1 op t2 : TT = T1

⊔T2

t : set-of (T )

count(t) : integer

t : set-of (T )

op(t) : Top ∈ {min, max},T ∈ {integer,real, string}

t : set-of (T )

op(t) : Top ∈ {avg, sum},T ∈ {integer,real}

Figure A1. Term typing rules.

t1 : set-of (T ′1) t2 : set-of (T ′

2)

t1 ∪ t2 : TT = set-of (T ′

1)⊔

set-of (T ′2)

where

set-of (T ′1)

⊔set-of (T ′

2) = set-of (T ′1

⊔T ′

2)

Figure A2 specifies some conditions that must be satisfied for the correct application of predicates toterms. Therefore, these conditions specify legal formulas exactly. As far as type formulas are concernedwe do not specify any type constraint for the application of type predicates because type formulas are



t1, t2 : T

t1 op t2T ∈ {integer, real, character, string}, op ∈ {<,>,≥,≤}

t1 : T1, t2 : T2

t1 = t2T1

⊔T2 is defined

t1 : T1, t2 : T2

t1 == t2T1, T2 ∈ CI, T1

⊔T2 is defined

t1 : T1, t2 : T2

t1 ==d t2T1, T2 ∈ CI, T1

⊔T2 is defined

t1 : T1, t2 : set-of (T2)

t1 in t2T1

⊔T2 is defined

t1 : T1, t2 : list-of (T2)

t1 in t2T1

⊔T2 is defined

t : T

t in cT , c ∈ CI, T

⊔c is defined

X ∈ VarTT (X)

T ∈ T

Figure A2. Formula typing rules.

the means with which to specify the type of a variable. We only require that type formulas be appliedto variables.

The typing rules in Figures A1 and A2 are based on the existence of a basis, that is, of a set ofstatements of the form X ∈ VarT , where X is a variable and T is a type. Such a basis is built with respectto a given database schema. Other typing rules use this basis. In the following we discuss how thesebases are obtained. The formulas we consider appear as conditions in method implementation rules.In method implementations, we distinguish two kinds of variable: parameters and local variables.We refer to Section 2.4 for typing parameters, whereas, in what follows, we investigate local variabletyping in formulas.

There are two different ways to assign a type to a local variable in a formula:

(i) explicitly, through a type formula; or(ii) implicitly, if the variable is an output parameter of a (side-effect-free) method, invoked in the

formula.

When writing a formula, we can explicitly state the type of each variable used in it. The scopeof a type assignment for a variable is the formula itself. Typing variables can be accomplished in ourlanguage by using type formulas. Indeed, we use type formulas as a typing mechanism. A type formulahas the form T (X) where T ∈ T is a type name and X is a variable. The meaning of this formula is to



t : T ′, O : c

create(c, t, O)c ∈ CI, σ (c) = T , T ′ ≤T T , c ≤T c, c ∈ CI ∪ {⊥}

create(c, t, O), O :⊥O : c

O : c

delete(c,O)c ∈ CI

O : c1, t : T ′specialize(c1, c2,O, t)

c1, c2 ∈ CI, c2 ≤T c1 and � c ∈ CI s.t. c2 ≤T c ≤T c1, σ (c2) = T , T ′ ≤T T|c2

specialize(c1, c2,O, t)

O : c2

O : c1

generalize(c1, c2,O)c1, c2 ∈ CI, c1 ≤T c2

generalize(c1, c2,O)

O : c2

O : c, t : T ′modify(c.a, O, t)

c ∈ CI, dom(a, c) = T , T ′ ≤T T

Figure A3. Update operation typing rules.

assign the type corresponding to T to variable X and to state that the objects/values, to which X maybe instantiated, must belong to the set of legal objects/values of type T . Type formulas are built fromtype names representing unary predicate symbols. Thus, to state that a variable X is of type T (i.e. toadd if the basis X ∈ VarT ) we add to our formula the type formula T (X). Note that this has also theeffect of stating that X ∈ dom(T ), thus providing a domain for the evaluation of the formula. To typethe pseudovariable Self properly, if a rule appears in the context of a class c we add Self ∈ Varc tothe basis.

Another way to type variables in formulas is offered by the possibility of invoking side-effect-freemethods. A side-effect-free method is characterized by some input parameters (for which a type mustbe declared in the formula in which the method invocation appears) and by some output parameters(for which the type can be determined by the method invocation itself).

Summarizing, the basis from which we start to evaluate types in conditions consists of

(i) X ∈ VarT for each type formula T (X); and(ii) Pi ∈ VarTi for each Pi method output parameter, being Ti the type of the ith parameter in the

method signature.

In what follows, we say that a formula is type correct if it can be deduced according to the rulesspecified in Figures A1 and A2.



O : c, ti : T ′i

O.m(t1, . . . , tn)Cond

Cond:c ∈ CIsign(m, c) = T1 × · · · × Tj → Tj+1 × · · · × Tn,T ′i

≤T Ti, T ′i

∈ T , (i = 1 . . . j )

Ti ≤T T ′i, T ′

i∈ T ∪ {⊥}, (i = j + 1 . . . n)

O : c, O.m(t1, . . . , tn) ti :⊥ (i ∈ {j + 1, . . . , n})ti : Ti

Cond′

Cond′:c ∈ CIsign(m, c) = T1 × · · · × Tj → Tj+1 × · · · × Tn

Figure A4. Typing rules for method invocations.

A.3. Typing of atomic updates

Figure A3 illustrates the typing rules for atomic updates. Note that the only atomic update which can beexecuted on a non-typed variable is the create operation, which, if executed on a non-typed variable,sets its type to the class on which the creation is executed.

In the rule for the specialize operation, the notation T|c2is employed, where T = σ(c2).

This notation denotes the restriction of the record type T to class c2. That is, given a record typeT , T|c2

denotes T containing only those fields whose labels are proper attributes of class c2. Formally,let c2 and c1 be two classes such that c2 ≤T c1 and σ(c1) = (a1 : T1, . . . , an : Tn). Suppose that c2adds to the attributes of c1 some attributes. Let an+1, . . . , an+m be the names of these attributes andTn+1, . . . , Tn+m the corresponding types, m ≥ 1. Thus, T = σ(c2) = (a1 : T1, . . . , an : Tn, an+1 :Tn+1, . . . , an+m : Tn+m) and T|c2

= (an+1 : Tn+1, . . . , an+m : Tn+m). In the rule related to theoperation specialize(c1, c2,O, t), the term t is deduced of type T ′ = (a′

1 : T ′1, . . . , a

′m : T ′

m) suchthat T ′ ≤T T|c2

. That is a′j ∈ {an+1, . . . , an+m}, j = 1, . . . ,m, and let a′

j = an+j , then T ′j ≤T Tn+j ,

j = 1, . . . ,m. Note, moreover, that the generalize operation (upward migration) changes the typeof the variable on which it is executed. Updates also include method invocations. The typing rulesfor method invocations are given in Figure A4, and they apply both to side-effect and side-effect-freemethod invocations. Similarly to formulas, we say that an update is type correct if it can be deducedaccording to the rules specified in Figure A3.

A.4. Typing of method invocations

Figure A4 illustrates the rules for method invocations. A generic method invocation has the formO.m(t1, . . . , tn) where ti , for i = 1, . . . , j , are input parameters, whereas ti , for i = j + 1, . . . , n,are output parameters. While input parameters can be arbitrary terms, output parameters must bevariables. In the rules, we have assumed that if no type has been specified for an output parameter,



the corresponding variable has an undefined type, which we have denoted as ⊥. Thus, T ≤T ⊥ for anyT ∈ T .

The typing rules point out that, while types of the method input parameters must be known beforethe invocation, and they must be compatible with those specified in the method signature, types of themethod output parameters, by contrast, may or may not be already known. In the former case they mustbe compatible with those specified in the method signature, whereas in the latter the types of outputparameters are deduced from the types specified for them in the method signature.

ACKNOWLEDGEMENTS

Thanks are due to the anonymous referees, whose comments helped us to improve both the technical content andpresentation of the paper. We also thank Giorgio Delzanno for helpful discussions on the subject of this paper.

REFERENCES

1. Cannan SJ, Otten GAM. SQL—The Standard Handbook. McGraw-Hill: New York, 1992.2. Bancilhon F. Object-oriented database systems. Proceedings of the Seventh ACM SIGACT-SIGMOD-SIGART Symposium

on Principles of Database Systems, 1998. ACM: New York, 1988; 152–162.3. Agrawal R, Gehani N. ODE (Object Database and Environment): The language and the data model. Proceedings of the

ACM SIGMOD International Conference on Management of Data, 1989. ACM: New York, 1989; 36–45.4. Breitl R, Maier D, Otis A, Penney J, Schuchardt B, Stein J, Williams EH, Williams M. The GemStone data management

system. Object-Oriented Concepts, Databasases, and Applications, 1989. Kim W, Lochovsky FH (eds.). Addison-Wesley:Reading, MA, 1989; 283–308.

5. Deux O et al. The story of O2. IEEE Transactions on Knowledge and Data Engineering 1990; 2(1):91–108.6. Object Design Inc. ObjectStore Reference Manual. Object Design: Burlington, MA, 1990.7. Servio Logic Development Corporation. Programming in OPAL, 1990. Version 2.0.8. Leavens G, Weihl W. Reasoning about object-oriented programs that use subtypes. Proceedings of the Fifth International

Conference on Object-Oriented Programming: Systems, Languages, and Applications Joint with the Fourth EuropeanConference on Object-Oriented Programming, 1990. ACM: New York, 1990; 212–223.

9. Bertino E, Guerrini G. Objects with multiple most specific classes. Proceedings of the Ninth European Conference onObject-Oriented Programming, 1995. Olthoff W (ed.). Springer: Berlin, 1995; 102–126.

10. Andries M, Cabibbo L, Paredaens J, van den Bussche J. Applying an update method to a set of receivers. Proceedings ofthe Fourteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM: New York, 1995;208–218.

11. Laasch C, Scholl MH. Deterministic semantics of set-oriented update sequences. Proceedings Ninth IEEE InternationalConference on Data Engineering. IEEE: Los Alamitos, CA, 1993; 4–13.

12. Liefke H, Davidson S. Updates and non-determinism in object-oriented databases. Technical Report MS-CIS-99-11,Department of Computer and Information Science, University of Pennsylvania, 1999.

13. Abiteboul S, Lausen G, Uphoff H, Waller E. Methods and rules. Proceedings of the ACM SIGMOD InternationalConference on Management of Data, 1993. Buneman P, Jajodia S (eds.). ACM: New York, 1993; 32–41.

14. Lou Y, Ozsoyoglu ZM. LLO: An object-oriented deductive language with methods and methods inheritance. Proceedingsof the ACM SIGMOD International Conference on Management of Data, 1991. ACM: New York, 1991; 198–207.

15. Cattel R, Barry D, Berler M, Eastman J, Jordan D, Russel C, Schadow O, Stanienda T, Velez F. The Object DatabaseStandard: ODMG 3.0. Morgan Kaufmann: San Mateo, CA, 1999.

16. Ceri S, Fraternali P. Designing Database Applications with Objects and Rules—The IDEA Methodology. Addison-Wesley:Reading, MA, 1997.

17. Bertino E, Martino L. Object-Oriented Database Systems. Addison-Wesley: Reading, MA, 1993.18. Guerrini G, Bertino E, Bal R. A formal definition of the chimera object-oriented data model. Journal of Intelligent

Information Systems 1998; 11(1):5–40.19. Ceri S, Gottlob G, Tanca L. Logic Programming and Databases. Springer: Berlin, 1990.20. Beeri C, Naqvi S, Shmueli O, Tsur S. Set constructors in a logic database language. Journal of Logic Programming 1991;

10(3&4):181–232.



21. Abiteboul S, Hull R, Vianu V. Foundations of Databases. Addison-Wesley: Reading, MA, 1995.22. Jones N. Computability and Complexity. The MIT Press: Cambridge, MA, 1997.23. Birkhoff G. Lattice theory. American Mathematical Society Colloquium Publications, vol. 25. American Mathematical

Society: Providence, RI, 1973.24. Albano A, Bergamini R, Ghelli G, Orsini R. An object data model with roles. Proceedings of the Nineteenth International

Conference on Very Large Data Bases, 1993. Agrawal R, Baker S, Bell D (eds.). Morgan Kaufmann: San Francisco, CA,1993; 39–51.

25. Chambers C. Predicate classes. Proceedings of the Seventh European Conference on Object-Oriented Programming, 1993.Nierstrasz O (ed.). Springer: Berlin, 1993; 268–296.

26. Bal R, Balsters H. A deductive and typed object-oriented language. Proceedings of the Third International Conference onDeductive and Object-Oriented Databases, 1993. Tsur S, Ceri S, Tanaka K (eds.). Springer: Berlin, 1993; 340–359.

27. Dobbie G, Topor RW. A model for inheritance and overriding in deductive object-oriented systems. Proceedings of theSixteenth Australian Computer Science Conference. Springer: Berlin, 1993; 625–634.

28. Jamil HM, Lakshmanan LVS. A declarative semantics of behavioral inheritance and conflict resolution. Proceedings of theInternational Logic Programming Symposium. MIT Press: Boston, MA, 1995; 130–144.

29. Bertino E, Guerrini G, Montesi D. Deductive object databases. Proceedings of the Eighth European Conference on Object-Oriented Programming, 1994. Tokoro M, Pareschi R (eds.). Springer: Berlin, July 1994; 213–235.

30. Bonner A, Kifer M. An overview of transaction logic. Theoretical Computer Science 1994; 133(2):205–265.31. Montesi D, Bertino E, Martelli M. Transactions and updates in deductive databases. IEEE Transactions on Knowledge and

Data Engineering 1997; 9(5):784–797.32. Liskov BH, Wing JM. A behavioral notion of subtyping. ACM Transactions on Programming Languages and Systems

1994; 16(6):1811–1841.33. Meyer B. Eiffel: The Language. Prentice-Hall: Englewood Cliffs, NJ, 1992.34. Alagic S, Solorzano J, Gitchell D. Orthogonal to the Java imperative. Proceedings of the Twelfth European Conference on

Object-Oriented Programming, Jul E (ed). Springer: Berlin, 1998; 212–233.35. Collet P, Vignola G. Towards a consistent viewpoint on consistency for persistent applications. Objects and Databases

(Lecture Notes in Computer Science, vol. 1944). Dittrich K et al. (eds.). Springer: Berlin, 2001; 47–60.36. Benzaken V, Doucet A. Themis: A database programming language handling integrity constraints. The VLDB Journal

1995; 4(3):493–517.37. Baralis E, Ceri S, Paraboschi S. Compile-time and runtime analysis of active behaviors. IEEE Transactions on Knowledge

and Data Engineering 1998; 10(3):353–370.38. Benzaken V, Schaefer X. Static integrity constraint management in object-oriented database programming languages via

predicate transformers. Proceedings of the Eleventh European Conference on Object-Oriented Programming. Aksit M(ed.). Springer: Berlin, 1997; 60–84.

39. Liefke H, Davidson S. Processing updates on complex value databases. Proceedings of the International Conference ofInformation Resource Management Association, 1999.

40. Bertino E, Catania B. Static analysis of intensional databases in U-datalog. Proceedings of the Fifteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM: New York, 1996; 202–212.

41. Elmagarmid AK. Database Transaction Models for Advanced Applications. Morgan Kaufmann: San Mateo, CA, 1992.42. Spelt D, Balsters H. Automatic verification of transactions on an object-oriented database. Proceedings of the Sixth

International Workshop on Database Programming Languages. Springer: Berlin, 1997; 396–412.43. Abadi M, Cardelli L. A Theory of Objects. Springer: Berlin, 1996.


A Set-Oriented Method Definition Language for Object Databases

Documents