Top Banner
Concurrency Control in Advanced Database Applications NASER S. BARGHOUTI AND GAIL E. KAISER Department of Computer Science, Columbia University, New York, New York 10027 Concurrency control has been thoroughly studied in the context of traditional database applications such as banking and airline reservations systems. There are relatively few studies, however, that address the concurrency control issues of advanced database applications such as CAD/CAM and software development environments. The concurrency control requirements in such applications are different from those in conventional database applications; in particular, there is a need to support nonserializable cooperation among users whose transactions are long-lived and interactive and to integrate concurrency control mechanisms with version and configuration control. This paper outlines the characteristics of data and operations in some advanced database applications, discusses their concurrency control requirements, and surveys the mechanisms proposed to address these requirements. Categories and Subject Descriptors: D.2.6 [Software Engineering]: Programming Environments— interactive; D.2.9 [Software Engineering]: Management– programming teams; H.2.4 [Database Management]: Systems—concurrency; transaction processing; H,2.8 [Database Management]: Database Applications General Terms: Algorithms, Design, Management Additional Key Words and Phrases: Advanced database applications, concurrency control, cooperative transactions, design environments, extended transaction models, long transactions, object-oriented databases, relaxing serializability INTRODUCTION Many advanced computer-based applica- tions, such as computer-aided design and manufacturing (CAD/CAM), network management, financial instruments trading, medical in formatics, office au- tomation, and software development en- vironments (SDES), are data intensive in the sense that they generate and manip- ulate large amounts of data (e. g., all the software artifacts in an SDE). It is desir- able to base these kinds of application systems on data management capabili- ties similar to those provided by database management systems (DBMSS) for tradi- tional data processing. These capabilities include adding, removing, retrieving, and updating data from on-line storage and maintaining the consistency of the infor- mation stored in a database. Consistency in a database is maintained if every data item satisfies specific consistency con- straints. These are typically implicit in data processing in the sense they are known to the implementors of the appli- cations and programmed into atomic units called transactions that transform the database from one consistent state to another. Consistency can be violated by concurrent access to the same data item by multiple transactions. A DBMS solves this problem by enforcing a con- currency control policy that allows only consistency-preserving schedules of con- current transactions to be executed. We use the term advanced database applications to describe application Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. @ 1991 ACM 0360-0300/91/0900-0269 $01.50 ACM Computing Surveys, Vol 23, No 3, September 1991
49

Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Jul 23, 2018

Download

Documents

phamanh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications

NASER S. BARGHOUTI AND GAIL E. KAISER

Department of Computer Science, Columbia University, New York, New York 10027

Concurrency control has been thoroughly studied in the context of traditional database

applications such as banking and airline reservations systems. There are relativelyfew studies, however, that address the concurrency control issues of advanced database

applications such as CAD/CAM and software development environments. The

concurrency control requirements in such applications are different from those in

conventional database applications; in particular, there is a need to support

nonserializable cooperation among users whose transactions are long-lived and

interactive and to integrate concurrency control mechanisms with version and

configuration control. This paper outlines the characteristics of data and operations in

some advanced database applications, discusses their concurrency control requirements,

and surveys the mechanisms proposed to address these requirements.

Categories and Subject Descriptors: D.2.6 [Software Engineering]: Programming

Environments— interactive; D.2.9 [Software Engineering]: Management–

programming teams; H.2.4 [Database Management]: Systems—concurrency;

transaction processing; H,2.8 [Database Management]: Database Applications

General Terms: Algorithms, Design, Management

Additional Key Words and Phrases: Advanced database applications, concurrency

control, cooperative transactions, design environments, extended transaction models,

long transactions, object-oriented databases, relaxing serializability

INTRODUCTION

Many advanced computer-based applica-tions, such as computer-aided design andmanufacturing (CAD/CAM), networkmanagement, financial instrumentstrading, medical in formatics, office au-tomation, and software development en-vironments (SDES), are data intensive inthe sense that they generate and manip-ulate large amounts of data (e. g., all thesoftware artifacts in an SDE). It is desir-able to base these kinds of applicationsystems on data management capabili-ties similar to those provided by databasemanagement systems (DBMSS) for tradi-tional data processing. These capabilitiesinclude adding, removing, retrieving, andupdating data from on-line storage and

maintaining the consistency of the infor-mation stored in a database. Consistencyin a database is maintained if every dataitem satisfies specific consistency con-straints. These are typically implicit indata processing in the sense they areknown to the implementors of the appli-cations and programmed into atomicunits called transactions that transformthe database from one consistent stateto another. Consistency can be violatedby concurrent access to the same dataitem by multiple transactions. A DBMSsolves this problem by enforcing a con-currency control policy that allows onlyconsistency-preserving schedules of con-current transactions to be executed.

We use the term advanced databaseapplications to describe application

Permission to copy without fee all or part of this material is granted provided that the copies are not madeor distributed for direct commercial advantage, the ACM copyright notice and the title of the publication

and its date appear, and notice is given that copying is by permission of the Association for ComputingMachinery. To copy otherwise, or to republish, requires a fee and/or specific permission.

@ 1991 ACM 0360-0300/91/0900-0269 $01.50

ACM Computing Surveys, Vol 23, No 3, September 1991

Page 2: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

270 - N. S. Barghouti and G. E. Kaiser

CONTENTS

INTRODUCTION1.2

MOTIVATING EXAMPLEADVANCED DATABASE APPLICATIONS

5

6

7

3

4

CONSISTENCY PROBLEMS IN

CONVENTIONAL DBMSS3 lThe Transaction Concept

3.2 Seriahzabllity

TRADITIONAL APPROACHES TO

CONCURRENCY CONTROL4.1 Locking Mechamsms4 2Tlmestamp Ordering4 3Multlverslon Timestamp Ordering44 Optlmistlc Nonlocklng Mechanisms4.5 Multlple Granularity Locking46 Nested TransactionsCONCURRENCY CONTROLREQUIREMENTS INADVANCED DATABASEAPPLICATIONS

SUPPORTING LONG TRANSACTIONS6 lExtending Serlallzabillty-based

Techniques

62 Relaxing Serlallzabd]tySUPPORTING COORDINATION AMONG

MULTIPLE DEVELOPERS7.1 Version and Configuration

Management

7 2Pessimlstlc Coordmatlon

7.3 Optimmtic Coordination

SUPPORTING SYNERGISTICCOOPERATION8 1 Cooperation Prlmltlves8 2Cooperatmg Transactions

9 SUMMARYACKNOWLEDGMENTS

REFERENCES

8

svstems, such as the ones mentionedabove, that use DBMS capabilities. Theyare termed advancedt odistinguisht hemfrom traditional database applicationssuch as banking and airline reservationssystems. In traditional applications, thenature of the data and the operationsperformed on the data are amenable toconcurrency control mechanisms that en-force the classical transaction model .Ad-vanced applications, in contrast, havedifferent kinds of consistency con-straints, and, in general, the classicaltransaction model is not applicable. Forexample, applications like network man-agement and medical informatics mayrequire real-time processing. Others like

CAD ICAM and office automation in-volve long, interactive database sessionsand cooperation among multiple databaseusers. Conventional concurrence control.mechanisms are not applicable “as is” inthese new domains. This paper is con-cerned with the latter class of advancedapplications, which involve computer-supported cooperative work. The re-quirements of these applications areelaborated in Section 5.

Some researchers and practitionersquestion the adoption of terminology andconcepts from on-line transaction pro-cessing (OLTP) systems for advancedapplications. In particular, these re-searchers feel the terms long transac-tions and cooperating transactions are aninappropriate and misleading use of theterm transaction since they do not carrythe atomicity and serializability proper-ties of OLTP transactions. We agree thatatomicity, serializability, and the corre-sponding OLTP implementation tech-niques are not appropriate for advancedapplications. The term transaction, how-ever, provides a nice intuition regardingthe need for consistency, concurrencycontrol, and fault recovery. Basic OLTPconcepts such as locks, versions, and val-idation provide a good starting point forthe implementation of long transactionsand cooperating transactions. In anycase, nearly all the relevant literatureuses the term transaction. We do like-wise in our survey.

The goals of this paper are to provide abasic understanding of the difference be-tween concurrency control in advanceddatabase applications and in traditionaldata processing applications, to outlinemechanisms used to control concurrentaccess in these advanced applications,and to point out some problems withthese mechanisms. We assume the readeris familiar with database concepts but donot assume an in-depth understanding oftransactions and concurrence control is-.sues. Throughout the paper we definethe concepts we use and give practicalexamples of them. We explain the mech-anisms at an intuitive level rather thanat a detailed technical level.

ACM Computing Surveys, Vol 23, No. 3, September 1991

Page 3: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications 0 271

The paper is organized as follows. Sec-tion 1 presents an example to motivatethe need for new concurrency controlmechanisms. Section 2 describes the datahandling requirements of advanceddatabase applications and shows whythere is a need for capabilities like thoseprovided by 1313MSS. Section 3 gives abrief overview of the consistency problemin traditional database applicationsand explains the concept of serializ-ability. Section 4 presents the mainserializability-based concurrency controlmechanisms. Readers who are familiarwith conventional concurrency controlschemes may wish to skip Sections 3 and4. Section 5 enumerates the concurrencycontrol requirements of advanceddatabase applications. It focuses on soft-ware development environments, al-though many of the problems ofCA13/CAM and office automation sys-tems are similar. Sections 6, 7, and 8survey the various concurrency controlmechanisms proposed for this class of ad-vanced database applications. Section 9discusses some of the shortcomings ofthese mechanisms and concludes with asummary of the mechanisms.

1. MOTIVATING EXAMPLE

We motivate the need for extended con-currency control policies by a simple ex-ample from the software developmentdomain. Variants of the following exam-ple are used throughout the paper todemonstrate the various concurrencycontrol models.

Two programmers, John and Mary, areworking on the same software project.The project consists of four modules A, B,C, and D. Modules A, B, and C consist ofprocedures and declarations that com-prise the main code of the project; mod-ule D is a library of procedures called bythe procedures in modules A, B, and C.Figure 1 depicts the organization of theproject.

When testing the project, two bugs arediscovered. John is assigned the task offixing one bug that is suspected to be inmodule A. He “reserves” A and starts

Project

AmA B c D

@plp2p9flp3p4p5p6p7 p8dld2ti

Figure t. Organization of example project,

working on it. Mary’s task is to explore apossible bug in the code of module B, soshe starts browsing B after “reserving”it. After a while. John finds there is abug in A caused by bugs in some of theprocedures in the library module, so he“reserves” module D. After modifying afew procedures in D, John proceeds tocompile and test the modified code.

Mary finds a bug in the code of moduleB and modifies various parts of the mod-ule to fix it. Mary then wants to test thenew code of B. She is not concerned withthe modifications John made in A be-cause module A is unrelated to moduleB. She does, however, want to access themodifications John made in module Dbecause the procedures in D are called inmodule B. The modifications John madeto D might have introduced inconsisten-cies to the code of module B. But sinceJohn is still working on modules A andD, Mary will either have to access mod-ule D at the same time John is modifyingit or wait until he is done,

In the above example, if the traditionalconcurrency control scheme of two-phaselocking was used, for example, John andMary would not have been able to accessthe modules in the manner describedabove. Thev would be allowed to concur-.rently lock module B and module A, re -spectively, since they work in isolationon these modules. Both of them, how-ever, need to work cooperatively on mod-ule D and thus neither of them can lockit. Even if the locks were at the granu-larity of procedures, they would still havea problem because both John and Marymight need to access the same proce-dures in order, for example, to recompile

ACM Computing Surveys, Vol. 23, No. 3, September 1991

Page 4: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

272 * N. S. Barghouti and G. E. Kaiser

D. The locks are released only afterreaching a satisfactory stage of modifica-tion of the code, such as the completion ofunit testing. Other traditional concur-rency control schemes would not solvethe problem because they would also re-quire the serialization of Mary’s workwith John’s.

The problem might be solved by sup-porting parallel versions of module D.Mary would access the last compiled ver-sion of module D while John works on anew version. This requires Mary to retesther code after the new version of D isreleased, which is really unnecessary.What is needed is a flexible concurrencycontrol scheme that allows cooperationbetween John and Mary. In the rest ofthis paper we explain the basic conceptsbehind traditional concurrency controlmechanisms, show how these mecha-nisms do not support the needs of ad-vanced applications, and describe severalconcurrency control mechanisms thatprovide some of the necessary support.

2. ADVANCED DATABASE APPLICATIONS

Many large multiuser software systems,such as software development environ-ments, generate and manipulate largeamounts of data. SDES, for example, gen-erate and manipulate source code, objectcode, documentation, test suites, and soon. Traditionally, users of such systemsmanage the data they generate eithermanually or by the use of special-purposetools. For example, programmers work-ing on a large-scale software project usesystem configuration management toolssuch as Make [Feldman 19791 and RCS[Tichy 1985] to manage the configura-tions and versions of the programs theyare developing. Releases of the finishedproject are stored in different directoriesmanually. The only common interfaceamong all these tools is the file system,which stores project components in textor binary files regardless of their inter-nal structures. This significantly limitsthe ability to manipulate these objects indesirable ways. It also causes inefficien-cies in the storage of collections of objects

and leaves data, stored as a collection ofrelated files, susceptible to corruption dueto incompatible concurrent access.

Recently, researchers have attemptedto use database technology to managethe objects belonging to a system uni-formly. Design environments, for exam-ple, need to store the objects theymanipulate (design documents, circuitlayouts, programs, etc.) in a databaseand have it managed by a DBMS forseveral reasons [Bernstein 1987; Dittrichet al. 1987; Nestor 1986; Rowe andWensel 1989]:

(1)

(2)

(3)

(4)

Data integration. Providing a singledata management and retrieval in-terface for all tools accessing the data.

Application orientation. Organizingdata items into structures that cap-ture much of the semantics of theintended applications.

Data integrity. Preserving consist-ency and recovery to ensure all thedata satisfy the integrity constraintsrequired by the application.

Convenient access. Providing a pow-erful query language to access multi-ple sets of data items at a time.

(5) Data independence. Hiding the inter-nal structure of data from tools sothat if the structure is changed, itwill have a minimal impact on theapplications using the data.

Since there are numerous commercialDBMSS available, several projects havetried to use them in advanced applica-tions. Researchers discovered quiterapidly, however, that even the mostsophisticated of today’s DBMSS are in-adequate for advanced applications[Bernstein 1987; Korth and Silberschatz1986]. One of the shortcomings of tradi-tional general-purpose DBMSS is theirinability to provide flexible concurrencycontrol mechanisms. To understand thereasons behind this, we need to explainthe concepts of transactions and serializ-ability. These two concepts are central toall conventional concurrency controlmechanisms.

ACM Computing Surveys, Vol. 23, No, 3, September 1991

Page 5: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications - 273

3. CONSISTENCY PROBLEMS INCONVENTIONAL DBMS%

Database consistency is maintained ifevery data item in the database satisfiesthe application-specific consistency con-straints. For example, in an airlinereservations system, one consistency con-straint might be that each seat on a flightcan be reserved by only one passenger.It is often the case, however, that theconsistency constraints are not knownbeforehand to the designers of’ general-purpose DBMSS. This is due to the lackof information about the computations inpotential applications and the semanticsof database operations in these applica-tions. Thus, the best a DBMS can do isabstract each database operation to beeither a read operation or a write opera-tion, irrespective of the particular corn.putation. Then it can guarantee thedatabase is always in a consistent statewith respect to reads and writes, inde-pendent of the semantics of the particu-lar application.

Ignoring the possibility of bugs in theDBMS program and the application pro-gram, inconsistent data result from twomain sources: software or hardware fail-ures such as bugs in the operating sys-tem or a disk crash in the middle ofoperations and concurrent access of thesame data item by multiple users orprograms.

3.1 The Transaction Concept

To solve these problems, the operationsperformed by a program accessing thedatabase are grouped into sequencescalled transactions [Eswaran et al. 1976].Users interact with a DBMS by execut -ing transactions. In traditional DBMSS,transactions serve three distinct pur-poses [Lynch 1983]: (1) They are logicalunits that group together operationscomprising a complete task; (2) they areatomicity units whose execution pre-serves the consistency of the database;and (3) they are recovery units that en-sure that either all the steps enclosedwithin them are executed or none are. Itis thus by definition that if the database

is in a consistent state before a trans-action starts executing, it will be in aconsistent state when the transactionterminates.

In a multiuser system, users executetheir transactions concurrently. TheDBMS must provide a concurrency con-trol mechanism to guarantee that consist-ency of data is maintained in spite ofconcurrent accesses by different users.From the user’s viewpoint, a concurrencycontrol mechanism maintains the consist-ency of data if it can guarantee that eachof the transactions submitted to theDBMS by a user eventually gets exe-cuted and that the results of the com-putations performed by each transactionare the same whether it is executedon a dedicated system or concurrentlywith other transactions in a multipro -grammed system [Bernstein et al. 1987;Papadimitriou 1986].

Let us follow up our previous exampleto demonstrate the transaction concept.John and Mary are now assigned thetask of fixing two bugs that were sus-pected to be in modules A and B. Thefirst bug is caused by an error in proce-dure pl in module A, which is called byprocedure p3 in module B. Thus, fixingthe bug might affect both pl and p3.The second bug is caused by an error inthe interface of procedure p2 in moduleA, which is called by procedure p4 in B.John and Mary agree that John will fixthe first bug and Mary will fix the sec-ond. John starts a transaction ~J+n andproceeds to modify procedure pl m mod-ule A. After completing the modificationto PI, he starts modifying procedure p3in module B. At the same time, Marystarts a transaction T~~,Y to modify pro-cedure p2 in module A and procedure p4in module B.

Although TJOh. and T~.,Y are execut -ing concurrently, their outcomes are ex-pected to be the same as they would havebeen had each of them been executed ona dedicated system. The overlap betweenT M,,y and TJOh. results in a sequence ofactions from both transactions, calleda schedule. Figure 2 shows an exampleof a schedule made up by interleaving

ACM Computing Surveys, Vol. 23, No. 3, September 1991

Page 6: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

274 ● A? S. Barghouti and G. E. Kaiser

T John T Mary

reserve(A)modify(pl)write(A)

reserve(A)modify(p2)

write(A)

reserve(B)modify(p3)

write(B)reserve(B)modify(p4)

v write(B)

Time

Figure 2. Serializable schedule,

operations from T~O~. and T~,rY. Aschedule that gives each transaction aconsistent view of the state of thedatabase is considered a consistentschedule. Consistent schedules are a re-sult of synchronizing the concurrent op -erations of transactions by allowing onlythose operations that maintain consis-tency to be interleaved.

3.2 Serializability

Let us give a more formal definition of aconsistent schedule. A schedule is con-sistent if the transactions comprising theschedule are executed serially. In otherwords, a schedule consisting of transac-tions Tl, Tz, . . . . T. is consistent if forevery i = 1 to n – 1, transaction T, isexecuted to completion before transac-tion T,+ ~ begins. We can then establishthat a serializable execution, one that isequivalent to a serial execution, is alsoconsistent. From the perspective of aDBMS, all computations in a transactioneither read or write a data item from thedatabase. Thus, two schedules S1 and S2are said to be computationally equiva-lent if [Korth and Silberschatz 1986]:

(1)

(2)

(3)

The set of transactions that partici-pates in S1 and Sz is the same.

For each data item Q in Sl, if trans-action T, executes read(Q), and thevalue of Q read by T, is written byT~, the same will hold in Sz (i.e.,read– write synchronization).

For each data item Q in S1, iftransaction T, executes write(Q) be-

fore T~ executes write(Q), the samewill hold in S’z (i. e., write–writesynchronization).

For example, the schedule shown inFigure 2 is computationally equivalentto the serial schedule T~Oh~, T~~,Y (ex-ecute T~O~~ to completion then executeT ~~,Y) because the set of transactions inboth schedules are the same, both dataitems A and B read by T~,,Y are writ-ten by T~O~~ in both schedules, and T~,,Yexecutes both write(A) and write(B)after T~O~~ in both schedules.

The consistency problem in conven-tional database systems reduces to thatof testing for serializable schedules be-cause it is accepted that the consistencyconstraints are unknown. Each operationwithin a transaction is abstracted intoeither reading a data item or writingone. Achieving serializability in DBMSScan thus be decomposed into two sub-problems: read–write synchronizationand write–write synchronization, de-noted rw and ww synchronization, re-spectively [Bernstein and Goodman1981]. Accordingly, concurrency controlalgorithms can be categorized into thosethat guarantee rw synchronization, thosethat are concerned with ww synchroniza-tion, and those that integrate the two.The rw synchronization refers to serializ-ing transactions in such a way that everyread operation reads the same value of adata item as it would have read in a seri-al execution. The ww synchronizationrefers to serializing transactions so thelast write operation of every trans-action leaves the database in the samestate as it would have left it in a serialexecution. The rw and ww synchroni-zations together result in a consistentschedule.

Thus, even though a DBMS may nothave any information about application-specific consistency constraints, it canguarantee consistency by allowing onlyserializable executions of concurrenttransactions. This concept of serializabil-ity is central to all the concurrency con-trol mechanisms described in the nextsection. If more semantic information

ACM Computing Surveys, Vol, 23, No 3, September 1991

Page 7: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications * 275

about transactions and their operationsis available, schedules that are not seri-alizable but do maintain can could bepermitted. This is exactly the goal of theextended transaction mechanisms dis-cussed later.

4. TRADITIONAL APPROACHES TO

CONCURRENCY CONTROL

To understand why conventional concur-rency control mechanisms are too restric-tive for advanced applications, it isnecessary to be familiar with the basicideas of the main serializability-basedconcurrency control mechanisms in con-ventional DBMSS. Most of the mecha-nisms follow one of four main approachesto concurrency control: two-phase locking(the most popular example of lockingprotocols), timestamp ordering, multiver -sion timestamp ordering, and optimisticconcurrency control. Some mechanismsadd multiple granularities of locking andnesting of transactions. In this section,we briefly describe these approaches.There have been a few comprehensivediscussions and surveys of traditionalconcurrency control mechanisms, includ-ing Bernstein and Goodman [1981] andKohler [1981]; a book has also been writ-ten on the subject [Bernstein et al. 198’71.

4.1 Locking Mechanisms

4. 1.1 Two-Phase Locking

The two-phase locking mechanism (2PL)introduced by Eswaran et al. [1976] isnow accepted as the standard solution tothe concurrency control problem in con-ventional DBMSS. 2PL guarantees seri-alizability in a centralized database whentransactions are executed concurrently.The mechanism depends on well-formedtransactions, which do not relock entitiesthat have been locked earlier in thetransaction and are divided into a grow-ing phase in which locks are only ac-quired and a shrinking phase, in whichlocks are only released. During theshrinking phase, a transaction is prohib -ited from acquiring locks. If a transac-tion tries during its growing phase to

acquire a lock that has already been ac-quired by another transaction, it is forcedto wait. This situation might result indeadlock if transactions are mutuallywaiting for each other’s resources.

4. 1.2 Tree Protocol

2PL allows only a subset of serializableschedules. In the absence of informationabout how and when the data items areaccessed, however, 2PL is both necessaryand sufficient to ensure serializability bylocking [Yannakakis 1982]. In advancedapplications, it is often the case that theDBMS has prior knowledge about theorder of access of data items. The DBMScan use this information to ensure serial-izability by using locking protocols thatare not 2PL. One such protocol is thetree protocol, which can be applied ifthere is a partial ordering on the set ofdata items accessed by concurrent trans-actions [Silberschatz and Kedem 1980].To illustrate this protocol, assume a thirdprogrammer, Bob, joined the program-ming team of Mary and John and is nowworking with them on the same project.Suppose Bob, Mary, and John want tomodify modules A and B concurrently inthe manner depicted in schedule S1 ofFigure 3. The tree protocol would allowthis schedule because it is serializable(equivalent to 7’~0~ l“~.~. 7’M,, ) eventhough it does not follow the 2P~ proto-col (because ~John releaSeS the lock on Abefore it acquires the lock on B). It ispossible to construct S1 because all of thetransactions in the example access (write)A before B, This information about theaccess patterns of the three transactionsis the basis for allowing the non-2PLschedule shown in the figure.

4.2 Timestamp Ordering

One of the problems of locking mecha-nisms is the potential for deadlock.Deadlock occurs when two or more trans-actions are mutually waiting for eachother’s resources. This problem can besolved by assigning each transaction aunique number, called a time stamp, cho-sen from a monotonically increasing

ACM Computing Surveys, Vol 23, No. 3, September 1991

Page 8: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

276 “ N. S. Barghouti and G. E. Kaiser

Schedule S1:

him

lock(A)read(A)modify(A)

I write(A)unlock(A)

T Mary

lock(A)read(A)modify(A)

write(A)

lock(B)read(B)modify(B)

write(B)unlock(B)

T Bob

lock(B)read(B)

modify(B)write(B)

unlock(B)

vTime

lock(B)

read(B)modify(B)

write(B)

unlock(A)unlock(B)

Figure 3. Serializable but not 2PL schedule

sequence. This sequence is often a func-tion of the time of day [Kohler 19811.Using timestamps, a concurrency controlmechanism can totally order requestsfrom transactions according to the trans-actions’ timestamps [Rosenkrantz et al.19781. The mechanism forces a transac-tion TI requesting to access a data itemx that is being held by another transac-tion Tz to wait until Tz terminates, abortitself and restart if it cannot be grantedaccess to x, or preempt Tz and get holdof x. A scheduling protocol decides whichone of these three actions to take aftercomparing the timestamps of TI and Tz.

Two of the possible alternativescheduling protocols used by timestamp-based mechanisms are the WAIT-DIEprotocol, which forces a transaction towait if it conflicts with a running trans-action whose timestamp is more recentor to die (abort and restart) if the run-ning transaction’s timestamp is older andthe WOUND-WAIT protocol, which al-

lows a transaction to wound (preempt bysuspending) a running one with a morerecent timestamp or forces the request-ing transaction to wait otherwise. Locksare used implicitly in both protocols sincesome transactions are forced to wait as ifthey were locked out. Both protocolsguarantee that a deadlock situation willnot arise.

4.3 Multiversion Timestamp Ordering

The timestamp ordering mechanismabove assumes that only one version of adata item exists. Consequently, only onetransaction can access a data item at atime. This restriction can be relaxed byallowing multiple transactions to readand write different versions of the samedata item as long as each transactionsees a consistent set of versions for allthe data items it accesses. This is thebasic idea of the first multiversion time-stamp ordering scheme introduced byReed [1978]. In Reed’s mechanism, eachtransaction is assigned a unique time-stamp when it starts; all operations ofthe transaction are assigned the sametimestamp. In addition, each data item xhas a set of transient versions, each ofwhich is a ( writetimestamp, value) pair,and a set of read timestamps. If a trans-action reads a data item, the transaction’stimestamp is added to the set of readtime stamps of the data item. A writeoperation, if permitted by the concur-rency control protocol, causes the cre-ation of a new transient version with thesame time-stamp as that of the transac-tion requesting the write operation. Theconcurrency control mechanism operatesas follows:

Let T, be a transaction with time-stamp TS( i), and let R(x) be a read oper-ation requested by T, [i. e., R(x) will alsobe assigned the timestamp TS( i)]. R(x)is processed by reading a value of theversion of x whose timestamp is thelargest timestamp smaller than TS( R)(i.e., the latest value written before T,started). TS( i) is then added to the set ofread timestamps of x. Read operationsare always permitted. Write operations,

ACM Computing Surveys, VO1 23, No 3, September 1991

Page 9: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications “ 277

in contrast, might cause a conflict. Let Tjbe another transaction with timestampTS(j), and let W(x) be a write operationrequested by TJ that assigns value u toitem x. W(x) WI1l be permitted only ifother transactions with a more recenttimestamp than TS(j) have not read aversion of x whose timestamp is greaterthan TS(j).

A situation like this can occur becauseof the delays in executing operationswithin a transaction. That is, an opera-tion 0~ belonging to transaction T~ isexecuted a specified period of time afterTJ has started. Meanwhile, other opera-tions from a transaction with a morerecent time-stamp might have beenperformed. In order to detect suchsituations, let irzterval( W ) be the intervalfrom TS(j) to the smallest timestamp ofa version of x greater than TS( j) (i.e., aversion of x that was written by a trans-action whose timestamp is more recentthan Tj’s timestamp). If any read time-stamps lie in the interval [i. e., a trans-action has already read a value of xwritten by a more recent write operationthan W( x)], then W(x) is rejected (andthe transaction is aborted). Otherwise,W(x) is allowed to create a new versionof x with timestamp TS( j).

The existence of multiple versionseliminates the need for write –write syn-chronization since each write operationproduces a new version and thus cannotconflict with another write operation. Theonly possible conflicts are those corre-sponding to read-from relationships[Bernstein et al. 19871, as demonstratedby the protocol above.

4A Optimistic hionlocking Mechanisms

In many applications, locking has beenfound to constrain concurrency and toadd an unnecessary overhead. The lock-ing approach has the following disadvan-tages [Kung and Robinson 19811:

(1) Lock maintenance represents an un-necessary overhead for read-onlytransactions, which do not affect theintegrity of the database.

(2)

(3)

(4)

(5)

There are no locking mechanismsthat provide high concurrency in allcases. Most of the general-purpose,deadlock-free locking mechanismswork well only in some cases but per-form rather poorly in other cases.

When large parts of the database re-side on secondary storage, locking ofobjects that are accessed frequently(referred to as congested nodes) whilewaiting for secondary memory ac-cess causes a significant decrease inconcurrency.

Not permitting locks to be releasedexcept at the end of the transaction,which although not required is al-ways done in practice to avoid cas-caded aborts, decreases concurrency.

Most of the time it is not necessary touse locking to guarantee consistencysince most transactions do not over-lap; locking may be necessary only inthe worst cases.

To avoid these disadvantages, Kungand Robinson [19811 presented the con-cept of “optimistic” concurrency control.They require each transaction to consistof two or three phases: a read phase, avalidation phase, and possibly a writephase. During the read phase, all writestake place on local copies (also referredto as transient versions) of the records tobe written. Then, if it can be establishedduring the validation phase that thechanges the transaction made will notviolate serializability with respect to allcommitted transactions, the local copiesare made global. Only then, in the writephase, do these copies become accessibleto other transactions.

Validation is done by assigning eachtransaction a timestamp at the end of theread phase and synchronizing usingtimestamp ordering. The correctness cri-teria used for validation are based on thenotion of serial equivalence, Any sched-ule produced by this technique ensuresthat if transaction T, has a timestampolder than the timestarnp of transactionT~, the schedule is equivalent to the se-rial schedule T, followed by T]. This canbe ensured if any one of the following

ACM Computmg Surveys, Vol. 23, No. 3, September 1991

Page 10: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

278 “ N. S. Barghouti and G. E. Kaiser

three conditions holds:

(1)

(2)

(3)

T, completes its write phase before T~starts its read phase.

The set of data items written by T,does not intersect with the set of dataitems read by Tj, and T, completesits write phase before T~ starts itswrite phase.

The set of data times written by T,does not intersect with the set of dataitems read or written by T, and TLcompletes its read phase ~efore ~completes its read phase.

Although optimistic concurrency con-trol allows more concurrency undercertain circumstances, it decreases con-currency when the read and write sets ofconcurrent transactions overlap. For ex-ample, Kung and Robinson’s protocolwould cause one of the transactions inthe simple 2PL schedule in Figure 2 tobe rolled back and restarted. From theviewpoint of advanced applications, theuse of rollback as the main mechanismfor maintaining consistency is a seriousdisadvantage. Since operations in ad-vanced applications are generally long-lived (e. g., compiling a module), rollingthem back and restarting them wastesall the work these operations did (theobject code produced by compilation). Thein-appropriateness of rolling back a longtransaction in advanced applications isdiscussed further in Section 5.

4.5 Multiple Granularity Locking

The concurrency control protocols de-scribed so far operate on individual dataitems to synchronize transactions. It issometimes desirable, however, to be ableto access a set of data items as a singleunit. Gray et al. [19751 presented a mul-tiple granularity concurrency control pro-tocol that aims to minimize the numberof locks used while accessing sets of ob-jects in a database. In their model, Grayet al. organize data items in a tree wheresmall items are nested within largerones. Each nonleaf item represents thedata associated with its descendants. This

is different from the tree protocol pre-sented above in that the nodes of the treedo not represent the order of access ofindividual data items but rather the or-ganization of data objects. The root of thetree represents the whole database.Transactions can lock nodes explicitly,which in turn locks descendants implic-itly. Two kinds of locks are defined: ex-clusive and shared. An exclusive (X) lockexcludes any other transaction from ac-cessing (reading or writing) the node; ashared (S) lock permits other transac-tions to read the same node concurrentlybut prevents any updating of the node.

To determine whether to grant a lockon a node to a transaction, the transac-tion manager would have to follow thepath from the root to the node to find outif any other transaction has explicitlylocked any of the ancestors of the node.This is clearly inefficient. To solve thisproblem, a third kind of lock mode calledan intention lock was introduced [Gray1978]. All the ancestors of a node mustbe locked in intention mode before anexplicit lock can be put on the node. Inparticular, nodes can be locked in fivedifferent modes. A nonleaf node is lockedin intention-shared (IS) mode to specifythat descendant nodes will be explicitlylocked in shared (S) mode. Similarly, anintention-exclusive (IX) lock implies thatexplicit locking is being done at a lowerlevel in exclusive (X) mode. A sharedand intention-exclusive (SIX) lock on anonleaf node implies that the whole sub-tree rooted at the node is being locked inshared mode and that explicit lock-ing will be done at a lower level withexclusive-mode locks. A compatibilitymatrix for the five kinds of locks is shown

in Figure 4. The matrix is used to deter-

mine when to grant lock requests andwhen to deny them.

Gray et al. defined the following multi-ple granularity protocol based on thecompatibility matrix:

(1)A transaction T, can lock a node in Sor IS mode only if all ancestors of thenode are locked in either IX or ISmode by T,.

ACM Computing Surveys, Vol. 23, No 3, September 1991

Page 11: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications “ 279

Is IX s SIX x

IS yes yes yes yes no

IX yes yes no no no

s yes no yes no no

SIX yes no no no no

x no no no no no

Figure 4. Compatibility matrix of granularity

locks

(2) A transaction T, can lock a node inX, SIX, or IX mode only if all theancestors of the node are locked ineither SIX or IX mode by 7’,.

(3) Locks should be released either at theend of the transaction (in any order)or in leaf-to-root order. In particular,if locks are not held to the end of thetransaction, the transaction shouldnot hold a lock on a node after releas-ing the locks on its ancestors.

The multiple granularity protocol in-creases concurrency and decreases over-head. This is especially true when thereis a combination of short transactionswith a few accesses and transactions thatlast for a long time accessing a largenumber of objects such as audit transac-tions that access every item in thedatabase. The Orion object-orienteddatabase system provides a concurrencycontrol mechanism based on the multi-granularity mechanism described above[Garza and Kim 1988; Kim et al. 19881.

4.6 Nested Transactions

A transaction, as presented above, is aset of primitive atomic actions abstractedas read and write operations. Each trans-action is independent of all other trans-actions. In practice, there is a need tocompose several transactions into oneunit (i. e., one transaction) for two rea-sons: (1) to provide modularity and (2) toprovide finer-grained recovery. The re-covery issue may be the more importantone, but it is not addressed in detail heresince the focus of this paper is on concur-

rency control. The modularity problem isconcerned with preserving serializabilitywhen composing two or more transac-tions. One way to compose transactionsis gluing together the primitive actionsof al] the transactions by concatenatingthe transactions in order into one bigtransaction. This preserves consistencybut decreases concurrency because theresulting transaction is really a serialordering of the subtransactions. Inter-leaving the actions of the transactions toprovide concurrent behavior, on the otherhand, can result in violation of serializ-ability and thus consistency. What isneeded is to execute the composition oftransactions as a transaction in its ownright and to provide concurrency controlwithin the transaction.

The idea of nested spheres of control,which is the origin of the nested transac-tions concept, was first introduced byDavies [1973] and expanded by Bjork[19731. Reed [19781 presented a compre-hensive solution to the problem of com-posing transactions by formulating theconcept of nested transactions. A nestedtransaction is a composition of a set ofsubtransactions; each subtransaction canitself be a nested transaction. To othertransactions, only the top-level nestedtransaction is visible and appears as anormal atomic transaction. Internally,however, subtransactions are run ccmcur -rently and their actions are synchronizedby an internal concurrency control mech-anism. The more important point is thata subtransacticm can fail and be restartedor replaced by another subtransactionwithout causing the whole nested trans-action to fail or restart. In the case ofgluing the actions of subtransactions to-gether, on the other hand, the failure ofany action would cause the whole newcomposite transaction to fail.

In Reed’s design, timestamp orderingis used to synchronize the concurrent ac-tions of subtransactions within a nestedtransaction. Moss designed a nestedtransaction system that uses locking forsynchronization [Moss 19851.

As far as concurrency is concerned, thenested transaction model presented above

ACM Computing Surveys, Vol. 23, No 3, September 1991

Page 12: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

280 * N. S. Barghouti and G. E. Kaiser

TM . - TMan

uread(A) read(B)modify(A)

mw?@y)write(A)

execute concurrently

A “’”””eAread(A) read(B)

modify(A) modify(B)

write(A) wnte(13)

execute concurrently

Figure 5. Scheduling nested transactions,

does not change the meaning of transac-tions (in terms of being atomic). It alsodoes not alter the concept of serializabil-ity. The only advantage of nesting is per-formance improvement because of thepossibility of increasing concurrency atthe subtransaction level, especially in amultiprocessor system. To illustrate this,consider transactions T~O~~ and T~~,Y ofFigure 2. We can construct each as anested transaction as shown in Figure 5.Using Moss’s algorithm, the concurrentexecution of John’s transaction andMary’s transaction will produce thesame schedule presented in Figure 2.Within each transaction, however, thetwo subtransactions can be executedconcurrently, improving the overallperformance.

It should be noted that many of theconcurrency control mechanisms pro-posed for advanced database applicationsare based on combinations of optimisticconcurrency control, multiversion ob -jects, and nested transactions. To under-stand the reasons behind this, we mustfirst address the concurrency control re-quirements of advanced database appli-cations. We explore these requirementsin Section 5; in the rest of the paper, wepresent several approaches that takethese requirements into consideration.

5. CONCURRENCY CONTROL

REQUIREMENTS IN ADVANCED

DATABASE APPLICATIONS

Traditional llBMSs enforce serializableexecutions of transactions with respect to

read and write operations because of thelack of semanti~ knowledge about theapplication-specific operations. This leadsto the inability to specify or check se-mantic consistency constraints on data.But there is nothing that makes a nonse-rializable schedule inherently inconsist-ent. If enough information is knownabout the transactions and operations, anonserializable but consistent schedulecan be constructed. In fact, equating thenotions of consistency with serializabilitycauses a significant loss of concurrencyin advanced applications. In these appli-cations, it is often possible to define spe-cific consistency constraints. The DBMScan use these specifications rather thanserializability as a basis for maintainingconsistency. Several researchers havestudied the nature of concurrent behav-ior in advanced applications and havearrived at new requirements for concur-rency control [13ancilhon et al. 1985; Yehet al. 1987]:

(1) Supporting long transactions. Opera-tions on objects in design environ-ments (such as compiling source codeor circuit layout) are often long-lived.If these operations are embedded intransactions, these transactions, un-like traditional ones, will also belong-lived. Long transactions needdifferent support than traditionalshort transactions. In particular,blocking a transaction until anothercommits is rarely acceptable for longtransactions. It is worthwhile notingthat the problem of long transactions

ACM Computmg Surveys, Vol 23, No, 3, September 1991

Page 13: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications ● 281

(2)

has also been addressed in tradi-tional data processing applications(e.g., bank audit transactions).

Supporting user control. In order tosupport user tasks that are nondeter -ministic and interactive in nature,the concurrency control mechanismshould provide the user with the abil-ity to start a transaction, interac-tively execute operations within it,dynamically restructure it, and com-mit or abort it at any time. The non-deterministic nature of transactionsimplies that the concurrency controlmechanism will not be able to deter-mine whether or not the execution ofa transaction will violate databaseconsistency, except by actually exe-cuting it and validating its resultsagainst the changed database. Thismight lead to situations in which theuser might have invested many hoursrunning a transaction only to findout later when he or she wants tocommit the work that some of theoperations performed within thetransaction violated some consistencyconstraints. The user would defi-nitely oppose deleting all of the work(by rolling back the transaction). Heor she might, however, be able toreverse the effects of some opera-tions explicitly in order to regainconsistency. Thus, there is a needto provide more user control overtransactions.

(3) Supporting synergistic cooperation.Cooperation among programmers todevelop project components has sig-nificant implications on concurrencycontrol. In CAD/CAM systems, SDES,and other design environments, sev-eral users might have to exchangeknowledge (i.e., share it collectively)in order to be able to continue theirwork. The activities of two or moreusers working on shared objects maynot be serializable. The users mwypass the shared objects back and forthin a way that cannot be accomplishedby a serial schedule. Also, two usersmight be modifying two parts of the

same object concurrently, with theintent of integrating these parts tocreate a new version of the object. Inthis case, they might need to look ateach others’ work to make sure theyare not modifying the two parts in away that would make their integra-tion difficult. This kind of sharingand exchanging knowledge wastermed synergistic interaction by Yehet al. To insist on serializable concur-rency control in design environmentsmight thus decrease concurrency or,more significantly, actually preventdesirable forms of cooperation amongdevelopers.

There has been a flurry of research todevelop new approaches to transactionmanagement that meet the requirementsof advanced applications. In the rest ofthe paper, we survey the mechanismsthat address the requirements listedabove. We categorize these mechanismsinto three categories according to whichrequirement they support best. All themechanisms that address only the prob-lems introduced by long transactions aregrouped in one section. Of the mecha-nisms that address the issue of coopera-tion, some achieve only coordination ofthe activities of multiple users, whereasothers allow synergistic cooperation. Thetwo classes of mechanisms are separatedinto two sections. Issues related to usercontrol are briefly addressed by mecha-nisms in both categories, but we did notfind any mechanism that provides satis-factory support for user control overtransactions in advanced applications.

In addition to the three requirementslisted above, many advanced applicationsrequire support for complex objects. Forexample, objects in a software projectmight be organized in a nested objectsystem (projects consisting of modulesthat contain procedures), where individ-ual objects are accessed hierarchically.We do not sur~ey mechanifims that sup-port complex objects because describingthese mechanisms would require ex-plaining concepts of object-orientedprogramming and object-oriented data-

ACM Computing Surveys, Vol. 23, No. 3, September 1991

Page 14: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

282 e N. S. Barghouti and G. E. Kaiser

base systems, both of which are outsidethe scope of this paper. It is worthwhilenoting, however, that the complexity ofthe structure and the size of objects inadvanced applications strongly suggestthe appropriateness of concurrencycontrol mechanisms that combine andextend multiversion and multiple granu-larity mechanisms.

Many of the ideas implemented in themechanisms we survey in the rest of thepaper have been discussed earlier in othercontexts. For instance, some of the ideasrelated to multilevel transactions, longtransactions, and cooperative transac-tions were discussed by Davies [19781.

6. SUPPORTING LONG TRANSACTIONS

Many of the operations performed on datain advanced database applications arelong-lived. Some, such as compiling codeor printing a complete layout of a VLSIchip, last for several minutes or hours.When these operations are part of atransaction, they result in a long trans-action (LT), which lasts for an arbitrarilylong period of time (ranging from hoursto weeks). Such transactions occur in tra-ditional domains (e.g., printing themonthly account statements at a bank)as well as in advanced applications, butthey are usually an order of magnitudelonger in advanced applications. LTs areparticularly common in design environ-ments. The length of their durationcauses serious performance problems ifthese transactions are allowed to lockresources until they commit. Other shortor long transactions wanting to accessthe same resources are forced to waiteven though the LT might have finishedusing the resources. LTs also increasethe likelihood of automatic aborts(rollback) to avoid deadlock or in thecase of failing validation in optimisticconcurrency control.

Two main approaches have been pur-sued to solve these problems: extendingserializability-based mechanisms whilestill maintaining serializable schedulesand relaxing serializability of schedulescontaining LTs. These alternative ap-

proaches use the application-specific se-mantics of operations in order to increaseconcurrency. Several examples of eachapproach are presented in this section.Some of the schemes were proposed tosupport LTs for traditional DBMSS, butthe techniques themselves seem perti-nent to advanced applications and thusare discussed in this section.

6.1 Extending Serializability-Based

Techniques

In traditional transaction processing, alldatabase operations are abstracted intoread and write operations. This abstrac-tion is necessary for designing general-purpose concurrency control mechanismsthat do not depend on the particulars ofapplications. Two-phase locking (2PL),for example, can be used to maintainconsistency in any database system, re-gardless of the intended application. Thisis true because 2PL maintains serializ-ability, and thus consistency, of trans-action schedules by guaranteeing theatomicit y of all transactions.

The performance of 2PL, however, isunacceptable for advanced applicationsbecause it forces LTs to lock resources fora long time even after they have finishedusing these resources. In the meantime,other transactions that need to access thesame resources are blocked. Optimisticmechanisms that use time stamp order-ing also suffer from performance problems when applied to long transactions.These mechanisms cause repeated roll-back of transactions when the rate ofconflicts increases significantly, which isgenerally the case in the context of longtransactions.

One approach for solving the problemsintroduced by LTs is to extract seman-tic information about transactions andoperations and use that information toextend traditional techniques. The ex-tended technique should revert back tothe traditional scheme in case the addi-tional information is not available (i.e.,it might be available for some transac-tions but not for others). This approach isthe basis for extending both two-phase

ACM Computmg Surveys, Vol. 23, No. 3, September 1991

Page 15: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications “ 283

locking and optimistic concurrency con-trol in order to address the requirementsof long transactions.

6.1.1 Altruistic Locking

One piece of information that can be usedto increase concurrency is when re-sources are no longer needed by a trans-action so they can be released and usedby other transactions. This informationcan be used to allow a long transaction,which otherwise follows a serializablemechanism such as two-phase locking, torelease some of its resources condition-ally. These resources can then be used byother transactions given that they satisfycertain requirements.

One formal mechanism that followsthis approach is altruistic locking, whichis an extension of the basic two-phaselocking algorithm [Salem et al. 19871.Altruistic locking makes use of informa-tion about access patterns of a trans-action to decide which resources it canrelease. In particular, the technique usestwo types of information: negative accesspattern information, which describes ob-jects that will not be accessed by thetransaction and positive access patterninformation, which describes which andin what order objects will be accessed bythe transaction. Taken together, thesetwo types of information allow longtransactions to release their resources af-ter they are done with them. The set ofall data items that have been locked andthen released by an LT is called the wakeof the transaction. Releasing a resourceis a conditional unlock operation becauseit allows other transactions to access thereleased resource as long as they followthe restrictions stated in the protocol be-low, which ensures serializability.

A two-phase with release schedule isthen defined as any schedule that ad-heres to two restrictions:

(1) No two transactions can hold locks onthe same data item simultaneouslyunless one of them has locked andreleased the object before the otherlocks it; the later lock holder is said

to be in the wake of the releasingtransaction.

(2) If a transaction is in the wake ofanother transaction, it must be com-pletely in the wake of that trans-action. This means that if John’stransaction locks a data item thathas been released by Mary’s transac-tion, any data item that is accessedby both John and Mary and that iscurrently locked by John must havebeen released by Mary before it waslocked by John.

These two restrictions guarantee seri-alizability of transactions without alter-ing their structure. The protocol assumestransactions are programmed and notuser controlled (i. e., the user cannotmake up the transactions as he or shegoes along). In the following example,however, we will assume an informal ex-tension to this mechanism that willallow user-controlled transactions.

Consider again the example in Figure1, where each module in the project con-tains a number of procedures (subobjects).Suppose Bob, who joined the program-ming team of Mary and John, wants tofamiliarize himself with the code of allthe procedures of the project. Bob startsa long transaction, Z’~Oh,that accesses allof the procedures, one procedure at atime. He needs to access each procedureonly once to read it and add some com-ments about the code; as he finishes ac-cessing each procedure he releases it. Inthe meantime, John starts a short trans-action, T~O~~,that accesses only two pro-cedures, pl then p2, from module A.Assume T~Ob has already accessed p2and released it and is currently readingpl. T~O~. has to wait until T~Ob is fin-ished with pl and releases it. At thatpoint T~Oh~ can start accessing pl byentering the wake of TBOb. TJohn will kallowed to enter the wake of T~Ob (i. e., tobe able to access pl) because all of theobjects T~O~~needs to access ( pl and p2)are in the wake of T~Ob. After finishingwith PI, T~O~. can start accessing p2without delay since it has already beenreleased by TBOb.

ACM Computing Surveys, Vol. 23, No. 3, September 1991

Page 16: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

284 - N. S. Barghouti and G. E. Kaiser

dataobjeCtS

p4

\

... .. . . ... .. .. .. . .. . ... .. . . .. . . ... . . ... .. .. .. . . .. .. .. .. .. .. ... .. .. .. .. .. .. .. .. .. . .. .. .. . . .. .. .. .. .. .. .. ... .. .. .. . ..

p5

‘Bob

p2.. . . ... .. .. . .. .. .. .. ..

pl ..-. -.. ..- .. . . . . . ...~.- .. .. . . .. .

. . .. .. .. .. .. . . ... . . .. . .. ..

p8 .. .. .. .. ... .. .. .. ..+. .. .. . . . .. .. .. .. . .p9 . ... .. .. .. . .. . ... .. .. .. . . ... . ... . . .. . .. . .. .. . .. .. .. . .

plo

Figure 6. Access patterns of three transactions

Now assume Mary starts another shorttransaction, T~~,Y, that needs to accessboth p2 and a third procedure p3 that isnot yet in the wake of T~Ob. T~,,Y canaccess p2 after T~O~. terminates, butthen it must wait until either p3 hasbeen accessed by T~Ob (i.e., until p3 en-ters the wake of T~Ob) or until T~.~ ter-minates. If Z’~O~never accesses p3 (Bobchanges his mind about viewing p3),T ~,,Y is forced to wait until T~Ob termi-nates (which might take a long time sinceit is a long transaction). To improve con-currency in this situation, Salem et al.[19871 introduced a mechanism for ex-panding the wake of a long transactiondynamically in order to enable shorttransactions that are already in the wakeof a long transaction to continue run-ning. The mechanism uses the negativeaccess information provided to it in orderto add objects that will not be accessed bythe long transaction to its wake. Con-tinuing the example, the mechanismwould add p3 to the wake of T~Ob byissuing a release on p3 even if T~Ob hadnot locked it. This would allow T~~,Y to

access p3 and thus continue executingwithout delay.

Figure 6 depicts the example above.Each data object is represented along thevertical axis. Time is represented along

the horizontal axis. The transactionsbelonging to Bob, John, and Mary arerepresented by solid lines. For example,TBOb k represented by a solid line thatpasses through several black dots. Eachblack dot represents a data object (hori-zontal dotted lines connect black dots toobjects they stand for). T~Ob accesses p~,

P~, PZ3 PD P~, Pg, P1O, PG and PV (thethick line extending to p~ is not part of

‘Bob). TBob accesses P2 at time tl as

indicated by the black dot at point ( tl,~) in the graph. TJOh. is in the wake of

Bob totally because every object ac-cessed by TJOh~ ( pl and p2) was accessedbefore by TBOb. This is not the case withT ~,,Y. In order to allow T~,,v to execute,the transaction expand the wake of T~Obby adding p3 to it (as shown by the thickline), then Z’MarY would be totally in the

wake of TBOb. In this case, the scheduleof the three transactions is equivalent tothe serial execution of T~Ob, followed by

‘John ~ followed by TM,~y.The basic advantage of altruistic lock-

ing is its ability to use the knowledgethat a transaction no longer needs accessto a data object it has locked. It main-tains serializability and assumes the datastored in the database are of the conven-tional form. Furthermore, if accessinformation is not available, any trans-

ACM Computmg Surveys, Vol. 23, No 3, September 1991

Page 17: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications ● 285

‘BobREAD I vALllwqm

‘John t

READ 1 VAL I wRITEt

read p 1

READ‘w

I VAL I ~~T

iread p2

TIME

Figure 7. Validation conflicts.

action, at any time, can run under theconventional 2PL protocol without per-forming any special operations. As ob-served earlier, however, because of theinteractive nature of transactions in de-sign environments, the access patterns oftransactions are not predictable. In theabsence of this information, altruisticlocking reduces to two-phase locking. Al-truistic locking also suffers from theproblem of cascaded rollbacks: When along transaction aborts, all the shorttransactions in its wake have tobe aborted even if they have alreadyterminated.

6. 1.2 Snapshot Validation

Altruistic locking assumes two-phaselocking as its basis and thus suffers fromthe overhead of locking mechanismsnoted in Section 4. An alternative ap-proach that avoids this overhead is toassume an underlying validation mecha-nism. As presented in Section 4, valida-tion (also called optimistic) techniquesallow concurrent transactions to proceedwithout restrictions. Before committinga transaction, however, a validationphase has to be passed in order to estab-lish that the transaction did not produceconflicts with other committed transac-tions. The main shortcoming of the tradi-tional validation technique is its weakdefinition of conflict. Because of thisweak definition some transactions, suchas those in Figure 2, are restarted unnec-essarily. In other words, the transactionsmight actually have been serializable butthe conflict mechanism did not recognizethem as such. This is not a serious prob -

lem in conventional applications wheretransactions are short. It is very undesir-able, however, to restart a long transac-tion that has done a significant amountof work. Pradel et al. [1986] observedthat the risk of restarting a transactioncan be reduced by distinguishing be-tween serious confZicts, which requirerestart, and nonserious conflicts, whichdo not. They introduced a mechanismcalled snapshot validation that uses thisapproach.

Going back to our example, assumeBob, John, and Mary start three transac-tions T~Oh, T~O~., and T~a,Y simul-taneously. T~O~ modifies (i.e., writes)procedures pl and p2 during the readphase of T~Ohn and T~,rY as shown inFigure 7. The validation phase of Z’~O~.and T~~r will thus consider operationsin T~Ob. ~ccording to the traditional opti-mistic concurrency control protocol, bothT John and T~~,Y would have to berestarted because of conflicts. Procedurespl and p2 that they read have beenupdated by TBOb TJOh. read pl, whichwas later changed by T~Ob; thus, whatT JOh~ read was out of date. This conflictis “serious” since it violates serializabil-ity and must be prevented. In this case,

JOh. has to be restarted to read theTupdated PI. The conflict between TM~rYand T~O~, however, is not serious sincethe concurrent schedule presented inFigure 7 is equivalent to the serialschedule of T~Ob followed by T~.,Y. Thisschedule is not allowed under the tradi-tional protocol, but the snapshot tech-nique allows T~,,Y to commit becausethe conflict is not serious.

Pradel et al. [1986] presented a simplemechanism for determining whether or

ACM Computing Surveys, Vol. 23, No. 3, September 1991

Page 18: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

286 ● N. S. Barghouti and G. E. Kaiser

not conflicts are serious. In the exampleabove, T~Ob terminates while TM,,y isstill in its read phase. Each transactionhas a read set that is ordered by the timeof access of each object. For example, ifobject PI is accessed before object p2 bythe same transaction, then pl appearsbefore p2 in the transaction’s read set.When T~Ob terminates, TM,, takes note

dof the termination in its rea set. Duringits validation phase, T~,,Y has to con-sider only the objects that were read be-fore T~O~ terminated. Any conflicts thatoccur after that point in the read set arenot considered serious. Thus, the conflictbetween T~Ob and T~,,Y regarding proce-dure p2 is not serious because T~arY readp2 after T~Ob has terminated.

Pradel et al. [1986] also analyzed thestarvation problem in the conventionaloptimistic protocol and discovered thatthe longer the transaction, the greaterthe risk of starvation. Starvation occurswhen a transaction that is restarted be-cause it had failed its validation phasekeeps failing its validation phase due toconflicts with other transactions. Starva-tion is detected after a certain number oftrials and restarts. The classical opti-mistic concurrency control protocol solvesthe starvation problem by locking thewhole database for the starving transac-tion, thus allowing it to proceed uninter-rupted. Such a solution is clearly notacceptable for advanced applications.Pradel et al. [19861 present an alterna-tive solution based on the concept of asubstitute transaction.

If a transaction, T~O~., is starving, asubstitute transaction, ST~O~n, is createdsuch that ST~O~. has the same read setand write set of T~O~~. At this point,T ~O~. is restarted. ST~O~. simply readsits transaction number (the first thingany transaction does), then immediatelyenters its validation phase. This will forceall other transactions to validate againstST.O,. . Since ST~O~. has the same readand write sets as T~O~., it will make surethat any other transaction T~ that con-flicts with T~O~~ would not pass its vali-dation against ST~Ob~ and thus wouldhave to restart. This “clears the way” for

T ~0~~ to continue its execution with amuch decreased risk of restart. ST~O~.terminates only after T~O~. commits.

6.1.3 Order-Preserving Serializability for

Multilevel Transactions

The two mechanisms presented above ex-tend traditional single-level protocols inwhich a transaction is a flat computationmade up of a set of atomic operations. Inadvanced applications, however, mostcomputations are long-duration opera-tions that involve several lower-levelsuboperations. For example, linking theobject code of a program involves readingthe object code of all its component mod-ules, accessing system libraries, and gen-erating the object code of the program.Each of these operations might itself in-volve suboperations that are distinguish-able. If traditional single-level protocolsare used to ensure atomicity of such longtransactions, the lower-level operationswill be forced to be executed in serialorder, resulting in long delays and a de-crease in concurrency.

Beeri et al. [1988] observed that con-currency can be increased if long-duration operations are abstracted intosubtransactions that are implemented bya set of lower-level o~erations. If theselower-level operation; are themselvestranslated into yet more lower-level oper-ations, the abstraction can be extendedto multiple levels. This is distinct fromthe traditional nested transactions model~resented in Section 4 in two main re -.spects: (1) A multilevel transaction has apredefined number of levels, of whicheach two adjacent pairs defines a layer of

the system, whereas nested transactionshave no medefined notion of lavers. (2)In contra~t to nested transactio& wherethere need not be a notion of abstrac-tion. in a multilevel transaction. thehigher the level, the more abstract theo~erations.‘ These two distinctions lead to a major

difference between transaction manage-ment for nested transactions and formultilevel transactions. In nested trans-

ACM Computing Surveys, Vol 23, No, 3, September 1991

Page 19: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications * 287

actions, a single global mechanism mustbe used because there is no predefinenotion of layers. The existence of layersof abstraction in multilevel transactionsopens the way to a modular approach to

concurrency control. Different concur-rency control protocols (schedulers) areapplied at different layers of the system.More specifically, layer-specific concur-rency control protocols can be used. Eachof these protocols must ensure serializ-ability with respect to its layer. In addi-tion, a protocol at one layer should notinvalidate the protocols at higher levels.In other words, the protocols at all layersof the multilevel transaction should worktogether to produce a correct execution(schedule).

Unfortunately, not all combinations ofconcurrency control protocols lead to cor-rect executions. To illustrate, assume wehave a three-level necessary transactionand the protocol between the second andthird levels is commutativity-based. Thismeans that if two adjacent operations atthe third level can commute, their orderin the schedule can be changed. C!hang-ing the order of operations at the thirdlevel, however, might change the order ofsubtransactions at the second level. Sincethe protocol only considers operations atthe third level, it may change the orderof operations in such a way so as toresult in a nonserializable order of thesubtransactions at the second level.

The example above shows that serializ-ability is too weak a correctness criterionto use for the “handshake” between theprotocols of adjacent layers in a multi-level system. The correctness criteriamust be extended to take into accountthe order of transactions at the adjacentlayers. Beeri et al. [1986, 1989] intro-duced the notion of order-preserving cor-rectness as the necessary property thatlayer-specific protocols must use to guar-antee consistency. This notion was usedearlier in a concurrency control model formultilevel transactions implemented inthe DASDBS system [Weikum 1986;Weikum and Schek 19841. A combinedreport on both of these efforts appears inBeeri et al. [19881.

The basic idea of order-preserving seri-alizability is to extend the concept ofcommutativity. Commutativity statesthat order transformation of two opera-tions belonging to the same transactioncan be applied if and only if the twooperations commute (i. e., the order oftheir execution with respect to each otheris immaterial). This notion can be trans-lated to multilevel systems by allowingthe order of two adjacent operations tochange only if their least common ances-tor does not impose an order on theirexecution. If commuting operations leadsto serializing the operations of a sub-transaction in one unit (i.e., they are notinterleaved with operations of other sub-transactions) and thus making it anatomic computation, the tree rooted atthe subtransaction can be replaced by anode representing the atomic executionof the subtransaction. Pruning serialcomputations and thus reducing thenumber of levels in a multilevel transac-tion by one is termed reduction.

To illustrate, assume Mary is assignedthe task of adding a new procedure p10to module A and recompiling the moduleto make sure the addition of procedurep10 does not introduce any compile-timeerrors. Bob is simultaneously assignedthe task of deleting procedure pO frommodule A. Adding or deleting a proce -dure from module A is an abstractionthat is implemented by two operations:updating the attribute that maintains thelist of procedures contained in A (i.e.,updating the object containing module A)and updating the documentation D todescribe the new functionality of moduleA after adding or deleting a procedure.Recompiling a module is an abstractionfor reading the source code of the moduleand updating the object containing themodule (e. g., to update its timestamp andmodify the object code). Consider the con-current execution of T~~,Y and T~Oh inFigure 8a. Although the schedule is notserializable, it is correct because the op-erations at the lower level can be com-muted so as to produce a serializableschedule while preserving the order ofthe subtransactions at the second level.

ACM Computing Surveys, Vol. 23, No. 3, September 1991

Page 20: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

288 ● N. S. Barghouti and G. E. Kaiser

‘w ‘John

womp’”’)I L,T–––

i——

T–l

tl=-lln”R(A) W(A) R(A) W!’) R(D) W(D) R(A). W(A) ~@), J(D)

T

4

R

(a) commute

t-m‘T-IhR(A) W(A) R(D) W(D) R(A) W(A) I&) W’(D) R(A) W(A)

yuureduce reduce

(c)

‘John

I

~dd(,10)T delete@O)

compde(A)

+-l--1-%-1-:-:i) W(A) R(A) ,W~A) R(D) ,W(D) I& W’(D) R(A) W(A)

Comute (b)

‘%L A J

add(plO) delete(plO) compile(A)

(d)

‘P

‘JohnI

m:add(plO) compile(A) delete(plO)

(e)

Figure 8. Order-preserving serializable schedule

The results of successive commutationsare shown in Figures 8b and 8c. Theresult of applying reduction is shown inFigure 8d, and the final result of apply-ing commutation to the reduced tree,which is a serial schedule, is shown inFigure 8e.

Beeri et al. [1988] have shown thatorder preservation is only a sufficientcondition to maintain consistency acrosslayers in a multilevel system. Theypresent a weaker necessary condition,conflict-based, order-preserving serializ-ability. This condition states that alayer-specific protocol need only preservethe order of conflicting operations of thetop level of its layers. For example, con-sider the schedule in Figure 9a, whichshows a concurrent execution of threetransactions initiated by Mary, Bob, andJohn. Compiling module A and compil-ing module B are nonconflicting opera-tions since they do not involve any sharedobjects. Linking the subsystem contain-ing both A and B, however, conflicts with

the other two operations. Although theschedule is not order-preserving serializ-able, it is correct because it could beserialized, as shown in Figure 9b, bychanging the order of the two compileoperations. Since these are nonconflict-ing subtransactions, the change of orderpreserves correctness.

Martin [1987] presented a similarmodel based on the paradigm of nestedobjects, which models hierarchical accessto data by defining a nested object sys-tem. Each object in the system exists at aparticular level of data abstraction. Op-erations at level i are specified in termsof operations at level i – 1. Thus, theexecution of operations at level i resultsin the execution of perhaps several sub-operations at level i – 1.The objects ac-cessed by suboperations at level i – 1 onbehalf of an operation on an object atlevel i are called subobjects of the objectat level i.

Martin’s model allows two kinds ofschedules that are not serializable—

ACM Computing Surveys, Vol. 23, No. 3, September 1991

Page 21: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications “ 289

‘Mary ‘John ‘Bob ‘Bob TJobn

,..p,,..)k_k,: k~ ‘k

R(A) R(A) W(A) R(B) W(B) R&) R(B) W(S) R(A) R(B) R(A) W(A)

(a) (b)

Figure 9. Conflict-based, order-preserving serializable schedule.

externally serializable schedules andsemantically verifiable schedules. Exter-nally serializable schedules allow onlyserializable access to top-level objectswhile allowing nonserializable access tosubobjects. Subobjects may be left in astate that cannot be produced by anyserial execution. Semantically verifiableschedules allow nonserializable access toobjects at all levels. Nonserializable be-havior can be proven to be correct if thesemantics of operations at all levels aregiven and considered. In Martin’s model,weakening an object’s conflict specifica-tion may produce a correct nonserializ -able schedule. For example, in Figure 9it can be specified that a write operationon a specific object at a specific level doesnot conflict with a read operation on thesame node. The scheduler would havethen allowed the link operation and thecompile operations to be commuted. Sucha schedule might be considered correct ifthe semantics of linking the object codeof two modules does not prohibit thelinker from reading different versions ofthe two modules.

6.2 Relaxing Serializability

The approaches presented in Section 6.1extend traditional techniques whilemaintaining serializability as a basis forguaranteeing consistency. Another ap-proach that aims at supporting longtransactions is based on relaxing the se-realizability requirement by using thesemantics of either data or application-specific operations. Relaxing serializabil-ity allows more concurrency and thus

improves the performance of a system ofconcurrent transactions.

The semantics-based mechanisms canbe divided into two main groups [Skarraand Zdonik 1989]. One group defines con-currency properties on abstract datatypes; the other defines concurrencyproperties on the transactions them-selves. The first group of mechanisms[Herlihy and Weihl 1988; Weihl 1988]constrains interleaving of concurrenttransactions by considering conflicts be-tween operations defined on typed ob-jects. Describing the details of this groupof mechanisms requires an overview ofabstract data types and object-orientedsystems. Since these concepts are outsidethe scope of this paper, we have chosen tolimit our discussion to the mechanismsin the second group, which use semanticsof transactions rather than typed objects.

6.2.1 Semantics-Based Concurrency Control

Garcia-Molina [1983] observed that byusing semantic information about trans-actions, a DBMS can replace the serial-izability constraint with the semanticconsistency constraint. The gist of thisapproach is that from a user’s point ofview, not all transactions need to beatomic. Garcia-Molina introduced thenotion of sensitive transactions to guar-antee that users see consistent data ontheir terminals, Sensitive transactionsare those that must output only consist-ent data to the user and thus must see aconsistent database state in order to pro-duce correct data. Not all transactionsthat output data are sensitive since some

ACM Computing Surveys, Vol. 23, No. 3, September 1991

Page 22: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

290 ● N. S. Barghouti and G. E. Kaiser

users might be satisfied with data thatare only relatively consistent. For exam-ple, suppose Bob wants to get an ideaabout the progress of his programmingteam. He starts a transaction T~O~ thatbrowses the modules and procedures ofthe project. Meanwhile, John and Maryhave two in-proWess transactions, T~O~~and T~~,Y, respectively, that are modify-ing the modules and procedures of theproject. Bob might be satisfied withinformation returned by a read-onlytransaction that does not take into con-sideration the updates being made by

‘John and ‘Mary . This would avoid delaysthat would result from having T~Oh waitfor T~O~. and T~,,Y to finish before read-ing the objects they updated.

A semantically consistent schedule isone that transforms the database fromone semantically consistent state to an-other. It does so by guaranteeing that allsensitive transactions obtain a consistentview of the database. Each sensitivetransaction must appear to be an atomictransaction with respect to all othertransactions.

It is more difficult to build a generalconcurrency control mechanism that de-tides which schedules preserve semanticconsistency than it is to build one thatrecognizes serializable schedules. Even ifall the consistency constraints were givento the DBMS (which is not possible in thegeneral case), there is no way for theconcurrency control mechanism to deter-mine a priori which schedules maintainsemantic consistency. The DBMS mustrun the schedules and check the con-straints on the resulting state of thedatabase in order to determine if theymaintain semantic consistency [Garcia–Molina 1983]. Doing that, however, wouldbe equivalent to implementing an opti-mistic concurrency control scheme thatsuffers from the problem of rollback. Toavoid rollback, the concurrency controlmechanism must be provided with infor-mation about which transactions arecompatible with each other.

Two transactions are said to be com-patible if their operations can beinterleaved at certain points without

violating semantic consistency. Havingthe user provide this information is notfeasible in the general case because itburdens the user with having to under-stand the details of applications. In someapplications, however, this kind of bur-den might be acceptable in order to avoidthe performance penalty of traditionalgeneral-purpose mechanisms. If this isthe case, the user still has to be pro-vided with a framework for supplyinginformation about the compatibility oftransactions.

Garcia-Molina presented a frameworkthat explicitly defines the semantics ofdatabase operations. He defines fourkinds of semantic information: (1) trans-action semantic types; (2) compatibilitysets associated with each type; (3) divi-sion of transactions into smaller steps(subtransactions); and (4) countersteps tocompensate for some of the steps exe-cuted within transactions. The first threekinds of information are declarative; thefourth piece of information consists of aprocedural description. Transactions arecategorized into types. The type of atransaction is determined by the natureof the operations it performs on objects.Each transaction type defines the stepsthat make up a transaction of that type.The steps are asumed to be atomic. Acompatibility set associated with a trans-action type defines allowable interleav-ing between steps of transactions of theparticular kind with the same or otherkinds of transactions. Countersteps spec-ify what to do in case a step needs to beundone.

Using these definitions, Garcia-Molinadefines an alternative concept to atomic-ity called semantic atomicity. A transac-tion is said to be semantically atomic ifall its steps are executed or if any exe-cuted steps are eventually followed bytheir countersteps. An atomic transac-tion, in contrast, is one in which all ornone of the steps are executed. In thecontext of a DBMS, the four pieces ofinformation presented above are used bya locking mechanism that uses two kindsof locks: local locks, which ensure theatomicity of transaction steps, and global

ACM Computing Surveys, Vol 23, No, 3, September 1991

Page 23: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications * 291

locks, which guarantee that the inter-leaving between transactions do not vio-late semantic consistency.

Thus, depending on the compatibilitysets of different types of transactions,various levels of concurrency can beachieved. In one extreme, if the compati-bility sets of all kinds of transactions areempty, the mechanism reverts to a tradi -tional locking mechanism that enforcesserializability of long transactions. In theother extreme, if all transaction typesare compatible, the mechanism only en-forces the atomicity of the small stepswithin each transaction, and thus themechanism reverts to a system of shortatomic transactions (i.e., the steps). Inadvanced applications where this kind ofmechanism might be the most applica-ble, allowable interleavings would be be-tween these two extremes.

In Garcia- Molina’s scheme, transac-tions are statically divided into atomicsteps. Compatibility sets define the al-lowable interleaving with respect tothose steps. Thus, if transactions of typeX are compatible with transactions oftypes Y and Z, any two transactions T,of type Y and T of type Z can arbitrar-

I?lily interleave t eir steps with a trans-action T~ of type X. There is thusno distinction between interleaving withrespect to Y and interleaving with re-spect to Z. Lynch [1983] observed thatit might be more appropriate to have dif-ferent sets of interleavings with respectto different transaction types. Morespecifically, it would be useful if for ev-ery two transaction types X and Y, theDBMS is provided with information aboutthe set of breakpoints at which the stepsof a transaction of type X can be inter-leaved with the steps of a transaction oftype Y. A breakpoint specifies a pointbetween two operations within a transac-tion at which one or more operations ofanother transaction can be executed.

Lynch’s observation seems to be validfor systems in which activities tend to behierarchical in nature, for example, soft-ware development environments. Trans-actions in such systems can often benested into levels. Each level groups

transactions that have something incommon in terms of access to data items.Level one groups all the transactions inthe system, whereas subsequent levelsgroup transactions that are more stronglyrelated to each other. A strong relationbetween two transactions might be thatthey often need to access the same ob-jects at the same time in a nonconflictingway. A set of breakpoints is then de-scribed for each level. The higher-ordersets (for the higher levels) always in-clude the lower order sets. This results ina total ordering of all sets of breakpoints.This means the breakpoints that specifyinterleavings at any level cannot be morerestrictive than those that define inter-leaving at a higher level.

Let us illush-ate this concept by contin-uing our example from the software de-velopment domain. Recall that Bob, John,and Mary are cooperatively developing asoftware project. In their development ef-fort, they need to modify objects (codeand documentation) as well as get infor-mation about the current status of devel -opment (e. g., the latest cross-referenceinformation between procedures in mod-ules A and B). Suppose Mary starts twotransactions (e. g., in two different win-dows), T~a,Yl and T~,,Y2, to modify aprocedure in module A and get cross--reference information, respectively. Bobstarts a transaction T~O~l to updatea procedure in module B. John startstwo transactions, T~O~~l to modify mod-ule A and T~O~~z to get cross-referenceinformation.

A hierarchy of transaction classes forthis example can be set up as shown inFigure 10. The top level includes alltransactions. Level 2 groups all modifi-cation transactions ( T~~,Yl, T~Obl, andT ~Ohnl) together and all cross-referencetransactions (T~,,Yz and T~O~~2) to-gether. Level 3 separates the transac-tions according to which modules theyaffect. Level 3 separates the transactionsthat modify module A (T~,,Yl and

~0~~1)from those that modify module B~T~O~l). Level 4 contains all the singletontransactions.

The sets of breakpoints for these levels

ACM Computing Surveys, Vol, 23, No, 3, September 1991

Page 24: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

292 “ N. S. Barghouti and G. E. Kaiser

Level 1

Level 2n

Level 3

#

‘Maryl ‘John 1

Level 4%aryl =Joblsl

Figure 10.

can be specified by describing the

3 ‘Bobl

=il &‘&ry2 ‘J0bn2

‘John2

Multilevel transaction classes.

trans -action ~egments “ between ~he break-points. For example, the’ top-level setmight specify that no interleaving is al-lowed and the second-level set mightspecify that all modification transactionsmight interleave at some granularity andthat cross-reference transactions mightsimilarly interleave. The two types oftransactions, however, cannot interleavetheir operations. This guarantees thatcross-reference information does notchange while a modification transactionis in progress.

The concurrency control mechanismcan then use the sets of breakpoints toprovide as much concurrency as permit-ted by the allowed interleaving betweenthese breakpoints at each level. Atomic-ity with respect to breakpoints and al -lowed interleaving is maintained at eachlevel. Thus, the mechanism in our exam-ple might allow transactions T~a,Yl andT ~0~~1 to interleave their steps whilemodifying module A (i. e., allow some de-gree of cooperation so as not to block outmodule A for a long time by one of them),

but it will not allow TM,,YI and T~O~~z to

interleave their operations.Breakpoints is not the only way to pro-

vide semantic information about transac -tions. In some advanced applications suchas CAD, where the different parts of thedesign are stored in a project database, itis possible to supply semantic informa-tion in the form of integrity constraintson database entities. Design operationsincrementally change those entities in

MoreAllowableInterleavings

r

order to reach the final design fEastman1980, 1981]. By definition, Full ‘integrityof the design, in the sense of satisfyingits specification, exists only when the de-sign is complete. Unlike in conventionaldomains where database integrity ismaintained during all quiescent periods,the iterative design process causes theintegrity of the design database to beonly partially satisfied until the design iscomplete. There is a need to define trans-actions that maintain the partial in-tegrity required by design operations.Kutay and Eastman [1983] proposed atransaction model that is based on theconcept of entity state.

Each entity in the database is associ-ated with a state that is defined in termsof a set of integrity constraints. Like atraditional transaction, an entity statetransaction is a collection of actions thatreads a set of entities and potentiallywrites into a set of entities. Unlike tradi-tional transactions, however, entity statetransactions are instances of transactionclasses. Each class defines (1) the set ofentities that instance transactions read,(2) the set of entities that instance trans-actions write, (3) the set of constraintsthat must be satisfied on the read andwrite entity sets prior to the invocationof a transaction, (4) the set of constraintsthat can be violated during the executionof an instance transaction, (5) the set ofconstraints that hold after the executionof the transaction is completed, and (6)the set of constraints that is violatedafter the transaction execution is com-

ACM Computing Surveys, Vol. 23, No. 3, September 1991

Page 25: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications “ 293

pleted. A simple example of a transac-tion class is the class of transactions thathave all the entities in the database asits read set. All other sets are emptysince these transactions do not transformthe database in any way.

The integrity constraints associatedwith transaction classes define a partialordering of these classes in the form of aprecedence ordering. Transaction classescan thus be depicted as a finite-state ma-chine where the violation or satisfactionof specific integrity constraints defines atransition from one database state to an-other. Based on this, Kutay andEastman [1983] define a concurrencycontrol protocol that detects violations tothe precedence ordering defined by theapplication-specific integrity constraints.Violations are resolved by communica-tion among transactions to negotiate theabortion of one or more of the conflictingtransactions. Kutay and Eastman didnot provide details of intertransactioncommunication.

6.2.2 Sagas

Semantic atomicity, multilevel atomic-ity, and entity-based integrity con-straints are theoretical concepts that arenot immediately practical. For example,neither Garcia–Molina [1983] nor Lynch[1983] explain how a multilevel atomic-ity scheme might be implemented. It isnot clear how the user decides on thelevels of atomicity and breakpoint sets.Simplifying assumptions are needed tomake these concepts practical. One re -striction that simplifies the multilevelatomicity concept is to allow only twolevels of nesting: the LT at the top leveland simple transactions. Making thissimplifying restriction, Garcia –Molinaand Salem [1987] introduced the conceptof sagas, which are LTs that can be bro-ken up into a collection of subtransac-tions that can be interleaved in any waywith other transactions.

A saga is not just a collection of unre-lated transactions because it guaranteesthat all its subtransactions will be com-pleted or they will be compensated (ex-plained shortly). A saga thus satisfies

the definition of a transaction as a logi-cal unit; a saga similar to Moss’s nestedtransactions and Lynch’s multileveltransactions in that respect. Sagas aredifferent from nested transactions, how-ever, in that, in addition to there beingonly two levels of nesting, they are notatomicity units since sagas may view thepartial results of other sagas. By struc-turing long transactions in this way,nonserializable schedules that allow moreconcurrency can be produced. Mecha-nisms based on nested transactions aspresented in Section 4 produce onlyserializable schedules.

In traditional concurrency control,when a transaction is aborted for somereason, all the changes that it introducedare undone and the database is returnedto the state that existed before the trans-action began. This operation is calledrollback. The concept of rollback is notapplicable to sagas because unlike atomictransactions, sagas permit other transac-tions to change the same objects thatits committed subtransactions havechanged. Thus, it would not be possibleto restore the database to its state beforethe saga started without cascaded abortsof all the committed transactions thatviewed the partial results of the abortedtransaction. Instead, user-supplied com -pensation functions are executed tocompensate for each transaction thatwas committed at the time of failure orautomatic abort.

A compensation function undoes theactions performed by a transaction froma semantic point of view. For example, ifa transaction reserves a seat on a flight,its compensation function would cancelthe reservation. We cannot say, however,that the database was returned to thestate that existed before the transactionstarted, because, in the meantime, an-other transaction could have reservedanother seat and thus the number of seatsthat are reserved would not be the sameas it was before the transaction.

Although sagas were introduced tosolve the problem of long transactions intraditional applications, their basic ideaof relaxing serializability is applicable todesign environments. For example, a long

ACM Computing Surveys, Vol 23, No 3, September 1991

Page 26: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

294 * N. S. Barghouti and G. E. Kaiser

transaction to fix a bug in a designenvironment can be naturally modeledas a saga that consists of subtransactionsto edit a file, compile source code, andrun the debugger. These subtransactionscan usually be interleaved with sub-transactions of other long transactions.The three transactions in Figure 9 can beconsidered sagas, and the interleavingsshown in the figure can be allowed underthe sagas scheme. Using compensationfunctions instead of cascaded aborts isalso suitable for advanced applications.For example, if one decided to abort themodifications introduced to a file, onecould revert to an older version of the fileand delete the updated version.

One shortcoming of sagas, however, isthat they limit nesting to two levels. Mostdesign applications require several levelsof nesting to support high-level opera-tions composed of a set of suboperations[Beeri et al. 19891. In software develop-ment, for example, a high-level operationsuch as modifying a subsystem trans-lates into a set of operations to modify itscomponent modules, each of which in turnis an abstraction for modifying the proce-dures that make up the module.

Realizing the multilevel nature of ad-vanced applications, several researchershave proposed models and proof tech-niques that address multilevel transac-tions. We already described three relatedmodels in Section 6.1.3 [Beeri et al. 1988,1989; Weikum and Schek 1984]. Twoother nested transaction models [Kim etal. 1984; Walter 1984] are described inSection 7, since these two models addressthe issue of groups of users and coordi-nated changes. We now will describe aformal model of correctness without seri-alizability that is based on multileveltransactions.

6.2.3 Confhct Predicate Correctness

Korth and Speegle [19881 have presenteda formal model that allows mathematicalcharacterization of correctness withoutserializability. Their model combinesthree features that lead to enhancingconcurrency over the serializability-based

models: (1) versions of objects, (2) multi-level transactions, and (3) explicit con-sistency predicates. These features aresimilar to Kutay and Eastman’s [19831predicates described earlier. We describeKorth and Speegle’s model at an intu-itive level.

The database in Korth and Speegle’smodel is a collection of entities, each ofwhich has multiple versions (i. e., multi-ple values). The versions are persistentand not transient like in the traditionalmultiversion schemes. A specific combi-nation of versions of entities is termed aunique database state. A set of uniquedatabase states that involve differentversions of the same entities forms onedatabase state. In other words, eachdatabase state has multiple versions. Theset of all versions that can be generatedfrom a database state is termed the uer-sion state of the database. A transactionin Korth and Speegle’s model is a map-ping from a version state to a uniquedatabase state. Thus, a transactiontransforms the database from one consist-ent combination of versions of entities toanother. Consistency constraints arespecified in terms of pairs of input andoutput predicates on the state of thedatabase. A predicate, which is a logicalconjunction of comparisons between enti-ties and constants, can be defined on aset of unique states that satisfy it. Eachtransaction guarantees that if its inputpredicate holds when the transaction be-gins, its output predicate will hold whenit terminates.

Instead of implementing a transactionby a set of flat operations, it is im-plemented by a pair of sets of sub-transactions and a partial ordering onthese subtransactions. Any transactionthat cannot be divided into subtransac-tions is a basic operation such as readand write. Thus, a transaction in Korthand Speegle’s model is a quadruple(T, P, I,, 0,), where T is the set of sub-transactions, P is a partial ordering onthese subtransactions, 1~ is the inputpredicate on the set of all database states,and 0~ is the output predicate. The inputand output predicates define three sets of

ACM Computing Surveys, Vol 23, No. 3, September 1991

Page 27: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications ● 295

data items related to a transaction: (1)the input set, (2) the update set, and (3)the fixed-point set, which is the set ofentities not updated by the transaction.Given this specification, Korth andSpeegle define a parent-based executionof a transaction as a relation on the set ofsubtransactions T that is consistent withthe partial order P. The re~ation encodesdependencies between subtransactionsbased on their three data sets. This defi-nition allows independent executions ondifferent versions of database states.

Finally, Korth and Speegle define anew multilevel correctness criteria: Anexecution is correct if at each level, everysubtransaction can access a databasestate that satisfies its input predicateand the result of all of the subtransac -tions satisfies the output predicate of theparent transaction. But since determin-ing whether an execution is in the classof correct executions and is NP-complete,Korth and Speegle consider subsets ofthe set of correct executions that haveefficient protocols. One of these subsetsis the conflict predicate correct (CPC)class in which the only conflicts that canoccur are a read of a data item followedby a write of the same data item (this isthe same as in traditional multiversiontechniques). In addition, if two data itemsare in different conjuncts of the consist-ency predicate, execution order must beserializable only with respect to eachconjunct individually. If for each con-junct the execution order is serializable,the execution is correct. The protocol thatrecognizes the CPC class creates a graphfor each conjunct where each node is atransaction. An arc is drawn betweentwo nodes if one node reads a data itemin the conjunct and the other node writesthe same data item in the same conjunct.A schedule is correct if the graphs of allconjuncts are acyclic. This class containsexecutions that could not be produced byany of the mechanisms mentioned aboveexcept for sagas.

Korth and Speegle [1990], recognizingthat the practicality of their model wasin question, applied the model to a realis-tic example from the field of computer-

7?John T Mary

write(A)read(A)

write(B)

read(B)write(A)

write(B)

Time

Figure 11. Nonserializable but conflict-predicate-correct schedule.

aided software engineering (CASE).Rather than using the same example theypresented, we use another example here.Consider the schedule shown in Figure11 (which is adapted from [Korth andSpeegle 1988]). This schedule is clearlynot serializable and is not allowed by anyof the traditional protocols. Suppose,however, that the database consistencyconstraint is a conjunct of the form P1OR P2, where PI is over A while P2 isover B. This occurs when A and B arenot related to each other (i. e., the valueof B does not depend on A and vice versa).In this case, the schedule is in CPC sincethe data items A and B are in differentconjuncts of the database consistencyconstraint and the graphs for both con-juncts P1 and P2 individually areacyclic, as shown in Figure 12. In otherwords, the schedule is correct becauseboth T~O~~and T~~w access A in a serial-izable manner and also access B in aserializable manner.

6.2.4 Dynamic Restructuring of Transactions

In many advanced database applications,such as design environments, operatio-are interactive. The operations a useperforms within a transaction might be(1) of uncertain duration, (2) of uncertaindevelopment (i.e., it cannot be predictedwhich operations the user will invoke apriori) and (3) dependent on other con-current operations. Both altruistic lock-ing and sagas address only the first andthird of these characteristics. They donot address the uncertainty of the devel-opment of a transaction. Specifically,

ACM Computing Surveys, Vol. 23, No. 3, September 1991

Page 28: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

296 ● N. S. Barghouti and G. E. Kaiser

Figure 12. Graphs built by

neither sagas nor long transactions inthe altruistic locking scheme can be re-structured dynamically to reflect achange in the needs of the user. To solvethis problem, Pu et al. [19881 introducedtwo new operations, split-transactionand join-transaction, which are used toreconfigure long transactions while inprogress.

The basic idea is that all sets ofdatabase actions included in a set of con-current transactions are performed in aschedule that is serializable when theactions are committed. The schedule,however, may include new transactionsthat result from splitting and joining theoriginal transactions. Thus, the commit-ted set of transactions may not corre-spond in a simple way to the originallyinitiated set. A split-transaction dividesan ongoing transaction into two or moreserializable transactions by dividing theactions and the resources between thenew transactions. The resulting transac-tions can proceed independently fromthat point on. More important, the re-sulting transactions behave as if theyhad been independent all along, and theoriginal transaction disappears entirely,as if it had never existed. Thus, thesplit-transaction operation can be appliedonly when it is possible to generate twoserializable transactions.

One advantage of splitting a transac-tion is the ability to commit one of thenew transactions in order to release all ofits resources so they can be used by othertransactions. The splitting of a transac-tion reflects the fact that the user whocontrolled the original transaction hasdecided he or she is done with some ofthe resources reserved by the transac-tion. These resources can be treated aspart of a separate transaction. Note thatthe splitting of a transaction in this case

@-’-@lconjunct P2

CPC protocol.

has resulted from new information aboutthe dynamic access pattern of the trans-action (the fact that it no longer needssome resources). This is different fromthe static access pattern that altruisticlocking uses to determine that a resourcecan be released. Another difference fromaltruistic locking is that rather than onlyallowing resources to be released by com-mitting one of the transactions that re -suits from a split, the split-transactionscan proceed in parallel and be controlledby different users. A join-transaction doesthe reverse operation of merging the re-sults of two or more separate transac-tions, as if these transactions had alwaysbeen a single transaction, and releasingtheir resources atomically.

To clarify this technique, suppose bothMary and John start two long transac-tions T~,,Y and TJOhn tO modify the twomodules A and B. After a while, Johnfinds that he needs to access module A.Being notified that T~O~~ needs to accessmodule A, Mary decides she can “giveup” the module since she finished herchanges to it. Therefore, she splits T~~,Yinto T~,rY and T~~,YA. Mary then comm-

its T~,,YA, thus committing herchanges to A while continuing to retainB. Mary can do that only if the changescommitted to A do not depend in any wayon the previous or planned changes to B,which might later be aborted. T~O~~ cannow read A and use it for testing code.Mary independently commits T~,,Y, thusreleasing B. T~O~~ can then access B andfinally commit changes to both A and B.The schedule of T~&, T~arYA, and T~O~~is shown in Figure 13.

The split-transaction and join-trans-action operations relax the traditionalconcept of serializability by allowingtransactions to be dynamically restruc-tured, Eventually, the restructuring

ACM Computing Surveys, VOI 23, No 3, September 1991

Page 29: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications ● 297

T Mary TM.VA TJohn

initiate

read(A)

I read(B)

write(A) initiateI request read(A)

corresponding notify(A)split((B), (A))

commit(A)write(B) actual read(A)

commit(B) write(A)read(B)write(B)

vcommit(A, El)

Time

Figure 13. Example of split-transaction.

produces a set of transactions that areserialized. Unlike all the other ap-proaches described earlier in this section,

this approach addresses the issue of usercontrol over transactions since it allowsusers to restructure their long transac-tions dynamically. This is most usefulwhen opportunities for limited exchangeof data among transactions arise whilethey are in progress. The split and joinoperations can be combined with theother techniques discussed in Section 8to provide for the requirements ofcooperation.

7. SUPPORTING COORDINATION AMONG

MULTIPLE DEVELOPERS

When a small group of developers workson a large project, a need arises to coordi-nate the access of its members to thedatabase in which project components arestored. Most of the time, the developerswork independently on the parts of theproject for which they are responsible,but they need to interact at various pointsto integrate their work. Thus, a few coor-dination rules, which moderate the con-current access to the project database bymultiple developers, need to be enforcedto guarantee that one developer does notduplicate or invalidate the work of otherdevelopers.

In this section we describe mechanismsthat coordinate the efforts of members ofa group of developers. It is important to

emphasize that all the mechanisms de-scribed in this section fall short of sup-porting synergistic cooperation in thesense of being able to pass incompletebut relatively stable data objects be-tween developers in a nonserializablefashion. It is also important to note thatunlike the mechanisms presented in Sec-tion 6, most of the models presented herewere not developed as formal transactionmodels but rather as practical systems tosupport design projects, mostly softwaredevelopment efforts. The behavior ofthese systems, however, can be formu-lated in terms of transaction models, aswe do in this section.

7.1 Version and Configuration Management

The simplest form of supporting coordi-nation among members of a developmentteam is to control the access to sharedfiles so only one developer can modifyanY file at any one time. One approachthat has been implemented by widelyused version control tools like the SourceCode Control System (SCCS) [Rochkind1975] and the Revision Control System(RCS) [Tichy 19851 is the checkout/check in mechanism (also called reserve/replace and reserve /deposit). Each dataobject is considered to be a collection ofdifferent versions. Each version represents the state of the object at some timein the history of its development. Theversions are usually stored in the form of

ACM Computmg Surveys, Vol. 23, No. 3, September 1991

Page 30: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

298 ● N. S. Barghouti and G. E. Kaiser

a compact representation that allows thefull reconstruction of any version, ifneeded. Once the original version ofthe object has been created, it becomesimmutable, which means it cannot bemodified. Instead, a new version can becreated after explicitly reserving the ob-ject. The reservation makes a copy of theoriginal version of the object (or the lat -est version thereafter) and gives theowner of the reservation exclusive accessto the copy so he or she can modify it anddeposit it as a new version.

Other users who need to access thesame object must either wait until thenew version is deposited or reserve an-other version, if that exists. Thus, two ormore users can modify the same objectonly by working on two parallel versions,creating branches in the version history.Branching ensures write serializabilityby guaranteeing that only one writer perversion of an object exists. The resultof consecutive reserves, deposits, andbranches is a version tree that recordsthe full history of development of theobject. When two branches of the versiontree are merged (by manually mergingthe latest version of each branch into oneversion), the tree becomes a dag. Thisscheme is pessimistic since it does notallow access conflicts to occur on the sameversion (rather than allowing them tooccur then correcting them as in opti-mistic schemes). It is optimistic, how-ever, in the sense that it allows multipleparallel versions of the same object to becreated even if these versions are con-flicting. The conflicts are resolvedmanually when the users merge theirversions.

The basic checkout/checkin mecha-nism provides minimal coordination be-tween multiple developers. It does notuse semantic information about the ob -jects or the operations performed on theseobjects. The model suffers from two mainproblems as far as concurrency control isconcerned. First, it does not support anynotion of aggregate or composite objects,forcing the user to reserve and depositeach object individually. This can lead toproblems if a programmer reserves sev-eral objects, all of which belong conceptu-

ally to one aggregate object, creates newversions of all of them, makes sure theyare consistent as a set, then forgets todeposit one of the objects. This will leadto an inconsistent set of versions beingdeposited. Second, the reserve/depositmechanism does not provide support forreserved objects beyond locking them inthe public database. Thus, once an objecthas been reserved by a programmer, it isnot controlled by the concurrency controlmechanism. The owner of the reserva-tion can decide to let other programmersaccess that object.

The first problem results from notkeeping track of which versions of objectsare consistent with each other. For ex-ample, if each component (object) of asoftware system has multiple versions, itwould be impossible to find out whichversions of the objects actually partici -pated in producing a particular exe-cutable version being tested. Further, aprogrammer cannot tell which versionsof different objects are consistent witheach other. There is a need to group setsof versions consistent with each otherinto configurations. This would enableprogrammers to reconstruct a system us-ing the correct versions of the objectsthat comprise the system. This notion ofconfigurations is supported by many soft-ware development systems (e. g., ApolloDomain Software Engineering Environ-ment (DSEE) [Leblang and Chase, Jr.1987]). A recent survey by Katz [1990]gives a comprehensive overview ofversion and configuration managementsystems.

Supporting configurations reduces theproblem of consistency to the problemof explicitly naming the set of consist-ent versions in configuration objects.This basically solves the problem ofcheckout/checkin where only ad hoc ways(associating attributes with versions de-posited at the same time) can be used tokeep track of which versions of differentobjects belong together.

7. 1.1 Domain Relative Addressing

Walpole et al. [1987, 1988althe problem of consistency in

addressedcon figura -

ACM Computing Surveys, Vol. 23, No 3, September 1991

Page 31: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications ● 299

T Bob T John

read(pl)read(pl)

read(p2)

modify(yl)

write(pl)modify(p2)

read(p2)modify(p2)write(p2)

write(p2)

T~me

Figure 14. Domain-relative addressing schedule

tion management systems and intro-duced a concurrency control notion calleddomain-relative addressing that supportsversions of configuration objects. Domainrelative addressing extends the notion ofReed’s time-relative addressing (multi-version concurrency control) described inSection 4.3. Whereas Reed’s mechanismsynchronizes accesses to objects withrespect to their timestamps, domain-relative addressing does so with respectto the domain of the data items accessedby the transaction. The database is parti-tioned into separate consistent domains,where each domain (configuration) con-sists of one version of each of the concep -tual objects in a related set.

To illustrate this technique, considerthe two transactions T~Ob and TJOh. ofFigure 14. By all the conventional con-currency control schemes, the schedulein the figure is disallowed. Underdomain-relative addressing, the schedulein Figure 14 is allowed because TBObandT~O~~operate on different versions of pro-cedures PI and p2. A similar scenariooccurs if Bob wants to modify module Athen modify module B to make it consist-ent with the updated A. At the sametime, John wants to modify B, keeping itconsistent with A. This can be done if

T~ob and ‘John use different versions ofmodules A and B as shown in Figure 15.This scheme captures the semantics ofthe operations performed (consistent up-dates) by maintaining that version Al(the original version of module A) isconsistent with B1 (the version of moduleB modified by T~O~.), while A2 (module A

Figure 15. Maintaining consistency using domain-

relative addressing.

after T ~ has modified it) is consistentPwith B (the new version of module B

that T~Ob has created). All of Al, A2, B1,and B2 become immutable versions.Domain-relative addressing is the con-currency control mechanism used in theCosmos software development environ-ment [Walpole et al. 1988bl.

7.2 Pessimistic Coordination

Although domain-relative addressingsolves the problem of configurations ofobjects, it does not address the secondproblem of the checkout /checkin model,which is concurrency control support forthe checked-out objects. Two mechanismsthat provide partial solutions to thisproblem are the conversational transac-

tions mechanism provided as an exten-sion to System R [Lorie and Plouffe 1983;William et al. 19811 and the designtransactions mechanism [Katz and Weiss1984]. Although the models differ in theirdetails, they are similar as far as concur-rency control is concerned. Thus, we re-fer to both models as the conversationaltransactions model. In this model, thedatabase of a design project consists of apublic database and several privatedatabases. The public database is sharedamong all designers, whereas each pri-vate database is accessed only by a singledesigner. Each designer starts a longtransaction in his or her private data-base that lasts for the duration of thedesign task.

When the long transaction needs toaccess an object in the public database, itrequests to check out the object in a par-ticular mode, either to read it, write it, ordelete it. This request initiates a shorttransaction on the public database. The

ACM Computing Surveys, Vol. 23, No. 3, September 1991

Page 32: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

300 “ N. S. Barghouti and G. E. Kaiser

short transaction first sets a short-livedlock on the object, then checks if theobject has been checked out by anothertransaction in a conflicting mode. If ithas not, the short transaction sets a per-manent lock on the object for the dura-tion of the long transaction. Before theshort transaction commits, it copies theobject to the specific private database andremoves the short-lived lock. If the objecthas been checked out by another longtransaction, the short transaction re-moves the short-lived lock, notifies theuser that he or she cannot access theobject, and aborts. The short-lived lockthat was created by the short transactionon the public database prevents othershort transactions from accessing thesame object at the same time. The per-manent locks prevent long transac-tions from checking out an object thathas already been checked out in an ex-clusive mode.

All objects that are checked out by along transaction are checked back in byinitiating short checkin transactions onthe public database at the end of the longtransaction. A checkin transaction copiesthe object to the public database anddeletes the old version of the object thatwas locked by the corresponding check-out transaction. The new version of theobject does not inherit the long-lived lockfrom its predecessor. Thus, each conver-sational transaction ensures that all theobjects that it checked out will be checkedback in before it commits. This mecha-nism solves the first problem describedabove with the reserve/deposit model.

A concurrency control mechanism sim-ilar to conversational transactions is usedin Smile, a multiuser software develop-ment environment [Kaiser and Feiler1987]. Smile adds semantics-based con-sistency preservation to the conversa-tional transactions model by enforcingglobal consistency checks before allowinga set of objects to be checked in. Smilealso maintains semantic informationabout the relations among objects, whichenables it to reason about collections ofobjects rather than individual objects. Itthus provides more support to compositeobjects such as modules or subsystems.

Like the conversational transactionsmodel, Smile maintains all informationabout a software project in a maindatabase, which contains the baselineversion of a software project. Modifica-tion of any part of the project takes placein private databases called experimental

databases. To illustrate Smile’s transac-tion model, assume John wants to modifymodules A and B; he starts a transactionT~O~~ and reserves A and B in an experi-mental database (EDB~O~.). When amodule is reserved, all of its subobjects(e.g., procedures, types) are also re-served. Reserving A and B guaranteesthat other transactions will not be ableto modify these modules until John hasdeposited them. Other transactions, how-ever, can read the baseline version of themodules from the main database. Johnthen proceeds to modify the body of themodules. When the modification processis complete, he requests a deposit opera-tion to return the updated A and B to themain database and make all the changesavailable to other transactions.

Before a set of modules is depositedfrom an experimental database to themain database, Smile compiles the set ofmodules together with the unmodifiedmodules in the main database. The com-pilation verifies that the set of modulesis self-consistent and did not introduceany errors that would prevent integrat-ing it with the rest of the main database.If the compilation succeeds, the modulesare deposited and T~O~. commits. Other-wise, John is informed of the errors andthe deposit operation is aborted. In thiscase, John has to fix the errors inthe modules and repeat the deposit oper-ation when he is done. T~O~. commitsonly when the set of modules thatwas reserved is successfully compiledthen deposited.

Smile’s model of consistency not onlyenforces self-consistency of the set ofmodules, it also enforces global consist-ency with the baseline version of all othermodules. Thus, John will not be permit-ted to make a change to the interface ofmodule A (e. g., to the number or types ofparameters of a procedure) withinEDB~O~~ unless he has reserved all other

ACM Computing Surveys, Vol 23, No 3, September 1991

Page 33: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications “ 301

modules that may be affected by thechange. For example, if procedure pl ofmodule A is called by procedure p7 ofmodule C, John has to reserve module C(in addition to modules A and B, whichhe has already reserved) before he canmodify the interface of pl. If anothertransaction T~~,Y has module C reservedin another experimental database,EDB~~,Y, the operation to change pl isaborted and T John. k forced to either waituntil T~,,Y deposits module C, at whichpoint TJOh. can reserve it, or to continueworking on another task that does notrequire module C. From this example, itshould be clear that by enforcing seman-tics-based consistency, Smile restrictscooperation even more than the conver-sational transactions because two userscannot simultaneously access objects se-mantically related to each other at theinterface level.

Although the two-level database hier-archy of Smile and the conversationaltransactions mechanism provide bettercoordination support than the basiccheckout /checkin model, it does not al-low for a natural representation of hier-archical design tasks in which groups ofusers participate. Supporting such a ~i -

erarchy requires a nested database struc-ture similar to the one provided by themultilevel transaction schemes describedin Section 6.

7.2.1 Multilevel Pessimistic Coordination

A more recent system, Infuse, supports amultilevel, rather than a two-level, hier-archy of experimental databases. Infuserelaxes application-specific consistencyconstraints by requiring only that mod-ules in an experimental database beself-consistent before they are depositedto the parent database [Kaiser and Perry1987]. More global consistency is en-forced only when the modules reserved intop-level experimental databases are de-posited to the main database.

Returning to our example, assume bothBob and Mary are involved in a task thatrequires modifying modules A and C;Figure 16 depicts the situation. Theyrefers to create an experimental database

main database

5ilA,C

A c

Bob Mary

Figure 16. Experimental databases in Infuse.

in which both modules A and C are re-served (EDBA,C). Bob and Mary decidethat Bob should modify module A andMary should work on module C. Bob cre-ates a child experimental database inwhich he reserves module A (EDBA).Mary creates EDBC in which she re-serves module C. Bob decides his taskrequires changing the interface of proce -dure pl by adding a new parameter. Atthe same time, Mary starts modifyingmodule C in her database. Recall thatprocedure p7 of module C calls pl inmodule A. After Bob completes hischanges, he deposits module A to EDBA.No errors are detected at that point be-cause Infuse only checks that A is self-consistent. This is possible becauseInfuse assumes any data types or pro-cedures used in the module but notdefined in it must be defined elsewhere.If they are not defined anywhere in thesystem, the final attempt to deposit intothe main database will detect that. In-fuse only checks that all uses of a datatype or object in the same module areconsistent with each other.

Mary then finishes her changes anddeposits module C. Again no errors aredetected at that level. When either Bobor Mary attempts to deposit the modulesin EDBA ~ to the main database, how-ever, the’ compiler reports that modulesA and C are not consistent with eachother because of the new parameter ofprocedure pl. At that point, either Bobor Mary must create a child experimen-tal database in which he or she can fixthe bug by changing the call to pl inprocedure p7.

ACM Computmg Surveys, Vol 23, No 3, September 1991

Page 34: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

302 ● N. S. Barghouti and G. E. Kaiser

Infuse’s model allows greater concur-rency at the cost of greater semantics-based inconsistency— and the poten-tial need for a later round of changesto reestablish consistency. But serializ-ability is always maintained by requir-ing sibling EDBs to reserve disjointsub sets of the resources locked bythe parent EDB.

7.3 Optimistic Coordination

The coordination models presented inSection 7.2 are pessimistic in that theydo not allow concurrent access to thesame object in order to prevent any con-sistency violations that might occur.They are thus more restrictive in thatsense than the configuration and versionmanagement schemes. It is often the casein design efforts, however, that two ormore developers within the same teamprefer to access different versions of thesame object concurrently. Since these de-velopers are typically familiar with eachother’s work, they can resolve any con-flicts they introduce during their concur-rent access by merging the differentversions into a single consistent ver-sion. Rather than supporting multipleversions in a flat database, however,software development environments toprovide a hierarchical structure likeInfuse’s.

7.3.1 Copy 1 Modify / Merge

Like Infuse, Sun’s Network Software En-vironment (NSE) supports a nestedtransaction mechanism that operates ona multilevel hierarchical database struc-ture [Adams et al. 19891. Like Cosmos(and unlike Infuse), NSE supports con-current access to the same data objects.NSE combines the checkout/checkinmodel with an extension to the classicaloptimistic concurrency control policy,thus allowing limited cooperationamong programmers. Unlike the check-out/checkin model and Cosmos, however,NSE provides some assistance to devel-opers in merging different versions of thesame data item.

NSE requires programmers to acquire(reserve) copies of the objects they wantto modify in an environment (not to beconfused with a software developmentenvironment) where they can modify thecopies. Programmers in other environ-ments at the same level cannot accessthese copies until they are deposited tothe parent environment. Environmentscan, however, have child environmentsthat acquire a subset of their set of copies.Multiple programmers can operate in thesame environment where the basicreserve/deposit mechanism is enforced tocoordinate their modifications.

Several sibling environments can con-currently acquire copies of the same ob-ject and modify them independently, thuscreating parallel versions of the sameobject. To coordinate the deposit of theseversions to the parent environment, NSErequires that each environment mergeits version (called reconcile in NSE’S ter-minology) with the previously committedversion of the same object. Thus, the firstenvironment to finish its modificationsdeposits its version as the new version ofthe original object in the parent environ-ment; the second environment to finishhas to merge its version with the firstenvironment’s version, creating a newerversion; the third environment to finishwill merge its version with this newerversion, and so on.

Like the optimistic concurrency control(OCC) mechanism, NSE’S mechanismallows concurrent transactions (pro -grammers in sibling environments in thiscase) to access private copies of the sameobject simultaneously. Before users canmake their copies visible to other users(i.e., the write phase in the OCC mecha-nism), they have to reconcile (validate)the changes they made with the changesother users in sibling environments haveconcurrently made on the same objects.If conflicts are discovered, rather thanrolling back transactions, the usersof conflicting updates have to mergetheir changes, producing a new versionof the object.

To illustrate this mechanism, assumethe modules of the project depicted in

ACM Computing Surveys, Vol. 23, No. 3, September 1991

Page 35: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications ● 303

m,14%?$1 K&’!:l ,$1 $

Proj_ENV 1 tij_ENV 2

PIOJ PmjmPro,J

/T’=-c

AAA + ~~fll

pl p2p3@p5@ pl pll P2 p21 P3 @ P5psl M

acqw

FRONT_END 1 FRONT.END 2

I

ipfOJ

IA

pl +pll P’2

0“7’ AAA “ ~ /’s7

pl p2 p5 fi pl pll P2 p21 P5PSI ti

Icquire

JOHN1

4 Ap&2w2,.?_,

JOHN 2

1aqwe vp\_,ix%Pf%f

BACK_END 2

&pOslt

MARY 1 MARY 2

Figure 17. Layered development in NSE.

Figure 1 represent the following: ModuleA comprises the user interface part of theproject, module B is the kernel of theproject, module C is the database man-ager, and module D is a library module.The development happens in three layersas shown in Figure 17. At the top layer,the environment PROJ-ENV representsthe released project. All the objects of theproject belong to this environment. Atthe second level, two environments coex-ist: one to develop the user interface,FRONT-END, and the other to developthe kernel, 13ACK_END. FRONT_ ENDacquires copies of modules A and C;BACK. END acquires copies of B and C.John works on modifying the front end inhis private environment, JOHN, while

Mary works on developing the back endin her private environment.

John acquires module A in order tomodify it. He creates a new version of PI

but then finds out that in order to modifyp2, he needs to modify P5. Conse-quently, he acquires p5 into his environ-ment and creates new versions of p2 andp5. Finally, he deposits all his changesto FRONT-END, creating new versionsof modules A and C as shown in Figure17. Concurrently, Mary acquires moduleB, modifies it, and deposits the changesto BACK-END. Mary can then test hercode in BACK_END.

Suppose before Mary starts testing hercode, John finishes testing his code anddeposits all of his changes to the top-level

ACM Computing Surveys, Vol. 23, No. 3, September 1991

Page 36: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

304 “ N. S. Barghouti and G. E. Kaiser

environment, creating a new version ofthe project and making all of his changesvisible to everybody. Before testing hercode, Mary can check to see if any of thecode that is relevant to her (modules Band C) has been changed by another pro-grammer. NSE provides a command,resync, to do that automatically on de-mand. Resync will inform Mary thatJohn has changed procedure p5. At thispoint, Mary can decide to acquire John’snew version and proceed to test her code.

In another scenario, the exact sameseries of actions as above occurs exceptthat Mary discovers she needs to modifyprocedure p5 in module C, so she ac-quires it. In this case, after the resynccommand informs her that John has al-ready deposited a new version of p5,Mary has to merge her new version withJohn’s. This is done by invoking a spe-cial editor that facilitates the mergingprocess. Merging produces a new versionof p5, which Mary can use to test hercode. Finally, she can deposit all of hercode, creating a new version of the wholeproject.

7.3.2 Backout and Comm!t Spheres

Both Infuse and NSE implicitly use theconcept of nested transactions; they alsoenforce a synchronous interaction be-tween a transaction and its child sub-transactions, in which control flows fromthe parent transaction to the child sub-transaction. Subtransactions can accessonly the data items the parent transac-tion can access, and they commit theirchanges only to their parent transaction.A more general model is needed to sup-port a higher level of coordination amongtransactions. Walter [1984] observed thatthere are three aspects that define therelationship between a parent transac-tion and a child subtransaction: the in-terface aspect, the dependency aspect,and the synchronization aspect.

The interface between a parent trans-action and a child subtransaction can ei -ther be single-request, that is, the parentrequests a query from the child and waitsuntil the child returns the result or con-

versational, that is, the control changesbetween the parent that issues a se-quence of requests and the child thatanswers these requests. A conversationalinterface, in which values are passed backand forth between the parent and thechild, necessitates grouping the parentand child transactions in the same roll-back domain, because if the child trans-action is aborted (for any reason) in themiddle of a conversation, not only doesthe system have to roll back the changesof the child transaction, but the parenttransaction has to be rolled back to thepoint before the conversation began. Inthis case, the two transactions are said tobelong to the same backout sphere. Abackout sphere includes all transactionsinvolved in a chain of conversations andrequires backing out (rollback) of alltransactions in the sphere if any one ofthem is backed out. A single-request in-terface, which is what the traditionalnested transaction model supports, doesnot require rolling back the parent, be-cause the computation of the child trans-action does not affect the computation inprogress in the parent transaction.

The dependency aspect concerns a childtransaction’s ability to commit its up-dates independently of when its parenttransaction commits. If a child is inde-pendent of its parent, it is said to be in adifferent commit sphere. Any transactionwithin a commit sphere can commit onlyif all other transactions in its sphere alsocommit. If a child in a different commitsphere than its parent commits, then theparent must either remember the childcommitted (e. g., by writing the commit-ted values in its variables) or be able toexecute the child transaction again if theparent is restarted.

The synchronization aspect concernsthe ability to support the concurrent exe-cution of the parent transaction and itssubtransactions. Such concurrency canoccur if the child subtransaction is calledfrom the parent transaction asyn-chronously (i. e., the parent continues itsexecution and fetches the results of thechild subtransaction at a later time). Inthis case, both the parent and the child

ACM Computing Surveys, Vol 23, No 3, September 1991

Page 37: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications * 305

may attempt to access the same dataitems at the same time, thus the need forsynchronization. If the child is calledsynchronously (i. e., the parent waitsuntil the child terminates), it camsafely access the data items locked byits parent.

Given these three aspects, Walter[19841 presented a nested transactionmodel in which each subtransaction hasthree attributes that must be definedwhen the transaction is created. The firstattribute, reflecting the interface trite -rion, can be set to either COMMIT orNOCOMMIT. The dependency attributeis set to either BACKOUT or NOBACK-OUT, and the third attribute, reflectingthe synchronization mode, is set to eitherSYNC or NOSYNC. The eight combina-tions of these attributes define levels ofcoordination between a transaction andits subtransactions. For example, a sub-transaction created with the attributesCOMMIT, BACKOUT, SYNC is inde-pendent of its parent since it possessesits own backout sphere and its own com-mit sphere and it can access data itemsnot locked by its parent.

Walter claims it is possible to defineall other nested transaction models inhis model. Moss’ model, for example,is defined as creating subtransactionswith attributes set to BACKOUT,NOCOMMIT, SYNC. Beeri et al.’s [1988]multilevel transaction model described inSection 6.1.3 supports the combinationCOMMIT, BACKOUT, NOSYNC. Nosynchronization is needed between atransaction and its subtransactions be-cause they operate at two different levelsof abstraction (e. g., if locking is used,different levels would use different typesof locks).

The models we described in this sec-tion support limited cooperation amongteams of developers mainly by coordinat -ing their access to shared data. Both NSEand Cosmos allow two or more environ-ments to acquire copies of the same ob -ject, modify them, and merge them. NSEalso provides programmers with the abil-ity to set notification requests on partic-ular objects so they are informed when

other programmers acquire or reconcilethese objects. Infuse provides a notion ofworkspaces that cuts across the hierar-chy to permit grouping of an arbitraryset of experimental databases. This “cut-ting across” enables users to look at thepartial results of other users’ work undercertain circumstances for the purpose ofearly detection of inconsistencies. Noneof the models described so far, how-ever, supports all the requirements ofsynergistic cooperation among teams ofdevelopers.

8. SUPPORTING SYNERGISTICCOOPERATION

In Section 7 we addressed the issue ofcoordinating the access of a group of de-velopers to the shared project database.Although this coordination is often allthat is needed for small groups of devel-opers, it is not sufficient when a largenumber of developers works on a large-scale design project [Perry and Kaiser1991]. The developers are often subdi-vided into several groups, each respon-sible for a part of the design task.Members of each group usually cooperateto complete their part. In this case, thereis a need to support cooperation amongmembers of the same group, as well ascoordination of the efforts of multiplegroups. The mechanisms described inSection 7 address the coordination issue,but most of them do not support any formof cooperation.

Supporting synergistic cooperationnecessitates relying on sharing the col-lective knowledge of designers. Forexample, in an SDE it is common to haveseveral programmers cooperate on devel-oping the same subsystem. Each pro-grammer becomes an “expert” in aparticular part of the subsystem, and itis only through the sharing of the exper-tise of all the programmers that the sub-system is integrated and completed. Insuch a cooperative design environment,the probability of conflicting accesses toshared data is relatively high because itis often the case that several users, withoverlapping expertise, are working on

ACM Computing Surveys, Vol. 23, No 3, September 1991

Page 38: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

306 * N. S. Barghouti and G. E. Kaiser

related tasks concurrently. Note that inan SDE there is overlapping access toexecutable and status information evenif not to source code components.

Many of the conflicts that occur in de-sign environments are not serious in thesense that they can be tolerated by users.In particular, designers working closelytogether often need to exchange incom-plete designs, knowing they mightchange shortly, in order to coordinate thedevelopment of various parts of thedesign. A DBMS supporting such anenvironment should not obstruct thiskind of cooperation by disallowingconcurrent access to shared objects ornonserializable interaction.

Instead, the concept of database consist-ency preservation needs to be refinedalong the lines of Section 7 to allow non-serializable cooperative interaction. Sucha refinement can be based on four obser-vations [Bancilhon et al. 1985]: (1) designefforts are usually partitioned into sepa-rate projects, where each project is devel-oped by a team of designers; (2) availableworkstations provide multiple windowsin which multiple tasks can be executedconcurrently by the same designer; (3)projects are divided into subtasks wherea group of designers, each working on asubtask, has a great need to share dataamong themselves and (4) in complexdesign projects, some subtasks are con-tracted to other design groups (subcon-tractors) that have limited access to theprojects’s main database.

ln this section, we present two modelsthat aim at defining the underlyingprimitives needed for the implementa-tion of cooperative concurrency controlmechanisms. We then describe fourmechanisms, two from the CAD/CAMcommunity and two from the SDE do-main, that use combinations of theseprimitives to implement cooperative con-

currency control policies. It is worth-while to note that much of the workdescribed in this section is very recent,and some of it is preliminary. We believethe models presented here provide agood sample of the research efforts underway in the area of cooperative trans-action models.

8.1 Cooperation Primitives

In order to address the four observationslisted above, there is a need to introducetwo new primitives that can be used bymechanisms supporting cooperation. Thefirst primitive is notification (mentionedin Section 7), which enables developersto monitor what is going on as far asaccess to particular objects in the

database is concerned. The second is theconcept of a group of cooperating devel -opers. The members of a group usuallywork on the same task (or at least re-

lated tasks) and thus need to cooperateamong themselves much more than withmembers of other groups.

8. 1.1 Interactive Notification

One approach to maintaining consist-ency, while still allowing some kind of

cooperation, is to support notification andinteractive conflict resolution rather thanenforce serialization [Yeh et al. 19871. Todo this, the Gordion database system pro-vides a notification primitive that can be

used in conjunction with other primitives(such as different lock modes) to imple-ment cooperative concurrency controlpolicies [Ege and Ellis 1987]. Notificationalerts users about “interesting” eventssuch as an attempt to lock an object thathas already been locked in an exclusivemode.

Two policies that use notification inconjunction with nonexclusive locksand versions were implemented in theGordion system: immediate notificationand delayed notification [Yeh et al. 1987].Immediate notification alerts the user ofany conflict (attempt to access an objectthat has an exclusive lock on it or fromwhich a new version is being created byanother user) as soon as the conflict oc-curs. Delayed notification alerts the userof all the conflicts that have occurredonly when one of the conflicting transac-tions attempts to commit. Conflicts areresolved by instigating a “phone call”between the two parties with the as-sumption that the y can interact (hencethe name interactive notification) to re-solve the conflict.

ACM Computing Surveys. Vol. 23, No 3, September 1991

Page 39: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications ● 307

These policies incorporate human be-ings as part of the conflict resolution al-gorithm. This, on the one hand, enhancesconcurrency in advanced applications,where many of the tasks are interactive.On the other hand, it can also degradeconsistency because human beings mightnot really resolve conflicts, which couldresult in inconsistent data. Like the sagasmodel, this model burdens the user withknowledge about the semantics of appli-cations. This points to the need for incor-porating some intelligent tools, similarto NSE’S merge tool, to help the userresolve conflicts.

8. 1.2 The Group Paradigm

Since developers of a large project oftenwork in small teams, there is a need todefine formally the kinds of interactionsthat can happen among members of thesame team as opposed to interactions be-tween teams. El Abbadi and Toueg [1989]defined the concept of a group as a set oftransactions that, when executed, trans-forms the database from one consistentstate to another. They presented thegroup paradigm to deal with consistencyof replicated data in an unreliable dis-tributed system. They hierarchicallydivide the problem of achieving serializ-ability into two simpler ones: a local pol-icy that ensures a total ordering of alltransactions within a group and a globalpolicy that ensures correct serializationof all groups.

Groups, like nested transactions, arean aggregation of a set of transactions.There are significant differences, how-ever, bet ween groups and nested transac-tions. A nested transaction is designed apriori in a structured manner as a singleentity that may invoke subtransactions,which may themselves invoke other sub -transactions. Groups do not have any a

priori assigned structure and do not havepredetermined precedence ordering im-posed on the execution of transactions

within a group. Another difference is thatthe same concurrency control policy isused to ensure synchronization amongnested transactions at the root level andwithin each nested transaction. Groups,

however, could use different local andglobal policies (e.g., an optimistic localpolicy and a 2PL global policy).

The group paradigm was introduced tomodel intersite consistency in a dis-tributed database system. It can also beused to model teams of developers, whereeach team is modeled as a group with alocal concurrency control policy that sup-ports synergistic cooperation. A globalpolicy can then be implemented to coor-dinate the efforts of the various groups.The local policies and the global policyhave to be compatible in the sense thatthey do not contradict each other. Touegand El Abbadi do not sketch the compati-bility requirements between global andlocal policies.

Dowson and Nejmeh [19891 applied thegroup concept to model teams of pro-grammers. They introduced the notion ofvisibility domains, which model groups ofprogrammers executing nested transac-tions on immutable objects. A visibilitydomain is a set of users that can sharethe same data items. Each transactionhas a particular visibility domain associ-ated with it. Any member of a visibilitydomain of a transaction may start a sub-transaction on the copy of data thatbelongs to the transaction. The onlycriterion for data consistency is thatthe visibility domain of a transactionbe a subset of the visibility domain ofits parent.

8.2 Cooperating Transactions

Variations of the primitives definedabove have been the basis for severalconcurrency control mechanisms thatprovide various levels of cooperation. Inthis section we present four mechanismsthat support some form of synergistic co-operation. Two of the mechanisms weredesigned for CAD environments; theother two were designed for SDES. Thenotion of cooperation in SDES is simi-lar to that in CAD. Differences, how-ever, arise from the differences in thestructure of projects in the two do-mains. It seems that CAD projects aremore strictly organized than softwaredevelopment projects, with a more

ACM Computing Surveys, Vol. 23, No 3, September 1991

Page 40: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

308 ● N. S. Barghouti and G. E. Kaiser

stringent division of tasks, and with lesssharing among tasks.

In both cases, designers working onthe same subtask might need uncon-strained cooperation, whereas two de-signers working on different tasks of thesame project might need more con-strained cooperation. CAD designersworking on two different projects (al-though within the same division, for ex-ample) might be content with traditionaltransaction mechanisms that enforceisolation. In software development,programmers working on different proj-ects might still need shared access tolibraries and thus might need morecooperation than is provided by tradi-tional mechanisms, even if their tasksare unrelated.

One approach to providing such sup-port is to divide users (designers or pro-grammers) into groups. Each group isthen provided with a range of lock modesthat allows various levels of isolation andcooperation among multiple users in thesame group and between different groups.Specific policies that allow cooperationcan then be implemented by the environ-ment using the knowledge about usergroups and lock modes. In this section,we describe four mechanisms that arebased on the group concept. All fourmechanisms avoid using blocking to syn-chronize transactions, thus eliminatingthe problem of deadlock.

8.2.1 Group-Oriented CAD Transact/ens

One approach, called the group-orientedmodel, extends the conversational trans-actions model described in Section 7[Klahold et al. 1985]. Unlike the conver-sational transactions model, the group-oriented model does not use long-livedlocks on objects in the public database.The conversational transactions modelsets long-lived locks on objects that arechecked out from the public database un-til they are checked back in to the publicdatabase.

The group-oriented model categorizestransactions into group transactions (GT)and user transactions (UT). Any UT is a

subtransaction of a GT. The model alsoprovides primitives to define groups ofusers with the intention of assigning eachGT a user group. Each user group devel-ops a part of the project in a groupdatabase. A GT reserves objects from thepublic database into the group databaseof the user group it was assigned. Withina group database, individual designerscreate their own user database and in-voke UTS to reserve objects from thegroup database to their user database.

In the group-oriented model, usergroups are isolated from each other. Oneuser group cannot see the work of an-other user group until the work is de-posited in the public database. Grouptransactions are thus serializable. Withina group transaction, several user trans-actions can run concurrently. Thesetransactions are serializable unless usersintervene to make them cooperate in anonserializable schedule. The basicmechanism provided for relaxing serial-izability is a version concept that allowsparallel development (branching) and no-tification. Versions are derived, deletedand modified explicitly by a designer onlyafter being locked in any one of a rangeof lock modes.

The model supports five lock modes ona version of an object: (1) read only, whichmakes a version available only for read-ing; (2) read–derive, which allows multi-ple users either to read the same versionor derive a new version from it; (3) sharedderivation, which allows the owner toboth read the version and derive a newversion from it, while allowing parallelreads of the same version and derivationof different new versions by other users;(4) exclusive derivation, which allows theowner of the lock to read a version of anobject and derive a new version and al-lows only parallel reads of the originalversion; and (5) exclusive lock, whichallows the owner to read, modify, andderive a version and allows no paralleloperations on the locked version.

Using these lock modes, several de-signers can cooperate on developing thesame design object. The exclusive lockmodes allow for isolation of development

ACM Computing Surveys, Vol 23, No 3, September 1991

Page 41: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications ● 309

efforts (as in traditional transactions), ifthat is what is needed. To guarantee con-sistency of the database, designers areonly allowed to access objects as part of atransaction. Each transaction in thegroup-oriented model is two-phased, con-sisting of an acquire phase and a releasephase. Locks can only be strengthened(converted into a more exclusive mode)in the acquire phase and weakened (con-verted into a more flexible lock) in therelease phase. If a user requests a lockon a particular object and the object isalready locked with an incompatible lock,the request is rejected and the initiatorof the requesting transaction is informedof the rejection. This avoids the problemof deadlock, which is caused by blockingtransactions that request unavailable re-sources. The initiator of the transactionis notified later when the object he or sherequested becomes available for locking.

In addition to this flexible lockingmechanism, the model provides a readoperation that breaks any lock by allow-ing a user to read any version, knowingthat it might be about to be changed.This operation provides the designer(more often a manager of a design effort)with the ability to observe the progressof development of a design object with-out affecting the designers doing thedevelopment.

8.2.2 Cooperating CAD Transactions

Like the group-oriented model, thecooperating CAD transactions model,introduced by Bancilhon, et al. [19851,envisions a design workspace to consistof a global database that contains a pub-lie database for each project and privatedatabases for active designers’ transac-tions. Traditional two-phase locking isused to synchronize access to shared dataamong different projects in the database.Within the same project, however, eachdesigner invokes a long transaction tocomplete a well-defined subtask for whichhe is responsible.

All the designers of a single projectparticipate in one cooperating transac-tion, which is the set of all long transac-

tions initiated by those designers. All theshort-duration transactions invoked bythe designers within the same cooperat-ing transaction are serialized as if theywere invoked by one designer. Thus, if adesigner invokes a short transaction(within his or her long transaction) thatconflicts with another designer’s shorttransaction, one of them has to wait onlyfor the duration of the short transaction.Each cooperating transaction encapsu-lates a complete design task. Some of thesubtasks within a design task can be“subcontracted” to another group in-stead of being implemented by membersof the project. In this case, a special coop-erating transaction called a client/sub-contractor transaction is invoked for thatpurpose. Each client/subcontractortransaction can invoke other client /sub-contractor transactions leading to a hier-archy of such transactions spawned by asingle client (designer). This notion issimilar to Infuse’s hierarchy of experi -mental databases, discussed in Section7.2.1.

A cooperating transaction is thus anested transaction that preserves someconsistency constraints defined as part ofthe transaction, Each subtransaction (it-self a cooperating transaction) in turnpreserves some integrity constraints (notnecessarily the same ones as its parenttransaction). The only requirement hereis that subtransactions have weaker con-straints than their ancestors. Thus, theintegrity constraints defined at the toplevel of a cooperating transaction implyall the constraints defined at lower lev-els. At the lowest level of the nestedtransaction are the database operations,which are atomic sequences of physicalinstructions such as reading and writingof a single data item.

To replace the conventional concept ofa serializable schedule for a nested trans-action, Bancilhon et al. [19851 define thenotion of an execution of a cooperatingtransaction to be a total order of all theoperations invoked by the subtransac-tions of the cooperating transaction thatis compatible with the partial orders im-posed by the different levels of nested

ACM Computing Surveys, Vol 23, No 3, September 1991

Page 42: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

310 “ N. S. Barghouti and G. E. Kaiser

transactions. A protocol is a set of rulesthat restrict the set of admissible execu-tions. Thus, if the set of rules is strictenough, they would allow only serializ-able executions. The set of rules can,however, allow nonserializable, and evenincorrect, executions.

8.2.3 Transaction Groups

In order to allow members of the samegroup to cooperate and to monitorchanges in the database, there is a needto provide concurrency control mecha-nisms with a range of lock modes of vary-ing exclusiveness. The transaction groups

model proposed for the ObServer systemreplaces classical locks with (lock mode,communication mode) pairs to supportthe implementation of a nested frame-work for cooperating transactions [Zdonik1989; Skarra and Zdonik 1989]. A trans-action group (TG) is defined as a processthat controls the access of a set of cooper-ating transactions (members of thetransaction group) to objects from theobject server. Since a TG can includeother TGs, a tree of TGs is composed.

Within each TG, member transactionsand subgroups are synchronized accord-ing to an input protocol that defines somesemantic correctness criteria appropriatefor the application. The criteria are spec-ified by semantic patterns and enforcedby a recognize and a conflict detector.The recognize ensures that a lock re-quest from a member transactionmatches an element in the set of locksthat the group may grant its members.The conflict detector ensures that a re-quest to lock an object in a certain modedoes not conflict with the locks alreadyheld on the object.

If a transaction group member re-quests an object that is not currentlylocked by the group, the group has torequest a lock on the object from its par-ent. The input protocol of the parentgroup, which controls access to objects,might be different from that of the childgroup. Therefore, the child group mighthave to transform its requested lock intoa different lock mode accepted by theparent’s input protocol. The transforma -

tion is carried out by an output protocol,which consults a lock translation table todetermine how to transform a lock re-quest into one that is acceptable by theparent group.

The lock modes provided by ObServerindicate whether the transaction intendsto read or write the object and whether itpermits reading while another transac-tion writes, writing while other transac-tions read, and multiple writers of thesame object. The communication modesspecify whether the transaction wants tobe notified if another transaction needsthe object or if another transaction hasupdated the object. Transaction groupsand the associated locking mechanismprovide suitable low-level primitives forimplementing a variety of concurrencycontrol policies.

To illustrate, consider the following ex-ample. Mary and John are assigned thetask of updating modules A and B, whichare strongly related (i. e., procedures inthe modules call each other, and typedependencies exist between the two mod-ules), while Bob is assigned responsibil-ity for updating the documentation of theproject. Mary and John need to cooperatewhile updating the modules, whereas Bobonly needs to access the final result ofthe modification of both modules in orderto update the documentation. Two trans-action groups are defined, TG1 and TG2.

‘G1 has ‘B& and TG2 as its members,and TG2 has T~O~mand T~,,Y as its mem-bers. The output protocol of TG2 statesthat changes made by the transactionswithin TG2 are committed to TG1 onlywhen all the transactions of TG2 haveeither committed or aborted. The inputprotocol of TG2 accepts lock modes thatallow T~,,Y and T.70h. to cooperate (e. g.,see partial results of their updates to themodules) while isolation is maintainedwithin TG1 (to prevent T~Ob from access-ing the partial results of the transactionsin TG2). This arrangement is depicted inFigure 18.

8.2.4 Participant Transactions

The transaction groups mechanism de-fines groups in terms of their access to

ACM Computing Surveys, Vol 23, No 3, September 1991

Page 43: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications g 311

T T-

T TG2Cooperative

Bob Group

TG1

Figure 18. Transaction groups,

database objects in the context of a nestedtransaction system. Another approach isto define a group of transactions as par-ticipants in a specific domainl [Kaiser19901. Participant transactions in a do-main need not appear to have been per-formed in some serial order with respectto each other. The set of transactionsthat is not a participant in a domain isconsidered an observer of the domain.The set of observer transactions of a do-main must be serialized with respect tothe domain. A particular transaction maybe a participant in some domains and anobserver of others whose transactions ac-cess the same objects.

A user can initiate a transaction thatnests subtransactions to carry out sub-tasks or to consider alternatives. Eachsubtransaction may be part of an implicitdomain, with itself as the sole partici-pant. Alternatively, one or more explicitdomains may be created for subsets ofthe subtransactions. In the case of animplicit domain, there is no requirementfor serializability among the subtransac -tions. Each subtransaction must, how-ever, appear atomic with respect to anyparticipants, other than the parent, inthe parent transaction’s domain.

The domain in which a transactionparticipates would typically be the set oftransactions associated with the mem-bers of a cooperating group of usersworking toward a common goal. Unlike

lThe word domain means different things in par-ticipant transactions, visibility domains, and do-

main relative addressing.

transaction groups, however, there is noimplication that all the transactions inthe domain commit together or even thatall of them commit (some may abort).Thus, it is misleading to think of thedomain as a top-level transaction, witheach user’s transaction as a subtrans-action, although in practice this is likelyto be a frequent case. The transac-tion groups mechanism described aboveis thus a special case of participanttransactions.

Each transaction is associated withzero or one particular domain at the timeit is initiated. A transaction that doesnot participate in any domain is the sameas a classical (but interactive) transac-tion. Such a transaction must be se-rializable with respect to all othertransactions in the system. A transactionis placed in a domain in order to sharepartial results with other transactions inthe same domain nonserializably, but itmust be serializable with respect to alltransactions not in the domain.

To illustrate, say a domain X is de-fined to respond to a particular modifica-tion request, and programmers Mary andJohn start transactions T~.,Y and T~O~.that participate in X. Assume an accessoperation is either a read or a write oper-ation. The schedule shown in Figure 19is not serializable according to any of theconventional concurrency control mecha-nisms. T~~,

ireads the updates TJohn

made to mo ule B that are written butare not yet committed by T~O~., modifiesparts of module B, then COmmitS. TJOh.continues to modify modules A and Bafter T~~,Y has committed. Since T~.,Yand T~Ohn participate in the same do-main X, the schedule is legal accord-ing to the participant transactionsmechanism.

Now suppose Bob starts a transaction

TBOb that 1s an observer of domain X.Assume the sequence of events shown inFigure 20 happens. Bob first modifiesmodule C. This by itself would be legal,since T~Ob thus far could be serializedbefore ~Jo~n (but not after). But thenTBOb attempts to read module B, whichhas been modified and committed by

~,,Y. This would be illegal even thoughT

ACM Computing Surveys, Vol. 23, No. 3, September 1991

Page 44: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

312 “ N. S. Barghouti and G. E. Kaiser

T John T Ma-y

begin(X)access(A)read(B) begin(X)write(B) access(C)

read(B)

access(A) write(B)read(B)write(B)

commit(B,C)

read(B)write(B)

commit(A,B)

Time

Figure 19. Participation schedule

T ~,,Y was committed. T~.,Y cannot beserialized before T~Ob, and thus before

~0~~, because T~.,Y reads the uncom-Tmitted changes to module B written byT . In fact, T~.,Y cannot be serializedei%~er before or after T~O~.. This wouldnot be a problem if it were not necessaryto serialize T~a,Y with any transactionsoutside the domain. Mary’s update tomodule B would be irrelevant if Johncommitted his final update to module Bbefore any transactions outside the do-main accessed module B. Thus, the seri-alizability of transactions within a par-ticipation domain needs be enforced onlywith respect to what is actually observedby the users who are not participants inthe domain.

9. SUMMARY

This paper investigates concurrencycontrol issues for a class of advanceddatabase applications involvingcomputer-supported cooperative work.We concentrate primarily on software de-velopment environments, although therequirements of CAD/CAM environ-ments and office automation are similar.The differences between concurrencycontrol requirements in these advancedapplications and traditional data process-ing applications are discussed, and sev-eral new mechanisms and policies thataddress these differences are presented.Many of these have not yet been imple-mented in any system. This is due to two

Ti...

T John

begin(X)modify(A)modify(B)read(C)write(C)modify(A)read(B)

modify(B)

modify(B)

T Mary T Bob

beginmodify(C)

begin(X)access(D)read(B)write(B)

commit(B,D)

read(B)

e

Figure 20. Participation conflict

factors: Many are theoretical frameworksrather than practical schemes, and manyof the more pragmatic schemes are sorecent that there has not been a suffi-cient period of time to design and imple-ment even prototype systems. Table 1summarizes the discussions in Sections 6to 8 by indicating whether or not eachmechanism or policy addresses longtransactions, user control and coopera-tion, and naming a system, if any, inwhich the ideas have been implemented.

There are four other concerns that ex-tended transactions models for advancedapplications should address: (1) the inter-face to and requirements for the under-lying DBMS, (2) the interface to theapplication tools and environment ker-nel, (3) the end-user interface, and (4) theenvironment/DBMS administrator’s in-terface. In a software development envi-ronment, for example, there is a varietyof tools that need to retrieve differentkinds of objects from the database. A toolthat builds the executable code of thewhole project might access the most re-cent version of all objects of type code.Another tool, for document preparation,accesses all objects of type document orof type description in order to produce auser manual. There might be several re-lationships between documents and code(a document describing a module mayhave to be modified if the code of themodule is changed, for instance). Userscollaborating on a project invoke tools asthey go along in their sessions, which

ACM Computmg Surveys, Vol 23, No. 3, September 1991

Page 45: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications 0 313

Table 1. Advanced Database Systems and Their Concurrency Control Schemes

Mechanism System Long Trans. User Control Cooperation

Altruistic locking N/A Yes No LimitedSnapshot validation N/A Yes No LimitedOrder-preserving transactions DASDBS Yes No LimitedEntity-state transactions N/A Yes No Limited

Semantic atomicity N/A Yes No LimitedMultilevel atomicity N/A Yes No Yes

Sagas N/A Yes No LimitedConflict-based serializability N/A Yes No Limited

Split-transaction, join-transaction N/A Yes Yes LimitedCheckout/checkin RCS No No LimitedDomain-relative addressing Cosmos Yes No LimitedConversational transactions System R Limited Limited NoMultilevel coordination Infuse Yes Yes LimitedCopy/modify/merge NSE Yes Yes Limited

Backout and commit spheres N/A Yes Yes LimitedInteractive notification Gordion No Limited Limited

Visibility domains N/A Yes Limited Yes

Group-oriented CAD trans. N/A Yes Limited Yes

Cooperating CAD transactions Orion Yes Limited Yes

Transaction groups ObServer Limited Limited Yes

Participant transactions N/A Yes Limited Yes

might result in tools being executedconcurrently. In such a situation, thetransaction manager, which controls con-current access to the database, must“understand” how to provide each userand each tool with access to a consistentset of objects upon which they operate,where consistency is defined according tothe needs of the application.

A problem that remains unsolved isthe lack of performance metrics by whichto evaluate the proposed policies andmechanisms in terms of the efficienciesof both implementation and use. We haveencountered only one empirical study[Yeh et al. 1987] that investigates theneeds of developers working together onthe same project and how different con-currency control schemes might affect thedevelopment process and the productiv-ity of developers. It might be that someof the schemes that appear adequate the-oretically will turn out to be inefficientor unproductive for the purposes of a par-ticular application. But is is not clearhow to define appropriate measures.

Another problem is that most of thenotification schemes are limited to at-taching the notification mechanism to thelocking primitives and notifying human

users, generally about the availability ofresources. These schemes assume thatonly the human user is active and thatthe database is just a repository of pas-sive objects. It is important, however, forthe DBMS of an advanced application tobe active in the sense that it be able tomonitor the activities in the databaseand automatically perform some opera-tions in response to changes made to thedatabase (i.e., what the database com-munity calls triggers [Stonebraker et al.1988]). Notification must be expanded,perhaps in combination with triggers, todetect a wide variety of database condi-tions, to consider indirect as well as di-rect consequences of database updates,and to notify appropriate monitor andautomation elements provided by thesoftware development environment.

In addition to supporting automation,advanced applications like SDES typi-cally provide the user with capabilities toexecute queries about the status of thedevelopment process. By definition, thisrequires access to the internal status ofin-progress tasks or transactions, per-haps restricted to distinguished userssuch as managers. If a manager of adesign project needs to determine the

ACM Computing Surveys, Vol. 23, No 3, September 1991

Page 46: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

314 “ N. S. Barghouti and G. E. Kaiser

exact status of the project in terms ofwhat has been completed (and how farthe schedule has slipped!), the databasemanagement system must permit accessto subparts of tasks that are still inprogress and not yet committed. Somequeries may, however, require a consis-tent state of the database in the sensethat no other activities may be concur-rently writing to the database. The brieflack of concurrency in this case may bedeemed acceptable to fulfill managerialgoals.

One key reason why traditional con-currency control mechanisms are too re-strictive for advanced applications is thatthey do not make use of the availablesemantics. Many of the extended trans-action models presented in this paper usesome kind of information about transac-tions, such as their access patterns, andabout users, such as which design groupthey belong to. Most, however, do notdefine or use the semantics of the taskthat a transaction is intended to performor the semantics of database operationsin terms of when an operation is applica-ble, what effects it has, and what impli-cations it has for leading to futuredatabase operations. Consequently, thesemechanisms capture only a subset of theinteractions possible in advanced appli-cations. One approach to solving thisproblem is to define a formal model thatcan characterize the whole range of in-teractions among transactions. Thisapproach was pursued in developingthe ACTA framework, which is capableof specifying both the structure andbehavior of transactions, as well asconcurrency and recovery properties[Chrysanthis and Ramamritham 19901.

Although all of the extended trans-action models presented in this paperaddress at least one of the concurrencycontrol requirements, which includesupporting long-duration transactions,user control over transactions, and coop -eration among multiple users, none ofthem supports all requirements. For ex-ample, some mechanisms that supportlong transactions, such as altruistic lock-ing, do not support user control. Some

mechanisms that support user control,such as optimistic coordination, do notdirectly support cooperation. All threerequirements must be fulfilled for theclass of advanced applications consideredhere, those involving computer-supportedcooperative work.

ACKNOWLEDGMENTS

This work was supported m part by the National

Science Foundation under grants CCR-8858029

and CCR-8802741, by grants from AT&T,2 BNR,

Citicorp, DEC, IBM,3 Siemens, Sun and Xerox.* by

the Center for Advanced Technology, and by the

Center for Telecommunications Research, We would

like to thank Terrance Boult and Soumitra

Sengupta for reviewing earlier versions of this pa-

per. We would also like to thank the anonymous

referees, the associate editor Hector Garcia-Molina,

and the editor-in-chief Salvatore March, all of

whom provided many detailed suggestions and

corrections.

REFERENCES

ADAMS, E. W., HONDA, M., AND MILLER, T. C. 1989.Object managment in a CASE environment. InProceedings of the 11th International Confer-ence on Software Engineering (May) IEEEComputer Society Press, pp. 154-163.

BANCILHON, F., KIM, W., AND KORTH, H. 1985. Amodel of CAD transactions. In Proceechngs of

the 11th International Con ference on Very LargeDatabases, (August), Morgan Kaufmann, pp.25-33.

BEERI, C., BERNSTEIN, P. A , AND GOODMAN, N. 1986

A model for concurrency in nested transactionsystems. Tech Rep. TR-86-03, The WangInstitute for Graduate Studies, Tyngaboro,Mass.

BEERI, C , BERNSTEIN, P. A., AND GOODMAN, N. 1989.A model for concurrency in nested transactionsystems, J/2, ACM 36, 230-269.

BEERI, C,, SCHEK, H.- J., AND WEIKUM, G. 1988.Multilevel transaction management: Theoreti-

cal art or practical need? Ado. Database Tech(Mar ), pp. 134-154 Published as LectureNotes in Computer Science # 303, G. Goos andJ. Hart Manis, Eds

BERNSTEIN, P, A. 1987. Database systems supportfor software engineering: An extended ab -

2AT&T is a registered trademark of American~lephone and Telegraph Company.

IBM is a registered trademark of InternationalBusiness Machines Corporation.4Xerox is a registered trademark of XeroxCorporation.

ACM Computing Surveys, Vol. 23, No. 3, September 1991

Page 47: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in AdL

stract. In Proceedings of the 9th InternationalConference on Software Engineering (March),

IEEE Computer Society Press, pp. 166-178.

BERNSTEIN, P. A., AND GOODMAN, N. Concurrency

control in distributed database systems. ACM

Comput. Suru. 13, 2 (June), 185-221.

BERNSTEIN, P. A., HADZILACOS, V., AND GOODMAN,

N. 1987. Concurrency Control and Recoveryin Database Systems. Addison-Wesley, Read-

ing, Mass.

BJORK, L. A. 1973. Recovery scenario for a

DB/DC system. In Proceedings of the 28thACM National Conference (Aug.), ACM Press,pp. 142-146.

CHRYSANTHM, P. K., AND RAMAMRITHAM, K. 1990.

Acts: A framework for specifying and reason-

ing about transaction structure and behavior.In Proceedings of the 1990 ACM SIGMOD Zn-

ternafional Conference on the Management of

Data (May), ACM Press, pp. 194-203.

DAVIES, C. T. 1973. Recovery semantics for aDB/DC system. In Proceedings of the 28thACM National Conference (Aug.), ACM Press,

pp. 136-141.

— 1978, Data processing spheres of control.IBM Syst. J. 17, 2, 179-198.

DITTRICH, K., GOTTHARD, W., AND LOCKEMANN, P.1987. DAMOKLES: The database system forthe UNIBASE software engineering environ-

ment. IEEE Bullet. Database Eng. 10, 1

(March), 37-47.

DOWSON, M., AND NEJMEH, B. 1989. Nested trans-

actions and visibility domains. In Proceedings

of the 1989 ACM SIGMOD Workshop on Soft-

ware CAD Databases (Feb.), ACM Press,

pp. 36-38. Position paper.

EASTMAN, C. 1980. System facilities for CAD

databases. In proceedings of the 17th ACMDesign Automation Conference (June), ACMPress, pp. 50-56.

—————1981. Database facilities for engineeringdesign. In Proceedings of the IEEE (Ott.), IEEE

Computer Society Press, pp. 1249-1263.

EGE, A., AND ELLIS, C. A, 1987. Design and im-

plementation of Gordion, an object-basedmanagement system. In Proceedings of the

3rd International Conference on Data Engineer-

ing (Feb. ), IF,EE Computer Society Press,pp 226-234,

EL ABBADI, A. AND TOUEG, S. 1989. The group

paradigm for concurrency control protocols.IEEE Trans. Knowledge Data Eng. 1, 3 (Sept.),376-386.

ESWARAN, K., GRAY, J., LORIE, R., AND TRAIGER, I.

1976. The notions of consistency and predi-cate locks in a database system. Commun. ACM19, 11 (Nov.), 624-632.

FELDMAN, S. I. 1979. Make: A program for main-taining computer programs. Softw. Pratt. Ex-per. 9, 4 (Apr.), 255-265.

FERNANDEZ, M. F., AND ZDONIK, S. B. 1989. Trans-

tanced Database Applications ● 315

action groups: A model for controllingcooperative work. In Proceedings of the 3rd

International Workshop on Perswtent Object

Systems: Their Design, Implementation, and

Use (Jan, ), Morgan-Kaufman pp. 128-138.

GARCIA-M• LINA, H. 1983. Using semantic knowl-edge for transaction processing in a distributeddatabase. ACM Trans. Database Syst. 8, 2

(June), 186-213.

GARCIA-M• LINA, H., AND SALEM, K, 1987,SAGAS. In Proceedings of the ACM SIGMOD1987 Annual Conference (May), ACM Press,

pp. 249-259.

GARZA, J., AND KIM, W. 1988. Transaction man-agement in an object-oriented database system,In Proceedings of the ACM SIGMOD Interna-

tional Conference on the Management of Data

(June), ACM Press, pp. 37-45.

GRAY, J. 1978. Notes on database operating sys-tems. IBM Res. Rep. RJ2188, IBM ResearchLaboratory, San Jose, Calif.

GRAY, J., LORIE, R,, AND PUTZOLU, G. 1975. Gran-ularity of locks and degrees of consistency in ashared database. IBM Res. Rep. RJ1654, IBM

Research Laboratory, San Jose, Calif.

HERLIHY, M. P,, AND WEIHL, W. E. 1988. Hybridconcurrency control for abstract data types. InProceedings of the 7th ACM Symposium on

Principles of Database Systems (Mar.), ACMPress, pp. 201-210.

KAISER, G. E. 1990. A flexible transaction modelfor software engineering. In Proceedings of the

6th International Conference on Data Engineer-

ing (Feb.), IEEE Computer Society Press,

pp. 560-567.

KAISER, G. E., AND FEILER, P. H. 1987. Intelligentassistance without artificial intelligence. InProceedings of the 32nd IEEE Computer SocietyInternational Conference (Feb.), IEEE Com-puter Society Press, pp. 236-241.

KAISER, G. E., AND PERRY, D. E. 1987. Workspacesand experimental databases: Automated sup-

port for software maintenance and evolution.In Proceedings of the Conference on SoftwareMaintenance (Sept.), IEEE Computer Society

Press, pp. 108-114.

KATZ, R. H. 1990. Toward a unified framework forversion modeling in engineering databases.ACM Comput. Surv. 22, 4 (Dec.), 375-408,

KATZ, R. H., AND WEISS, S. 1984. Design transac-tion management. In Proceedings of the ACMIEEE 21st Design Automation Conference(June), IEEE Computer Society Press, pp.692-693.

KIM, W., LORIE, R. A., MCNABB, D., AND PLOUFFE,W. 1984. A transaction mechanism for engi-neering databases. In I’roceedmgs of the 10thInternational Conference on Very Large Data-

bases (Aug.), Morgan Kaufmann, pp. 355-362.

KIM, W., BALLOU, N., CHOU, H., AND GARZA, J.

1988. Integrating an object-oriented pro-

ACM Computing Surveys, Vol. 23, No. 3, September 1991

Page 48: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

316 - N. S. Barghouti and G. E. Kaiser

gramming system with a database system.

In Proceedings of the 3rd Intern at~onal Confer-

en ce on ObJect– Orzented programming S.YS-tems, Languages and Applications (Sept.), ACMPress, pp. 142-152.

KLAHOLD, P., SCHLAGETER, G., UNLAND, R., ANDWILKES, W. 1985. A transaction model sup-

porting complex applications in integrated in-

formation systems. In Proceedings of the ACM

SIGMOD International Conference on theManagement of Data (May), ACM Press,

pp. 388-401.

KOHLER, W. 1981 Asurvey oftechniques forsyn-

chronization and recovery in decentralizedcomputer systems ACM Comput. Suru. 13, 2

(June), 149-183.

KORTH, H., AND SILBERSCHATZ, A. 1986. DatabaseSystem Concepts. McGraw-Hill, New York.

1990. Long-duration transactions in soft-

ware design projects. In Proceedings of the 6thInternational Conference on Data Engineering

(Feb.), IEEE Computer Society Press, PP

568-574.

KORTH, H., AND SPEEGLE, G. 1988. Formal model

of correctness without serializability. In Pro-ceedings of the ACM SIGMOD International

Conference on the Management of Data (June),ACM Press, pp. 379-386.

KuN~, H., AND ROBINSON, J. 1981. On optimmticmethods for concurrency control ACM Trans.

Database Syst. 6, 2 (June), 213-226.

KUTAY, A., AND EASTMAN, C. 1983. Transactionmanagement in engineering databases. In Pro-

ceedings of the Annual Meeting of DatabaseWeek; Engineering Design Appl~cations (May),IEEE Computer Society Press, pp. 73-80.

LEBLANG, D. B., AND CHASE, R P., JR. 1987. Par-allel software configuration management in anetwork environment IEEE Softw 4, 6 (Nov.),

28-35.

LORIE, R, AND PLOUFFE, W 1983. Complex ob-jects and their use in design transactions. InProceedings of the Annual Meeting of DatabaseWeek; Engineering Design Applications (May),IEEE Computer Society Press, pp 115-121.

L~NcH, N. A. 1983 Multilevel atomiclty: A newcorrectness criterion for database concurrencycontrol. ACM Z’rans. Database Syst.8,4 (Dec.),484-502

MARTIN, B. 1987 Modeling concurrent activitieswith nested objects. In Proceedings of the 7th

International Conference on Dwtrzbuted Com-puting Systems (Sept. ), IEEE Computer SocietyPress, pp 432-439.

Moss, J. E. B. 1985. Nested Transactwns: AnApproach to Rel~able Distributed Computing.The MIT Press, Cambridge, Mass.

NESTOR, J. R. 1986. Toward a persistent objectbase In Aduanced Programmuzg Environ-ments, Conradi, R., Didriksen, T. M., andWanvlk, D H , eds. Springer-Verlag, Berlin,pp. 372-394.

PAPADIMITRIOU, C 1986. The Theory of Database

Concurrency Control. Computer Science Press,Rockville, Md.

PERRY, D., AND KAISER, G. 1991. Models of soft-

ware development environments IEEE Trans.Softw Eng. 17,3 (Mar.)

PRADEL, U , SCHLAGETER, G , AND UNLAND, R. 1986.

Redesign of optimistic methods: Improving per-formance and availability In Proceedings of

the 2nd International Conference on Data Engi-neering (Feb.), IEEE Computer Society Press,

pp. 466-473.

Pu, C., KAISER, G., AND HUTCHINSON, N. 1988.

Split transactions for open-ended activities In

Proceedings of the 14th International Confer-ence on Very Large Databases (Aug.), MorganKaufmann, pp. 26-37.

REED, R. 1978 Naming andsynchromzation in a

decentralized computer system Ph.D. dmserta-tion, MIT Laboratory of Computer Science, MITTech. Rep, 205.

ROCHKIND, M. J 1975. The source code controlsystem IEEE Trans. Softw. Eng. SE-1,

364-370

ROSENKRANTZ, D., STEARNS, R., AND LEWIS, P. 1978

System-level concurrency control for dis-

tributed database systems ACM Trans.

Database Syst 3,2(June),178-198

ROWE, L. A , AND WENSEL, S. 1989. 1989 ACM

SIG MOD Workshop on Software CADDatabases, Napa, Cahf

SALEM, K., GARCIA-M• LINA, H., AND ALONSO, R.1987. Altrumtlc locking: A strategy for cop-ing with long-hved transactions In Proceed-ings of th e 2 nd International Workshop on HzghPerformance Transaction Systems (Sept ),

Amdahl Corporation, IBM Corporation, pp.19.1-19 24.2 RSilberschatz, A., and Kedem, Z.

Consistency in hierarchical database systems.

J. ACM 27, 1 (Jan), 72-80

SKARRA, A. H , AND ZDONnC, S. B. 1989. Concur-rency control and object-oriented databases In

Kim, W., and Lochovsky, F. H., eds ,

ObJect- Oriented Concepts, Databases, and Ap-plications, ACM Press, New York, pp 395-421

STONEBR~KER, M., KATZ, R , PATTERSON, D., AND

OUSTERHOUT, J. 1988. The design of XPRS.In Proceedings of the 14th International

Conference on Very Large Databases (Aug.),Morgan Kaufmann, pp 318-330

TICHY, W. F. 1985. RCS: A system for version

control. So ftw. Pratt. Exper. 15, 7 (July),637-654

WALPOLE, J., BLAIR, G., HUTCHISON, D., AND NICOL,

J. 1987. Transaction mechanisms for dis-tributed programming environments. Softw.Eng. J. 2, 5 (Sept.), 169-177

WALPOLE, J., BLAIR, G., MALIK, J., AND NICOL, J.

1988a A unifying model for consistent

distributed software development environ-ments. In Proceedings of the ACM SIGSOFT/

SIGPLAN Soflware Engineering Symposium on

ACM Computmg Surveys, Vol. 23, No. 3, September 1991

Page 49: Concurrency control in advanced database applications · Concurrency Control in Advanced Database Applications ... tween concurrency control in advanced database applications and

Concurrency Control in Advanced Database Applications g 317

Practical Software Development Environments

(Nov.), ACM Press, pp. 183-190.

1988b. Maintaining consistency in dis-

tributed software engineering environments. In

Proceedings of the 8th International Conference

on Distributed Computing Systems (June),IEEE Computer Society Press, pp. 418-425.

WALTER, B. 1984. Nested transactions with multi-

ple commit points: An approach to the structur-ing of advanced database applications. In Pro-

ceedings of the 10th International Conference onVery Large Databases (Aug.), Morgan Kauf-

mann, pp. 161-171.

WEIHL, W. 1988. Commutativity-based concur-rency control for abstract data types (Pre-

liminary Report). In Proceedings of the 21 St

Annual Hawaii International Conference onSystem Sciences (Jan.), IEEE Computer Soci-ety Press, pp. 205-214.

WEII.WM, G. 1986. A theoretical foundation of

muitileve~ concurrency control. In proceedings

Recew’ed November 1989; final revmon accepted March

of the 5th ACM Symposium on Principles

of Database Systems (Mar.), ACM Press,

pp. 31-42.

WEIKUM, G., AND SCHEK, H.-J. 1984. Architec-tural issues of transaction management inmultilevel systems. In Proceedings of th e 10thInternational Conference on Very Large Data-

bases (Aug.), Morgan Kaufmann.

WILLIAMS, R , DANIEL., D., HASS, L., LAPIS, B.,LINDSAY, B., NG, P,, OBERMARCLE, R., SELING~R,

P , WALKER, A., WILMS, P., AND YOST, R. 1981.R:* An overview of the architecture In Im -prouing Database Usabil@ and Responsive-

ness, Scheuermann, P., Ed, Academic Pressj

NY, pp. 1-27.

YANNAKAKIS, M. 1982. Issues of correctness in

database concurrency control by locking. J.ACM29, 3 (July), 718-740

YEH, S., ELLIS, C., EGE, A., AND KORTH, H. 1987.Performance analysis of two concurrency con-

trol schema. for design environments. Tech.

Rep. STP-036-87, MCC, Austin, Tex.

1991.

ACM Computing Surveys, Vol. 23, No. 3, September 1991